AZURE_BLOB Provider
The AZURE_BLOB provider uses Microsoft Azure Blob Storage for storing files. Azure Blob Storage is ideal for cloud-native deployments and cluster configurations where multiple webPDF Server nodes need to share file storage.
Overview
- Provider Name:
AZURE_BLOB - Supports: File Storage only
- Configuration File:
provider-azure-blob.json - Use Case: Azure cloud deployments, cluster file storage, geo-redundant storage
- Protocol: Azure Storage REST API
Features
- ✓ Microsoft Azure native integration
- ✓ High-performance blob storage
- ✓ Cluster-ready with shared file access
- ✓ Multiple redundancy options (LRS, ZRS, GRS, RA-GRS)
- ✓ Integrated Azure security features
- ✓ Lifecycle management and tiering
- ✓ ETag-based local caching for improved performance
- ✓ Concurrent file access
- ✓ Automatic retry with exponential backoff
- ✓ Four authentication methods (Shared Key, SAS Token, Connection String, Managed Identity)
- ✓ Automatic container creation with cluster race condition handling
- ✗ Requires Azure Storage account
- ✗ Cloud-dependent (requires internet connectivity)
Prerequisites
- Azure Storage Account (General Purpose v2 or Blob Storage)
- Azure Storage Account connection string or access keys
- Container created or automatic container creation enabled
- Network connectivity to Azure (direct or via VPN/ExpressRoute)
Configuration
provider.json
{
"fileStorage": {
"name": "AZURE_BLOB",
"checks": {
"enabled": true,
"interval": 10000
}
}
}
provider-azure-blob.json
Configuration file for Azure Blob Storage:
{
"url": "https://webpdfstorageacct.blob.core.windows.net",
"username": "webpdfstorageacct",
"password": "storage-account-key-or-connection-string",
"containerName": "webpdf-files",
"authenticationType": "SHARED_KEY"
}
Configuration Attributes
Required Attributes
| Attribute | Type | Description | Example |
|---|---|---|---|
url | string | Azure Blob Storage endpoint URL | https://account.blob.core.windows.net |
username | string | Storage account name | webpdfstorageacct |
password | string | Storage account key, SAS token, or connection string (depends on authenticationType) | (see below) |
Optional Attributes
| Attribute | Type | Description | Default |
|---|---|---|---|
containerName | string | Azure container name for file storage | Auto-generated |
authenticationType | string | Authentication method: SHARED_KEY, SAS_TOKEN, CONNECTION_STRING, DEFAULT_AZURE_CREDENTIAL | SHARED_KEY |
Authentication Methods
The Azure Blob provider supports four authentication types. The password field serves different purposes depending on the authentication method:
1. Shared Key Authentication (Default)
Uses storage account name and account key:
{
"url": "https://webpdfstorageacct.blob.core.windows.net",
"password": "base64-encoded-account-key",
"authenticationType": "SHARED_KEY"
}
Implementation details:
- Supports both Azurite format (
http://host:port/accountname) and Azure format (https://accountname.blob.core.windows.net) - The URL must include the account name
- Account name is extracted from the URL
2. SAS Token Authentication
Uses Shared Access Signature tokens for time-limited access:
{
"url": "https://webpdfstorageacct.blob.core.windows.net",
"password": "?sv=2021-06-08&ss=b&srt=sco&sp=rwdlacx&se=...",
"authenticationType": "SAS_TOKEN"
}
Note: The password field contains the SAS token when using this method and the URL must include the account name.
3. Connection String Authentication
Uses a complete Azure connection string:
{
"password": "DefaultEndpointsProtocol=https;AccountName=webpdfstorageacct;AccountKey=...;EndpointSuffix=core.windows.net",
"authenticationType": "CONNECTION_STRING"
}
Note: The password field contains the full connection string when using this method.
4. Default Azure Credential (Recommended for Production)
Uses Azure Managed Identity for passwordless authentication:
{
"url": "https://webpdfstorageacct.blob.core.windows.net",
"authenticationType": "DEFAULT_AZURE_CREDENTIAL"
}
Authentication chain:
- Managed Identity
- Environment Variables
- Azure CLI credentials
- Interactive browser login (development only)
Advantages:
- No credentials stored in configuration
- Automatic token rotation
- Enhanced security compliance
Container Naming
If containerName is not specified, it is automatically generated using the namespace configuration:
{org}-{app}-{env}-{version}
Example: webpdf-webpdf-prod-v10-0
Azure container names must be lowercase and can contain only letters, numbers, and hyphens. The name must start with a letter or number.
Environment Variables
# Azure Blob file storage
WEBPDF_AZURE_BLOB_FILE_STORAGE_SETTINGS_URL=https://webpdfstorageacct.blob.core.windows.net
WEBPDF_AZURE_BLOB_FILE_STORAGE_SETTINGS_USERNAME=webpdfstorageacct
WEBPDF_AZURE_BLOB_FILE_STORAGE_SETTINGS_PASSWORD=account-key
WEBPDF_AZURE_BLOB_FILE_STORAGE_SETTINGS_CONTAINER_NAME=webpdf-files
Health Checks
Health checks verify connectivity to Azure Blob Storage:
{
"fileStorage": {
"name": "AZURE_BLOB",
"checks": {
"enabled": true,
"interval": 15000
}
}
}
enabled: Enable health checks (recommended:true)interval: Check interval in milliseconds (recommended:10000-30000)
Health Check Implementation
The health check performs a lightweight connectivity test:
- Service Properties Check - Validates network connectivity and authentication by reading the client properties
- Authentication Validation - Ensures credentials are valid
- Service Availability - Confirms Azure Blob Storage service is accessible
What is NOT checked:
- Container existence (container may not exist yet during initial setup)
- Write permissions (to minimize costs)
- Storage capacity
Health Status Results:
UP- Connection established, authentication successfulDOWN- Connection failed, authentication error, or service unavailable
Security Considerations
Access Control
- Managed Identity – Use Azure Managed Identity for passwordless authentication
- SAS Tokens – Time-limited access with granular permissions
- Account Keys – Rotate regularly (Azure supports two keys for zero-downtime rotation)
- Private Endpoints – Use Azure Private Link for network isolation
Network Security
For Private Endpoints, simply use the private link URL in the url field:
{
"url": "https://webpdfstorageacct.privatelink.blob.core.windows.net"
}
Encryption
- At Rest – Azure automatically encrypts data (256-bit AES)
- In Transit – Always use HTTPS endpoints
- Customer-Managed Keys – Optional integration with Azure Key Vault
Never store Azure storage account keys directly in configuration files. Use environment variables, Azure Key Vault, or Managed Identity.
Azure Storage Tiers
Choose appropriate storage tier based on access patterns:
| Tier | Use Case | Cost | Access Time |
|---|---|---|---|
| Hot | Frequent access | Higher storage, lower access | Immediate |
| Cool | Infrequent access (>30 days) | Lower storage, higher access | Immediate |
| Archive | Rare access (>180 days) | Lowest storage, highest access | Hours |
webPDF typically uses Hot tier for active file storage. Consider lifecycle policies to move old files to Cool/Archive tiers.
Container Management
Automatic Container Creation
Azure Blob provider will attempt to create the container automatically if it doesn't exist.
Implementation details:
- The provider checks for container existence during initialization
- If the container doesn't exist, it attempts to create it
- Handles race conditions in cluster environments (HTTP 409 Conflict)
- If another node creates the container simultaneously, the provider retrieves the existing container client
Manual Container Creation
Using Azure CLI:
# Login to Azure
az login
# Create container
az storage container create \
--name webpdf-files \
--account-name webpdfstorageacct \
--public-access off
Using Azure Portal:
- Navigate to Storage Account
- Select "Containers" under Data Storage
- Click "+ Container"
- Set name and private access level
Container Access Levels
- Private (recommended) – No anonymous access
- Blob – Anonymous read access to blobs
- Container – Anonymous read access to blobs and container
High Availability
Azure Blob Storage provides built-in redundancy:
Redundancy Options
- LRS (Locally Redundant Storage) – 3 copies in single datacenter
- ZRS (Zone Redundant Storage) – 3 copies across availability zones
- GRS (Geo-Redundant Storage) – 6 copies across two regions
- RA-GRS (Read-Access Geo-Redundant) – GRS + read access to secondary region
Recommended Configuration
For production webPDF clusters:
Storage Account Type: General Purpose v2
Performance: Standard
Replication: ZRS or GRS
Access Tier: Hot
Performance Tuning
Azure Storage Optimization
- Use Premium Storage – For higher throughput requirements
- Enable CDN – For geographically distributed users
- Optimize Blob Size – Larger blobs have better throughput
- Use Parallel Operations – Azure supports high concurrency
Built-in Retry Policy
The client implements exponential backoff for transient failures:
- Max Retries: 3 attempts
- Base Delay: 1 second
- Max Delay: 16 seconds
- Strategy: Exponential backoff
This configuration handles transient network issues and Azure throttling automatically.
File Caching
The implementation uses ETag-based caching:
- Local Cache - Files are cached locally after download
- ETag Validation - ETags are compared before re-downloading
- Automatic Invalidation - Cache is updated when ETags don't match
Monitoring
Azure Monitor Integration
Monitor storage metrics:
- Requests – Success rate, latency
- Availability – Service availability percentage
- Capacity – Used storage capacity
- Egress/Ingress – Data transfer
Alerts
Configure Azure Monitor alerts for:
- High latency (>1000ms)
- Error rate >1%
- Capacity thresholds
- Unusual access patterns
Troubleshooting
Cannot Connect to Azure Storage
- Verify storage account exists:
az storage account show --name webpdfstorageacct - Check network connectivity to Azure
- Verify account key or connection string
- Check firewall rules in Storage Account settings
Common error messages:
- "Connection not possible" - Network connectivity issue or invalid endpoint
- "Authentication failed" - Invalid credentials or expired SAS token
- "No container name set" - Container name not configured
- "Unable to extract account name from URL" - Invalid URL format
Container Not Found
The health check does not validate container existence. Container-related errors occur during file operations:
- Verify container exists:
az storage container list --account-name webpdfstorageacct - Check container name matches configuration (must be lowercase)
- Verify access permissions
- Enable automatic container creation
Note: In cluster environments, if you see HTTP 409 errors during startup, this is normal and indicates another node is creating the container simultaneously.
Authentication Errors
Different authentication types have different troubleshooting steps:
Shared Key:
- Verify account name is correct
- Check account key hasn't been rotated
- Validate URL format matches expected pattern
SAS Token:
- Validate token hasn't expired
- Check token permissions include required operations
- Ensure token is properly formatted (starts with
?)
Connection String:
- Verify complete connection string format
- Check all required parameters are present
- Validate endpoint suffix matches Azure region
Default Azure Credential:
- Verify Managed Identity is assigned to resource
- Check environment variables are set correctly
- Ensure Azure CLI is authenticated (for development)
- Review Azure Identity authentication chain order
Slow Upload/Download
- Check network bandwidth to Azure
- Review Azure Storage metrics
- Consider Premium Storage tier
- Enable Azure CDN for downloads
- Check if geo-replication is causing latency
Examples
Basic Azure Blob Setup (Shared Key)
{
"url": "https://webpdfstorageacct.blob.core.windows.net",
"password": "AccountKey123...==",
"containerName": "webpdf-files",
"authenticationType": "SHARED_KEY"
}
With Connection String
{
"password": "DefaultEndpointsProtocol=https;AccountName=webpdfstorageacct;AccountKey=...==;EndpointSuffix=core.windows.net",
"containerName": "webpdf-prod-files",
"authenticationType": "CONNECTION_STRING"
}
With Private Endpoint
{
"url": "https://webpdfstorageacct.privatelink.blob.core.windows.net",
"username": "webpdfstorageacct",
"password": "AccountKey123...==",
"containerName": "webpdf-files",
"authenticationType": "SHARED_KEY"
}
With Managed Identity (Production Recommended)
{
"url": "https://webpdfstorageacct.blob.core.windows.net",
"containerName": "webpdf-prod-files",
"authenticationType": "DEFAULT_AZURE_CREDENTIAL"
}
Note: Requires Azure Managed Identity to be assigned to the VM/App Service with appropriate Storage Blob Data permissions.
Azurite (Local Development)
{
"url": "http://127.0.0.1:10000/devstoreaccount1",
"password": "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==",
"containerName": "webpdf-dev",
"authenticationType": "SHARED_KEY"
}
Note: Uses Azurite emulator with default credentials for local development and testing.