Containers
webPDF can run as a container on a Docker host, with Docker Compose, or on container
orchestration platforms such as Kubernetes and OpenShift. The same application startup and
shutdown behavior applies in all cases: the container starts the webPDF server as its main
process, and the container runtime stops it by sending SIGTERM to that process.
For installation and image details, see Docker and Kubernetes containers.
Start a Container
The official container exposes the webPDF server on port 8080. A minimal Docker host
start command maps this port to the host:
docker run --name webpdf -p 8080:8080 softvisiondev/webpdf:latest
After startup, the portal is available at:
http://PORTAL_URL/webPDF/
For local Docker testing, http://localhost:8080/webPDF/ is commonly used.
The image also contains a built-in container health check against /webPDF/health.
You can inspect the runtime state with:
docker ps
docker inspect --format='{{json .State.Health}}' webpdf
Start with Persistent Data
For anything beyond a temporary test container, mount volumes for configuration, logs, and keystore data. This keeps the relevant runtime data independent from the container lifecycle:
docker run --name webpdf -p 8080:8080 \
-v webpdf-config:/opt/webpdf/conf \
-v webpdf-logs:/opt/webpdf/logs \
-v webpdf-keystore:/opt/webpdf/keystore \
--shm-size=2gb \
softvisiondev/webpdf:latest
The container initializes missing configuration files from the image defaults when the configuration volume is empty. For the location of configuration files, logs, and other runtime directories, see Directories.
Allocate sufficient shared memory for document conversion workloads. The official Docker
Compose example uses shm_size: 2GB; with docker run, use --shm-size=2gb.
Docker Compose
With Docker Compose, start the service in detached mode:
docker compose up -d
Stop the service with:
docker compose stop
Restart the service after configuration changes with:
docker compose restart
Use docker compose logs webPDF to inspect startup and shutdown output when the service
name follows the official Compose example.
Stop a Container
Use docker stop for normal shutdown:
docker stop webpdf
Docker sends SIGTERM first and waits for the container stop timeout before it sends
SIGKILL. If webPDF is processing long-running jobs, set a stop timeout that is long enough
for the configured shutdown behavior:
docker stop --time 120 webpdf
When creating a long-running container, you can also define the stop timeout up front:
docker run --name webpdf --stop-timeout 120 -p 8080:8080 softvisiondev/webpdf:latest
Avoid docker kill for normal operation. It bypasses the graceful shutdown path and can
interrupt active document processing immediately.
Container Configuration
Container settings are usually passed as environment variables. For example, use
JAVA_PARAMETERS for JVM memory settings and standard Linux environment variables for locale
and timezone:
docker run --name webpdf -p 8080:8080 \
-e JAVA_PARAMETERS="-Xmx4g -Xms1g" \
-e LANG=en_US.UTF-8 \
-e LC_ALL=en_US.UTF-8 \
-e LANGUAGE=en_US.UTF-8 \
-e TZ=America/New_York \
softvisiondev/webpdf:latest
webPDF-specific Environment Settings can also be passed as container environment variables. For configuration precedence and naming conventions, see the relevant configuration topics.
Graceful Shutdown on Orchestrated Platforms
When webPDF runs as a container in Kubernetes or OpenShift, a naive shutdown can lead to brief but visible connection errors for clients, even during normal rolling deployments. The following sections explain the root cause and how to configure a graceful shutdown that prevents these errors.
The Problem: The Race Window
When Kubernetes terminates a pod, two independent processes run in parallel without synchronization:
Kubernetes API: delete pod
│
├─────────────────────────────────────┐
│ Path A │ Path B
│ Endpoint deregistration │ Container termination
│ │
▼ ▼
Endpoint controller kubelet runs preStop hook
removes pod from (if configured), then sends
Service endpoint slice SIGTERM to container
│ │
▼ ▼
kube-proxy / iptables Container shuts down,
propagates to all nodes stops accepting connections
(~1–5+ seconds)
│
▼
Load balancer stops routing
traffic to this pod
The race window: Path A (endpoint propagation) takes 1–5+ seconds to complete across
all cluster nodes. During this time the container has already received SIGTERM and may
have stopped Tomcat — but the load balancer is still routing new requests to the pod.
The result is Connection refused errors and 502 Bad Gateway responses for clients.
This race condition occurs during every rolling deployment, scale-down, and node drain — not just during emergency shutdowns. It is not a bug but a fundamental property of how Kubernetes manages endpoint propagation.
In OpenShift, the situation is amplified by the HAProxy-based router, which reloads its backend configuration at a configurable interval (often every 1–5 seconds). This can extend the race window compared to standard Kubernetes Ingress controllers.
Health Probes: Live vs. Ready
Kubernetes uses two independent health probes to manage a pod's lifecycle and traffic routing:
| Probe | Path | k8s reaction on failure | Role during shutdown |
|---|---|---|---|
| Liveness | /health/live | Restart the container | Must stay UP — prevents unwanted restart |
| Readiness | /health/ready | Remove pod from Service endpoints | Set DOWN immediately — stops new traffic |
This distinction is the key to a clean shutdown:
- Setting
/health/readytoDOWNsignals the load balancer to stop routing new traffic, without triggering a container restart. - Setting
/health/livetoDOWNwould cause Kubernetes to kill and restart the container — which is the opposite of what we want during an intentional shutdown.
kubelet polls the readiness probe every periodSeconds seconds. Only after
failureThreshold consecutive failures does it mark the pod as "not ready" and trigger
removal from the endpoint slice. With periodSeconds: 5 and failureThreshold: 2, it
takes up to 10 seconds from the first 503 response until the load balancer stops
routing traffic. Your drain period must be longer than this window.
The Solution: Two-Phase Shutdown
The Two-Phase Shutdown resolves the race condition by inserting a traffic drain phase before the application begins its actual shutdown sequence:
Phase 1 – Traffic Drain
Immediately on SIGTERM:
/health/ready→503 Service Unavailable— load balancer stops routing new requests/health/live→200 OK— Kubernetes does not restart the container- The server keeps running and continues to process all in-flight requests
- After a configurable wait period, the endpoint propagation has completed
Phase 2 – Full Shutdown
After the drain period:
- All health checks are set to
DOWN(including liveness) - Session management is stopped
- Active worker threads are given time to complete (see Worker handling)
- Tomcat is shut down gracefully
SIGTERM received
│
▼ Phase 1 – Traffic Drain
/health/ready → 503 (load balancer removes pod, ~2–10s)
/health/live → 200 (no container restart)
Server keeps running, in-flight requests complete
│
└─ wait WEBPDF_CONTAINER_DRAIN_SECONDS
│
▼ Phase 2 – Full Shutdown
All probes → DOWN
Active workers finish (or are stopped gracefully)
Tomcat shuts down
JVM terminates
Complete shutdown sequence
SIGTERM
│
▼ JVM Shutdown Hook
│
▼ Application shutdown starts
│ │
│ ▼ Application services stop
│ │
│ │ ╔═════════════════════════════════════════════════════════╗
│ │ ║ PHASE 1 – Traffic Drain ║
│ │ ╠═════════════════════════════════════════════════════════╣
│ │ ║ Readiness is set to DOWN ║
│ │ ║ /health/ready → 503 ← kubelet removes pod ║
│ │ ║ /health/live → 200 ← no container restart ║
│ │ ║ ║
│ │ ║ sleep(WEBPDF_CONTAINER_DRAIN_SECONDS) ║
│ │ ║ Tomcat still up, in-flight requests served ║
│ │ ║ Workers completing reduce load for Phase 2 ║
│ │ ╠═════════════════════════════════════════════════════════╣
│ │ ║ PHASE 2 – Full Shutdown ║
│ │ ╠═════════════════════════════════════════════════════════╣
│ │ ║ Compute worker-shutdown outer timeout ║
│ │ ║ = WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT ║
│ │ ║ (or calculated from current load if 0) ║
│ │ ║ + WEBPDF_SERVER_SHUTDOWN_TIMEOUT × 2 ║
│ │ ║ + 10s internal margin ║
│ │ ║ ║
│ │ ║ All probes → DOWN (incl. /health/live) ║
│ │ ║ Session expiry is disabled during shutdown ║
│ │ ║ ║
│ │ ║ Worker shutdown [bounded by outer timeout above] ║
│ │ ║ WAIT workers finish naturally (calculated time) ║
│ │ ║ STOP no new jobs, running jobs complete ║
│ │ ║ timeout: WEBPDF_SERVER_SHUTDOWN_TIMEOUT ║
│ │ ║ ABORT interrupt remaining threads at checkpoints ║
│ │ ║ timeout: WEBPDF_SERVER_SHUTDOWN_TIMEOUT ║
│ │ ║ ║
│ │ ║ Background services stop ║
│ │ ║ Chromium conversion support stops ║
│ │ ║ Runtime configuration resources stop ║
│ │ ║ Worker monitoring stops (max. 15s) ║
│ │ ║ Billing tasks stop (max. 15s) ║
│ │ ║ Billing data is saved ║
│ │ ║ ║
│ │ ║ Management resources stop ║
│ │ ║ Temporary working directories are removed ║
│ │ ╚═════════════════════════════════════════════════════════╝
│ │
│ └─ Cluster resources stop (if cluster mode is active)
│
▼ Embedded web server stops
│
▼ JVM terminates
Configuration
All shutdown-related settings are read at server startup from either an environment variable or a JVM system property. Environment variables take precedence when both are set. The naming convention is:
| Mechanism | Format | Example |
|---|---|---|
| Environment variable | UPPERCASE_WITH_UNDERSCORES | WEBPDF_CONTAINER_DRAIN_SECONDS=20 |
| JVM system property | lowercase.with.dots | -Dwebpdf.container.drain.seconds=20 |
Drain Period
The drain period controls how long Phase 1 waits before the full shutdown begins:
| Environment Variable | System Property | Default | Description |
|---|---|---|---|
WEBPDF_CONTAINER_DRAIN_SECONDS | webpdf.container.drain.seconds | 0 (disabled) | Seconds to wait in Phase 1 before proceeding with full shutdown |
Set this to 0 or leave it unset to skip the drain phase entirely (backwards-compatible
default for non-container deployments).
How to choose the right value:
WEBPDF_CONTAINER_DRAIN_SECONDS
≥ readinessProbe.periodSeconds × readinessProbe.failureThreshold
+ endpoint propagation buffer (5s for Kubernetes, 10s for OpenShift)
+ safety margin (2–3s)
Example: periodSeconds=5, failureThreshold=2, Kubernetes
= 5 × 2 + 5 + 3 = 18s → set 20s
Worker Shutdown Timeouts
After Phase 1, the server waits for active conversion and processing threads to finish before forcibly interrupting them. These timeouts can be set as environment variables or JVM system properties:
| Environment Variable | System Property | Default | Description |
|---|---|---|---|
WEBPDF_SERVER_SHUTDOWN_GRACEFUL | webpdf.server.shutdown.graceful | true | Wait for active workers to finish naturally |
WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT | webpdf.server.shutdown.graceful.timeout | 0 (auto) | Maximum seconds to wait (0 = calculated from current load) |
WEBPDF_SERVER_SHUTDOWN_TIMEOUT | webpdf.server.shutdown.timeout | 30 | Timeout for the stop and abort phases (seconds) |
When set to 0 (default), the server calculates the WAIT duration from the actual workload
at shutdown time. This is optimal for standalone deployments, but makes terminationGracePeriodSeconds
hard to size deterministically for container environments. For Kubernetes and OpenShift, set
this to a finite value (e.g. 30) that reflects your typical maximum job duration — this
caps the WAIT phase and allows a reliable terminationGracePeriodSeconds formula.
webPDF automatically derives the outer timeout for the worker shutdown process from your configured values — there is no separate hard limit to worry about:
outer timeout = WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT (if set, else calculated from load)
+ WEBPDF_SERVER_SHUTDOWN_TIMEOUT × 2
+ 10s internal margin
This means you can freely increase WEBPDF_SERVER_SHUTDOWN_TIMEOUT without hitting an
invisible ceiling. The only requirement is that terminationGracePeriodSeconds in your
Deployment spec is large enough to accommodate the total shutdown duration (see below).
Kubernetes terminationGracePeriodSeconds
Kubernetes sends SIGKILL after terminationGracePeriodSeconds regardless of the
application state. All phases must complete within this window:
terminationGracePeriodSeconds
≥ preStop duration (e.g. 5s)
+ WEBPDF_CONTAINER_DRAIN_SECONDS (e.g. 20s)
+ WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT (e.g. 30s, or max. expected job duration)
+ WEBPDF_SERVER_SHUTDOWN_TIMEOUT × 2 (STOP + ABORT, e.g. 25 × 2 = 50s)
+ internal margin (10s)
+ background cleanup (~15s)
+ safety margin (10s)
Formula: preStop + drain + graceful.timeout + timeout×2 + 35s
Example: 5 + 20 + 30 + 50 + 35 = 140s → set 150s
Set WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT to a finite value so you can calculate a
fixed terminationGracePeriodSeconds. When left at 0, the WAIT phase duration depends
on the actual workload at shutdown time, which may exceed a conservatively sized grace period
during peak load.
Complete Kubernetes Example
apiVersion: apps/v1
kind: Deployment
metadata:
name: webpdf
spec:
replicas: 2
template:
spec:
# preStop(5) + drain(20) + graceful(30) + stop+abort(50) + 35s overhead = 140s → 150s
terminationGracePeriodSeconds: 150
containers:
- name: webpdf
image: softvisiondev/webpdf:latest
env:
# Phase 1: traffic drain duration
- name: WEBPDF_CONTAINER_DRAIN_SECONDS
value: "20"
# Phase 2: cap WAIT phase at 30s (enables deterministic terminationGracePeriodSeconds)
- name: WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT
value: "30"
# Phase 2: timeout per STOP and ABORT phase
- name: WEBPDF_SERVER_SHUTDOWN_TIMEOUT
value: "25"
lifecycle:
preStop:
exec:
# Gives Kubernetes time to propagate the endpoint removal
# internally before SIGTERM starts the drain timer.
command: ["sh", "-c", "sleep 5"]
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5 # Check frequently during drain
failureThreshold: 2 # → max 10s until pod is removed from endpoints
timeoutSeconds: 3
OpenShift Example
OpenShift uses an HAProxy-based router that reloads its backend configuration at intervals (typically every 1–5 seconds). This can extend the effective race window and requires a slightly longer drain period:
apiVersion: apps/v1
kind: Deployment
metadata:
name: webpdf
spec:
replicas: 2
template:
spec:
# preStop(10) + drain(30) + graceful(30) + stop+abort(50) + 35s overhead = 155s → 165s
terminationGracePeriodSeconds: 165
containers:
- name: webpdf
image: softvisiondev/webpdf:latest
env:
# Longer drain for OpenShift HAProxy router reload latency
- name: WEBPDF_CONTAINER_DRAIN_SECONDS
value: "30"
- name: WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT
value: "30"
- name: WEBPDF_SERVER_SHUTDOWN_TIMEOUT
value: "25"
lifecycle:
preStop:
exec:
# Slightly longer sleep for OpenShift router propagation
command: ["sh", "-c", "sleep 10"]
readinessProbe:
httpGet:
path: /health/ready
port: 8080
periodSeconds: 5
failureThreshold: 2
timeoutSeconds: 3
livenessProbe:
httpGet:
path: /health/live
port: 8080
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 5
Worker Thread Handling
Active conversion jobs (PDF conversion, signing, OCR, etc.) run in dedicated worker threads that are independent of the HTTP lifecycle. The shutdown process handles them in three stages after the drain period:
| Stage | Action | Timeout |
|---|---|---|
| Wait | Workers that are still running are given time to finish naturally. The wait time is calculated from the current load. | Calculated (configurable via WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT) |
| Stop | New jobs are rejected. Running jobs continue to completion. | WEBPDF_SERVER_SHUTDOWN_TIMEOUT (default 30s) |
| Abort | Running threads are interrupted. Workers check for the interrupt signal at defined checkpoints and stop cleanly. | WEBPDF_SERVER_SHUTDOWN_TIMEOUT (default 30s) |
Workers that complete during the drain period (Phase 1) are no longer counted in the wait calculation — the drain time is not "wasted", it actively reduces the worker shutdown duration.
Workers interrupted in the Abort stage stop at defined checkpoints inside the conversion pipeline. Partial results are discarded and temporary files are cleaned up. The client receives an error response for the interrupted request.
Optional: Manual Pre-Shutdown Endpoint
webPDF provides a management endpoint that triggers Phase 1 manually, independently of
SIGTERM. This is useful for preStop hooks, rolling deployment automation, or
manual maintenance scenarios.
| Method | POST |
| Path | /health/shutdown |
| Auth | Authorization: Bearer <token> (required; 401 if missing/invalid, 404 if no token configured) |
| Effect | Sets readiness to DOWN, liveness stays UP |
| Response | 200 OK with current readiness state as JSON |
| Idempotent | Yes — repeated calls have no additional effect |
Authentication (Bearer Token)
The endpoint is disabled by default. It must be explicitly activated by configuring a
Bearer Token — without a token, POST /health/shutdown returns 404 Not Found.
Token resolution priority (highest to lowest):
| Priority | Source | Example |
|---|---|---|
| 1 | System property | -Dwebpdf.health.shutdown.token=secret |
| 2 | Environment variable | WEBPDF_HEALTH_SHUTDOWN_TOKEN=secret |
| 3 | server.xml | <health><shutdown token="secret"/></health> |
The environment variable and system property work without any server.xml configuration
— no <health> element is required when using WEBPDF_HEALTH_SHUTDOWN_TOKEN.
Option A: Environment variable only (recommended for containers)
WEBPDF_HEALTH_SHUTDOWN_TOKEN=your-secret-token
Option B: server.xml
<server>
<health>
<shutdown token="your-secret-token"/>
</health>
</server>
An empty token (token="" or blank) is treated as not configured in all sources.
Store the token as a Kubernetes Secret and inject it as an environment variable:
env:
- name: WEBPDF_HEALTH_SHUTDOWN_TOKEN
valueFrom:
secretKeyRef:
name: webpdf-shutdown-secret
key: token
Usage in a preStop hook (with authentication):
lifecycle:
preStop:
exec:
command:
- sh
- -c
- |
curl -sf -X POST \
-H "Authorization: Bearer $(cat /var/run/secrets/shutdown-token/token)" \
http://localhost:8080/health/shutdown || true
sleep 20
Or with an inline token (only suitable for non-sensitive environments):
lifecycle:
preStop:
exec:
command:
- sh
- -c
- |
curl -sf -X POST \
-H "Authorization: Bearer ${WEBPDF_HEALTH_SHUTDOWN_TOKEN}" \
http://localhost:8080/health/shutdown || true
sleep 20
If you use POST /health/shutdown in a preStop hook and set
WEBPDF_CONTAINER_DRAIN_SECONDS > 0, the drain wait time accumulates: the preStop sleep
plus the in-process drain period both run. Use one or the other:
- Use
WEBPDF_CONTAINER_DRAIN_SECONDSfor automatic drain triggered by SIGTERM. - Use
POST /health/shutdown+ sleep in a preStop hook if you need the drain to start before SIGTERM arrives (allows a shorterWEBPDF_CONTAINER_DRAIN_SECONDSor0).
The /health/shutdown endpoint requires a Bearer Token (Authorization: Bearer <token>).
Without a configured token, all requests return 404 Not Found — the endpoint is
completely inaccessible. For additional network-level restriction, use a Kubernetes
NetworkPolicy to limit access to cluster-internal traffic, or restrict access at the
embedded web server so only local requests can reach the endpoint.
Quick Reference
| Setting | Recommended Value | Notes |
|---|---|---|
WEBPDF_CONTAINER_DRAIN_SECONDS | 20 (k8s) / 30 (OpenShift) | System property: webpdf.container.drain.seconds. Must be ≥ periodSeconds × failureThreshold + propagation buffer |
WEBPDF_SERVER_SHUTDOWN_GRACEFUL_TIMEOUT | 30 | System property: webpdf.server.shutdown.graceful.timeout. Caps the WAIT phase; enables deterministic terminationGracePeriodSeconds sizing |
WEBPDF_SERVER_SHUTDOWN_TIMEOUT | 25 | System property: webpdf.server.shutdown.timeout. Timeout per STOP and ABORT phase |
WEBPDF_SERVER_SHUTDOWN_GRACEFUL | true (default) | System property: webpdf.server.shutdown.graceful. Keep enabled for clean job completion |
WEBPDF_HEALTH_SHUTDOWN_TOKEN | (Kubernetes Secret) | System property: webpdf.health.shutdown.token. Activates POST /health/shutdown. Without a token the endpoint returns 404 Not Found. Takes precedence over server.xml. |
terminationGracePeriodSeconds | preStop + drain + graceful.timeout + timeout×2 + 35 | Kubernetes Deployment spec. Example: 5+20+30+50+35 = 140 → set 150 |
preStop: sleep | 5s (k8s) / 10s (OpenShift) | Endpoint propagation safety net |
readinessProbe.periodSeconds | 5 | Frequent checks for fast drain detection |
readinessProbe.failureThreshold | 2 | Balances speed vs. flap sensitivity |