Skip to main content
Version: 10.0

Metrics and Monitoring

Experimental feature

Metrics and the /metrics endpoint are experimental features in the current version of webPDF. They are disabled by default and must be explicitly enabled in the configuration.

webPDF provides comprehensive metrics collection and export capabilities for monitoring application performance, system resources, and business operations. The metrics system is built on Micrometer and supports both Prometheus and JMX registries.

Overview

The metrics architecture implements a 4-layer monitoring system for REST web services that measures performance at different stages of request processing:

Web Service Type Coverage
  • REST (Jersey): Uses all 4 layers (HTTP, API, Service, Thread Pool)
  • SOAP (JAX-WS): Uses Layer 3 (Service) and Layer 4 (Thread Pool) only

The HTTP Transport and REST API layers (Layer 1 and 2) are specific to REST endpoints and do not apply to SOAP web services.

Layered Architecture

Layer 1: HTTP Transport http_server_* [REST only]

  • Full request cycle: authentication, CORS, routing
  • Measures complete request/response time

framework overhead

Layer 2: REST API api_operation_* [REST only]

  • Jersey endpoint execution and request/response processing

serialization overhead

Layer 3: Business Logic service_method_* [REST + SOAP]

  • Web service operations (converter, PDF/A, signature, etc.)
  • Core processing logic

queue wait time

Layer 4: Thread Pool threadpool_* [REST + SOAP]

  • Worker execution and queue management
  • Actual worker processing time

This layered approach allows precise identification of performance bottlenecks at each stage. Each layer's metrics measure different aspects, with Layer 1 being the most comprehensive (full request) and Layer 4 focusing on the actual worker execution.

Quick Start

  1. Enable metrics in conf/application.xml:
<application>
<metrics enabled="true">
<prometheus enabled="true"/>
</metrics>
</application>
  1. Configure authentication (required by default):
# Environment settings
export WEBPDF_METRICS_AUTH_USERNAME=prometheus
export WEBPDF_METRICS_AUTH_PASSWORD=your-secure-password
  1. Restart webPDF server

  2. Access metrics endpoint:

curl -u prometheus:your-secure-password http://localhost:8080/webPDF/metrics
  1. Configure Prometheus to scrape the endpoint (see Integration Guide)

Configuration

Metrics can be configured via conf/application.xml or environment variables.

For complete configuration details including parameters, presets, and environment settings, see the Configuration Guide.

Accessing Metrics

Prometheus Endpoint

Metrics are exposed via HTTP at:

http://<HOST_URL>/webPDF/metrics
Context Path

The endpoint URL includes the context path (default: webPDF). If you have changed the context path, adjust the URL accordingly.

Authentication is required by default. See the Authentication Guide for configuration details.

Example:

curl -u prometheus:your-password http://localhost:8080/webPDF/metrics

JMX Access

Metrics are also available via JMX under the configured domain (default: webpdf.metrics).

See Java Management Extension (JMX) for JMX configuration and access.

Available Metrics

webPDF collects metrics across multiple layers and components:

HTTP Transport Layer

  • Request duration, throughput, and active requests
  • Request/response payload sizes
  • Error rates (4xx/5xx)

REST API Layer

  • Endpoint execution timing
  • Operation-specific metrics

Business Logic Layer

  • Service method execution time
  • Request counts and error rates
  • Per-operation metrics (converter, PDF/A, signature, etc.)

Thread Pool Layer

  • Thread pool utilization
  • Queue depth and capacity
  • Rejected tasks (critical alert indicator)

Event Pipeline (Cluster)

  • Event consumer processing metrics
  • Dead letter queue (DLQ) metrics
  • Reprocessing statistics

JVM & System

  • Application version information
  • Memory usage (heap/non-heap)
  • Garbage collection metrics
  • JVM runtime information
  • CPU usage, disk space, file descriptors
  • Thread counts, class loader stats

See Metric Reference for complete details on all 58+ metrics.

Memory Usage Estimation

The metrics system provides memory forecasting based on configuration:

  • Base overhead: ~50 MB (registries, JVM metrics, system metrics)
  • Per timer (with percentiles): ~1 MB
  • Per timer (without percentiles): ~0.1 MB
  • Per gauge/counter: ~0.05 MB

Example calculation for 50 endpoints:

  • HTTP timers: 50 × 1 MB = 50 MB
  • API timers: 50 × 1 MB = 50 MB
  • Service timers: 10 × 1 MB = 10 MB
  • Thread pool gauges: 10 × 0.05 MB = 0.5 MB
  • Base overhead: 50 MB
  • Total: ~160 MB

Disabling percentiles reduces this to ~65 MB.

Health Monitoring

The metrics system includes automatic health monitoring that logs periodic snapshots of meter count and heap usage:

Metrics health snapshot: meters=1523 (delta=12), heap=256 MB / 2048 MB (delta=8 MB)

Warnings are emitted when:

  • Meter count exceeds the configured warning threshold (default: 10000, see Configuration Guide)
  • Heap usage exceeds 80% of maximum
  • Unusual growth detected (>1000 meters or >256 MB per interval)

Health Monitoring Configuration

Health logging can be configured via environment settings:

Environment VariableDefaultDescription
WEBPDF_METRICS_HEALTH_LOG_INTERVAL300000Health snapshot interval in milliseconds (min: 30000)
WEBPDF_METRICS_HEALTH_WARN_COOLDOWN3600000Warning cooldown period in milliseconds (min: 60000)

Example:

# Log health snapshots every 5 minutes, warn at most once per hour
export WEBPDF_METRICS_HEALTH_LOG_INTERVAL=300000
export WEBPDF_METRICS_HEALTH_WARN_COOLDOWN=3600000
note

Health monitoring starts automatically when metrics are enabled. The logger runs as a daemon thread and logs to the standard application logger.

Common Use Cases

Monitor Request Performance

Track request latency and throughput:

# P95 latency
http_server_requests{quantile="0.95"}

# Request rate (per second)
rate(http_server_requests_count[5m])

# Error rate
rate(http_server_errors_server_total[5m])

Monitor Resource Utilization

Track thread pool and system resources:

# Thread pool utilization
(threadpool_active / threadpool_pool_size) * 100

# Queue depth
threadpool_queue_size

# Heap usage percentage
(jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}) * 100

Detect Performance Bottlenecks

Compare metrics across layers:

# HTTP overhead (auth, CORS)
http_server_requests{quantile="0.95"} - api_operation_duration{quantile="0.95"}

# Serialization overhead
api_operation_duration{quantile="0.95"} - service_method_duration{quantile="0.95"}

Alert on Critical Conditions

# Queue overflow (immediate action required!)
increase(threadpool_rejected_total[5m]) > 0

# High error rate
(sum(rate(http_server_errors_server_total[5m])) / sum(rate(http_server_requests_count[5m]))) > 0.05

Best Practices

Production Deployments

  1. Disable percentiles to save memory (500-1000 MB reduction)
  2. Set cardinality limits to prevent metric explosion
  3. Enable authentication for security
  4. Use HTTPS for metrics endpoint (TLS configuration)
  5. Monitor health logs for warnings
  6. Set up alerts for critical conditions (queue overflow, high error rate)

Development/Testing

  1. Enable all metrics for detailed analysis
  2. Disable authentication for easy access (local only!)
  3. Enable startup logging to see all metrics
  4. Use percentiles for latency analysis

Performance

  1. Disable HTTP layer if overhead is concern (highest impact)
  2. Limit percentiles to 1-2 values (e.g., only p95)
  3. Use minimal preset for resource-constrained environments

See Also