Skip to main content
Version: 10.0

Metrics Integration

This guide shows how to integrate webPDF metrics with monitoring systems like Prometheus and Grafana.

Prometheus Integration

Prometheus is the recommended monitoring system for webPDF metrics. It scrapes the metrics endpoint periodically and stores time-series data.

Basic Configuration

Configure Prometheus to scrape the metrics endpoint with authentication:

Using Bearer Token Authentication

prometheus.yml:

scrape_configs:
- job_name: 'webpdf'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/webPDF/metrics'
bearer_token: 'your-secure-api-token'
scrape_interval: 15s

Using Basic Authentication

prometheus.yml:

scrape_configs:
- job_name: 'webpdf'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/webPDF/metrics'
basic_auth:
username: 'prometheus'
password: 'your-secure-password'
scrape_interval: 15s

Production Configuration

For production deployments, store credentials in separate files:

File-Based Basic Authentication

prometheus.yml:

scrape_configs:
- job_name: 'webpdf'
static_configs:
- targets: ['webpdf-prod-01:8080', 'webpdf-prod-02:8080']
metrics_path: '/webPDF/metrics'
basic_auth:
username: 'prometheus'
password_file: '/etc/prometheus/webpdf_password.txt'
scrape_interval: 15s
scrape_timeout: 10s

/etc/prometheus/webpdf_password.txt:

your-secure-password

Set restrictive permissions:

chmod 600 /etc/prometheus/webpdf_password.txt
chown prometheus:prometheus /etc/prometheus/webpdf_password.txt

File-Based Bearer Token

prometheus.yml:

scrape_configs:
- job_name: 'webpdf'
static_configs:
- targets: ['webpdf-prod-01:8080', 'webpdf-prod-02:8080']
metrics_path: '/webPDF/metrics'
bearer_token_file: '/etc/prometheus/webpdf_token.txt'
scrape_interval: 15s
scrape_timeout: 10s

/etc/prometheus/webpdf_token.txt:

your-secure-api-token

Multiple Instances

For monitoring multiple webPDF instances, use labels to identify them:

scrape_configs:
- job_name: 'webpdf'
static_configs:
- targets: ['webpdf-prod-01:8080']
labels:
environment: 'production'
datacenter: 'us-east-1'
instance_name: 'webpdf-prod-01'
- targets: ['webpdf-prod-02:8080']
labels:
environment: 'production'
datacenter: 'us-west-1'
instance_name: 'webpdf-prod-02'
- targets: ['webpdf-staging:8080']
labels:
environment: 'staging'
datacenter: 'us-east-1'
instance_name: 'webpdf-staging'
metrics_path: '/webPDF/metrics'
basic_auth:
username: 'prometheus'
password_file: '/etc/prometheus/webpdf_password.txt'
scrape_interval: 15s

HTTPS Configuration

When webPDF uses TLS, configure HTTPS scraping:

scrape_configs:
- job_name: 'webpdf'
scheme: https
static_configs:
- targets: ['webpdf.example.com:8443']
metrics_path: '/webPDF/metrics'
bearer_token_file: '/etc/prometheus/webpdf_token.txt'
tls_config:
ca_file: /etc/prometheus/certs/ca.crt
# For self-signed certificates (not recommended):
# insecure_skip_verify: true
scrape_interval: 15s

Verify Scraping

Check Prometheus targets page to verify scraping is working:

http://prometheus-server:9090/targets

Look for:

  • State: UP (green) = successful scraping
  • State: DOWN (red) = scraping failed, check authentication/network

Grafana Integration

Grafana provides visualization and dashboards for Prometheus metrics.

Add Prometheus Data Source

  1. Open Grafana: http://grafana-server:3000
  2. Navigate to ConfigurationData Sources
  3. Click Add data source
  4. Select Prometheus
  5. Configure:
    • URL: http://prometheus-server:9090
    • Access: Server (default)
  6. Click Save & Test

Example Dashboard Panels

Request Rate

# Total request rate (requests per second)
rate(http_server_requests_count[5m])

# Request rate by endpoint
sum by (endpoint) (rate(http_server_requests_count[5m]))

# Request rate by status code
sum by (status) (rate(http_server_requests_count[5m]))

Error Rate

# Total error rate (4xx + 5xx)
sum(rate(http_server_errors_client_total[5m])) + sum(rate(http_server_errors_server_total[5m]))

# Error percentage
(sum(rate(http_server_errors_server_total[5m])) / sum(rate(http_server_requests_count[5m]))) * 100

# Errors by endpoint
sum by (endpoint) (rate(http_server_errors_server_total[5m]))

Response Time (Latency)

# P95 latency
http_server_requests{quantile="0.95"}

# P99 latency
http_server_requests{quantile="0.99"}

# Average latency (5min window)
rate(http_server_requests_sum[5m]) / rate(http_server_requests_count[5m])

# Latency by endpoint
http_server_requests{quantile="0.95", endpoint="/rest/converter"}

Thread Pool Utilization

# Thread pool utilization percentage
(threadpool_active / threadpool_pool_size) * 100

# Queue depth
threadpool_queue_size

# Queue utilization percentage
(threadpool_queue_size / threadpool_queue_capacity) * 100

# Rejected tasks (critical!)
increase(threadpool_rejected_total[5m])

Memory Usage

# Heap memory usage percentage
(jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}) * 100

# Heap used (MB)
jvm_memory_used_bytes{area="heap"} / 1024 / 1024

# GC pause time rate
rate(jvm_gc_pause_sum[5m])

CPU Usage

# Process CPU usage (percentage)
process_cpu_usage * 100

# System CPU usage (percentage)
system_cpu_usage * 100

Service Method Performance

# Service method request rate
sum by (method) (rate(service_method_requests_total[5m]))

# Service method error rate
sum by (method) (rate(service_method_errors_total[5m]))

# Service method duration P95
service_method_duration{quantile="0.95"}

Alert Rules

Create alerts in Prometheus to notify on critical conditions:

prometheus-alerts.yml:

groups:
- name: webpdf_alerts
interval: 30s
rules:
# High error rate alert
- alert: WebPDFHighErrorRate
expr: |
(sum(rate(http_server_errors_server_total[5m])) / sum(rate(http_server_requests_count[5m]))) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "webPDF high error rate detected"
description: "Error rate is {{ $value | humanizePercentage }} (threshold: 5%)"

# Thread pool saturation
- alert: WebPDFThreadPoolSaturated
expr: |
(threadpool_active / threadpool_pool_size) > 0.9
for: 10m
labels:
severity: warning
annotations:
summary: "webPDF thread pool {{ $labels.pool }} saturated"
description: "Thread pool utilization is {{ $value | humanizePercentage }}"

# Queue overflow (critical!)
- alert: WebPDFQueueOverflow
expr: |
increase(threadpool_rejected_total[5m]) > 0
labels:
severity: critical
annotations:
summary: "webPDF queue overflow - requests being rejected!"
description: "{{ $value }} tasks rejected in pool {{ $labels.pool }}"

# High memory usage
- alert: WebPDFHighMemoryUsage
expr: |
(jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}) > 0.85
for: 15m
labels:
severity: warning
annotations:
summary: "webPDF high memory usage"
description: "Heap usage is {{ $value | humanizePercentage }}"

# High P99 latency
- alert: WebPDFHighLatency
expr: |
http_server_requests{quantile="0.99"} > 5
for: 10m
labels:
severity: warning
annotations:
summary: "webPDF high latency detected"
description: "P99 latency is {{ $value }}s for {{ $labels.endpoint }}"

# Service is down
- alert: WebPDFServiceDown
expr: |
up{job="webpdf"} == 0
for: 2m
labels:
severity: critical
annotations:
summary: "webPDF service is down"
description: "Instance {{ $labels.instance }} is not responding"

# DLQ buildup
- alert: WebPDFDLQBuildup
expr: |
dlq_store_entries{status="pending"} > 100
for: 30m
labels:
severity: warning
annotations:
summary: "webPDF DLQ has many pending entries"
description: "{{ $value }} pending DLQ entries need attention"

Load alerts into Prometheus:

# prometheus.yml
rule_files:
- 'prometheus-alerts.yml'

Other Monitoring Systems

JMX Integration

Metrics are also exposed via JMX under the domain webpdf.metrics (configurable).

See Java Management Extension (JMX) for details.

JConsole

jconsole localhost:9999

Navigate to MBeanswebpdf.metrics

VisualVM

  1. Install VisualVM
  2. Connect to webPDF JVM
  3. Open MBeans tab
  4. Navigate to webpdf.metrics

Datadog Integration

Use Prometheus scraping with Datadog Agent:

datadog.yaml:

prometheus_url: http://localhost:9090
metrics:
- http_server_*
- threadpool_*
- jvm_*
- service_method_*

New Relic Integration

Use Prometheus Remote Write:

prometheus.yml:

remote_write:
- url: https://metric-api.newrelic.com/prometheus/v1/write?prometheus_server=webpdf
bearer_token_file: /etc/prometheus/newrelic_token.txt

Performance Considerations

Scrape Interval

  • 15 seconds: Good balance for production (default recommendation)
  • 30 seconds: Reduced load, suitable for less critical monitoring
  • 5 seconds: High-resolution monitoring, increases load

Cardinality Management

High cardinality (many unique tag combinations) creates many time series:

<metrics enabled="true">
<application maxEndpoints="100"
maxUris="500"
maxStatusCodes="20"/>
</metrics>

See Configuration for details.

Retention

Configure Prometheus retention based on needs:

# 30 days retention
prometheus --storage.tsdb.retention.time=30d

# 100GB maximum storage
prometheus --storage.tsdb.retention.size=100GB

Troubleshooting Integration

Prometheus Can't Scrape

  1. Check authentication: Verify credentials in prometheus.yml
  2. Check network: curl -u username:password http://webpdf:8080/webPDF/metrics
  3. Check TLS: Verify certificate configuration if using HTTPS
  4. Check logs: Review Prometheus logs for error messages

No Data in Grafana

  1. Check data source: Test Prometheus connection in Grafana
  2. Check query: Verify PromQL syntax in panel query
  3. Check time range: Ensure time range includes scraped data
  4. Check labels: Verify label names match your metrics

High Prometheus Memory Usage

  1. Reduce cardinality: Set limits in webPDF metrics configuration
  2. Reduce retention: Lower --storage.tsdb.retention.time
  3. Disable unused metrics: Disable metric layers not needed

See Also