Version: 10.0

Metrics Integration

This guide shows how to integrate webPDF metrics with monitoring systems like Prometheus and Grafana.

Prometheus Integration

Prometheus is the recommended monitoring system for webPDF metrics. It scrapes the metrics endpoint periodically and stores time-series data.

Basic Configuration

Configure Prometheus to scrape the metrics endpoint with authentication:

Using Bearer Token Authentication

prometheus.yml:

scrape_configs:
  - job_name: 'webpdf'
    static_configs:
      - targets: ['localhost:8080']
    metrics_path: '/webPDF/metrics'
    bearer_token: 'your-secure-api-token'
    scrape_interval: 15s

Using Basic Authentication

prometheus.yml:

scrape_configs:
  - job_name: 'webpdf'
    static_configs:
      - targets: ['localhost:8080']
    metrics_path: '/webPDF/metrics'
    basic_auth:
      username: 'prometheus'
      password: 'your-secure-password'
    scrape_interval: 15s

Production Configuration

For production deployments, store credentials in separate files:

File-Based Basic Authentication

prometheus.yml:

scrape_configs:
  - job_name: 'webpdf'
    static_configs:
      - targets: ['webpdf-prod-01:8080', 'webpdf-prod-02:8080']
    metrics_path: '/webPDF/metrics'
    basic_auth:
      username: 'prometheus'
      password_file: '/etc/prometheus/webpdf_password.txt'
    scrape_interval: 15s
    scrape_timeout: 10s

/etc/prometheus/webpdf_password.txt:

your-secure-password

Set restrictive permissions:

chmod 600 /etc/prometheus/webpdf_password.txt
chown prometheus:prometheus /etc/prometheus/webpdf_password.txt

File-Based Bearer Token

prometheus.yml:

scrape_configs:
  - job_name: 'webpdf'
    static_configs:
      - targets: ['webpdf-prod-01:8080', 'webpdf-prod-02:8080']
    metrics_path: '/webPDF/metrics'
    bearer_token_file: '/etc/prometheus/webpdf_token.txt'
    scrape_interval: 15s
    scrape_timeout: 10s

/etc/prometheus/webpdf_token.txt:

your-secure-api-token

Multiple Instances

For monitoring multiple webPDF instances, use labels to identify them:

scrape_configs:
  - job_name: 'webpdf'
    static_configs:
      - targets: ['webpdf-prod-01:8080']
        labels:
          environment: 'production'
          datacenter: 'us-east-1'
          instance_name: 'webpdf-prod-01'
      - targets: ['webpdf-prod-02:8080']
        labels:
          environment: 'production'
          datacenter: 'us-west-1'
          instance_name: 'webpdf-prod-02'
      - targets: ['webpdf-staging:8080']
        labels:
          environment: 'staging'
          datacenter: 'us-east-1'
          instance_name: 'webpdf-staging'
    metrics_path: '/webPDF/metrics'
    basic_auth:
      username: 'prometheus'
      password_file: '/etc/prometheus/webpdf_password.txt'
    scrape_interval: 15s

HTTPS Configuration

When webPDF uses TLS, configure HTTPS scraping:

scrape_configs:
  - job_name: 'webpdf'
    scheme: https
    static_configs:
      - targets: ['webpdf.example.com:8443']
    metrics_path: '/webPDF/metrics'
    bearer_token_file: '/etc/prometheus/webpdf_token.txt'
    tls_config:
      ca_file: /etc/prometheus/certs/ca.crt
      # For self-signed certificates (not recommended):
      # insecure_skip_verify: true
    scrape_interval: 15s

Verify Scraping

Check Prometheus targets page to verify scraping is working:

http://prometheus-server:9090/targets

Look for:

State: UP (green) = successful scraping
State: DOWN (red) = scraping failed, check authentication/network

Grafana Integration

Grafana provides visualization and dashboards for Prometheus metrics.

Add Prometheus Data Source

Open Grafana: http://grafana-server:3000
Navigate to Configuration → Data Sources
Click Add data source
Select Prometheus
Configure:
- URL: http://prometheus-server:9090
- Access: Server (default)
Click Save & Test

Example Dashboard Panels

Request Rate

# Total request rate (requests per second)
rate(http_server_requests_count[5m])

# Request rate by endpoint
sum by (endpoint) (rate(http_server_requests_count[5m]))

# Request rate by status code
sum by (status) (rate(http_server_requests_count[5m]))

Error Rate

# Total error rate (4xx + 5xx)
sum(rate(http_server_errors_client_total[5m])) + sum(rate(http_server_errors_server_total[5m]))

# Error percentage
(sum(rate(http_server_errors_server_total[5m])) / sum(rate(http_server_requests_count[5m]))) * 100

# Errors by endpoint
sum by (endpoint) (rate(http_server_errors_server_total[5m]))

Response Time (Latency)

# P95 latency
http_server_requests{quantile="0.95"}

# P99 latency
http_server_requests{quantile="0.99"}

# Average latency (5min window)
rate(http_server_requests_sum[5m]) / rate(http_server_requests_count[5m])

# Latency by endpoint
http_server_requests{quantile="0.95", endpoint="/rest/converter"}

Thread Pool Utilization

# Thread pool utilization percentage
(threadpool_active / threadpool_pool_size) * 100

# Queue depth
threadpool_queue_size

# Queue utilization percentage
(threadpool_queue_size / threadpool_queue_capacity) * 100

# Rejected tasks (critical!)
increase(threadpool_rejected_total[5m])

Memory Usage

# Heap memory usage percentage
(jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}) * 100

# Heap used (MB)
jvm_memory_used_bytes{area="heap"} / 1024 / 1024

# GC pause time rate
rate(jvm_gc_pause_sum[5m])

CPU Usage

# Process CPU usage (percentage)
process_cpu_usage * 100

# System CPU usage (percentage)
system_cpu_usage * 100

Service Method Performance

# Service method request rate
sum by (method) (rate(service_method_requests_total[5m]))

# Service method error rate
sum by (method) (rate(service_method_errors_total[5m]))

# Service method duration P95
service_method_duration{quantile="0.95"}

Alert Rules

Create alerts in Prometheus to notify on critical conditions:

prometheus-alerts.yml:

groups:
  - name: webpdf_alerts
    interval: 30s
    rules:
      # High error rate alert
      - alert: WebPDFHighErrorRate
        expr: |
          (sum(rate(http_server_errors_server_total[5m])) / sum(rate(http_server_requests_count[5m]))) > 0.05
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "webPDF high error rate detected"
          description: "Error rate is {{ $value | humanizePercentage }} (threshold: 5%)"

      # Thread pool saturation
      - alert: WebPDFThreadPoolSaturated
        expr: |
          (threadpool_active / threadpool_pool_size) > 0.9
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "webPDF thread pool {{ $labels.pool }} saturated"
          description: "Thread pool utilization is {{ $value | humanizePercentage }}"

      # Queue overflow (critical!)
      - alert: WebPDFQueueOverflow
        expr: |
          increase(threadpool_rejected_total[5m]) > 0
        labels:
          severity: critical
        annotations:
          summary: "webPDF queue overflow - requests being rejected!"
          description: "{{ $value }} tasks rejected in pool {{ $labels.pool }}"

      # High memory usage
      - alert: WebPDFHighMemoryUsage
        expr: |
          (jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"}) > 0.85
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "webPDF high memory usage"
          description: "Heap usage is {{ $value | humanizePercentage }}"

      # High P99 latency
      - alert: WebPDFHighLatency
        expr: |
          http_server_requests{quantile="0.99"} > 5
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "webPDF high latency detected"
          description: "P99 latency is {{ $value }}s for {{ $labels.endpoint }}"

      # Service is down
      - alert: WebPDFServiceDown
        expr: |
          up{job="webpdf"} == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "webPDF service is down"
          description: "Instance {{ $labels.instance }} is not responding"

      # DLQ buildup
      - alert: WebPDFDLQBuildup
        expr: |
          dlq_store_entries{status="pending"} > 100
        for: 30m
        labels:
          severity: warning
        annotations:
          summary: "webPDF DLQ has many pending entries"
          description: "{{ $value }} pending DLQ entries need attention"

Load alerts into Prometheus:

# prometheus.yml
rule_files:
  - 'prometheus-alerts.yml'

Other Monitoring Systems

JMX Integration

Metrics are also exposed via JMX under the domain webpdf.metrics (configurable).

See Java Management Extension (JMX) for details.

JConsole

jconsole localhost:9999

Navigate to MBeans → webpdf.metrics

VisualVM

Install VisualVM
Connect to webPDF JVM
Open MBeans tab
Navigate to webpdf.metrics

Datadog Integration

Use Prometheus scraping with Datadog Agent:

datadog.yaml:

prometheus_url: http://localhost:9090
metrics:
  - http_server_*
  - threadpool_*
  - jvm_*
  - service_method_*

New Relic Integration

Use Prometheus Remote Write:

prometheus.yml:

remote_write:
  - url: https://metric-api.newrelic.com/prometheus/v1/write?prometheus_server=webpdf
    bearer_token_file: /etc/prometheus/newrelic_token.txt

Performance Considerations

Scrape Interval

15 seconds: Good balance for production (default recommendation)
30 seconds: Reduced load, suitable for less critical monitoring
5 seconds: High-resolution monitoring, increases load

Cardinality Management

High cardinality (many unique tag combinations) creates many time series:

<metrics enabled="true">
  <application maxEndpoints="100"
               maxUris="500"
               maxStatusCodes="20"/>
</metrics>

See Configuration for details.

Retention

Configure Prometheus retention based on needs:

# 30 days retention
prometheus --storage.tsdb.retention.time=30d

# 100GB maximum storage
prometheus --storage.tsdb.retention.size=100GB

Troubleshooting Integration

Prometheus Can't Scrape

Check authentication: Verify credentials in prometheus.yml
Check network: curl -u username:password http://webpdf:8080/webPDF/metrics
Check TLS: Verify certificate configuration if using HTTPS
Check logs: Review Prometheus logs for error messages

No Data in Grafana

Check data source: Test Prometheus connection in Grafana
Check query: Verify PromQL syntax in panel query
Check time range: Ensure time range includes scraped data
Check labels: Verify label names match your metrics

High Prometheus Memory Usage

Reduce cardinality: Set limits in webPDF metrics configuration
Reduce retention: Lower --storage.tsdb.retention.time
Disable unused metrics: Disable metric layers not needed

Prometheus Integration​

Basic Configuration​

Using Bearer Token Authentication​

Using Basic Authentication​

Production Configuration​

File-Based Basic Authentication​

File-Based Bearer Token​

Multiple Instances​

HTTPS Configuration​

Verify Scraping​

Grafana Integration​

Add Prometheus Data Source​

Example Dashboard Panels​

Request Rate​

Error Rate​

Response Time (Latency)​

Thread Pool Utilization​

Memory Usage​

CPU Usage​

Service Method Performance​

Alert Rules​

Other Monitoring Systems​

JMX Integration​

JConsole​

VisualVM​

Datadog Integration​

New Relic Integration​

Performance Considerations​

Scrape Interval​

Cardinality Management​

Retention​

Troubleshooting Integration​

Prometheus Can't Scrape​

No Data in Grafana​

High Prometheus Memory Usage​

See Also​

Prometheus Integration

Basic Configuration

Using Bearer Token Authentication

Using Basic Authentication

Production Configuration

File-Based Basic Authentication

File-Based Bearer Token

Multiple Instances

HTTPS Configuration

Verify Scraping

Grafana Integration

Add Prometheus Data Source

Example Dashboard Panels

Request Rate

Error Rate

Response Time (Latency)

Thread Pool Utilization

Memory Usage

CPU Usage

Service Method Performance

Alert Rules

Other Monitoring Systems

JMX Integration

JConsole

VisualVM

Datadog Integration

New Relic Integration

Performance Considerations

Scrape Interval

Cardinality Management

Retention

Troubleshooting Integration

Prometheus Can't Scrape

No Data in Grafana

High Prometheus Memory Usage

See Also