Skip to main content
Version: 10.0

Metrics Troubleshooting

This guide helps diagnose and resolve common issues with webPDF metrics.

High Memory Usage

Metrics can consume significant memory, especially with percentiles enabled.

Symptoms

  • High JVM heap usage
  • Frequent garbage collections
  • OutOfMemoryError in logs
  • Server startup warnings about metric memory usage

Diagnosis

Check server logs for memory forecasts:

Metrics memory forecast (heuristic): 2048 MB (warning threshold: 1024 MB)
Metrics health snapshot: meters=15234 (delta=1200), heap=1800 MB / 2048 MB (delta=256 MB)

Solutions

1. Disable Percentiles (Most Effective)

Percentiles consume ~1 MB per timer. Disabling can reduce memory by 50-80%.

conf/application.xml:

<application>
<metrics enabled="true">
<application percentilesEnabled="false" />
</metrics>
</application>

Memory savings: ~500-1000 MB for typical deployment

2. Set Cardinality Limits

Limit unique tag combinations to prevent metric explosion:

<application>
<metrics enabled="true">
<application maxEndpoints="100"
maxUris="500"
maxStatusCodes="20" />
</metrics>
</application>

Memory savings: ~200-500 MB depending on traffic patterns

3. Disable Unused Layers

Turn off metric layers you don't need:

<application>
<metrics enabled="true">
<httpMetrics enabled="false" /> <!-- Highest overhead -->
<apiMetrics enabled="true" />
<serviceMetrics enabled="true" />
<threadPoolMetrics enabled="true" />
</metrics>
</application>

To disable event consumer and DLQ metrics, set WEBPDF_METRICS_EVENT_PIPELINE_ENABLED=false.

Memory savings: ~100-300 MB per disabled layer

4. Increase JVM Heap

If metrics are essential, increase heap size:

webPDF.vmoptions / webPDF.service.vmoptions:

-Xms2048m
-Xmx4096m

See Java Configuration for details.

Missing Metrics

Metrics endpoint is accessible but returns no data or partial data.

Symptoms

  • Empty response from /webPDF/metrics
  • Some metrics missing
  • Prometheus shows "No data"

Diagnosis Steps

1. Check Global Enable

Verify in logs:

Metrics manager globally disabled by configuration

Fix (via XML):

<application>
<metrics enabled="true">
</metrics>
</application>

Or via Environment Settings:

export WEBPDF_METRICS_ENABLED=true

2. Check Registry Configuration

Verify in logs:

No registry enabled - Metrics will not be exported

Fix:

<application>
<metrics enabled="true">
<prometheus enabled="true" />
</metrics>
</application>

3. Check Layer Configuration

Specific metrics missing? Check layer settings:

<application>
<metrics enabled="true">
<httpMetrics enabled="true" /> <!-- http_server_* metrics -->
<apiMetrics enabled="true" /> <!-- api_operation_* metrics -->
<serviceMetrics enabled="true" /> <!-- service_method_* metrics -->
<threadPoolMetrics enabled="true" /> <!-- threadpool_* metrics -->
<jvmMetrics enabled="true" /> <!-- jvm_* metrics -->
<systemMetrics enabled="true" /> <!-- system_*, process_* -->
</metrics>
</application>

Event consumer and DLQ metrics are controlled by WEBPDF_METRICS_EVENT_PIPELINE_ENABLED.

4. Check Endpoint Configuration

Verify endpoint is enabled:

<application>
<metrics enabled="true">
<application endpointEnabled="true" />
</metrics>
</application>

5. Verify Authentication

See Authentication Troubleshooting below.

Authentication Issues

Cannot access metrics endpoint due to authentication failures.

401 Unauthorized

Symptoms

$ curl http://localhost:8080/webPDF/metrics
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Basic realm="Metrics", Bearer
Unauthorized: Invalid credentials

Diagnosis

  1. Check configured authentication methods:

Server logs:

Metrics authentication enabled: Basic Auth (username=prometheus)
# or
Metrics authentication enabled: Bearer Token
# or
Metrics authentication is required but no credentials configured!
  1. Verify credentials:

Environment variables:

# Linux/Mac
echo $WEBPDF_METRICS_AUTH_USERNAME
echo $WEBPDF_METRICS_AUTH_PASSWORD
echo $WEBPDF_METRICS_AUTH_TOKEN

# Windows
echo %WEBPDF_METRICS_AUTH_USERNAME%
echo %WEBPDF_METRICS_AUTH_PASSWORD%
echo %WEBPDF_METRICS_AUTH_TOKEN%

Solutions

Set credentials via environment settings:

export WEBPDF_METRICS_AUTH_USERNAME=prometheus
export WEBPDF_METRICS_AUTH_PASSWORD=secure-password

Test with curl:

# Basic Auth
curl -u prometheus:secure-password http://localhost:8080/webPDF/metrics

# Bearer Token
curl -H "Authorization: Bearer your-token" http://localhost:8080/webPDF/metrics

Restart webPDF server after configuration changes.

Wrong Authentication Method

Symptoms

Using Bearer Token but only Basic Auth is configured (or vice versa).

Diagnosis

Check WWW-Authenticate header in 401 response:

  • Basic realm="Metrics" only → Only Basic Auth configured
  • Bearer only → Only Bearer Token configured
  • Both → Either method accepted

Solution

Use the configured authentication method or configure the method you want to use.

Disable Authentication (Development Only)

<application>
<metrics enabled="true">
<auth enabled="false" />
</metrics>
</application>
caution

This exposes metrics publicly. Only use in development/trusted environments.

Performance Impact

Metrics collection adds overhead to request processing.

Symptoms

  • Increased request latency
  • Higher CPU usage
  • More memory allocations

Expected Overhead

LayerOverhead per RequestComponent
HTTP metrics~0.1-0.2 msHTTP request instrumentation
API metrics~0.05-0.1 msREST API instrumentation
Service metrics~0.02-0.05 msService operation instrumentation
Thread pool metricsNegligibleGauge reads

Total expected overhead: ~0.2-0.4 ms per request (< 1% for typical 50ms requests)

Diagnosis

Compare request duration with/without metrics:

With metrics:

http_server_requests{quantile="0.95"}

Disable metrics temporarily:

<metrics enabled="false" />

Restart and measure performance difference.

Solutions

1. Disable HTTP Layer (Highest Overhead)

HTTP metrics have the most overhead due to frequent execution:

<metrics enabled="true">
<httpMetrics enabled="false" />
<apiMetrics enabled="true" />
<serviceMetrics enabled="true" />
<threadPoolMetrics enabled="true" />
</metrics>

Overhead reduction: ~50-60% of metrics overhead

2. Reduce Percentile Count

Fewer percentiles = faster recording:

<metrics enabled="true">
<application percentilesEnabled="true">
<percentiles>
<percentile>0.95</percentile>
</percentiles>
</application>
</metrics>

3. Consider Minimal Configuration

Only essential metrics:

<application>
<metrics enabled="true">
<httpMetrics enabled="false" />
<apiMetrics enabled="false" />
<serviceMetrics enabled="true" />
<threadPoolMetrics enabled="true" />
<jvmMetrics enabled="false" />
<systemMetrics enabled="false" />
<application percentilesEnabled="false" />
</metrics>
</application>

Prometheus Scraping Issues

Prometheus can't scrape metrics or reports errors.

Target Down in Prometheus

Symptoms

Prometheus targets page shows webPDF as DOWN (red).

Diagnosis

Check network connectivity:

curl -v http://webpdf:8080/webPDF/metrics

Check authentication:

curl -u prometheus:password http://webpdf:8080/webPDF/metrics

Check Prometheus logs:

level=error msg="Error scraping target" target=webpdf err="authentication required"

Solutions

  1. Fix authentication in prometheus.yml:
basic_auth:
username: 'prometheus'
password: 'correct-password'
  1. Fix network/firewall:
  • Ensure Prometheus can reach webPDF server
  • Check firewall rules
  • Verify DNS resolution
  1. Fix TLS configuration:
scheme: https
tls_config:
ca_file: /path/to/ca.crt

Scrape Timeout

Symptoms

level=warn msg="Scrape took longer than timeout" target=webpdf

Solutions

Increase scrape timeout:

scrape_configs:
- job_name: 'webpdf'
scrape_timeout: 15s # Increase from default 10s
scrape_interval: 30s

Or reduce metric count:

  • Set cardinality limits
  • Disable unused layers

Health Monitoring Issues

Metrics health logger reports warnings.

Meter Count Warning

Log Message

Metrics health warning: meters=12500 (threshold=10000), heap=1200 MB / 2048 MB

Meaning

More metrics than expected, potential memory issue.

Solutions

  1. Increase warning threshold (if memory is okay):

System property:

-Dwebpdf.metrics.application.metric.count.warning.threshold=15000

Or in conf/application.xml:

<metrics enabled="true">
<application metricCountWarningThreshold="15000" />
</metrics>
  1. Reduce metric count:
  • Set cardinality limits
  • Disable unused layers

High Heap Usage Warning

Log Message

Metrics health warning: heap=1700 MB / 2048 MB (83% usage)

Solutions

See High Memory Usage above.

Unusual Growth Warning

Log Message

Metrics health snapshot: meters=15000 (delta=1200), heap=1800 MB / 2048 MB (delta=512 MB)
Metrics health warning: growth: metersDelta=1200, heapDelta=512 MB

Meaning

Rapid increase in metrics or memory, possible metric explosion.

Diagnosis

Check for unbounded cardinality:

  • URI tags with unique IDs (/user/123, /user/456, ...)
  • Endpoint tags with dynamic values
  • Status codes beyond typical range

Solutions

Set cardinality limits:

<metrics enabled="true">
<application maxEndpoints="100"
maxUris="500"
maxStatusCodes="20" />
</metrics>

Use templated URIs: Ensure REST resources use path templates such as /user/{id} instead of concrete values such as /user/123 or /user/456.

Metric Cardinality Explosion

Too many unique tag combinations create excessive metrics.

Symptoms

  • Rapid memory growth
  • Hundreds of thousands of metrics
  • Prometheus high memory usage
  • Slow metric queries

Common Causes

  1. Concrete IDs in URIs:
http_server_requests{uri="/document/abc-123-def"}
http_server_requests{uri="/document/xyz-456-ghi"}
...thousands more...

Should be: {uri="/document/{id}"}

  1. Dynamic endpoint names: Each unique endpoint creates new time series

  2. Unbounded status codes: Custom status codes creating many unique values

Solutions

1. Enable Cardinality Limits

Prevents unbounded growth:

<metrics enabled="true">
<application maxEndpoints="100"
maxUris="500"
maxStatusCodes="20" />
</metrics>

When limit reached, new unique values are denied (existing metrics continue).

2. Fix URI Templates

Ensure REST resources use path templates such as /document/{id} instead of concrete identifiers such as /document/abc-123-def.

3. Monitor Cardinality

Check metrics count:

curl -s http://localhost:8080/webPDF/metrics | grep -c "^http_server_requests"

Should be: Hundreds to low thousands Warning: Tens of thousands Critical: Hundreds of thousands

Log Messages Reference

Startup Messages

✓ Using the metrics configuration: MetricsConfiguration{enabled=true, ...}
✓ Metrics startup memory baseline (measured): before=150 MB, after=180 MB, delta=30 MB
✓ Metrics authentication enabled: Basic Auth (username=prometheus)
✓ Metrics health logger started (interval=300s, warnCooldown=3600s)

Warning Messages

⚠ Metrics configuration warnings: More than 5 percentiles configured - high memory usage expected
⚠ Metrics memory forecast (heuristic): 1536 MB (warning threshold: 1024 MB) - consider disabling percentiles
⚠ Metrics count 12000 exceeds warning threshold 10000
⚠ Metrics authentication disabled (recommended for development mode only)
⚠ Metrics authentication is required but no credentials configured!

Error Messages

✗ Metrics manager globally disabled by configuration
✗ No registry available - Metrics disabled
✗ MetricsManager not found in ServletContext - authentication disabled

Getting Help

If issues persist after trying these solutions:

  1. Check server logs for detailed error messages
  2. Enable debug logging for metrics components
  3. Review configuration in conf/application.xml
  4. Test endpoint directly with curl
  5. Verify Prometheus configuration and logs

See Also