Skip to main content
Version: 10.0

Metrics Reference

This page provides a complete reference of all metrics collected by webPDF. Metrics are organized by layer and component.

Application Information

MetricTypeTagsUnitDescription
webpdfGaugeversion, version_date, beta-Constant metric with webPDF version metadata

Architecture Overview

webPDF implements a 4-layer monitoring architecture:

note

The values for Duration and other measurements on this page are examples only.

Layer Duration Overhead What's Measured
─────────────────────────────────────────────────────────────────────────
HTTP (http_server) 50ms 100% Full request cycle
↓ (5ms auth/CORS)
API (api_operation) 45ms 90% 5ms: Auth, CORS, routing
↓ (5ms Jersey)
Service (service_method) 40ms 80% 5ms: Jersey serialization
↓ (5ms queue wait)
Worker Execution 35ms 70% 5ms: Queue wait time

HTTP Transport Layer

Prefix: http_server
Measures: Complete HTTP request lifecycle including network I/O, authentication, CORS, and routing overhead

MetricTypeTagsUnitDescription
http_server_requestsTimerendpoint, method, uri, statussecondsTotal HTTP request duration from network arrival to response
http_server_requests_activeGaugeendpoint, method, urirequestsCurrently active HTTP requests in processing
http_server_request_sizeDistributionSummaryendpoint, method, uribytesHTTP request body size
http_server_response_sizeDistributionSummaryendpoint, method, uri, statusbytesHTTP response body size
http_server_errors_clientCounterendpoint, method, urierrorsClient errors (4xx status codes)
http_server_errors_serverCounterendpoint, method, urierrorsServer errors (5xx status codes)

Example Tags:

  • endpoint: /rest/converter, /rest/pdfa, /soap/signature
  • method: GET, POST, PUT, DELETE
  • uri: /rest/converter, /rest/pdfa/{id}
  • status: 200, 400, 401, 500

REST API Layer

Prefix: api_operation
Measures: Jersey REST endpoint execution including request deserialization, resource method invocation, and response serialization (excludes auth/CORS overhead)

MetricTypeTagsUnitDescription
api_operation_durationTimerendpoint, operationsecondsREST endpoint processing time

Example Tags:

  • endpoint: ConverterResource, PdfaResource, SignatureResource
  • operation: convert, validatePdfa, signDocument

Business Logic Layer

Prefix: service_method
Measures: Execution of business logic (web service operations) including processing time, request count, and errors (excludes Jersey overhead)

MetricTypeTagsUnitDescription
service_method_durationTimermethodsecondsService operation duration from worker submission to response
service_method_requests_totalCountermethodrequestsTotal number of service requests executed
service_method_errors_totalCountermethoderrorsTotal number of service execution errors

Method Tag Values:

  • converter - Document conversion (Office, images, etc.)
  • pdfa - PDF/A conversion and validation
  • signature - Digital signatures
  • barcode - Barcode generation and recognition
  • ocr - Optical character recognition
  • toolbox - PDF manipulation operations
  • urlconverter - URL to PDF conversion

Thread Pool Layer

Prefix: threadpool
Measures: Thread pool utilization, queue depth, and worker lifecycle metrics (live gauges, no timing)

MetricTypeTagsUnitDescription
threadpool_activeGaugepoolthreadsCurrently executing worker threads
threadpool_pool_sizeGaugepoolthreadsConfigured maximum thread pool capacity
threadpool_largest_pool_sizeGaugepoolthreadsPeak number of concurrent threads observed
threadpool_completed_totalGaugepooltasksTotal number of completed worker tasks
threadpool_rejected_totalGaugepooltasksTasks rejected due to full queue ⚠️
threadpool_queue_sizeGaugepooltasksCurrent number of tasks waiting in queue
threadpool_queue_remaining_capacityGaugepooltasksAvailable queue slots for new tasks
threadpool_queue_capacityGaugepooltasksMaximum configured queue capacity

Pool Tag Values:

  • converter, pdfa, barcode, ocr, signature, toolbox, urlconverter

Queue Monitoring:

queue_size = 0 → No backlog, immediate processing
queue_size > 0 → Workers waiting, potential latency
queue_size == capacity → Queue full, next submit will reject!
rejected_total > 0 → CRITICAL: System overload, scale up required
Critical Alert

threadpool_rejected_total > 0 indicates queue overflow and request rejection (503 errors). Immediate action required: scale up or investigate bottleneck.

Event Pipeline Metrics

Prefix: event_consumer, dlq
Measures: Event processing and dead letter queue operations

Event Consumer Metrics

MetricTypeTagsUnitDescription
event_consumer_read_batches_totalCounter-batchesTotal consumer readGroup batches
event_consumer_read_messages_totalCounter-messagesTotal messages returned by consumer readGroup
event_consumer_messages_totalCounteroutcomemessagesEvent consumer outcomes by result class
event_consumer_file_storage_delete_totalCounterresultoperationsFile storage deletion results in event consumer
event_consumer_errors_totalCountertypeerrorsConsumer loop errors by type

Outcome Tag Values:

  • processed - Successfully processed
  • retried - Temporary failure, retry scheduled
  • acked_invalid - Invalid message acknowledged
  • dlq - Moved to dead letter queue
  • max_deliveries - Maximum delivery attempts reached

File Storage Result Tag Values:

  • success - File deleted successfully
  • retryable_failure - Temporary failure, will retry
  • permanent_failure - Permanent failure, cannot delete
  • skipped_non_cloud - Skipped (not cloud storage)

Error Type Tag Values:

  • redis_connection - Redis connection error
  • redis_timeout - Redis operation timeout
  • redis_exception - Other Redis exception
  • unexpected - Unexpected error

Dead Letter Queue (DLQ) Metrics

MetricTypeTagsUnitDescription
dlq_published_totalCounteroriginentriesDLQ publications by origin
dlq_reprocess_claimed_totalCounter-entriesTotal claimed replay entries
dlq_reprocess_result_totalCounterresultoperationsDLQ replay outcomes
dlq_lineage_blocked_on_publish_totalCounter-operationsDLQ publish operations blocked by lineage protection
dlq_archive_failures_totalCountersinkoperationsDLQ archive failures by sink type
dlq_reprocess_cleanup_removed_totalCounter-entriesRemoved DLQ entries during retention cleanup
dlq_store_entriesGaugestatusentriesCurrent replay store entries by status
dlq_store_blocked_lineagesGauge-lineagesCurrent number of permanently blocked replay lineages
dlq_store_oldest_pending_age_secondsGauge-secondsAge of oldest pending DLQ entry

Origin Tag Values:

  • event_writer - Published by event writer
  • event_consumer - Published by event consumer

Reprocess Result Tag Values:

  • succeeded - Reprocessing succeeded
  • retry - Temporary failure, retry scheduled
  • permanent_failed - Permanent failure
  • skipped_blocked_lineage - Reprocessing skipped because the entry belongs to a blocked lineage

Archive Sink Tag Values:

  • best_effort - Best-effort archive sink
  • local_mandatory - Local mandatory archive sink

Store Status Tag Values:

  • pending - Waiting for reprocessing
  • processing - Currently being reprocessed
  • succeeded - Reprocessing succeeded
  • failed_permanent - Permanent failure

JVM Metrics

Prefix: jvm
Measures: Java Virtual Machine performance

Memory Metrics

MetricTypeTagsUnitDescription
jvm_memory_used_bytesGaugearea, idbytesUsed memory by memory pool
jvm_memory_committed_bytesGaugearea, idbytesCommitted memory by memory pool
jvm_memory_max_bytesGaugearea, idbytesMaximum memory by memory pool

Area Tag Values: heap, nonheap
ID Tag Values: PS Eden Space, PS Old Gen, PS Survivor Space, Code Cache, Metaspace, etc.

Garbage Collection Metrics

MetricTypeTagsUnitDescription
jvm_gc_pauseTimeraction, causesecondsGarbage collection pause duration
jvm_gc_memory_allocated_bytes_totalCounter-bytesTotal memory allocated
jvm_gc_memory_promoted_bytes_totalCounter-bytesMemory promoted to old generation

Thread Metrics

MetricTypeTagsUnitDescription
jvm_threads_liveGauge-threadsCurrent live thread count
jvm_threads_daemonGauge-threadsCurrent daemon thread count
jvm_threads_peakGauge-threadsPeak live thread count
jvm_threads_statesGaugestatethreadsCurrent thread count by state

State Tag Values: runnable, blocked, waiting, timed-waiting, new, terminated

Class Loader Metrics

MetricTypeTagsUnitDescription
jvm_classes_loadedGauge-classesCurrently loaded class count
jvm_classes_unloaded_totalCounter-classesTotal unloaded class count

JVM Info Metrics

MetricTypeTagsUnitDescription
jvm_infoGaugeruntime, vendor, version-Constant info metric with JVM runtime metadata

System Metrics

Prefix: system, process
Measures: Operating system and process metrics

CPU Metrics

MetricTypeTagsUnitDescription
system_cpu_usageGauge-ratio (0-1)System-wide CPU usage
system_cpu_countGauge-coresNumber of available processors
process_cpu_usageGauge-ratio (0-1)Process CPU usage

Uptime Metrics

MetricTypeTagsUnitDescription
process_uptime_secondsGauge-secondsProcess uptime since start
process_start_time_secondsGauge-secondsProcess start time (Unix epoch)

Disk Space Metrics

MetricTypeTagsUnitDescription
disk_free_bytesGaugepathbytesFree disk space on filesystem
disk_total_bytesGaugepathbytesTotal disk space on filesystem

File Descriptor Metrics

MetricTypeTagsUnitDescription
process_files_openGauge-filesCurrently open file descriptors
process_files_maxGauge-filesMaximum file descriptors

Log4j2 Metrics

MetricTypeTagsUnitDescription
log4j2_events_totalCounterleveleventsLog events by level

Level Tag Values: trace, debug, info, warn, error, fatal

Tomcat Metrics

Prefix: tomcat
Measures: Embedded Apache Tomcat server metrics

Session Metrics

MetricTypeTagsUnitDescription
tomcat_sessions_active_currentGauge-sessionsCurrent active sessions
tomcat_sessions_active_maxGauge-sessionsMaximum active sessions observed
tomcat_sessions_created_totalCounter-sessionsTotal created sessions
tomcat_sessions_expired_totalCounter-sessionsTotal expired sessions
tomcat_sessions_rejected_totalCounter-sessionsTotal rejected sessions

Thread Metrics

MetricTypeTagsUnitDescription
tomcat_threads_currentGaugenamethreadsCurrent thread count in thread pool
tomcat_threads_busyGaugenamethreadsBusy thread count in thread pool

Name Tag: http-nio-8080 (depends on connector configuration)

Metric Correlation

Understanding Performance Bottlenecks

Compare metrics across layers to identify bottlenecks:

If api_operation_duration >> service_method_duration:

  • High serialization/deserialization overhead
  • Consider optimizing JSON parsing or reducing payload size

If service_method_duration >> threadpool_active:

  • Workers are idle while business logic runs
  • Processing time dominated by external dependencies

If threadpool_queue_size > 0:

  • Requests waiting for available worker threads
  • service_method_duration increases due to queue wait time
  • Consider scaling up thread pool

Common Alert Patterns

Queue Buildup:

threadpool_queue_size > 0
→ Requests waiting for worker threads
→ service_method_duration increases
→ Consider scaling thread pool

Thread Pool Saturation:

threadpool_active == threadpool_pool_size
→ All threads busy, new requests must queue
→ Scale up or investigate slow workers

System Overload (Critical):

threadpool_rejected_total > 0
→ Queue full, requests being rejected
→ http_server_errors_server increases (503)
→ IMMEDIATE ACTION REQUIRED

High Memory Usage:

jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} > 0.8
→ Memory pressure, possible GC thrashing
→ Review heap settings or optimize memory usage

DLQ Buildup:

dlq_store_entries{status="pending"} > 100
→ Many failed events waiting for reprocessing
→ Check event consumer errors

See Also