Aiven for Apache Kafka® metrics available via Prometheus

The following list only contains the most common metrics available via Prometheus for an Aiven for Apache Kafka® service.

You can retrieve the complete list of available metrics for your specific service by making a request to the Prometheus endpoint, substituting:

  • the Aiven project certificate (ca.pem)

  • the Prometheus credentials (<PROMETHEUS_USER>:<PROMETHEUS_PASSWORD>)

  • the Aiven for Apache Kafka hostname (<KAFKA_HOSTNAME>)

  • the Prometheus port (<PROMETHEUS_PORT>)

curl --cacert ca.pem \
    --user '<PROMETHEUS_USER>:<PROMETHEUS_PASSWORD>' \
    'https://<KAFKA_HOSTNAME>:<PROMETHEUS_PORT>/metrics'

Tip

You can check how to use Prometheus with Aiven in the dedicated document.

CPU utilization

  • cpu_usage_guest

  • cpu_usage_guest_nice

  • cpu_usage_idle

  • cpu_usage_iowait

  • cpu_usage_irq

  • cpu_usage_nice

  • cpu_usage_softirq

  • cpu_usage_steal

  • cpu_usage_system

  • cpu_usage_user

  • system_load1

  • system_load15

  • system_load5

  • system_n_cpus

  • system_n_users

  • system_uptime

Disk space utilization

  • disk_free

  • disk_inodes_free

  • disk_inodes_total

  • disk_inodes_used

  • disk_total

  • disk_used

  • disk_used_percent

Disk input and output

  • diskio_io_time

  • diskio_iops_in_progress

  • diskio_merged_reads

  • diskio_merged_writes

  • diskio_read_bytes

  • diskio_read_time

  • diskio_reads

  • diskio_weighted_io_time

  • diskio_write_bytes

  • diskio_write_time

  • diskio_writes

Garbage collector MXBean

  • java_lang_GarbageCollector_G1_Young_Generation_CollectionCount: returns the total number of collections that have occurred

  • java_lang_GarbageCollector_G1_Young_Generation_CollectionTime: returns the approximate accumulated collection elapsed time in milliseconds

  • java_lang_GarbageCollector_G1_Young_Generation_duration

Memory usage

  • java_lang_Memory_committed: returns the amount of memory in bytes that is committed for the Java virtual machine to use

  • java_lang_Memory_init: returns the amount of memory in bytes that the Java virtual machine initially requests from the operating system for memory management

  • java_lang_Memory_max: returns the maximum amount of memory in bytes that can be used for memory management

  • java_lang_Memory_used: returns the amount of used memory in bytes

  • java_lang_Memory_ObjectPendingFinalizationCount

Apache Kafka Connect

The list of Apache Kafka Connect metrics is available in the dedicated page.

Apache Kafka broker

The descriptions for the below metrics are available in the Monitoring section of the Apache Kafka documentation.

Note

The metrics with a _Count suffix are cumulative counters for the given metric, e.g. kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_Count.

Note that a metric like kafka_server_BrokerTopicMetrics_MessagesInPerSec_Count is a cumulative count of incoming messages despite the PerSec suffix in the metric name.

To see the rate of change of these _Count metrics, a function can be applied, e.g. the rate() function in PromQL.

Apache Kafka controller

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_50thPercentile

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_75thPercentile

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_95thPercentile

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_98thPercentile

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_999thPercentile

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_99thPercentile

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_Count

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_FifteenMinuteRate

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_FiveMinuteRate

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_Max

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_Mean

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_MeanRate

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_Min

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_OneMinuteRate

  • kafka_controller_ControllerStats_LeaderElectionRateAndTimeMs_StdDev

  • kafka_controller_ControllerStats_UncleanLeaderElectionsPerSec_Count

  • kafka_controller_KafkaController_ActiveBrokerCount_Value

  • kafka_controller_KafkaController_ActiveControllerCount_Value

  • kafka_controller_KafkaController_FencedBrokerCount_Value

  • kafka_controller_KafkaController_OfflinePartitionsCount_Value

  • kafka_controller_KafkaController_PreferredReplicaImbalanceCount_Value

  • kafka_controller_KafkaController_ReplicasIneligibleToDeleteCount_Value

  • kafka_controller_KafkaController_ReplicasToDeleteCount_Value

  • kafka_controller_KafkaController_TopicsIneligibleToDeleteCount_Value

  • kafka_controller_KafkaController_TopicsToDeleteCount_Value

Jolokia collector collect time

  • kafka_jolokia_collector_collect_time

Apache Kafka log

  • kafka_log_LogCleaner_cleaner_recopy_percent_Value

  • kafka_log_LogCleanerManager_time_since_last_run_ms_Value

  • kafka_log_LogCleaner_max_clean_time_secs_Value

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_50thPercentile

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_75thPercentile

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_95thPercentile

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_98thPercentile

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_999thPercentile

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_99thPercentile

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_Count

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_FifteenMinuteRate

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_FiveMinuteRate

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_Max

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_Mean

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_MeanRate

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_Min

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_OneMinuteRate

  • kafka_log_LogFlushStats_LogFlushRateAndTimeMs_StdDev

  • kafka_log_Log_LogEndOffset_Value

  • kafka_log_Log_LogStartOffset_Value

  • kafka_log_Log_Size_Value

Apache Kafka network

  • kafka_network_RequestChannel_RequestQueueSize_Value

  • kafka_network_RequestChannel_ResponseQueueSize_Value

  • kafka_network_RequestMetrics_RequestsPerSec_Count

  • kafka_network_RequestMetrics_TotalTimeMs_95thPercentile

  • kafka_network_RequestMetrics_TotalTimeMs_Count

  • kafka_network_RequestMetrics_TotalTimeMs_Mean

  • kafka_network_SocketServer_NetworkProcessorAvgIdlePercent_Value

Apache Kafka server

  • kafka_server_BrokerTopicMetrics_BytesInPerSec_Count

  • kafka_server_BrokerTopicMetrics_BytesOutPerSec_Count

  • kafka_server_BrokerTopicMetrics_BytesRejectedPerSec_Count

  • kafka_server_BrokerTopicMetrics_FailedFetchRequestsPerSec_Count

  • kafka_server_BrokerTopicMetrics_FailedProduceRequestsPerSec_Count

  • kafka_server_BrokerTopicMetrics_FetchMessageConversionsPerSec_Count

  • kafka_server_BrokerTopicMetrics_MessagesInPerSec_Count

  • kafka_server_BrokerTopicMetrics_ProduceMessageConversionsPerSec_Count

  • kafka_server_BrokerTopicMetrics_ReassignmentBytesInPerSec_Count

  • kafka_server_BrokerTopicMetrics_ReassignmentBytesOutPerSec_Count

  • kafka_server_BrokerTopicMetrics_ReplicationBytesInPerSec_Count

  • kafka_server_BrokerTopicMetrics_ReplicationBytesOutPerSec_Count

  • kafka_server_BrokerTopicMetrics_TotalFetchRequestsPerSec_Count

  • kafka_server_BrokerTopicMetrics_TotalProduceRequestsPerSec_Count

  • kafka_server_DelayedOperationPurgatory_NumDelayedOperations_Value

  • kafka_server_DelayedOperationPurgatory_PurgatorySize_Value

  • kafka_server_KafkaRequestHandlerPool_RequestHandlerAvgIdlePercent_OneMinuteRate

  • kafka_server_KafkaServer_BrokerState_Value

  • kafka_server_ReplicaManager_IsrExpandsPerSec_Count

  • kafka_server_ReplicaManager_IsrShrinksPerSec_Count

  • kafka_server_ReplicaManager_LeaderCount_Value

  • kafka_server_ReplicaManager_PartitionCount_Value

  • kafka_server_ReplicaManager_UnderMinIsrPartitionCount_Value

  • kafka_server_ReplicaManager_UnderReplicatedPartitions_Value

  • kafka_server_group_coordinator_metrics_group_completed_rebalance_count

  • kafka_server_group_coordinator_metrics_group_completed_rebalance_rate

  • kafka_server_group_coordinator_metrics_offset_commit_count

  • kafka_server_group_coordinator_metrics_offset_commit_rate

  • kafka_server_group_coordinator_metrics_offset_deletion_count

  • kafka_server_group_coordinator_metrics_offset_deletion_rate

  • kafka_server_group_coordinator_metrics_offset_expiration_count

  • kafka_server_group_coordinator_metrics_offset_expiration_rate

Kernel

  • kernel_boot_time

  • kernel_context_switches

  • kernel_entropy_avail

  • kernel_interrupts

  • kernel_processes_forked

Generic memory

  • mem_active

  • mem_available

  • mem_available_percent

  • mem_buffered

  • mem_cached

  • mem_commit_limit

  • mem_committed_as

  • mem_dirty

  • mem_free

  • mem_high_free

  • mem_high_total

  • mem_huge_pages_free

  • mem_huge_page_size

  • mem_huge_pages_total

  • mem_inactive

  • mem_low_free

  • mem_low_total

  • mem_mapped

  • mem_page_tables

  • mem_shared

  • mem_slab

  • mem_swap_cached

  • mem_swap_free

  • mem_swap_total

  • mem_total

  • mem_used

  • mem_used_percent

  • mem_vmalloc_chunk

  • mem_vmalloc_total

  • mem_vmalloc_used

  • mem_wired

  • mem_write_back

  • mem_write_back_tmp

Network

  • net_bytes_recv

  • net_bytes_sent

  • net_drop_in

  • net_drop_out

  • net_err_in

  • net_err_out

  • net_icmp_inaddrmaskreps

  • net_icmp_inaddrmasks

  • net_icmp_incsumerrors

  • net_icmp_indestunreachs

  • net_icmp_inechoreps

  • net_icmp_inechos

  • net_icmp_inerrors

  • net_icmp_inmsgs

  • net_icmp_inparmprobs

  • net_icmp_inredirects

  • net_icmp_insrcquenchs

  • net_icmp_intimeexcds

  • net_icmp_intimestampreps

  • net_icmp_intimestamps

  • net_icmpmsg_intype3

  • net_icmpmsg_intype8

  • net_icmpmsg_outtype0

  • net_icmpmsg_outtype3

  • net_icmp_outaddrmaskreps

  • net_icmp_outaddrmasks

  • net_icmp_outdestunreachs

  • net_icmp_outechoreps

  • net_icmp_outechos

  • net_icmp_outerrors

  • net_icmp_outmsgs

  • net_icmp_outparmprobs

  • net_icmp_outredirects

  • net_icmp_outsrcquenchs

  • net_icmp_outtimeexcds

  • net_icmp_outtimestampreps

  • net_icmp_outtimestamps

  • net_ip_defaultttl

  • net_ip_forwarding

  • net_ip_forwdatagrams

  • net_ip_fragcreates

  • net_ip_fragfails

  • net_ip_fragoks

  • net_ip_inaddrerrors

  • net_ip_indelivers

  • net_ip_indiscards

  • net_ip_inhdrerrors

  • net_ip_inreceives

  • net_ip_inunknownprotos

  • net_ip_outdiscards

  • net_ip_outnoroutes

  • net_ip_outrequests

  • net_ip_reasmfails

  • net_ip_reasmoks

  • net_ip_reasmreqds

  • net_ip_reasmtimeout

  • net_packets_recv

  • net_packets_sent

  • netstat_tcp_close

  • netstat_tcp_close_wait

  • netstat_tcp_closing

  • netstat_tcp_established

  • netstat_tcp_fin_wait1

  • netstat_tcp_fin_wait2

  • netstat_tcp_last_ack

  • netstat_tcp_listen

  • netstat_tcp_none

  • netstat_tcp_syn_recv

  • netstat_tcp_syn_sent

  • netstat_tcp_time_wait

  • netstat_udp_socket

  • net_tcp_activeopens

  • net_tcp_attemptfails

  • net_tcp_currestab

  • net_tcp_estabresets

  • net_tcp_incsumerrors

  • net_tcp_inerrs

  • net_tcp_insegs

  • net_tcp_maxconn

  • net_tcp_outrsts

  • net_tcp_outsegs

  • net_tcp_passiveopens

  • net_tcp_retranssegs

  • net_tcp_rtoalgorithm

  • net_tcp_rtomax

  • net_tcp_rtomin

  • net_udp_ignoredmulti

  • net_udp_incsumerrors

  • net_udp_indatagrams

  • net_udp_inerrors

  • net_udplite_ignoredmulti

  • net_udplite_incsumerrors

  • net_udplite_indatagrams

  • net_udplite_inerrors

  • net_udplite_noports

  • net_udplite_outdatagrams

  • net_udplite_rcvbuferrors

  • net_udplite_sndbuferrors

  • net_udp_noports

  • net_udp_outdatagrams

  • net_udp_rcvbuferrors

  • net_udp_sndbuferrors

Processes

  • processes_blocked

  • processes_dead

  • processes_idle

  • processes_paging

  • processes_running

  • processes_sleeping

  • processes_stopped

  • processes_total

  • processes_total_threads

  • processes_unknown

  • processes_zombies

Swap usage

  • swap_free

  • swap_in

  • swap_out

  • swap_total

  • swap_used

  • swap_used_percent