Issue
we installed Kafka Kraft cluster version 3.5 on 3 VM RHEL Linux machines ( as part of lab testing )
Kafka Kraft cluster is working well and all brokers are up and running and controllers are running as well
the configuration for controllers and brokers are set on following path:
/home/controller_home/kafka_2.13-3.5.1/config/kraft/controller.properties
/home/broker_home/kafka_2.13-3.5.1/config/kraft/broker.properties
we successfully created topics and we can view the topics by kafka cli as well
the last thing that isn't working is about the view of metadata
as the following example ( from our lab )
./kafka-metadata-shell.sh --snapshot /var/kafka/controller/__cluster_metadata-0/00000000000000000000.log
Loading...
[2023-09-13 16:11:38,419] WARN [snapshotReaderQueue] cleanup event threw exception (org.apache.kafka.queue.KafkaEventQueue)
java.nio.channels.NonWritableChannelException
at java.base/sun.nio.ch.FileChannelImpl.truncate(FileChannelImpl.java:399)
at org.apache.kafka.common.record.FileRecords.truncateTo(FileRecords.java:270)
at org.apache.kafka.common.record.FileRecords.trim(FileRecords.java:231)
at org.apache.kafka.common.record.FileRecords.close(FileRecords.java:205)
at org.apache.kafka.metadata.util.SnapshotFileReader$ShutdownEvent.run(SnapshotFileReader.java:195)
at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
at java.base/java.lang.Thread.run(Thread.java:829)
in rare cases is works as
./kafka-metadata-shell.sh --snapshot /var/kafka/controller/__cluster_metadata-0/00000000000000000000.log
Loading...
Starting...
[2023-09-13 15:33:18,481] WARN [snapshotReaderQueue] cleanup event threw exception (org.apache.kafka.queue.KafkaEventQueue)
java.nio.channels.NonWritableChannelException
at java.base/sun.nio.ch.FileChannelImpl.truncate(FileChannelImpl.java:399)
at org.apache.kafka.common.record.FileRecords.truncateTo(FileRecords.java:270)
at org.apache.kafka.common.record.FileRecords.trim(FileRecords.java:231)
at org.apache.kafka.common.record.FileRecords.close(FileRecords.java:205)
at org.apache.kafka.metadata.util.SnapshotFileReader$ShutdownEvent.run(SnapshotFileReader.java:195)
at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
at java.base/java.lang.Thread.run(Thread.java:829)
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.jline.terminal.impl.exec.ExecTerminalProvider (file:/xxxx/controller/kafka_2.13-3.5.1/libs/jline-3.22.0.jar) to constructor java.lang.ProcessBuilder$RedirectPipeImpl()
WARNING: Please consider reporting this to the maintainers of org.jline.terminal.impl.exec.ExecTerminalProvider
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[ Kafka Metadata Shell ]
>> ls /
local image
but on 99% tries we cant get the Exploring of Kafka metadata using metadata shell
we not understand if this is a bug on Kafka Kraft version or something else?
our kafka cluster details
openjdk version "11.0.11"
3 brokers PID's
3 controllers PID's
kafka version - 3.5 ( kraft mode )
example from kafka-metadata-quorum.sh
( we run it on first Kafka node )
./kafka-metadata-quorum.sh --bootstrap-server xxxxxx:xxxxx describe --replication
NodeId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status
1 30368 0 1694623038767 1694623038767 Leader
2 30368 0 1694623038606 1694623038606 Follower
3 30368 0 1694623038607 1694623038607 Follower
33 30368 0 1694623038606 1694623038606 Observer
22 30368 0 1694623038606 1694623038606 Observer
11 30368 0 1694623038606 1694623038606 Observer
here the files under __cluster_metadata-0
folder ( the meta data topic )
-rwxrwxrwx 1 root root 43 Sep 13 12:09 partition.metadata
-rwxrwxrwx 1 root root 4772 Sep 13 14:12 00000000000000012930-0000000036.checkpoint
-rw-r--r-- 1 root root 10 Sep 13 15:06 00000000000000019468.snapshot
-rw-r--r-- 1 root root 154 Sep 13 15:58 quorum-state
-rw-r--r-- 1 root root 202 Sep 13 15:58 leader-epoch-checkpoint
-r-------- 1 root root 4772 Sep 13 16:06 00000000000000026669-0000000084.checkpoint
-rwxrwxrwx 1 root root 10485756 Sep 13 16:44 00000000000000000000.timeindex
-rwxrwxrwx 1 root root 10485760 Sep 13 16:44 00000000000000000000.index
-rwxrwxrwx 1 root root 2239162 Sep 13 16:44 00000000000000000000.log
here is example of expected output ( from apache documentation )
> bin/kafka-metadata-shell.sh --snapshot metadata_log_dir/__cluster_metadata-0/00000000000000000000.log
>> ls /
brokers local metadataQuorum topicIds topics
>> ls /topics
foo
>> cat /topics/foo/0/data
{
"partitionId" : 0,
"topicId" : "5zoAlv-xEh9xRANKXt1Lbg",
"replicas" : [ 1 ],
"isr" : [ 1 ],
"removingReplicas" : null,
"addingReplicas" : null,
"leader" : 1,
"leaderEpoch" : 0,
"partitionEpoch" : 0
}
>> exit
Related links
https://zhuanlan.zhihu.com/p/595020396
IMPORTANT NOTE
since we used the Kafka Kraft version 3.5 , we look on documentation on version 3.5.1 to find hint about maybe bug that was fixed that related to our problem , but from following list we not find anything that related
Bug
[KAFKA-15053] - Regression for security.protocol validation starting from 3.3.0
[KAFKA-15080] - Fetcher's lag never set when partition is idle
[KAFKA-15096] - CVE 2023-34455 - Vulnerability identified with Apache kafka
[KAFKA-15098] - KRaft migration does not proceed and broker dies if authorizer.class.name is set
[KAFKA-15114] - StorageTool help specifies user as parameter not name
[KAFKA-15137] - Don't log the entire request in KRaftControllerChannelManager
[KAFKA-15145] - AbstractWorkerSourceTask re-processes records filtered out by SMTs on retriable exceptions
[KAFKA-15149] - Fix not sending UMR and LISR RPCs in dual-write mode when there are new partitions
Solution
When looking on https://kafka.apache.org/documentation/
, They give example how to check the metadata as the following
> bin/kafka-dump-log.sh --cluster-metadata-decoder --files metadata_log_dir/__cluster_metadata-0/00000000000000000000.log
above option to set the xxxxxxxxxxxxxxx.log
gives exception and not working well
But other option is working is to set the file/s -xxxxxxxxxxxxxxxxxxxxxxxxx.checkpoint
instead of xxxxxxxxxxxxxxxxxxxxxxxx.log
Example
kafka-metadata-shell.sh --snapshot ..../__cluster_metadata-0/00000000000005727401-0000003391.checkpoint
Loading...
Starting...
[ Kafka Metadata Shell ]
>> ls /
image local
>> ls /image
acls clientQuotas cluster configs delegationToken features producerIds provenance scram topics
>> ls /local
commitId version
>>
Answered By - Judy Answer Checked By - Clifford M. (WPSolving Volunteer)