kafka + reading from topic log file
I have a topic log file and the corresponding .index file. I would like to read the messages in a streaming fashion and process it. How and where should I start?
- Should I load these files to Kafka producer and read from topic?
- Can i directly write a consumer to read data from the file and process it?
I have gone through the Kafka website and everywhere, it uses pre-built Kafka producers and consumers in the examples. So, I couldn't get enough guidance.
I want to read in streaming fashion in Java.
The text looks encrypted so i am not posting the input files.
Any help is really appreciated.
apache-kafka apache-kafka-streams
add a comment |
I have a topic log file and the corresponding .index file. I would like to read the messages in a streaming fashion and process it. How and where should I start?
- Should I load these files to Kafka producer and read from topic?
- Can i directly write a consumer to read data from the file and process it?
I have gone through the Kafka website and everywhere, it uses pre-built Kafka producers and consumers in the examples. So, I couldn't get enough guidance.
I want to read in streaming fashion in Java.
The text looks encrypted so i am not posting the input files.
Any help is really appreciated.
apache-kafka apache-kafka-streams
I cannot follow. What do you exactly mean by "I have a topic log file"?
– Matthias J. Sax
Nov 24 '18 at 21:52
It's not encrypted. It's serialized in raw bytes, but standard consumers are deserailizing that... Otherwise, what you want is the CLI tool to dump log segments, but not clear why you're wanting these raw files
– cricket_007
Nov 24 '18 at 22:19
@matthias-j-sax there are two files in a folder, they are named 0000000000.index and 000000000.log. I want to read the files and do aggregate operations.
– Thomson
Nov 24 '18 at 23:36
You would need to build a deserializer that understand the internal format used by Kafka. Overall, this seems to be a very special request. Those files are not designed to be consumer from any other application. As @cricket_007 pointed out, there is a dump log segments tool: this should help you read the files if you have the correct deserializers for keys and values.
– Matthias J. Sax
Nov 25 '18 at 4:22
add a comment |
I have a topic log file and the corresponding .index file. I would like to read the messages in a streaming fashion and process it. How and where should I start?
- Should I load these files to Kafka producer and read from topic?
- Can i directly write a consumer to read data from the file and process it?
I have gone through the Kafka website and everywhere, it uses pre-built Kafka producers and consumers in the examples. So, I couldn't get enough guidance.
I want to read in streaming fashion in Java.
The text looks encrypted so i am not posting the input files.
Any help is really appreciated.
apache-kafka apache-kafka-streams
I have a topic log file and the corresponding .index file. I would like to read the messages in a streaming fashion and process it. How and where should I start?
- Should I load these files to Kafka producer and read from topic?
- Can i directly write a consumer to read data from the file and process it?
I have gone through the Kafka website and everywhere, it uses pre-built Kafka producers and consumers in the examples. So, I couldn't get enough guidance.
I want to read in streaming fashion in Java.
The text looks encrypted so i am not posting the input files.
Any help is really appreciated.
apache-kafka apache-kafka-streams
apache-kafka apache-kafka-streams
asked Nov 24 '18 at 18:17
ThomsonThomson
1
1
I cannot follow. What do you exactly mean by "I have a topic log file"?
– Matthias J. Sax
Nov 24 '18 at 21:52
It's not encrypted. It's serialized in raw bytes, but standard consumers are deserailizing that... Otherwise, what you want is the CLI tool to dump log segments, but not clear why you're wanting these raw files
– cricket_007
Nov 24 '18 at 22:19
@matthias-j-sax there are two files in a folder, they are named 0000000000.index and 000000000.log. I want to read the files and do aggregate operations.
– Thomson
Nov 24 '18 at 23:36
You would need to build a deserializer that understand the internal format used by Kafka. Overall, this seems to be a very special request. Those files are not designed to be consumer from any other application. As @cricket_007 pointed out, there is a dump log segments tool: this should help you read the files if you have the correct deserializers for keys and values.
– Matthias J. Sax
Nov 25 '18 at 4:22
add a comment |
I cannot follow. What do you exactly mean by "I have a topic log file"?
– Matthias J. Sax
Nov 24 '18 at 21:52
It's not encrypted. It's serialized in raw bytes, but standard consumers are deserailizing that... Otherwise, what you want is the CLI tool to dump log segments, but not clear why you're wanting these raw files
– cricket_007
Nov 24 '18 at 22:19
@matthias-j-sax there are two files in a folder, they are named 0000000000.index and 000000000.log. I want to read the files and do aggregate operations.
– Thomson
Nov 24 '18 at 23:36
You would need to build a deserializer that understand the internal format used by Kafka. Overall, this seems to be a very special request. Those files are not designed to be consumer from any other application. As @cricket_007 pointed out, there is a dump log segments tool: this should help you read the files if you have the correct deserializers for keys and values.
– Matthias J. Sax
Nov 25 '18 at 4:22
I cannot follow. What do you exactly mean by "I have a topic log file"?
– Matthias J. Sax
Nov 24 '18 at 21:52
I cannot follow. What do you exactly mean by "I have a topic log file"?
– Matthias J. Sax
Nov 24 '18 at 21:52
It's not encrypted. It's serialized in raw bytes, but standard consumers are deserailizing that... Otherwise, what you want is the CLI tool to dump log segments, but not clear why you're wanting these raw files
– cricket_007
Nov 24 '18 at 22:19
It's not encrypted. It's serialized in raw bytes, but standard consumers are deserailizing that... Otherwise, what you want is the CLI tool to dump log segments, but not clear why you're wanting these raw files
– cricket_007
Nov 24 '18 at 22:19
@matthias-j-sax there are two files in a folder, they are named 0000000000.index and 000000000.log. I want to read the files and do aggregate operations.
– Thomson
Nov 24 '18 at 23:36
@matthias-j-sax there are two files in a folder, they are named 0000000000.index and 000000000.log. I want to read the files and do aggregate operations.
– Thomson
Nov 24 '18 at 23:36
You would need to build a deserializer that understand the internal format used by Kafka. Overall, this seems to be a very special request. Those files are not designed to be consumer from any other application. As @cricket_007 pointed out, there is a dump log segments tool: this should help you read the files if you have the correct deserializers for keys and values.
– Matthias J. Sax
Nov 25 '18 at 4:22
You would need to build a deserializer that understand the internal format used by Kafka. Overall, this seems to be a very special request. Those files are not designed to be consumer from any other application. As @cricket_007 pointed out, there is a dump log segments tool: this should help you read the files if you have the correct deserializers for keys and values.
– Matthias J. Sax
Nov 25 '18 at 4:22
add a comment |
1 Answer
1
active
oldest
votes
You can dump log segments and use the deep iteration option to deserialize the data into something more readable.
If you want to "stream it", then use a standard Unix pipe to output to some other tool
do aggregate operations
Then use Kafka Streams to actually read from the topic for all partitions rather than the single partition on that single broker
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461087%2fkafka-reading-from-topic-log-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can dump log segments and use the deep iteration option to deserialize the data into something more readable.
If you want to "stream it", then use a standard Unix pipe to output to some other tool
do aggregate operations
Then use Kafka Streams to actually read from the topic for all partitions rather than the single partition on that single broker
add a comment |
You can dump log segments and use the deep iteration option to deserialize the data into something more readable.
If you want to "stream it", then use a standard Unix pipe to output to some other tool
do aggregate operations
Then use Kafka Streams to actually read from the topic for all partitions rather than the single partition on that single broker
add a comment |
You can dump log segments and use the deep iteration option to deserialize the data into something more readable.
If you want to "stream it", then use a standard Unix pipe to output to some other tool
do aggregate operations
Then use Kafka Streams to actually read from the topic for all partitions rather than the single partition on that single broker
You can dump log segments and use the deep iteration option to deserialize the data into something more readable.
If you want to "stream it", then use a standard Unix pipe to output to some other tool
do aggregate operations
Then use Kafka Streams to actually read from the topic for all partitions rather than the single partition on that single broker
edited Nov 25 '18 at 5:01
answered Nov 24 '18 at 22:23
cricket_007cricket_007
82.3k1143111
82.3k1143111
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53461087%2fkafka-reading-from-topic-log-file%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I cannot follow. What do you exactly mean by "I have a topic log file"?
– Matthias J. Sax
Nov 24 '18 at 21:52
It's not encrypted. It's serialized in raw bytes, but standard consumers are deserailizing that... Otherwise, what you want is the CLI tool to dump log segments, but not clear why you're wanting these raw files
– cricket_007
Nov 24 '18 at 22:19
@matthias-j-sax there are two files in a folder, they are named 0000000000.index and 000000000.log. I want to read the files and do aggregate operations.
– Thomson
Nov 24 '18 at 23:36
You would need to build a deserializer that understand the internal format used by Kafka. Overall, this seems to be a very special request. Those files are not designed to be consumer from any other application. As @cricket_007 pointed out, there is a dump log segments tool: this should help you read the files if you have the correct deserializers for keys and values.
– Matthias J. Sax
Nov 25 '18 at 4:22