Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ANY field in Protobuf descriptors #543

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

xakassi
Copy link
Contributor

@xakassi xakassi commented Dec 16, 2020

This a fix for a bug described in PR 529.

Any field is now supported for Protobuf descriptors. Multiple layer complex objects with Any fields on any layers are also supported using recursion. Please, see unit tests for examples.

The principle of operation is as follows. If a message contains a field with type Any, we can find out an actual field type at the runtime from the message, get descriptors for this field type and add them to TypeRegistry. To get the descriptors for this field type I pre-built a map with pairs - for all types provided in all descriptors.

Hi, @tchiotludo , @jorgheymans ! Please, take a look at this PR.
@jorgheymans I hope your case is fully covered now, please, test on your data.

@xakassi xakassi force-pushed the bug/protobuf_with_any_field branch 2 times, most recently from 6df6117 to 60aba97 Compare December 17, 2020 06:36
@jorgheymans
Copy link
Contributor

Going to test this out hopefully today, feedback soon.

@jorgheymans
Copy link
Contributor

Looking at this, the proto serde is not working and i'm not seeing any error messages. This leads me to believe it's a configuration error. Can you confirm this config should be correct:

    deserialization:
      protobuf:
        descriptors-folder: "/opt/akhq/descriptors"
        topics-mapping:
          - topic-regex: "cc.*"
            descriptor-file: "cc-message-proto.protobin"
            value-message-type: "cc.message.CCMessage"
          - topic-regex: "han.*"
            descriptor-file: "han-cdc-proto.protobin"
            value-message-type: "han.message.HanMessage"

Note that the descriptor files are not self-containing as it doesn't make sense once you have Any. So the /opt/akhq/descriptors directory contains many protobin files, I am assuming you scan them all ?

So basically we have KafkaMessage.proto

message KafkaMessage {
    ..........
    Any payload
}

And then many payload types, each type has its own protobin file. All these protobin files are in the descriptors-folder.

@xakassi
Copy link
Contributor Author

xakassi commented Dec 19, 2020 via email

@xakassi
Copy link
Contributor Author

xakassi commented Dec 21, 2020

Hi, @jorgheymans !
This is my whole config:

akhq:
  connections:
    kafka:
      properties:
        bootstrap.servers: "localhost:9094"
      deserialization:
        protobuf:
          descriptors-folder: "D:\\protobuf_desc"
          topics-mapping:
            - topic-regex: "test.*"
              descriptor-file: "streaming.desc"
              key-message-type: "Row"
              value-message-type: "Envelope"

It should not be a problem that descriptor files are not self-containing. I believe it's just configuration issue since there are no any error messages. Seems like deserialization is put in the wrong place and is not recognized at all.

@jorgheymans
Copy link
Contributor

Indeed the config block was placed under topics, i have corrected it and as soon as I'm able to test this again i'll let you know, may take a few days with all the holidays going on.

@tchiotludo
Copy link
Owner

@jorgheymans : please tell me what you test is going on.
I would like to release a new AKHQ version will a full support of protobuf (v1).
Thanks for your time @xakassi & @jorgheymans !

@jorgheymans
Copy link
Contributor

jorgheymans commented Dec 28, 2020 via email

@tchiotludo
Copy link
Owner

@jorgheymans : have a good holidays !
see you next week !

@xakassi
Copy link
Contributor Author

xakassi commented Jan 11, 2021

Hi, @jorgheymans !
Did you have any chance to test it yet?

@mdkouki
Copy link

mdkouki commented Jan 11, 2021

Hi,following Jorg inputs test done and we got an exception here I share with you what we got :
proto_exception

cordially.

@xakassi
Copy link
Contributor Author

xakassi commented Jan 11, 2021

Hi!
Are your descriptor files ok? It's an exception from standard function FileDescriptorSet.parseFrom(descriptorFile), it cannot parse your descriptor file.
Could you provide more info about your descriptor files and data? Maybe some fake data to reproduce.

@jorgheymans
Copy link
Contributor

jorgheymans commented Jan 11, 2021 via email

@xakassi xakassi force-pushed the bug/protobuf_with_any_field branch from 60aba97 to 90d63a1 Compare January 11, 2021 16:00
@xakassi
Copy link
Contributor Author

xakassi commented Jan 11, 2021

I have added a unit test for an object with Timestamp field. @jorgheymans , please, see the unit test deserializeFilm(). Film object now looks like this:

syntax = "proto3";
package org.akhq.utils;

import "google/protobuf/timestamp.proto";

option java_package = "org.akhq.utils";
option java_outer_classname = "FilmProto";

message Film {
  string name = 1;
  string producer = 2;
  int32 release_year = 3;
  int32 duration = 4;
  repeated string starring = 5;
  google.protobuf.Timestamp timestamp = 6;
}

And it deserialized successfully.
Do your objects looks like this also? Or I have specified Timestamp in a different way?

@xakassi xakassi force-pushed the bug/protobuf_with_any_field branch 4 times, most recently from f44ea1f to b0615a7 Compare January 12, 2021 08:48
@jorgheymans
Copy link
Contributor

The error surfaces if you have a timestamp field in a type that is defined as Any.

@xakassi
Copy link
Contributor Author

xakassi commented Jan 12, 2021

Yes, indeed. But in my case (when I test) I get other error in a different function (not here FileDescriptorSet.parseFrom(descriptorFile)).

But anyway my current approach of collecting necessary descriptors is incomplete. And it would be better to just add all custom descriptors from a specified folder and add all well-known-descriptors.

The one question is - how to get all well-known-descriptors. @jorgheymans could you, please, tell me how you implemented that? How do you get all well-known-descriptors? It would be easier and faster for me :)

@xakassi
Copy link
Contributor Author

xakassi commented Jan 12, 2021

@jorgheymans
Maybe you just call something like that for every well-known-type?

com.google.protobuf.StringValue.getDescriptor();

@geertvb
Copy link

geertvb commented Jan 12, 2021

@xakassi The .proto files for the well-known-descriptors can be found in the com.google.protobuf:protobuf-java jar under the folder google/protobuf. You can compile these proto definitions and generate a single protobin file. We have a maven pom file to do that.
Once you have this protobin file, it can be loaded and used like any other file descriptor set.

@xakassi
Copy link
Contributor Author

xakassi commented Jan 12, 2021

Thanks, @geertvb ! I will try this approach!

@xakassi xakassi force-pushed the bug/protobuf_with_any_field branch from b0615a7 to ea2f01b Compare January 12, 2021 15:27
@xakassi
Copy link
Contributor Author

xakassi commented Jan 12, 2021

I had some troubles with making single DescriptorSet, and I have implemented it in a different way. I think, it's better in my case.

Please, guys, try your use case again with my new fixed code! @jorgheymans

@geertvb
Copy link

geertvb commented Jan 12, 2021

@xakassi Initially we also hard coded the well defined types like that but it is less flexible. Besides that we needed the protobin files for another project where we can't use that trick because we aren't using the google libraries.

@xakassi
Copy link
Contributor Author

xakassi commented Jan 12, 2021

@geertvb You can put any additional protobin files in descriptors-folder, Descriptors will be extracted from them and will be used for fields with Any type if needed.

@dert1714
Copy link

Hi, @tchiotludo!
Do you have any another assumtion to merge this PR except unresolved conflicts?

@tchiotludo tchiotludo force-pushed the bug/protobuf_with_any_field branch from 8376a48 to 35cf9ec Compare April 26, 2022 20:06
@tchiotludo
Copy link
Owner

at @dert1714, I don't have a lot of knowledge of protobuf.
I resolved the conflict, If you can test if it's working I could merge it

@tchiotludo
Copy link
Owner

@dert1714 build failed due to incorrect unit that need to be fixed before merging

@stoibazarov
Copy link

I was able to fix this error by building proto.desc file with argument protoc --include_imports

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

7 participants