Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instructions to train / fine tune on our own data #1

Open
IamShubhamGupto opened this issue Oct 11, 2023 · 11 comments
Open

Instructions to train / fine tune on our own data #1

IamShubhamGupto opened this issue Oct 11, 2023 · 11 comments

Comments

@IamShubhamGupto
Copy link

Hey

Thank you for releasing nanoowl, I think it's really helpful for my ongoing work. Is there a way to fine-tune the weights for my own data?

Instructions on how train / fine tune would be great!

Thank you

@jaybdub
Copy link
Contributor

jaybdub commented Oct 11, 2023

Hey @IamShubhamGupto ,

Thanks for reaching out!

We don't have this feature at the moment, but I'll update this thread if that changes.

Depending on your use case, you might be able to provide image embeddings instead of text embeddings for querying objects. We haven't implemented this yet either though 😅 .

Let me know if you have any questions, or anything else I can do to help.

John

@IamShubhamGupto
Copy link
Author

Hey @jaybdub

Thank you for the feedback! Understood, I'll look into it in my own time as well but essentially we wanted to tackle very niche use cases for object detection using Nanoowl (as training our own model is painful).

For now I guess prompt engineering is the way to go

@elfar
Copy link

elfar commented Nov 3, 2023

Hi @jaybdub - truly awesome stuff! Any plans on adding image embeddings you mentioned in your comment as an option as well. E.g. selecting a bounding box of something in image A and looking for the image within that selection in image B? Alternatively could you roughly point me in the right direction if I found the time to look into implementing that feature myself!

-elfar

@Aki1991
Copy link

Aki1991 commented Aug 9, 2024

Hi @jaybdub,

Is there any update on the training of this model on custom dataset? This model is used in metropolis, so It would be incredibly helpful to have this implemented.

As another option, I have trained the model from the official Owl-ViT repo but here the model from transformer have been used. I tried using that model to convert to .engine file but it is not working. Is there a way to build an .engine file from that model from official repo?

-Akash

@TaugenichtsZZY
Copy link

Hi @jaybdub,

Is there any update on the training of this model on custom dataset? This model is used in metropolis, so It would be incredibly helpful to have this implemented.

As another option, I have trained the model from the official Owl-ViT repo but here the model from transformer have been used. I tried using that model to convert to .engine file but it is not working. Is there a way to build an .engine file from that model from official repo?

-Akash

Hello! I recently wished to try to fine-tune Owl-ViT with my own dataset. May I ask how you fine-tuned it? What format do I need to tweak my own dataset to? the official Owl-ViT repo only gives very simple code to run it, and I have some confusion. I hope you can give me some advice if it is convenient, thank you very much!

@Aki1991
Copy link

Aki1991 commented Sep 16, 2024

Hi @TaugenichtsZZY,

At which step are you having a problem? Did you try this step to fine-tune your model?

First, you will have to create dataset for your custom objects and build it with tfds library. You can find the procedure here. Change the name of the dataset in config file according to the name you gave to your dataset. Then make changes in decoder function as shown here. I created my dataset into the coco format and built it. That way I was able to use the decoder function with minimum changes.

@TaugenichtsZZY
Copy link

@Aki1991 Thank you for your response! I'll give it a try.

@TaugenichtsZZY
Copy link

Hi @TaugenichtsZZY,

At which step are you having a problem? Did you try this step to fine-tune your model?

First, you will have to create dataset for your custom objects and build it with tfds library. You can find the procedure here. Change the name of the dataset in config file according to the name you gave to your dataset. Then make changes in decoder function as shown here. I created my dataset into the coco format and built it. That way I was able to use the decoder function with minimum changes.

Hello! I made my own coco dataset (with 14 images and two categories), but I tried and modified it many times according to the documentation of tfds, and it always reports a lot of different types of errors when I test it. I also can't find a more specific example of converting a coco dataset to tfds. Can I please refer to how your dataset_builder.py and dataset_builder_test.py are written? Or can you give me some specific suggestions on how to modify these files? For example what needs to be modified? Much appreciated!

@Aki1991
Copy link

Aki1991 commented Sep 18, 2024

Hi @TaugenichtsZZY,

yes, it is a bit confusing, creating the dataset for tfds. You can take some hint from lvis builder. On the repository, you will find many other datasets from which you can see, how they were created. Take a look at "builder.py" files of some datasets. It took me almost a week to build my dataset and some more days to perfect it.

This is my file.
golf_mug_dataset_builder.txt

@Aki1991
Copy link

Aki1991 commented Sep 18, 2024

features=tfds.features.FeaturesDict and def _generate_examples(self, path): are main things you should focus on.

features includes all the features that you want to have in your dataset, like, image_id, name, height, width, bounding_Box, etc.

and in the _generate_examples, the way you create record is a bit tricky. What I found is that at yield you will have to give name of the image and then the data of the image (record). I think you will get good idea from my file. Let me know if you have any confusion.

@TaugenichtsZZY
Copy link

features=tfds.features.FeaturesDict and def _generate_examples(self, path): are main things you should focus on.

features includes all the features that you want to have in your dataset, like, image_id, name, height, width, bounding_Box, etc.

and in the _generate_examples, the way you create record is a bit tricky. What I found is that at yield you will have to give name of the image and then the data of the image (record). I think you will get good idea from my file. Let me know if you have any confusion.

Your documentation and guidance has been a great help to me! I will spend some time working on your examples, thank you so much for your help, it means a lot to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants