Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code example for Object Detection using Mask RCNN #1134

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

Om-Doiphode
Copy link

Developed training example for object detection #1117

@CLAassistant
Copy link

CLAassistant commented May 6, 2023

CLA assistant check
All committers have signed the CLA.

@Om-Doiphode
Copy link
Author

@HAOCHENYE, I have issued a PR, please review. Thanks!

Copy link
Collaborator

@HAOCHENYE HAOCHENYE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your contribution, can you provide the training logs of this example?

BTW, the KITTI dataset seems a little bit large for downloading. Is there a smaller dataset? Maybe a slice of Coco dataset is enough, for example, the coco128, and we can divide it into train set and validation set. Although such a small dataset will trigger the overfitting problem, however, I think this is acceptable for an example.

It seems this example still uses the `Accuray· Metric, suggest using a iou metric to evaluate the model. MMEval has defined a lot of metrics, and you can try to use them.

Copy link
Collaborator

@HAOCHENYE HAOCHENYE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your contribution, can you provide the training logs of this example?

BTW, the KITTI dataset seems a little bit large for downloading. Is there a smaller dataset? Maybe a slice of Coco dataset is enough, for example, the coco128, and we can divide it into train set and validation set. Although such a small dataset will trigger the overfitting problem, however, I think this is acceptable for an example.

It seems this example still uses the `Accuray· Metric, suggest using a iou metric to evaluate the model. MMEval has defined a lot of metrics, and you can try to use them.

@Om-Doiphode
Copy link
Author

Om-Doiphode commented May 9, 2023

Hi @HAOCHENYE, I tried using Coco128 Dataset, but I am facing the following issue:
image

I have tried debugging it, but unable to resolve it. Can you please guide me on how to solve this problem?

@HAOCHENYE
Copy link
Collaborator

HAOCHENYE commented May 10, 2023

The current implementation seems a little bit comlicated, the error indicates that the index given by sampler does not match with the implementation here:

https://github.com/open-mmlab/mmengine/pull/1134/files#diff-ee540e9d5ba67d4968f7f2640915c3e1e333de63a16bfd6d1620cabfc60c005bR220

Before continue developing with coco128, we need to figure out is coco128 fit for the example:

  1. Can we download coco128 in Python code, and then convert it to the target format to reuse the existing data class like tochvision.datasets.CocoDetection?
  2. Could we train a simpler algorithm such as RetinaNet? I'm not sure does coco128 contain the sufficient label information for instance segmentation tasks
  3. coco128 is only a suggestion, any other smaller dataset is acceptable.

@Om-Doiphode
Copy link
Author

Om-Doiphode commented May 11, 2023

Hi @HAOCHENYE, the issue still persists.

  1. The COCO dataset is not available in torchvision library so we need to externally download the dataset.

image

The following error occurs on using dict()
image

image
While on using Dataloader from torchvision the following error occurs:
image

Even on using kitti dataset, the above error occurs, even used RetineNet instead of Mask RCNN still the same issue, can you please guide me on how to solve this problem?

@Om-Doiphode
Copy link
Author

Hi, @HAOCHENYE, After implementing a custom collate function for COCO dataset, getting the following error:
image

@HAOCHENYE
Copy link
Collaborator

It seems that your collate_fn returns an improper batch data. You can see this tutorial for more information about collate_fn

@HAOCHENYE
Copy link
Collaborator

Besides, if there is no available URL to download coco128, you can also use this dataset:

https://download.openmmlab.com/mmyolo/data/balloon_dataset.zip

and use torch.hub.download_url_to_file to download it in the code. However there is only one class in this dataset, you need to change the model structure.

@Om-Doiphode
Copy link
Author

Om-Doiphode commented May 15, 2023

Hi, @HAOCHENYE, seems like the issue has been solved with the collate function, but getting error while calculating the Accuracy using IOU score. I have defined my own Accuracy class but getting errors, sometimes the model predict lesser number of bounding boxes as compared to the actual bounding boxes and vice versa. Sometimes it returns empty list for predicted bounding boxes. Can you please guide me on how to solve this?

@Om-Doiphode
Copy link
Author

Hi @HAOCHENYE, I have fixed the issue with model training as you can see:
image

The only thing remaining is the implementation of Accuracy class. I am using mmeval for IOU calculation (MeanIOU to be more precise). But the predictions returned by the model has more bounding boxes than the ground truth bounding boxes. So I am working on fixing this by employing techniques such as NMS (Non maximum suppression), etc.

@Om-Doiphode
Copy link
Author

Hi @HAOCHENYE, the model is working successfully, albeit the accuracy is low. Please review, thanks!

Copy link
Collaborator

@HAOCHENYE HAOCHENYE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, please include coco.py and detection_example.py in the detection directory, and provide a README.md to tell users how to train this example

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this util module is only used for converting the dataset format, I think we can save the converted data in our oss, and download dataset in the example code, just like torchvision.dataset does. Besides, could the mini-balloon_dataset be used in this example?

@HAOCHENYE
Copy link
Collaborator

Hi @HAOCHENYE, the model is working successfully, albeit the accuracy is low. Please review, thanks!

BTW, the official mask-rcnn is designed for detecting 80 different classes, while it appears that the example provided is focused on single class detection. Could this be a contributing factor to the lower accuracy observed?

@Om-Doiphode
Copy link
Author

Hi, @HAOCHENYE, I have made the readme, please review. Thanks!

@Om-Doiphode
Copy link
Author

Hi @HAOCHENYE, the model is working successfully, albeit the accuracy is low. Please review, thanks!

BTW, the official mask-rcnn is designed for detecting 80 different classes, while it appears that the example provided is focused on single class detection. Could this be a contributing factor to the lower accuracy observed?

It seems like the dataset I was using wasn't appropriate for this object detection example because I tried training for more epochs, tweaked the hyper parameters still the accuracy didn't increase. So now I have switched to using COOC128. The accuracy is now much better.

@Om-Doiphode
Copy link
Author

Om-Doiphode commented Jun 2, 2023

Hi, @HAOCHENYE, is there something else which needs to be done for this PR? Can you please elaborate?
Besides I am using Faster RCNN instead of mask RCNN.

Copy link
Collaborator

@HAOCHENYE HAOCHENYE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the examples is to demonstrate how convenient it is to build an algorithm pipeline using MMEngine. Specific algorithm implementations, dataset implementations, and metric implementations should not take up too much space.

For example, we directly use torchvision's models and datasets to construct the training process. We can also use Metric defined in MMEval or TorchMetric to calculate validation accuracy. MMEngine serves as a combination of these modules and should not have too many implementation details in the examples.

The current implementation exposes too many details about the implementations, which is not ideal for an example.

@Om-Doiphode
Copy link
Author

Om-Doiphode commented Jun 6, 2023

Hi @HAOCHENYE, As per your suggestions I have made the following changes:
Removed coco_utils.py file

Furthermore, I am already utilizing the COCODetection metric from mmeval to calculate accuracy:

image

And the implementations are for dataset formatting only which I think is necessary for the custom dataset we are using.

Please review. Thanks!

Copy link
Collaborator

@HAOCHENYE HAOCHENYE Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! Could you provide me with a step-by-step guide on running this example? I find that the section titled "Prepare your dataset" is not detailed enough.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the necessary changes. Please review. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants