Required changes to train the skeleton action recognition model with custom data #277

davelbit · 2022-07-06T13:59:46Z

davelbit
Jul 6, 2022

Hello opendr-team,
I am currently working on my master thesis, which is about skeleton action recognition. Because of this reason I was able to find this great project. So far I was able to test the pose estimation, the skeleton action recognition with the provided pre-trained models and also the skeleton extraction on my own dataset, which contains only 5 video files (5 classes) as of now and will be extended later if everything works as expected.

For the last step I need to train the models with the custom dataset. Therefore, I was wondering if someone can enlighten me on what changes need to be made to train one of the provided models with either the spatio_temporal_gcn_learner or the progressive_spatio_temporal_gcn_learner.

Answered by negarhdr

Aug 9, 2022

Yes, if you have extracted skeletons and the corresponding labels ready, all you need to do is to make an object of SpatioTemporalGCNLearner class and feed the parameters according to your dataset (ex. num_points, num_classes, graph_type, etc).
Then you can call the fit() function and train a model.

View full answer

negarhdr · 2022-07-11T10:15:50Z

negarhdr
Jul 11, 2022
Collaborator

Hi,

Thank you for your interest in OpenDR project.

In order to train the models on a custom-skeleton dataset, you need first to employ light-weight openpose method to extract body poses from the videos. I have a code for that in this directory:
opendr/projects/perception/skeleton_based_action_recognition/demos/skeleton_extraction.py

After extracting body skeletons, you need to do some preprocessing as well. The models receive a sequence of 300 skeletons for each sample. Therefore, if you have less than 300 frames in each video, you need to do padding to fulfill this. Also, we assumed that in each frame we have two skeletons (because NTU-RGBD and Kinetics datasets had some actions taken place by two-person interaction like "shaking hand"). So you need to check each frame and in case there are less than 2 skeletons in each frame, you need padding and if there are more than 2 skeletons in each frame you need to choose 2 of them (this is included in the code).

So the first step to prepare a custom dataset is to follow the preprocessing steps to make sure each input video will be modeled as a tensor of size CxTxV (2x300x18).
Please check out my codes in this directory and if you have further questions, we can discuss:

https://github.com/opendr-eu/opendr/tree/d3beb3b56fddb5f82e8b62ae5b4678e7f09a5f33/projects/perception/skeleton_based_action_recognition/demos

2 replies

davelbit Jul 20, 2022
Author

Thanks for your response. So, I've completed all the previously mentioned steps. But I'm still missing the training process of the ST-GCN model for example. Do I simply need to run the fit() function for any of the GCN models (STGCN or PSTGCN) with the new dataset?

negarhdr Aug 9, 2022
Collaborator

Yes, if you have extracted skeletons and the corresponding labels ready, all you need to do is to make an object of SpatioTemporalGCNLearner class and feed the parameters according to your dataset (ex. num_points, num_classes, graph_type, etc).
Then you can call the fit() function and train a model.

Answer selected by passalis

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Required changes to train the skeleton action recognition model with custom data #277

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Required changes to train the skeleton action recognition model with custom data #277

davelbit Jul 6, 2022

Replies: 1 comment · 2 replies

negarhdr Jul 11, 2022 Collaborator

davelbit Jul 20, 2022 Author

negarhdr Aug 9, 2022 Collaborator

davelbit
Jul 6, 2022

Replies: 1 comment 2 replies

negarhdr
Jul 11, 2022
Collaborator

davelbit Jul 20, 2022
Author

negarhdr Aug 9, 2022
Collaborator