Install libraries from requirements.txt
Datasets can be downloaded from the sources mentioned in the paper. Language dataset mapping (Italian:EMOVO,Persian:ShEMO, Urdu: Urdu SER)
Install torchaudio:
pip install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
For Hyperbolic implementation we use implementations from Hazy Research: https://github.com/HazyResearch/hgcn. Clone the hgcn repository and enter inside the directory
git clone https://github.com/HazyResearch/hgcn.git
cd hgcn
The datasets can be downloaded from the respective sources mentioned in the paper. Load the datasets in the respective folders mentioned below as this is the format followed by (dataset)_train.csv for reading the speech files.
- URDU: create a folder named 'urdu-language-speech-dataset', load data here.
- ShEMO: create a folder named 'shemo-persian-speech-emotion-detection-database', load data here.
- EMOVO: create a folder named 'emovo', load data here.
For data loading, create a folder named 'data' and paste the dataset specific files taken from 'data_files' folder. Similarly vocab.json can also be replaced with the dataset specific file Example command: !cp /data_files/train_shemo.csv /data/train.csv
- model_run.py includes processing and trainging functions used by the model.
- For trying different hyperbolic variants:(HVIB, HVIB-C and ADAPT-VIB), change self.c in the function Wav2Vec2ClassificationHeadViBERTHyperbolic() {lines:575-582} accordingly. Dataset-specific hyperbolicity can also be changed here.
- To run base model or VIB replace trainer_hyp_vib.train() with trainer_base.train() and trainer_hyp_vib.train() respectively.