Skip to content

Commit

Permalink
Add pseudo-labeling based semi-supervised training recipe
Browse files Browse the repository at this point in the history
  • Loading branch information
zhu-han committed Aug 16, 2024
1 parent 1730fce commit 60bfbff
Show file tree
Hide file tree
Showing 17 changed files with 6,979 additions and 0 deletions.
118 changes: 118 additions & 0 deletions egs/librispeech/PL/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
# Introduction

This is a pseudo-labeling based semi-supervised ASR recipe for the LibriSpeech dataset. The ASR model is Zipformer Transducer. The labeled data is Labeled data is LibriSpeech train-clean-100. Unlabeled data can be LibriSpeech "train-clean-360 + train-other-500" for conventional semi-supervised learning or TedLium3 training set for unsupervised domain adaptation.

## Description of the recipe

### Preparation of data

The data required in this recipe is the same with LibriSpeech/TedLium3 ASR recipe. And the tokenizer of LibriSpeech is used to build the model. Therefore, we can reuse the `prepare.sh` scripts in those recipes.

### Supervised training for the seed ASR model

Firstly, we need to perform supervised training on the LibriSpeech train-clean-100 subset to generate the seed model for the following pseudo-labeling based semi-supervsed training.

```
export CUDA_VISIBLE_DEVICES="0,1,2,3"
./zipformer/train_seed.py \
--world-size 4 \
--num-epochs 70 \
--start-epoch 1 \
--use-fp16 1 \
--exp-dir zipformer/exp_seed \
--max-duration 1000
```

For better performance of the seed model, we average the checkpoints as follows:

```
./zipformer/generate_averaged_model.py \
--epoch 70 \
--avg 30 \
--exp-dir ./zipformer/exp_seed
```

The above command generates the final seed model `./zipformer/exp_seed/epoch-70-avg-30.pt`

### Semi-supervised training for the final ASR model

Then, we peform semi-supervised training with the seed model as the initialization.

- Conventional semi-supervised learning setting where unlabeled data is "train-clean-360 + train-other-500":

```
./zipformer/train_pl.py \
--world-size 4 \
--num-epochs 20 \
--start-epoch 1 \
--use-fp16 1 \
--exp-dir zipformer/exp_pl_librispeech \
--max-duration 1000 \
--seed-model-path "zipformer/exp_seed/epoch-70-avg-30.pt" \
--unlabeled-dataset "librispeech"
```

- Unsupervised domain adaptation setting where unlabeled data is TedLium3 training set:

```
./zipformer/train_pl.py \
--world-size 4 \
--num-epochs 20 \
--start-epoch 1 \
--use-fp16 1 \
--exp-dir zipformer/exp_pl_tedlium \
--max-duration 1000 \
--seed-model-path "zipformer/exp_seed/epoch-70-avg-30.pt" \
--unlabeled-dataset "tedlium"
```

### Decode

Finally, we decode the ASR model to evaluate the performance.

- Evaluate on the LibriSpeech dataset:

```
./zipformer/decode.py \
--epoch 20 \
--avg 10 \
--exp-dir ./zipformer/exp_pl_librispeech \
--max-duration 600 \
--decoding-method modified_beam_search \
--beam-size 4 \
--dataset "librispeech"
```

- Evaluate on the TedLium3 dataset:

```
./zipformer/decode.py \
--epoch 20 \
--avg 10 \
--exp-dir ./zipformer/exp_pl_tedlium \
--max-duration 600 \
--decoding-method modified_beam_search \
--beam-size 4 \
--dataset "tedlium"
```

## Results

- Conventional semi-supervised learning (LibriSpeech 100h/LibriSpeech 860h)

| Model | test-clean | test-other | comment |
|-------------------------|------------|------------|---------------------|
| supervised seed model | 5.45 | 13.7 | --epoch 70 --avg 30 |
| pseudo-labeling model | 4.33 | 9.61 | --epoch 20 --avg 10 |

- Unsupervised domain adaptation (LibriSpeech 100h/TedLium3)

| Model | tedlium3 dev | tedlium3 test | comment |
|-------------------------|------------|------------|---------------------|
| supervised seed model | 18.29 | 18.16 | --epoch 70 --avg 30 |
| pseudo-labeling model | 14.97 | 14.65 | --epoch 20 --avg 10 |


## Pre-trained models and logs

You can find the pre-trained models, training logs, tensorboard logs, decoding logs and decoding results at <https://huggingface.co/zhu-han/icefall-pl-librispeech-zipformer-medium-2023-08-06>
Loading

0 comments on commit 60bfbff

Please sign in to comment.