When the number of training steps is too much, the loss becomes NaN #118

Wxl-stars · 2025-01-02T03:41:28Z

Besides modifying the dataset, I used the settings from the paper.
But when the steps up to 8w, the loss becomes NaN. What can I do to fix the problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When the number of training steps is too much, the loss becomes NaN #118

When the number of training steps is too much, the loss becomes NaN #118

Wxl-stars commented Jan 2, 2025

When the number of training steps is too much, the loss becomes NaN #118

When the number of training steps is too much, the loss becomes NaN #118

Comments

Wxl-stars commented Jan 2, 2025