Implement ELO rating function #4

paulbricman · 2023-05-30T12:31:49Z

We should have a function which receives as arguments:

a list of model names

After every game, the winning player takes points from the losing one. (https://en.wikipedia.org/wiki/Elo_rating_system)

a "number of games" parameter (needs looking into: are we randomly pitting "players" against each other? Are we rather going through all possible games?
And returns a dictionary whose keys are model names and values are ELO ratings.

This part on the wiki page also seems relevant for implementation:

An example may help to clarify: Suppose player A has a rating of 1613...

paulbricman · 2023-06-06T12:08:51Z

Suggestion: Test using multiple small models: distilgpt2, gpt2, gpt2-medium, for example. Actually, it should be possibly to simply send in a list of e.g. three identical model names, too, right?

paulbricman assigned y-mx May 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ELO rating function #4

Implement ELO rating function #4

paulbricman commented May 30, 2023

paulbricman commented Jun 6, 2023

Implement ELO rating function #4

Implement ELO rating function #4

Comments

paulbricman commented May 30, 2023

paulbricman commented Jun 6, 2023