You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been experimenting with my own little BPE implementations in other programming languages.
It seems that major bottlenecks are counting the frequencies of the pairs each iteration and merging the most frequent one.
I've noticed that instead getting the top two most frequent pairs and merging them each iteration is already a noticeable speedup for me with no noticeable loss of quality.
It should be possible to scale this up quite a bit before hitting diminishing returns or a drop in quality.
The text was updated successfully, but these errors were encountered:
There is no point really in doing this. The algorithm used in this repo is really slow and can be improved to linear time without changing the functionality
I've been experimenting with my own little BPE implementations in other programming languages.
It seems that major bottlenecks are counting the frequencies of the pairs each iteration and merging the most frequent one.
I've noticed that instead getting the top two most frequent pairs and merging them each iteration is already a noticeable speedup for me with no noticeable loss of quality.
It should be possible to scale this up quite a bit before hitting diminishing returns or a drop in quality.
The text was updated successfully, but these errors were encountered: