-
Notifications
You must be signed in to change notification settings - Fork 716
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
4 changed files
with
32,552 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
|
||
# makemore | ||
|
||
makemore is the most accessible way of tinkering with a GPT. | ||
|
||
The one-file script `makemore.py` takes one text file as input, where each line is assumed to be one training thing, and generates more things like it. For example, we can feed it a database of names, and then use it to generate new cool baby name ideas that sound name-like, but are not already existing names. Or if we feed it a database of company names then we can generate new ideas for a name of a company. Or we can just feed it valid scrabble words and generate english-like babble. | ||
|
||
Under the hood, the script trains a (character-level) Transformer, identical to the one that powers [GPT and friends](). | ||
|
||
This is not meant to be a heavyweight library with switches and knobs. It's one hackable file of ~500 lines of code. [PyTorch](https://pytorch.org) is the only requirement. Go nuts. | ||
|
||
### Usage | ||
|
||
The included `names.txt` dataset, as an example, has the most common 32K names takes from [ssa.gov](https://www.ssa.gov/oact/babynames/) for the year 2018. It looks like: | ||
|
||
``` | ||
emma | ||
olivia | ||
ava | ||
isabella | ||
sophia | ||
charlotte | ||
... | ||
``` | ||
|
||
Let's point the script at it: | ||
|
||
```bash | ||
$ python makemore.py -i names.txt -o names | ||
``` | ||
|
||
Training progress and logs and model will all be saved to the working directory `names`. The default model is a super tiny 200K param transformer; Many more training configurations are available - see the argparse and read the code. Training does not require any special hardware, it runs on my Macbook Air and will run on anything else, but if you have a GPU then training will fly. As training progresses the script will print some samples throughout. However, if you'd like to sample manually, you can use the `--sample-only` flag, e.g. in a separate terminal do: | ||
|
||
```bash | ||
$ python makemore.py -i names.txt -o names --sample-only | ||
``` | ||
|
||
This will load the best model so far and print more samples on demand. Have fun. | ||
|
||
### License | ||
|
||
MIT |
Oops, something went wrong.