Enable easy swapping of PyTorch models #2

john-hewitt · 2019-04-05T23:44:54Z

Right now, to test a new representation learner, one must:

Use the representation learner to write hidden state vectors for each token (or subword) to disk. (Better idea for subword models: decide how to combine subword representations; write resultant token embeddings to disk)
Run structural probe code on the hidden states as saved to disk.

This is "nice" in that the hidden states don't need to be computed at each pass (BERT is big/slow; I actually run most experiments on CPUs because the probe training is so fast and CPUs are so plentiful)

However, it's "not nice" that one can't swap representation model parameters on the fly, and especially that big huge vectors take up a lot of disk space (115GB for BERT-large on PTB WSJ train -- 40k sents.)

We'd like to enable easy swapping of new models by defining a new class in model.py. We'll need to read in the tokenizer (and perhaps subword-tokenizer) so we pass the model words as identified by its own vocabulary, and are able to map from subword reprs back to token reprs. There's also the problem of inefficiency of aligning subword reprs to token reprs at every batch

The text was updated successfully, but these errors were encountered:

john-hewitt added the enhancement New feature or request label Apr 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable easy swapping of PyTorch models #2

Enable easy swapping of PyTorch models #2

john-hewitt commented Apr 5, 2019

Enable easy swapping of PyTorch models #2

Enable easy swapping of PyTorch models #2

Comments

john-hewitt commented Apr 5, 2019