Releases: TransformerLensOrg/TransformerLens
v1.3.0
What's Changed
- fix outdated link in Exploratory Analysis Demo by @daspartho in #259
- Finish patching docs by @ckkissane in #261
- Fix
from_pretrained
withredwood_attn_2l
by @ArthurConmy in #268 - Added list of demos to tutorial section. by @JayBaileyCS in #263
- Improving head detector by @MatthewBaggins in #255
- Optimize imports in HookedTransformer by @rusheb in #260
- Baidicoot main - Implemented functionality for loading mingpt-style models off HF (e.g. othello-gpt) by @jbloomAus in #272
- Upgrade to typeguard 3 by @dkamm in #269
- Install autoformatting tools and add formatting checks to CI by @rusheb in #270
- Add TransformerLens logo to docs and GitHub by @koayon in #273
- Wrap docstrings and comments in HookedTransformer by @luciaquirke in #274
- Format array in test_transformer_lens.py by @rusheb in #275
- Introducing HookedEncoder by @rusheb in #276
- Add tests for tokenization methods by @Aprillion in #280
- Fix broken link in issue template by @rusheb in #278
- Various memory solutions. Ultimately used gc to "hide" memory issue which should be solved soon. by @jbloomAus in #296
- FactoredMatrix getitem (#224) by @glerzing in #295
- Add tiny stories by @Felhof in #292
- from_pretrained custom parameters (#288) by @glerzing in #298
- Add better
__name__
annotation tofull_hook
s by @ArthurConmy in #302 - Multiple minor corrections by @glerzing in #301
- Add get_basic_config util function by @adamyedidia in #294
- Fix bug: HookedEncoder not being moved to GPU by @rusheb in #307
- Fix tokenization tests on GPU by @rusheb in #308
- Add prepend option to
model.add_hook
by @ArthurConmy in #303 - Fix tiny stories model names by @Felhof in #305
- Add
hook_mlp_in
by @ArthurConmy in #313 - Ignore some functions in the documentation (#310) by @glerzing in #312
- Add assertion to refactor_factored_attn_matrices by @ArthurConmy in #320
- Update evals.py to not directly call cuda, instead have default cuda … by @dennis-akar in #324
- Add SVD interpretability feature to TransformerLens by @JayBaileyCS in #311
- Fix svd tests on GPU by @slavachalnev in #330
- Reduce memory use when loading model by @slavachalnev in #327
New Contributors
- @MatthewBaggins made their first contribution in #255
- @koayon made their first contribution in #273
- @luciaquirke made their first contribution in #274
- @Aprillion made their first contribution in #280
- @glerzing made their first contribution in #295
- @Felhof made their first contribution in #292
- @dennis-akar made their first contribution in #324
Full Changelog: v1.2.2...v1.3.0
v1.2.2
What's Changed
Too many commit messages so let's summarise them.
General Features
- Pipeline Parallelism
- Cache now doesn't move tensors across devices unless told to
New Models:
- Redwood 2L
- New Pythia Models
- LLaMA
Analysis Features:
- Add apply_ln to stack_head_results and stack_neuron_results
- Context Manager for Hooks
- Attention Head Detectors
Thanks to all the Contributors!
Many thanks to: @rusheb, @ckkissane, @slavachalnev, @JayBaileyCS, @zshn-gvg, @jbloomAus, @adzcai, @adamyedidia, @ArthurConmy, @bryce13950, @daspartho, @haileyschoelkopf, @0amp
Full Changelog: v1.2.1...v1.2.2
v1.2.1
New minor release with a variety of improvements relating to testing, documentation and development. Transition from torchtyping to jaxtyping is one the most significant changes.
What's Changed
- Replace torchtyping with jaxtyping by @dkamm in #171
- Run poetry lock by @rusheb in #178
- Add
verbose
flag to disable tqdm onmodel.generate(...)
by @afspies in #185 - Make
tracr
plot show outside on colab by @ArthurConmy in #184 - Add positional_embedding_type to model properties table by @ckkissane in #176
- Run
poetry lock --check
in CI by @rusheb in #182 - Slice: doc and tests by @Xmaster6y in #166
- Separate tests into unit and acceptance tests by @rusheb in #191
- Grokking demo by @neelnanda-io in #193
- Configure coverage reports to measure branch coverage by @rusheb in #192
- Silence DeprecationWarning for distutils by @rusheb in #187
- Clone pos embed by @slavachalnev in #194
- Add pos embed hook tests by @slavachalnev in #196
- Fix test command in Readme by @valedan in #197
- Test constructor of FactoredMatrix by @rusheb in #188
- Add helper for logit attribution by @dkamm in #135
- issue and pr templates by @jbloomAus in #203
New Contributors
- @Xmaster6y made their first contribution in #166
- @slavachalnev made their first contribution in #194
- @valedan made their first contribution in #197
Full Changelog: v1.2...v1.2.1
v1.2.0
What's Changed
- Test cache names by @neelnanda-io in #169
- Implement from_pretrained_no_processing by @lukasberglund in #161
- Minor Readme Changes by @jbloomAus in #170
- Add QKV split by @ArthurConmy in #158
- Add gotcha to HookedTransformer.to_str_tokens docstring by @rusheb in #173
- Test coverage reports by @jbloomAus in #172
- Sphinx documentation Solving issue #132 by @jbloomAus in #174
- retarget pages at main pushes by @jbloomAus in #175
New Contributors
- @lukasberglund made their first contribution in #161
- @jbloomAus made their first contribution in #170
- @rusheb made their first contribution in #173
Full Changelog: v1.1.1...v1.2
v1.1.1
What's Changed
- Created a demo for attribution patching with minor bug fixes to demos and code on activation patching by @neelnanda-io in #168
Full Changelog: v1.1.0...v1.1.1
v1.1.0
New release with a bunch of quality of life improvements, including attention patching utils and early stopping
What's Changed
- Tracr demo by @neelnanda-io in #142
- Added option to stop running the model at an earlier layer by @neelnanda-io in #143
- Arthur/loss per token by @neelnanda-io in #144
- Make project versioning clearer by @alan-cooney in #146
- let 'get_dataset' function pass kwargs by @afspies in #141
- Fix deps by @jas-ho in #149
- Fix OPT BOS Prepending issue by @afspies in #154
- Induction heads phase changes demo by @ckkissane in #148
- Add helper enum for torch typing by @dkamm in #145
- In run_with_hooks, remove hooks, even when an error is thrown. by @joelburget in #156
- Correct incorrect equation by @epurdy in #159
- add hook tokens by @callummcdougall in #147
- Update pythia-19m to 70m by @ArthurConmy in #162
- Added Utilities for Activation Patching + A Demo of how to use them by @neelnanda-io in #165
New Contributors
- @afspies made their first contribution in #141
- @jas-ho made their first contribution in #149
- @ckkissane made their first contribution in #148
- @dkamm made their first contribution in #145
- @epurdy made their first contribution in #159
- @callummcdougall made their first contribution in #147
Full Changelog: v1.0.0...v1.1.0
v1.0.0
Creating a new version to represent the library being fairly stable!
What's Changed
- Add devcontainer support by @alan-cooney in #122
- Added sample_datapoint, get_dataset and token truncation by default by @neelnanda-io in #131
- Added code to load in fine-tuned versions of SoLU 1L and SoLU 4L on 4.8B tokens of wikipedia by @neelnanda-io in #133
- Implement permanent hooks by @derpyplops in #117
- Typing Checks by @neelnanda-io in #137
- Bump ws-action-parse-semver by @alan-cooney in #140
New Contributors
- @derpyplops made their first contribution in #117
Full Changelog: v0.2.0...v1.0.0
v0.2.0
Initial release to PyPi