Authors: @edwardyang12 @catherinelee274
We like to read machine learning papers (roughly) every week! We take turns leading the discussion.
We follow this structure:
- What does the paper introduce
- Data
- Training/Finetuning
- Evaluation
- Discussion
- Formal Algorithms for Transformers
- Power of Scale For Prompt Finetuning
- Flash Attention
- Low Rank Adaptation (need to insert date)
- Big Bird
- Flash Attention
- Power of Scale for prompt tuning
- Nerf
- Retrieval Augmented Generation
- Survey on Deep Reinforcement Learning
- Latent Diffusion Models 1/7/2023
- Mixture of Experts
- Direct Preference Optimization
- Mappo
- Mamba
- Contrastive Language-Image Pretraining 03/17/2024
- Deepseed Inference 03/24/2024
- Llama 03/31/2024
- Multimodal Learning with Transformers 4/14/2024
- Chinchilla 5/5/2024
- Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B: A Technical Report 6/30/2024