📌 Submission for the C4AI Scholars Program Take-Home Challenge (Sept 2024)
The project consists of debugging, training, and optimizing a SmolLM-135M model. The challenge is divided into three parts:
- Bug Fixing: Identify and fix 10 bugs in the provided implementaton of building a SmolLM-135M model from scratch.
- SFT + DPO: Fine-tune the provided model on the Grammarly CoEdIT dataset for grammatical error correction; created datasets for DPO; Implement DPO.
- DPO Variants Exploration: Implement and evaluate an alternative Direct Preference Optimization (DPO) variant to improve model performance.