-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove system prompt from data generation #96
Comments
+1, would be good to introduce system role during the data mixing phase which prepares the dataset for training - this makes it a tad bit cleaner to understand - as the system role is only applicable to training |
This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. |
Unmarking as stale, because this is still relevant and as we more cleanly separate data generation from mixing that would be the time to ensure we aren't passing around system prompts during generation. |
…actions/rojopolis/spellcheck-github-actions-0.38.0 Bump rojopolis/spellcheck-github-actions from 0.37.0 to 0.38.0
We're closer to removing this, but the system prompt is still used for the legacy test_*.jsonl file we generate as well as adding to all samples during the data mixing phase. |
Remove system prompt from data generation and will be re-introduced in the mixing phase.
sdg/src/instructlab/sdg/generate_data.py
Line 38 in b28a12b
The text was updated successfully, but these errors were encountered: