-
Notifications
You must be signed in to change notification settings - Fork 0
Home
wangd12rpi edited this page Nov 29, 2024
·
17 revisions
Name | Train dataset | Test dataset |
---|---|---|
Sentiment Analysis (SA) | FinGPT/fingpt-sentiment-train | fpb, fiqa, tfns, nwgi |
Named Entity Recognition (NER) | FinGPT/fingpt-ner-cls (train split) | FinGPT/fingpt-ner-cls (test split) |
Headline classification | FinGPT/fingpt-headline (train split) | FinGPT/fingpt-headline (test split) |
XBRL tag extraction | XBRL QA dataset (train split) | XBRL QA dataset (test split) |
XBRL value extraction | XBRL QA dataset (train split) | XBRL QA dataset (test split) |
XBRL formula calculation | XBRL QA dataset (train split) | XBRL QA dataset (test split) |
- fpb: takala/financial_phrasebank
- fiqa: pauri32/fiqa-2018
- ngwi: oliverwang15/news_with_gpt_instructions
- tfns: zeroshot/twitter-financial-news-sentiment
- Prepare XBRL datasets in train/test for all three XBRL tasks
- Initial testing on the three XBRL tasks on 8B and 70B
- Finetune 8B (4 bit, 8bit quant), 70B (4 bit quant) on the three XBRL train dataset (concat together).
- Evaluate performance using test XBRL datasets for base model and finetuned model.
- Record inference speed and resource usage
Model | fiqa | fpb | headline | ner | nwgi | tfns |
---|---|---|---|---|---|---|
state-spaces/mamba-130m-hf | 0.0734 | 0.1841 | 0.0185 | |||
meta-llama/Llama-3.1-8B-Instruct | 0.4655 | 0.6873 | 0.4534 | 0.4889 | 0.4658 | 0.6997 |
meta-llama/Llama-3.1-8B-Instruct-4bits-r4 | 0.7309 | 0.8630 | - | - | 0.8095 | 0.8827 |
meta-llama/Llama-3.1-8B-Instruct-4bits-r8 | - | - | - | - | - | - |
meta-llama/Llama-3.1-8B-Instruct-8bits-r4 | - | - | - | - | - | - |
meta-llama/Llama-3.1-8B-Instruct-8bits-r8 | 0.8036 | 0.8284 | 0.8334 | 0.9103 | 0.8396 | 0.8405 |
Model | fiqa | fpb | headline | ner | nwgi | tfns |
---|---|---|---|---|---|---|
state-spaces/mamba-130m-hf | 0.1328 | 0.1711 | 0.0205 | |||
meta-llama/Llama-3.1-8B-Instruct | 0.5571 | 0.6768 | 0.5576 | 0.5686 | 0.4117 | 0.6834 |
meta-llama/Llama-3.1-8B-Instruct-4bits-r4 | 0.7811 | 0.8600 | - | - | 0.8029 | 0.8824 |
meta-llama/Llama-3.1-8B-Instruct-4bits-r8 | - | - | - | - | - | - |
meta-llama/Llama-3.1-8B-Instruct-8bits-r4 | - | - | - | - | - | - |
meta-llama/Llama-3.1-8B-Instruct-8bits-r8 | 0.8177 | 0.8302 | 0.8402 | 0.9095 | 0.8492 | 0.8436 |
- Finetuning vram usage, gpu hour, # of parameters
- Model size and adaptor size
- Evaluation inference speed
- Training/Eval loss convergence plot
Name | # of parameters |
---|---|
meta-llama/Llama-3.1-8B-Instruct | 8B |
meta-llama/Llama-3.1-70B-Instruct | 70B |