Skip to content
wangd12rpi edited this page Nov 29, 2024 · 17 revisions

Tasks

Name Train dataset Test dataset
Sentiment Analysis (SA) FinGPT/fingpt-sentiment-train fpb, fiqa, tfns, nwgi
Named Entity Recognition (NER) FinGPT/fingpt-ner-cls (train split) FinGPT/fingpt-ner-cls (test split)
Headline classification FinGPT/fingpt-headline (train split) FinGPT/fingpt-headline (test split)
XBRL tag extraction XBRL QA dataset (train split) XBRL QA dataset (test split)
XBRL value extraction XBRL QA dataset (train split) XBRL QA dataset (test split)
XBRL formula calculation XBRL QA dataset (train split) XBRL QA dataset (test split)
  • fpb: takala/financial_phrasebank
  • fiqa: pauri32/fiqa-2018
  • ngwi: oliverwang15/news_with_gpt_instructions
  • tfns: zeroshot/twitter-financial-news-sentiment

Todo list

  • Prepare XBRL datasets in train/test for all three XBRL tasks
  • Initial testing on the three XBRL tasks on 8B and 70B
  • Finetune 8B (4 bit, 8bit quant), 70B (4 bit quant) on the three XBRL train dataset (concat together).
  • Evaluate performance using test XBRL datasets for base model and finetuned model.
  • Record inference speed and resource usage

Results

Accuracy

Model fiqa fpb headline ner nwgi tfns
state-spaces/mamba-130m-hf 0.0734 0.1841 0.0185
meta-llama/Llama-3.1-8B-Instruct 0.4655 0.6873 0.4534 0.4889 0.4658 0.6997
meta-llama/Llama-3.1-8B-Instruct-4bits-r4 0.7309 0.8630 - - 0.8095 0.8827
meta-llama/Llama-3.1-8B-Instruct-4bits-r8 - - - - - -
meta-llama/Llama-3.1-8B-Instruct-8bits-r4 - - - - - -
meta-llama/Llama-3.1-8B-Instruct-8bits-r8 0.8036 0.8284 0.8334 0.9103 0.8396 0.8405

F1 Score

Model fiqa fpb headline ner nwgi tfns
state-spaces/mamba-130m-hf 0.1328 0.1711 0.0205
meta-llama/Llama-3.1-8B-Instruct 0.5571 0.6768 0.5576 0.5686 0.4117 0.6834
meta-llama/Llama-3.1-8B-Instruct-4bits-r4 0.7811 0.8600 - - 0.8029 0.8824
meta-llama/Llama-3.1-8B-Instruct-4bits-r8 - - - - - -
meta-llama/Llama-3.1-8B-Instruct-8bits-r4 - - - - - -
meta-llama/Llama-3.1-8B-Instruct-8bits-r8 0.8177 0.8302 0.8402 0.9095 0.8492 0.8436

Future data

  • Finetuning vram usage, gpu hour, # of parameters
  • Model size and adaptor size
  • Evaluation inference speed
  • Training/Eval loss convergence plot

Models

Name # of parameters
meta-llama/Llama-3.1-8B-Instruct 8B
meta-llama/Llama-3.1-70B-Instruct 70B
Clone this wiki locally