Model Card for gherke/mistral-7b-quantized-lora-finetuned
This is a financial sentiment analysis model, fine-tuned for fine-tuned for sentiment analysis to return a sentiment score between -1 (very negative) and 1 (very positive). It was specifically trained to analyze financial news and assess its impact on financial market trends.
Model Details
Model Description
This is a quantized version of the Mistral-7B model, fine-tuned using the LoRA (Low-Rank Adaptation) technique for sentiment analysis tasks. The model was trained on the takala/financial_phrasebank
dataset to detect sentiment related to economic or market-relevant information. The output is a single sentiment score, with values between -1 and 1, representing very negative to very positive sentiment respectively.
- Developed by: Gabriella Herke
- Funded by [optional]: [More Information Needed]
- Shared by [optional]: Gabriella Herke
- Model type: Causal Language Model fine-tuned for Sentiment Analysis
- Language(s) (NLP): English
- License: [More Information Needed]
- Finetuned from model [optional]: thesven/Mistral-7B-Instruct-v0.3-GPTQ
Model Sources [optional]
- Repository: Link to the Hugging Face Model Repository
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Direct Use
The model can be used for analyzing financial news and producing a sentiment score that indicates the potential impact on financial market trends.
Downstream Use [optional]
The model can be incorporated into larger financial analysis pipelines or trading bots to assess market sentiment.
Out-of-Scope Use
The model should not be used for general sentiment analysis outside of the financial context, as it was specifically trained on financial news.
Bias, Risks, and Limitations
The model is limited by the nature of its training dataset (takala/financial_phrasebank
), which may not be representative of all financial scenarios or market conditions. It may produce biased results if applied to other sectors.
Recommendations
Users (both direct and downstream) should be aware of the risks, biases, and limitations of the model. Careful consideration is needed when applying the sentiment scores in automated trading decisions, as biases in the data can lead to incorrect assessments.
How to Get Started with the Model
Use the code below to get started with the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "gherke/mistral-7b-quantized-lora-finetuned"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
Training Details
Training Data
The model was fine-tuned on the takala/financial_phrasebank
dataset, which contains financial news phrases labeled for sentiment (positive, negative, neutral).
Training Procedure
Preprocessing [optional]
The financial news phrases were tokenized and preprocessed using the Hugging Face tokenizer, with truncation applied for long texts.
Training Hyperparameters
- Training regime: 4-bit quantized training with LoRA adaptation
- Learning Rate: 2e-4
- Batch Size: 8
- Number of Epochs: 20
Speeds, Sizes, Times [optional]
Training was performed using an 8-bit paged AdamW optimizer, with gradient accumulation steps set to 4.
Evaluation
Testing Data, Factors & Metrics
Testing Data
The model was evaluated on the test split of the takala/financial_phrasebank
dataset.
Factors
The evaluation was performed based on the model's ability to accurately predict sentiment labels in financial contexts.
Metrics
Mean Squared Error (MSE) and correlation with human-labeled sentiment scores were used to evaluate model performance.
Results
The model achieved reasonable accuracy in predicting sentiment scores within the financial domain, performing well on positive and negative examples but showing some difficulty in identifying neutral cases.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: NVIDIA A100 GPU
- Hours used: 15 hours
- Cloud Provider: AWS
- Compute Region: Europe (London)
- Carbon Emitted: Approximately 25 kg CO2eq
Technical Specifications [optional]
Model Architecture and Objective
The model is based on the Mistral-7B architecture, with LoRA applied for efficient fine-tuning in the sentiment analysis task.
Compute Infrastructure
Hardware
The model was trained on a single NVIDIA A100 GPU with 40 GB VRAM.
Software
- Transformers Library: Hugging Face Transformers v4.31.0
- PEFT Library: v0.12.0
More Information [optional]
For further questions or inquiries about this model, please reach out to Gabriella Herke.
Model Card Contact
For more information, contact Gabriella Herke at [contact information].
Framework versions
- PEFT 0.12.0
- Downloads last month
- 2
Model tree for gherke/mistral-7b-quantized-lora-finetuned
Base model
mistralai/Mistral-7B-v0.3