YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
language:
- en
- zh tags:
- paligemma
- multitask
- object-detection
- text-generation
- visual-question-answering datasets:
- custom license: mit
model-index: - name: Extremely4606/paligemma24_12_30 results: - task: type: object-detection dataset: type: custom name: Defect Detection metrics: - name: mAP type: mean_average_precision value: 0.85 # 替换为实际评估值 - task: type: text-generation dataset: type: custom name: Text Description metrics: - name: BLEU type: bleu value: 0.78 # 替换为实际评估值 - task: type: visual-question-answering dataset: type: custom name: Visual QA metrics: - name: Accuracy type: accuracy value: 0.90 # 替换为实际评估值
PaliGemma Multitask Model
This is a multitask model based on PaliGemma that can perform:
- Object Detection (defect detection)
- Text Generation (defect description)
- Visual Question Answering
Model Description
The model is fine-tuned from google/paligemma-3b-mix-224 using LoRA technique.
Usage
from transformers import AutoProcessor, AutoModelForVision2Seq
# Load model and processor
processor = AutoProcessor.from_pretrained("Extremely4606/paligemma24_12_30")
model = AutoModelForVision2Seq.from_pretrained("Extremely4606/paligemma24_12_30")
# Process image and text
inputs = processor(images=image, text=text, return_tensors="pt")
# Get predictions
outputs = model(**inputs)
- Downloads last month
- 8
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.