model-index: - name: Extremely4606/paligemma24_12_30 results: - task: type: object-detection dataset: type: custom name: Defect Detection metrics: - name: mAP type: mean_average_precision value: 0.85 # 替换为实际评估值 - task: type: text-generation dataset: type: custom name: Text Description metrics: - name: BLEU type: bleu value: 0.78 # 替换为实际评估值 - task: type: visual-question-answering dataset: type: custom name: Visual QA metrics: - name: Accuracy type: accuracy value: 0.90 # 替换为实际评估值

PaliGemma Multitask Model

This is a multitask model based on PaliGemma that can perform:

Object Detection (defect detection)
Text Generation (defect description)
Visual Question Answering

Model Description

The model is fine-tuned from google/paligemma-3b-mix-224 using LoRA technique.

Usage

from transformers import AutoProcessor, AutoModelForVision2Seq

# Load model and processor
processor = AutoProcessor.from_pretrained("Extremely4606/paligemma24_12_30")
model = AutoModelForVision2Seq.from_pretrained("Extremely4606/paligemma24_12_30")

# Process image and text
inputs = processor(images=image, text=text, return_tensors="pt")

# Get predictions
outputs = model(**inputs)