Usage in Model Card is incorrect

#30

by Kars - opened 24 days ago

Kars

24 days ago

There are several problems. The code imports AutoFeatureExtractor but uses AutoProcessor.
processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0") does not work because it is not present on huggingface OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.

Using AutoFeatureExtractor works though.

from transformers import AutoFeatureExtractor, Wav2Vec2BertModel
import torch
from datasets import load_dataset

dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
dataset = dataset.sort("id")
sampling_rate = dataset.features["audio"].sampling_rate

processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
model = Wav2Vec2BertModel.from_pretrained("facebook/w2v-bert-2.0")

# audio file is decoded on the fly
inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

sishaar

19 days ago

Could you provide a working example? Are you saying that you replace the line ?

- processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
+ processor = AutoFeatureExtractor.from_pretrained("facebook/w2v-bert-2.0")

Kars

19 days ago

Yes, that works. But I'm a bit confused by the fact that the feature_extractor also does tokenization? And there is no w2v-bert-2.0 processor available on huggingface via AutoProcessor. Please correct me if I'm wrong, standard huggingface convention is to have the processor perform both feature extraction and tokenization.

In [1]: AutoFeatureExtractor.from_pretrained("facebook/w2v-bert-2.0")
Out[1]:  Wav2Vec2Processor:
- feature_extractor: Wav2Vec2FeatureExtractor {
  "do_normalize": true,
  "feature_extractor_type": "Wav2Vec2FeatureExtractor",
  "feature_size": 1,
  "padding_side": "right",
  "padding_value": 0.0,
  "return_attention_mask": false,
  "sampling_rate": 16000
}

- tokenizer: Wav2Vec2CTCTokenizer(name_or_path='facebook/wav2vec2-base', vocab_size=32, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '<unk>', 'pad_token': '<pad>'}, clean_up_tokenization_spaces=False, added_tokens_decoder={
    0: AddedToken("<pad>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
    1: AddedToken("<s>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
    2: AddedToken("</s>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
    3: AddedToken("<unk>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Wav2Vec2Processor"
}
In [2]: AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
Out[2]: ---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/wav2vec2_bert/processing_wav2vec2_bert.py:56, in Wav2Vec2BertProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
     55 try:
---> 56     return super().from_pretrained(pretrained_model_name_or_path, **kwargs)
     57 except OSError:

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/processing_utils.py:974, in ProcessorMixin.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, **kwargs)
    972     kwargs["token"] = token
--> 974 args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
    975 processor_dict, kwargs = cls.get_processor_dict(pretrained_model_name_or_path, **kwargs)

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/processing_utils.py:1020, in ProcessorMixin._get_arguments_from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
   1018         attribute_class = getattr(transformers_module, class_name)
-> 1020     args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
   1021 return args

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py:943, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
    942 if tokenizer_class_py is not None:
--> 943     return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
    944 else:

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:2016, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
   2015 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()) and not gguf_file:
-> 2016     raise EnvironmentError(
   2017         f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
   2018         "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
   2019         f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
   2020         f"containing all relevant files for a {cls.__name__} tokenizer."
   2021     )
   2023 for file_id, file_path in vocab_files.items():

OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
Cell In[4], line 2
      1 from transformers import AutoProcessor
----> 2 processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/auto/processing_auto.py:328, in AutoProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    324     return processor_class.from_pretrained(
    325         pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
    326     )
    327 elif processor_class is not None:
--> 328     return processor_class.from_pretrained(
    329         pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
    330     )
    331 # Last try: we use the PROCESSOR_MAPPING.
    332 elif type(config) in PROCESSOR_MAPPING:

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/wav2vec2_bert/processing_wav2vec2_bert.py:68, in Wav2Vec2BertProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
     58 warnings.warn(
     59     f"Loading a tokenizer inside {cls.__name__} from a config that does not"
     60     " include a `tokenizer_class` attribute is deprecated and will be "
   (...)
     64     FutureWarning,
     65 )
     67 feature_extractor = SeamlessM4TFeatureExtractor.from_pretrained(pretrained_model_name_or_path, **kwargs)
---> 68 tokenizer = Wav2Vec2CTCTokenizer.from_pretrained(pretrained_model_name_or_path, **kwargs)
     70 return cls(feature_extractor=feature_extractor, tokenizer=tokenizer)

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:2016, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
   2013 # If one passes a GGUF file path to `gguf_file` there is no need for this check as the tokenizer will be
   2014 # loaded directly from the GGUF file.
   2015 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()) and not gguf_file:
-> 2016     raise EnvironmentError(
   2017         f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
   2018         "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
   2019         f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
   2020         f"containing all relevant files for a {cls.__name__} tokenizer."
   2021     )
   2023 for file_id, file_path in vocab_files.items():
   2024     if file_id not in resolved_vocab_files:

OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.

Kars

19 days ago

Ah, AutoFeatureExtractor seems to be loading a Wav2Vec2Processor object.

Kars

19 days ago

Closing because there is already an unmerged pull request for this at https://huggingface.co/facebook/w2v-bert-2.0/discussions/16

Kars changed discussion status to closed 19 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment