Usage in Model Card is incorrect

#30
by Kars - opened

There are several problems. The code imports AutoFeatureExtractor but uses AutoProcessor.
processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0") does not work because it is not present on huggingface OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.

Using AutoFeatureExtractor works though.

from transformers import AutoFeatureExtractor, Wav2Vec2BertModel
import torch
from datasets import load_dataset

dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
dataset = dataset.sort("id")
sampling_rate = dataset.features["audio"].sampling_rate

processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
model = Wav2Vec2BertModel.from_pretrained("facebook/w2v-bert-2.0")

# audio file is decoded on the fly
inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

Could you provide a working example? Are you saying that you replace the line ?

- processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
+ processor = AutoFeatureExtractor.from_pretrained("facebook/w2v-bert-2.0")

Yes, that works. But I'm a bit confused by the fact that the feature_extractor also does tokenization? And there is no w2v-bert-2.0 processor available on huggingface via AutoProcessor. Please correct me if I'm wrong, standard huggingface convention is to have the processor perform both feature extraction and tokenization.

In [1]: AutoFeatureExtractor.from_pretrained("facebook/w2v-bert-2.0")
Out[1]:  Wav2Vec2Processor:
- feature_extractor: Wav2Vec2FeatureExtractor {
  "do_normalize": true,
  "feature_extractor_type": "Wav2Vec2FeatureExtractor",
  "feature_size": 1,
  "padding_side": "right",
  "padding_value": 0.0,
  "return_attention_mask": false,
  "sampling_rate": 16000
}

- tokenizer: Wav2Vec2CTCTokenizer(name_or_path='facebook/wav2vec2-base', vocab_size=32, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '<unk>', 'pad_token': '<pad>'}, clean_up_tokenization_spaces=False, added_tokens_decoder={
    0: AddedToken("<pad>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
    1: AddedToken("<s>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
    2: AddedToken("</s>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
    3: AddedToken("<unk>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
}
)

{
  "processor_class": "Wav2Vec2Processor"
}
In [2]: AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
Out[2]: ---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/wav2vec2_bert/processing_wav2vec2_bert.py:56, in Wav2Vec2BertProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
     55 try:
---> 56     return super().from_pretrained(pretrained_model_name_or_path, **kwargs)
     57 except OSError:

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/processing_utils.py:974, in ProcessorMixin.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, **kwargs)
    972     kwargs["token"] = token
--> 974 args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
    975 processor_dict, kwargs = cls.get_processor_dict(pretrained_model_name_or_path, **kwargs)

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/processing_utils.py:1020, in ProcessorMixin._get_arguments_from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
   1018         attribute_class = getattr(transformers_module, class_name)
-> 1020     args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
   1021 return args

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py:943, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
    942 if tokenizer_class_py is not None:
--> 943     return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
    944 else:

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:2016, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
   2015 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()) and not gguf_file:
-> 2016     raise EnvironmentError(
   2017         f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
   2018         "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
   2019         f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
   2020         f"containing all relevant files for a {cls.__name__} tokenizer."
   2021     )
   2023 for file_id, file_path in vocab_files.items():

OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
Cell In[4], line 2
      1 from transformers import AutoProcessor
----> 2 processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/auto/processing_auto.py:328, in AutoProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
    324     return processor_class.from_pretrained(
    325         pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
    326     )
    327 elif processor_class is not None:
--> 328     return processor_class.from_pretrained(
    329         pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
    330     )
    331 # Last try: we use the PROCESSOR_MAPPING.
    332 elif type(config) in PROCESSOR_MAPPING:

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/wav2vec2_bert/processing_wav2vec2_bert.py:68, in Wav2Vec2BertProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
     58 warnings.warn(
     59     f"Loading a tokenizer inside {cls.__name__} from a config that does not"
     60     " include a `tokenizer_class` attribute is deprecated and will be "
   (...)
     64     FutureWarning,
     65 )
     67 feature_extractor = SeamlessM4TFeatureExtractor.from_pretrained(pretrained_model_name_or_path, **kwargs)
---> 68 tokenizer = Wav2Vec2CTCTokenizer.from_pretrained(pretrained_model_name_or_path, **kwargs)
     70 return cls(feature_extractor=feature_extractor, tokenizer=tokenizer)

File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:2016, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
   2013 # If one passes a GGUF file path to `gguf_file` there is no need for this check as the tokenizer will be
   2014 # loaded directly from the GGUF file.
   2015 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()) and not gguf_file:
-> 2016     raise EnvironmentError(
   2017         f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
   2018         "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
   2019         f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
   2020         f"containing all relevant files for a {cls.__name__} tokenizer."
   2021     )
   2023 for file_id, file_path in vocab_files.items():
   2024     if file_id not in resolved_vocab_files:

OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.

Ah, AutoFeatureExtractor seems to be loading a Wav2Vec2Processor object.

Closing because there is already an unmerged pull request for this at https://huggingface.co/facebook/w2v-bert-2.0/discussions/16

Kars changed discussion status to closed

Sign up or log in to comment