Usage in Model Card is incorrect
There are several problems. The code imports AutoFeatureExtractor but uses AutoProcessor.processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
does not work because it is not present on huggingface OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.
Using AutoFeatureExtractor works though.
from transformers import AutoFeatureExtractor, Wav2Vec2BertModel
import torch
from datasets import load_dataset
dataset = load_dataset("hf-internal-testing/librispeech_asr_demo", "clean", split="validation")
dataset = dataset.sort("id")
sampling_rate = dataset.features["audio"].sampling_rate
processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
model = Wav2Vec2BertModel.from_pretrained("facebook/w2v-bert-2.0")
# audio file is decoded on the fly
inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
Could you provide a working example? Are you saying that you replace the line ?
- processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
+ processor = AutoFeatureExtractor.from_pretrained("facebook/w2v-bert-2.0")
Yes, that works. But I'm a bit confused by the fact that the feature_extractor also does tokenization? And there is no w2v-bert-2.0 processor available on huggingface via AutoProcessor. Please correct me if I'm wrong, standard huggingface convention is to have the processor perform both feature extraction and tokenization.
In [1]: AutoFeatureExtractor.from_pretrained("facebook/w2v-bert-2.0")
Out[1]: Wav2Vec2Processor:
- feature_extractor: Wav2Vec2FeatureExtractor {
"do_normalize": true,
"feature_extractor_type": "Wav2Vec2FeatureExtractor",
"feature_size": 1,
"padding_side": "right",
"padding_value": 0.0,
"return_attention_mask": false,
"sampling_rate": 16000
}
- tokenizer: Wav2Vec2CTCTokenizer(name_or_path='facebook/wav2vec2-base', vocab_size=32, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': '<s>', 'eos_token': '</s>', 'unk_token': '<unk>', 'pad_token': '<pad>'}, clean_up_tokenization_spaces=False, added_tokens_decoder={
0: AddedToken("<pad>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
1: AddedToken("<s>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
2: AddedToken("</s>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
3: AddedToken("<unk>", rstrip=True, lstrip=True, single_word=False, normalized=False, special=False),
}
)
{
"processor_class": "Wav2Vec2Processor"
}
In [2]: AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
Out[2]: ---------------------------------------------------------------------------
OSError Traceback (most recent call last)
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/wav2vec2_bert/processing_wav2vec2_bert.py:56, in Wav2Vec2BertProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
55 try:
---> 56 return super().from_pretrained(pretrained_model_name_or_path, **kwargs)
57 except OSError:
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/processing_utils.py:974, in ProcessorMixin.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, **kwargs)
972 kwargs["token"] = token
--> 974 args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
975 processor_dict, kwargs = cls.get_processor_dict(pretrained_model_name_or_path, **kwargs)
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/processing_utils.py:1020, in ProcessorMixin._get_arguments_from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
1018 attribute_class = getattr(transformers_module, class_name)
-> 1020 args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
1021 return args
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py:943, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
942 if tokenizer_class_py is not None:
--> 943 return tokenizer_class_py.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
944 else:
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:2016, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
2015 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()) and not gguf_file:
-> 2016 raise EnvironmentError(
2017 f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
2018 "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
2019 f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
2020 f"containing all relevant files for a {cls.__name__} tokenizer."
2021 )
2023 for file_id, file_path in vocab_files.items():
OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last)
Cell In[4], line 2
1 from transformers import AutoProcessor
----> 2 processor = AutoProcessor.from_pretrained("facebook/w2v-bert-2.0")
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/auto/processing_auto.py:328, in AutoProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
324 return processor_class.from_pretrained(
325 pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
326 )
327 elif processor_class is not None:
--> 328 return processor_class.from_pretrained(
329 pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
330 )
331 # Last try: we use the PROCESSOR_MAPPING.
332 elif type(config) in PROCESSOR_MAPPING:
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/models/wav2vec2_bert/processing_wav2vec2_bert.py:68, in Wav2Vec2BertProcessor.from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
58 warnings.warn(
59 f"Loading a tokenizer inside {cls.__name__} from a config that does not"
60 " include a `tokenizer_class` attribute is deprecated and will be "
(...)
64 FutureWarning,
65 )
67 feature_extractor = SeamlessM4TFeatureExtractor.from_pretrained(pretrained_model_name_or_path, **kwargs)
---> 68 tokenizer = Wav2Vec2CTCTokenizer.from_pretrained(pretrained_model_name_or_path, **kwargs)
70 return cls(feature_extractor=feature_extractor, tokenizer=tokenizer)
File ~/miniconda3/envs/vaanienv/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:2016, in PreTrainedTokenizerBase.from_pretrained(cls, pretrained_model_name_or_path, cache_dir, force_download, local_files_only, token, revision, trust_remote_code, *init_inputs, **kwargs)
2013 # If one passes a GGUF file path to `gguf_file` there is no need for this check as the tokenizer will be
2014 # loaded directly from the GGUF file.
2015 if all(full_file_name is None for full_file_name in resolved_vocab_files.values()) and not gguf_file:
-> 2016 raise EnvironmentError(
2017 f"Can't load tokenizer for '{pretrained_model_name_or_path}'. If you were trying to load it from "
2018 "'https://huggingface.co/models', make sure you don't have a local directory with the same name. "
2019 f"Otherwise, make sure '{pretrained_model_name_or_path}' is the correct path to a directory "
2020 f"containing all relevant files for a {cls.__name__} tokenizer."
2021 )
2023 for file_id, file_path in vocab_files.items():
2024 if file_id not in resolved_vocab_files:
OSError: Can't load tokenizer for 'facebook/w2v-bert-2.0'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'facebook/w2v-bert-2.0' is the correct path to a directory containing all relevant files for a Wav2Vec2CTCTokenizer tokenizer.
Ah, AutoFeatureExtractor seems to be loading a Wav2Vec2Processor object.
Closing because there is already an unmerged pull request for this at https://huggingface.co/facebook/w2v-bert-2.0/discussions/16