Did something happen to the token_dictionary_gc95M.pkl?
I've been playing with this a lot recently and due to our infrastructure I download/install geneformer fresh each time I load. the past few days i've had no problems loading the token_dictionary_gc95M file
when I try to extract embeddings, i get:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/Geneformer/geneformer/emb_extractor.py", line 521, in init
self.gene_token_dict = pickle.load(f)
_pickle.UnpicklingError: invalid load key, 'v'.
and if i try to load the .pkl directly i get
token_file = "../Geneformer/geneformer/token_dictionary_gc95M.pkl"
with open(token_file, "rb") as file:
token_dict = pickle.load(file)
Traceback (most recent call last):
File "", line 2, in
_pickle.UnpicklingError: invalid load key, 'v'.
am I the only one? using python 3.10.16.
Thanks for your question - this can happen when you aren't using git lfs.
yep totally right - our git lfs install had a problem on respin, once we fixed that everything's peachy again! tyty!