danielhanchen
/

open_llama_3b_600bt_preview

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

open_llama_3b_600bt_preview / README.md

danielhanchen's picture

Update README.md

db77853 over 1 year ago

|

1.11 kB

	---
	license: apache-2.0
	language:
	- en
	---

	Original model from https://huggingface.co/openlm-research/open_llama_3b_600bt_preview.

	This repo includes:
	1) Ported `LlamaTokenizer` to `LlamaTokenizerFast` via a few lines of code.
	Loading via `AutoTokenizer` takes 3 to 4 minutes. Now, a few seconds!
	```
	from transformers import LlamaTokenizerFast
	from tokenizers import AddedToken
	tokenizer = LlamaTokenizerFast.from_pretrained(
	"openlm-research/open_llama_3b_600bt_preview",
	add_bos_token = True, add_eos_token = True,
	bos_token = AddedToken("<s>", single_word = True),
	eos_token = AddedToken("</s>", single_word = True),
	unk_token = AddedToken("<unk>", single_word = True),
	pad_token = AddedToken("<unk>", single_word = True)
	)
	tokenizer.push_to_hub("open_llama_3b_600bt_preview")
	```
	2) `AutoTokenizer` does not recognize the BOS, EOS and UNK tokens. All tokenizations weirdly prepend 0 and append 0 to the end, when actually, you're supposed to prepend 1 and append 2.
	3) Manually added BOS `<s>`, EOS `</s>`, UNK `<unk>` tokens, with PAD (padding) being also the `<unk>` token.