bew
/

t5_sentence_to_triplet_xl

Text2Text Generation

Model card Files Files and versions Community

bew commited on Jul 1, 2023

Commit

486de62

·

1 Parent(s): 1711062

Update README.md

Files changed (1) hide show

README.md +18 -3

README.md CHANGED Viewed

@@ -8,11 +8,26 @@ library_name: peft
 pipeline_tag: text2text-generation
 ---
-This is a model trained on the [KELM Corpus](https://github.com/google-research-datasets/KELM-corpus) to take in sentences and output triplets of the form `subject-relation-object` to be used for knowledge graph generation.
 The model uses custom tokens to delimit triplets:
 ```
 special_tokens = ['<triplet>', '</triplet>', '<relation>', '<object>']
 tokenizer.add_tokens(special_tokens)
-```

 pipeline_tag: text2text-generation
 ---
+This is a version of `flan-t5-xl` fine-tuned on the [KELM Corpus](https://github.com/google-research-datasets/KELM-corpus) to take in sentences and output triplets of the form `subject-relation-object` to be used for knowledge graph generation.
 The model uses custom tokens to delimit triplets:
 ```
 special_tokens = ['<triplet>', '</triplet>', '<relation>', '<object>']
 tokenizer.add_tokens(special_tokens)
+```
+You can use it like this:
+```
+model = model.to(device)
+model.eval()
+new_input = "Hugging Face, Inc. is an American company that develops tools for building applications using machine learning.",
+inputs = tokenizer(new_input, return_tensors="pt")
+with torch.no_grad():
+    outputs = model.generate(input_ids=inputs["input_ids"].to("cuda"))
+    print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=False)[0])
+```
+Output: `<pad><triplet> Hugging Face <relation> instance of <object> Business </triplet></s>`
+This model still isn't perfect, and may make mistakes! I'm working on fine-tuning it for longer and on a more diverse set of data.