doberst commited on
Commit
449ac8b
·
verified ·
1 Parent(s): bce4718

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -10
README.md CHANGED
@@ -1,14 +1,14 @@
1
  ---
2
  license: apache-2.0
3
  inference: false
4
- tags: [green, llmware-rag, p1, ov]
5
  ---
6
 
7
- # bling-tiny-llama-ov
8
 
9
  <!-- Provide a quick summary of what the model is/does. -->
10
 
11
- **bling-tiny-llama-ov** is an OpenVino int4 quantized version of BLING Tiny-Llama 1B, providing a very fast, very small inference implementation, optimized for AI PCs using Intel GPU, CPU and NPU.
12
 
13
  [**bling-tiny-llama**](https://huggingface.co/llmware/bling-tiny-llama-v0) is a fact-based question-answering model, optimized for complex business documents.
14
 
@@ -16,23 +16,20 @@ Get started right away
16
 
17
  1. Install dependencies
18
 
19
- ```
20
  pip3 install llmware
21
- pip3 install openvino
22
- pip3 install openvino_genai
23
  ```
24
 
25
  2. Hello World
26
 
27
- ```
28
  from llmware.models import ModelCatalog
29
- model = ModelCatalog().load_model("bling-tiny-llama-ov")
30
  response = model.inference("The stock price is $45.\nWhat is the stock price?")
31
  print("response: ", response)
32
  ```
33
 
34
- Get started right away with [OpenVino](https://github.com/openvinotoolkit/openvino)
35
-
36
  Looking for AI PC solutions and demos, contact us at [llmware](https://www.llmware.ai)
37
 
38
 
 
1
  ---
2
  license: apache-2.0
3
  inference: false
4
+ tags: [green, llmware-rag, p1, onnx]
5
  ---
6
 
7
+ # bling-tiny-llama-onnx
8
 
9
  <!-- Provide a quick summary of what the model is/does. -->
10
 
11
+ **bling-tiny-llama-onnx** is an ONNX int4 quantized version of BLING Tiny-Llama 1B, providing a very fast, very small inference implementation, optimized for AI PCs using Intel GPU, CPU and NPU.
12
 
13
  [**bling-tiny-llama**](https://huggingface.co/llmware/bling-tiny-llama-v0) is a fact-based question-answering model, optimized for complex business documents.
14
 
 
16
 
17
  1. Install dependencies
18
 
19
+ ```python
20
  pip3 install llmware
21
+ pip3 install onnxruntime_genai
 
22
  ```
23
 
24
  2. Hello World
25
 
26
+ ```python
27
  from llmware.models import ModelCatalog
28
+ model = ModelCatalog().load_model("bling-tiny-llama-onnx")
29
  response = model.inference("The stock price is $45.\nWhat is the stock price?")
30
  print("response: ", response)
31
  ```
32
 
 
 
33
  Looking for AI PC solutions and demos, contact us at [llmware](https://www.llmware.ai)
34
 
35