Update README.md
Browse files
README.md
CHANGED
@@ -1,14 +1,14 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
inference: false
|
4 |
-
tags: [green, llmware-rag, p1,
|
5 |
---
|
6 |
|
7 |
-
# bling-tiny-llama-
|
8 |
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
|
11 |
-
**bling-tiny-llama-
|
12 |
|
13 |
[**bling-tiny-llama**](https://huggingface.co/llmware/bling-tiny-llama-v0) is a fact-based question-answering model, optimized for complex business documents.
|
14 |
|
@@ -16,23 +16,20 @@ Get started right away
|
|
16 |
|
17 |
1. Install dependencies
|
18 |
|
19 |
-
```
|
20 |
pip3 install llmware
|
21 |
-
pip3 install
|
22 |
-
pip3 install openvino_genai
|
23 |
```
|
24 |
|
25 |
2. Hello World
|
26 |
|
27 |
-
```
|
28 |
from llmware.models import ModelCatalog
|
29 |
-
model = ModelCatalog().load_model("bling-tiny-llama-
|
30 |
response = model.inference("The stock price is $45.\nWhat is the stock price?")
|
31 |
print("response: ", response)
|
32 |
```
|
33 |
|
34 |
-
Get started right away with [OpenVino](https://github.com/openvinotoolkit/openvino)
|
35 |
-
|
36 |
Looking for AI PC solutions and demos, contact us at [llmware](https://www.llmware.ai)
|
37 |
|
38 |
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
inference: false
|
4 |
+
tags: [green, llmware-rag, p1, onnx]
|
5 |
---
|
6 |
|
7 |
+
# bling-tiny-llama-onnx
|
8 |
|
9 |
<!-- Provide a quick summary of what the model is/does. -->
|
10 |
|
11 |
+
**bling-tiny-llama-onnx** is an ONNX int4 quantized version of BLING Tiny-Llama 1B, providing a very fast, very small inference implementation, optimized for AI PCs using Intel GPU, CPU and NPU.
|
12 |
|
13 |
[**bling-tiny-llama**](https://huggingface.co/llmware/bling-tiny-llama-v0) is a fact-based question-answering model, optimized for complex business documents.
|
14 |
|
|
|
16 |
|
17 |
1. Install dependencies
|
18 |
|
19 |
+
```python
|
20 |
pip3 install llmware
|
21 |
+
pip3 install onnxruntime_genai
|
|
|
22 |
```
|
23 |
|
24 |
2. Hello World
|
25 |
|
26 |
+
```python
|
27 |
from llmware.models import ModelCatalog
|
28 |
+
model = ModelCatalog().load_model("bling-tiny-llama-onnx")
|
29 |
response = model.inference("The stock price is $45.\nWhat is the stock price?")
|
30 |
print("response: ", response)
|
31 |
```
|
32 |
|
|
|
|
|
33 |
Looking for AI PC solutions and demos, contact us at [llmware](https://www.llmware.ai)
|
34 |
|
35 |
|