Update README.md
Browse files
README.md
CHANGED
@@ -67,7 +67,7 @@ We adopt the architecture of FLM-101B as the backbone for Tele-FLM, with several
|
|
67 |
- SwiGLU for activation function
|
68 |
- Linear bias disabled
|
69 |
- Embedding and language model head untied
|
70 |
-
- Input and output
|
71 |
|
72 |
Consequently, Tele-FLM is largely compatible with Llama architecturally.
|
73 |
To maximize convenience for the community, we made minimal adjustments to Llama's code to adapt it to Tele-FLM and released it as open source.
|
|
|
67 |
- SwiGLU for activation function
|
68 |
- Linear bias disabled
|
69 |
- Embedding and language model head untied
|
70 |
+
- Input and output multiplier
|
71 |
|
72 |
Consequently, Tele-FLM is largely compatible with Llama architecturally.
|
73 |
To maximize convenience for the community, we made minimal adjustments to Llama's code to adapt it to Tele-FLM and released it as open source.
|