--- tags: - text-generation-inference - transformers - facebook - meta - pytorch - gguf - reasoning - context-dynamic - small-models - synthetic-data - function-calls - open-source - llama - NeuraLake - 🇧🇷 - 256K license: apache-2.0 model_creator: Celso H A Diniz model_name: iSA-02-Nano-1B-Preview --- **⚠️ Experimental Release Notice:** This model is in an **experimental phase** on Hugging Face and is **still undergoing training**. Expect further enhancements and updates in the coming week. # NeuraLake iSA-02 Series: Advanced Small-Scale Reasoning Models ## Overview The **NeuraLake iSA-02 Series** comprises compact reasoning models optimized for efficient logical processing in resource-constrained environments. Designed for applications requiring nuanced decision-making and complex problem-solving, these models balance performance with computational efficiency. ## Release Information Model weights for each variant (1B, 2B, 3B, and 7B parameters) will be released post comprehensive training and optimization to ensure high performance and safety standards. # iSA-02-Nano-1B-Preview v1.1 (**No Structured Tags Variant**) The **iSA-02-Nano-1B-Preview** is the latest addition to the iSA-02 series, enhanced with synthetic data to prioritize **“thinking before speaking.”** This focus enhances its reasoning capabilities, making it ideal for applications requiring thoughtful and logical text generation within a compact framework. ### What is a Reasoning Model? A **reasoning model** simulates human-like logical thinking, enabling the analysis of information, inference drawing, and decision-making based on data. Unlike traditional language models that generate text from patterns, reasoning models excel in understanding, planning, and executing multi-step processes. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/67355d00728f9dcf37212c02/whZHzNAYQ6eGtpjJJlUM6.png) ### Name and Inspiration - **iSA:** Stands for **Intelligent, Small, Autonomous**, reflecting the mission to create compact AI systems with adaptive and intelligent behavior. - **Development:** Initiated in January 2024, the series emerged from experiments combining diverse datasets, revealing initial reasoning capabilities in the base model. Unlike models derived from OpenAI, iSA-02 emphasizes unique reasoning enhancements through innovative synthetic data and contextual refinement. ### Lineage Based on **[meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)** and refined with synthetic datasets from **[NeuraLake](https://www.neuralake.com.br)**, the iSA-02-Nano-1B-Preview targets improvements in reasoning, long-context handling, and adaptive behaviors. ## Key Features - **Extended Context Window:** Supports up to **256K tokens** for complex reasoning and Retrieval-Augmented Generation (RAG). - **Adaptive Reasoning:** Adjusts reasoning depth based on context size—concise for <8K tokens and detailed for >16K tokens. - **Efficiency Optimized:** Balances advanced reasoning with low computational demands, suitable for resource-limited settings. ## Model Specifications ### Architecture - **Type:** Transformer-based - **Layers:** 16 - **Hidden Size:** 2048 - **Attention Heads:** 32 - **Feed-Forward Size:** 8192 - **Vocabulary Size:** 128,256 ### Training Parameters - **Precision:** Mixed Precision (fp16) - **Context Window:** - **Text Generation:** 1,024–4,096 tokens - **Logical Reasoning:** 16,000–64,000 tokens ### Quantization Versions | Version | Format | Bits | Parameters | Download | |---------|-----------------|------|------------|------------------------------------------------------------------------------------------------------| | F32 | Custom Llama 3.2 | FP32 | 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-1B-NoTags-Preview/resolve/main/iSA-02-Nano-1B-NoTags.F32.gguf) | | F16 | Custom Llama 3.2 | FP16 | 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-1B-NoTags-Preview/resolve/main/iSA-02-Nano-1B-NoTags.F16.gguf) | | Q4_0 | Custom Llama 3.2 | 4-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-1B-NoTags-Preview/resolve/main/iSA-02-Nano-1B-NoTags.Q4_0.gguf) | | Q4_K_M | Custom Llama 3.2 | 4-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-1B-NoTags-Preview/resolve/main/iSA-02-Nano-1B-NoTags.Q4_K_M.gguf) | | Q5_K_M | Custom Llama 3.2 | 5-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-1B-NoTags-Preview/resolve/main/iSA-02-Nano-1B-NoTags.Q5_0.gguf) | | Q8_0 | Custom Llama 3.2 | 8-bit| 1.24B | [Download](https://huggingface.co/NeuraLakeAi/iSA-02-1B-NoTags-Preview/resolve/main/iSA-02-Nano-1B-NoTags.Q8_0.gguf) | ### Hardware Requirements | Version | Quantization | Size | Memory (RAM/vRAM) | |---------|--------------|--------|-------------------| | F32 | FP32 | 4.95 GB| 9.9 GB | | F16 | FP16 | 2.48 GB| 4.96 GB | | Q4_0 | 4-bit | 771 MB | 1.56 GB | | Q4_K_M | 4-bit | 808 MB | 1.62 GB | | Q5_K_M | 5-bit | 893 MB | 1.84 GB | | Q8_0 | 8-bit | 1.32 GB| 2.64 GB | ## Training and Fine-Tuning Trained on synthetic datasets tailored to enhance logical reasoning, multi-step task execution, and contextual tool usage, the iSA-02 series ensures robust performance in complex scenarios and adaptive behaviors. ## Use Cases ### Applications - **Logical Reasoning & Decision-Making:** Generate analytical reports from system logs. - **Dynamic Tool Integration:** Ideal for long-context RAG tasks like querying large databases. - **Structured Content Generation:** Perfect for correcting OCR outputs and filling in missing data. ### Limitations - **Unsuitable for:** - High-throughput text generation. - Latency-sensitive applications. - **Challenges:** - Potential biases from synthetic data. - Redundant or verbose reasoning. ## Improvements in Version 1.1 - **Enhanced Reasoning:** Faster processing with reduced overthinking. - **Better Tool Utilization:** More effective use of external tools. - **Improved Context Understanding:** Aligns actions with user intentions. - **Reduced Redundancy:** More concise responses. - **Less Task Aversion:** Fewer refusals of routine tasks. - **Optimized Context Management:** Efficient handling of the 256K context window. ## Best Practices ### Configuration Recommendations - **max_tokens:** - **Simple Tasks:** 1,024–4,096 tokens - **Complex Tasks:** 8,000–16,000 tokens - **temperature:** - **Objective Responses:** 0.1–0.3 - **Creative Reasoning:** 0.7–1.0 - **top_p:** - **Focused Outputs:** 0.85 - **Precision Tasks:** 0.1 - **stop_sequences:** - Use specific sequences like "Therefore, the answer is," to minimize redundancy. ### Prompt Engineering - **Simple Tasks:** - **Example:** `"You are a helpful assistant."` - **Complex Tasks:** - **Example:** `"Transform OCR outputs into valid JSON, return only the JSON data as output."` - **Structured Reasoning**: "Not apply in "No Structured Tags", as it is not necessary or supported." ### Supervision and Monitoring - **Clear Prompts:** Ensure instructions are specific and unambiguous to reduce errors and redundancies. ## Known Issues (Addressed in V1.1) - **Task Management:** Improved handling of complex tasks and function calls. - **Unusual Behavior:** Reduced instances of unsolicited online searches or autonomous interactions. - **Conversational Redirection:** Enhanced stability in maintaining topic focus. - **Function Call Execution:** Ensured simulated function calls are actionable. ## Citation ```bibtex @misc{isa02, author = {NeuraLake}, title = {iSA-02: The First Small Reasoning Model with Context-Dynamic Behavior}, year = {2024}, license = {Apache 2.0}, url = {https://huggingface.co/NeuraLake/iSA-02}, } ``` **Note:** This model card is under development and will be updated with additional details, evaluation metrics, and the final model name.