# Model Overview

<figure><img src="https://3886194858-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FaIMUIIIWTZWKkkBAT0Fc%2Fuploads%2FSN47zXEj6lMuenfinvNX%2F1701735656-hero-solana%20(1).svg?alt=media&#x26;token=83c52c97-f9f7-43e2-a187-ca7c86d99b17" alt=""><figcaption></figcaption></figure>

**Lumo-8B-Instruct** is the first-ever cutting-edge AI model specifically designed to empower developers and users within the Solana ecosystem. Built upon the foundation of the robust LLaMa 3.1 8B parameter language model, Lumo is fine-tuned on a comprehensive dataset of Solana-related questions and answers, enabling it to provide exceptional assistance in various domains.

{% hint style="info" %}
**Lumo is the first ever to launch a fine-tuned model tailored for the Solana ecosystem.**
{% endhint %}

### About the Model

```python
import torch
from transformers import LlamaForCausalLM, AutoTokenizer
from llama_recipes.configs import train_config as TRAIN_CONFIG

train_config = TRAIN_CONFIG()
train_config.model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
train_config.num_epochs = 2
train_config.run_validation = False
train_config.gradient_accumulation_steps = 4
train_config.batch_size_training = 1
train_config.lr = 3e-4
train_config.use_fast_kernels = True
train_config.use_fp16 = True
train_config.context_length = 4096
train_config.batching_strategy = "packing"
train_config.output_dir = "Lumo-8B-Instruct"

from transformers import BitsAndBytesConfig
config = BitsAndBytesConfig(
    load_in_8bit=True,
)

model = LlamaForCausalLM.from_pretrained(
    train_config.model_name,
    device_map="auto",
    quantization_config=config,
    use_cache=False,
    attn_implementation="sdpa" if train_config.use_fast_kernels else None,
    torch_dtype=torch.float16,
)

tokenizer = AutoTokenizer.from_pretrained(train_config.model_name)
tokenizer.pad_token = tokenizer.eos_token
```

* **Base Model:** Lumo is founded upon the LLaMa 3.1 8B parameter model, a state-of-the-art decoder-only transformer architecture renowned for its exceptional language generation capabilities.
  * **Key Architectural Features:**
    * **Transformer Architecture:** Lumo leverages the attention mechanism of transformers to effectively capture long-range dependencies within the input sequence and generate coherent and contextually relevant responses.
    * **Decoder-Only Model:** Lumo is designed as a decoder-only model, focusing on generating text outputs based on given inputs, making it well-suited for tasks like text completion, summarization, and question answering.
    * **8 Billion Parameters:** The model boasts 8 billion parameters, enabling it to learn complex patterns and relationships within the data and generate highly sophisticated outputs.

```python
from peft import get_peft_model, prepare_model_for_kbit_training, LoraConfig
from dataclasses import asdict
from llama_recipes.configs import lora_config as LORA_CONFIG

lora_config = LORA_CONFIG()
lora_config.r = 8
lora_config.lora_alpha = 32
lora_dropout: float = 0.01

peft_config = LoraConfig(**asdict(lora_config))

model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)
```

* **Fine-tuning:** To specialize Lumo for the Solana ecosystem, the base model undergoes a fine-tuning process on the Lumo-8B-DS-Instruct dataset. This dataset comprises over 28,518 high-quality question-answer pairs specifically curated for Solana, covering a wide range of topics:
  * **Solana Fundamentals:** Blockchain architecture, consensus mechanisms (Proof-of-History, Proof-of-Stake), tokenomics.
  * **Development:** Smart contract development (using languages like Rust, Solidity), interacting with the Solana RPC, using Solana developer tools.
  * **Ecosystem:** DeFi protocols, NFTs, dApps, governance, and the broader Solana ecosystem.
  * **Technical Concepts:** Cryptography, cryptography algorithms used in Solana (e.g., Ed25519), data structures (e.g., Merkle trees).

```python
import torch.optim as optim
from llama_recipes.utils.train_utils import train
from torch.optim.lr_scheduler import StepLR

# Initialize wandb with new project name
wandb_run = wandb.init(project="finetune-llama-lumo-8b")
wandb_run.config.update(train_config)

model.train()

# Initialize optimizer with potentially adjusted hyperparameters for Lumo dataset
optimizer = optim.AdamW(
    model.parameters(),
    lr=train_config.lr,
    weight_decay=train_config.weight_decay,
)

# Keep the same scheduler structure
scheduler = StepLR(optimizer, step_size=1, gamma=train_config.gamma)

# Start training with updated dataloaders
results = train(
    model,
    train_dataloader,
    eval_dataloader,
    tokenizer,
    optimizer,
    scheduler,
    train_config.gradient_accumulation_steps,
    train_config,
    None,
    None,
    None,
    wandb_run,
)
```

* **Parameter-Efficient Fine-Tuning (PEFT):** To optimize the fine-tuning process and enhance efficiency, Lumo employs PEFT techniques. Specifically, we utilize **LoRA (Low-Rank Adaptation)**, a method that introduces trainable rank-decomposition matrices to the model's attention layers.
  * **LoRA Parameters:**
    * **Rank:** 8 (r = 8)
    * **Alpha:** 32 (alpha = 32)
    * **Dropout:** 0.01
  * **Benefits of LoRA:**
    * **Reduced Training Time:** Trains significantly faster than fine-tuning all model parameters.
    * **Reduced Memory Footprint:** Requires significantly less memory during training.
    * **Preserves Pre-trained Knowledge:** Minimizes the risk of catastrophic forgetting, where the model loses its pre-trained knowledge during fine-tuning.

### Check out the model

Lumo-8B-Instruct is open-source, and deployed on HuggingFace, click the embedding below to check out the model.

{% embed url="<https://huggingface.co/lumolabs-ai/Lumo-8B-Instruct>" %}
