Model Overview

Lumo-8B-Instruct is the first-ever cutting-edge AI model specifically designed to empower developers and users within the Solana ecosystem. Built upon the foundation of the robust LLaMa 3.1 8B parameter language model, Lumo is fine-tuned on a comprehensive dataset of Solana-related questions and answers, enabling it to provide exceptional assistance in various domains.

Lumo is the first ever to launch a fine-tuned model tailored for the Solana ecosystem.

About the Model

import torch
from transformers import LlamaForCausalLM, AutoTokenizer
from llama_recipes.configs import train_config as TRAIN_CONFIG

train_config = TRAIN_CONFIG()
train_config.model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
train_config.num_epochs = 2
train_config.run_validation = False
train_config.gradient_accumulation_steps = 4
train_config.batch_size_training = 1
train_config.lr = 3e-4
train_config.use_fast_kernels = True
train_config.use_fp16 = True
train_config.context_length = 4096
train_config.batching_strategy = "packing"
train_config.output_dir = "Lumo-8B-Instruct"

from transformers import BitsAndBytesConfig
config = BitsAndBytesConfig(
    load_in_8bit=True,
)

model = LlamaForCausalLM.from_pretrained(
    train_config.model_name,
    device_map="auto",
    quantization_config=config,
    use_cache=False,
    attn_implementation="sdpa" if train_config.use_fast_kernels else None,
    torch_dtype=torch.float16,
)

tokenizer = AutoTokenizer.from_pretrained(train_config.model_name)
tokenizer.pad_token = tokenizer.eos_token
  • Base Model: Lumo is founded upon the LLaMa 3.1 8B parameter model, a state-of-the-art decoder-only transformer architecture renowned for its exceptional language generation capabilities.

    • Key Architectural Features:

      • Transformer Architecture: Lumo leverages the attention mechanism of transformers to effectively capture long-range dependencies within the input sequence and generate coherent and contextually relevant responses.

      • Decoder-Only Model: Lumo is designed as a decoder-only model, focusing on generating text outputs based on given inputs, making it well-suited for tasks like text completion, summarization, and question answering.

      • 8 Billion Parameters: The model boasts 8 billion parameters, enabling it to learn complex patterns and relationships within the data and generate highly sophisticated outputs.

from peft import get_peft_model, prepare_model_for_kbit_training, LoraConfig
from dataclasses import asdict
from llama_recipes.configs import lora_config as LORA_CONFIG

lora_config = LORA_CONFIG()
lora_config.r = 8
lora_config.lora_alpha = 32
lora_dropout: float = 0.01

peft_config = LoraConfig(**asdict(lora_config))

model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)

  • Fine-tuning: To specialize Lumo for the Solana ecosystem, the base model undergoes a fine-tuning process on the Lumo-8B-DS-Instruct dataset. This dataset comprises over 28,518 high-quality question-answer pairs specifically curated for Solana, covering a wide range of topics:

    • Solana Fundamentals: Blockchain architecture, consensus mechanisms (Proof-of-History, Proof-of-Stake), tokenomics.

    • Development: Smart contract development (using languages like Rust, Solidity), interacting with the Solana RPC, using Solana developer tools.

    • Ecosystem: DeFi protocols, NFTs, dApps, governance, and the broader Solana ecosystem.

    • Technical Concepts: Cryptography, cryptography algorithms used in Solana (e.g., Ed25519), data structures (e.g., Merkle trees).

import torch.optim as optim
from llama_recipes.utils.train_utils import train
from torch.optim.lr_scheduler import StepLR

# Initialize wandb with new project name
wandb_run = wandb.init(project="finetune-llama-lumo-8b")
wandb_run.config.update(train_config)

model.train()

# Initialize optimizer with potentially adjusted hyperparameters for Lumo dataset
optimizer = optim.AdamW(
    model.parameters(),
    lr=train_config.lr,
    weight_decay=train_config.weight_decay,
)

# Keep the same scheduler structure
scheduler = StepLR(optimizer, step_size=1, gamma=train_config.gamma)

# Start training with updated dataloaders
results = train(
    model,
    train_dataloader,
    eval_dataloader,
    tokenizer,
    optimizer,
    scheduler,
    train_config.gradient_accumulation_steps,
    train_config,
    None,
    None,
    None,
    wandb_run,
)

  • Parameter-Efficient Fine-Tuning (PEFT): To optimize the fine-tuning process and enhance efficiency, Lumo employs PEFT techniques. Specifically, we utilize LoRA (Low-Rank Adaptation), a method that introduces trainable rank-decomposition matrices to the model's attention layers.

    • LoRA Parameters:

      • Rank: 8 (r = 8)

      • Alpha: 32 (alpha = 32)

      • Dropout: 0.01

    • Benefits of LoRA:

      • Reduced Training Time: Trains significantly faster than fine-tuning all model parameters.

      • Reduced Memory Footprint: Requires significantly less memory during training.

      • Preserves Pre-trained Knowledge: Minimizes the risk of catastrophic forgetting, where the model loses its pre-trained knowledge during fine-tuning.

Check out the model

Lumo-8B-Instruct is open-source, and deployed on HuggingFace, click the embedding below to check out the model.

Last updated