Model Overview
Lumo-8B-Instruct is the first-ever cutting-edge AI model specifically designed to empower developers and users within the Solana ecosystem. Built upon the foundation of the robust LLaMa 3.1 8B parameter language model, Lumo is fine-tuned on a comprehensive dataset of Solana-related questions and answers, enabling it to provide exceptional assistance in various domains.
About the Model
import torch
from transformers import LlamaForCausalLM, AutoTokenizer
from llama_recipes.configs import train_config as TRAIN_CONFIG
train_config = TRAIN_CONFIG()
train_config.model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
train_config.num_epochs = 2
train_config.run_validation = False
train_config.gradient_accumulation_steps = 4
train_config.batch_size_training = 1
train_config.lr = 3e-4
train_config.use_fast_kernels = True
train_config.use_fp16 = True
train_config.context_length = 4096
train_config.batching_strategy = "packing"
train_config.output_dir = "Lumo-8B-Instruct"
from transformers import BitsAndBytesConfig
config = BitsAndBytesConfig(
load_in_8bit=True,
)
model = LlamaForCausalLM.from_pretrained(
train_config.model_name,
device_map="auto",
quantization_config=config,
use_cache=False,
attn_implementation="sdpa" if train_config.use_fast_kernels else None,
torch_dtype=torch.float16,
)
tokenizer = AutoTokenizer.from_pretrained(train_config.model_name)
tokenizer.pad_token = tokenizer.eos_token
Base Model: Lumo is founded upon the LLaMa 3.1 8B parameter model, a state-of-the-art decoder-only transformer architecture renowned for its exceptional language generation capabilities.
Key Architectural Features:
Transformer Architecture: Lumo leverages the attention mechanism of transformers to effectively capture long-range dependencies within the input sequence and generate coherent and contextually relevant responses.
Decoder-Only Model: Lumo is designed as a decoder-only model, focusing on generating text outputs based on given inputs, making it well-suited for tasks like text completion, summarization, and question answering.
8 Billion Parameters: The model boasts 8 billion parameters, enabling it to learn complex patterns and relationships within the data and generate highly sophisticated outputs.
from peft import get_peft_model, prepare_model_for_kbit_training, LoraConfig
from dataclasses import asdict
from llama_recipes.configs import lora_config as LORA_CONFIG
lora_config = LORA_CONFIG()
lora_config.r = 8
lora_config.lora_alpha = 32
lora_dropout: float = 0.01
peft_config = LoraConfig(**asdict(lora_config))
model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)
Fine-tuning: To specialize Lumo for the Solana ecosystem, the base model undergoes a fine-tuning process on the Lumo-8B-DS-Instruct dataset. This dataset comprises over 28,518 high-quality question-answer pairs specifically curated for Solana, covering a wide range of topics:
Solana Fundamentals: Blockchain architecture, consensus mechanisms (Proof-of-History, Proof-of-Stake), tokenomics.
Development: Smart contract development (using languages like Rust, Solidity), interacting with the Solana RPC, using Solana developer tools.
Ecosystem: DeFi protocols, NFTs, dApps, governance, and the broader Solana ecosystem.
Technical Concepts: Cryptography, cryptography algorithms used in Solana (e.g., Ed25519), data structures (e.g., Merkle trees).
import torch.optim as optim
from llama_recipes.utils.train_utils import train
from torch.optim.lr_scheduler import StepLR
# Initialize wandb with new project name
wandb_run = wandb.init(project="finetune-llama-lumo-8b")
wandb_run.config.update(train_config)
model.train()
# Initialize optimizer with potentially adjusted hyperparameters for Lumo dataset
optimizer = optim.AdamW(
model.parameters(),
lr=train_config.lr,
weight_decay=train_config.weight_decay,
)
# Keep the same scheduler structure
scheduler = StepLR(optimizer, step_size=1, gamma=train_config.gamma)
# Start training with updated dataloaders
results = train(
model,
train_dataloader,
eval_dataloader,
tokenizer,
optimizer,
scheduler,
train_config.gradient_accumulation_steps,
train_config,
None,
None,
None,
wandb_run,
)
Parameter-Efficient Fine-Tuning (PEFT): To optimize the fine-tuning process and enhance efficiency, Lumo employs PEFT techniques. Specifically, we utilize LoRA (Low-Rank Adaptation), a method that introduces trainable rank-decomposition matrices to the model's attention layers.
LoRA Parameters:
Rank: 8 (r = 8)
Alpha: 32 (alpha = 32)
Dropout: 0.01
Benefits of LoRA:
Reduced Training Time: Trains significantly faster than fine-tuning all model parameters.
Reduced Memory Footprint: Requires significantly less memory during training.
Preserves Pre-trained Knowledge: Minimizes the risk of catastrophic forgetting, where the model loses its pre-trained knowledge during fine-tuning.
Check out the model
Lumo-8B-Instruct is open-source, and deployed on HuggingFace, click the embedding below to check out the model.
Last updated