LogoLogo
HuggingFace Community$LUMOXTelegram
  • Introduction
  • Roadmap
  • Partnerships and Listings
  • LumoKit: Solana AI Toolkit Framwork
    • Introduction to LumoKit
    • Installation Guide
      • Pre-requisites
      • Environment Configuration
      • Local Installation
  • How to Add Tools
  • Tools
    • Wallet Portfolio tool
    • Token Identification Tool
    • Rugcheck Token Information Tool
    • Fluxbeam Token Price
    • BirdEye Token Trending
    • Birdeye All Time Trades
    • CoinMarketCap Crypto News
    • Crypto.news Memecoin News
    • GeckoTerminal Trending Pump.Fun Tool
    • CoinGecko Global Crypto Data Tool
    • CoinGecko Trending Crypto Tool
    • CoinGecko Exchange Rates Tool
    • CoinGecko Coin Data Tool
    • CoinMarketCap Trending Coins Tool
    • DexScreener Top Boosts Tool
    • DexScreener Token Information
    • Jupiter Token Price
    • Jupiter Token Metadata Tool
    • Solana Send SOL Tool
    • Solana Send SPL Tokens Tool
    • Solana Burn Tokens Tool
    • Jupiter Swap (Buy/Sell) Tool
    • Pump.Fun Launch Coin Tool
  • Lumo-8B-Instruct Model
    • Model Overview
    • Capabilities and Limitations
    • Use Cases
  • Lumo Dataset
    • About Lumo-Iris
    • About Lumo-8B
    • Dataset Preparation
    • Training Metrics
  • Using The Model
    • HuggingFace Hub
    • How to Inference
  • Lumo Community
    • How to Contribute
    • Report Bugs/Issues
Powered by GitBook

Copyright © 2025 Lumo. All Rights Reserved. This software is open-source and licensed under the GNU Affero General Public License (AGPL) v3.0. You are free to redistribute and modify it under the terms of this license.

On this page
  • About the Model
  • Check out the model
  1. Lumo-8B-Instruct Model

Model Overview

PreviousPump.Fun Launch Coin ToolNextCapabilities and Limitations

Last updated 4 months ago

Lumo-8B-Instruct is the first-ever cutting-edge AI model specifically designed to empower developers and users within the Solana ecosystem. Built upon the foundation of the robust LLaMa 3.1 8B parameter language model, Lumo is fine-tuned on a comprehensive dataset of Solana-related questions and answers, enabling it to provide exceptional assistance in various domains.

Lumo is the first ever to launch a fine-tuned model tailored for the Solana ecosystem.

About the Model

import torch
from transformers import LlamaForCausalLM, AutoTokenizer
from llama_recipes.configs import train_config as TRAIN_CONFIG

train_config = TRAIN_CONFIG()
train_config.model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
train_config.num_epochs = 2
train_config.run_validation = False
train_config.gradient_accumulation_steps = 4
train_config.batch_size_training = 1
train_config.lr = 3e-4
train_config.use_fast_kernels = True
train_config.use_fp16 = True
train_config.context_length = 4096
train_config.batching_strategy = "packing"
train_config.output_dir = "Lumo-8B-Instruct"

from transformers import BitsAndBytesConfig
config = BitsAndBytesConfig(
    load_in_8bit=True,
)

model = LlamaForCausalLM.from_pretrained(
    train_config.model_name,
    device_map="auto",
    quantization_config=config,
    use_cache=False,
    attn_implementation="sdpa" if train_config.use_fast_kernels else None,
    torch_dtype=torch.float16,
)

tokenizer = AutoTokenizer.from_pretrained(train_config.model_name)
tokenizer.pad_token = tokenizer.eos_token
  • Base Model: Lumo is founded upon the LLaMa 3.1 8B parameter model, a state-of-the-art decoder-only transformer architecture renowned for its exceptional language generation capabilities.

    • Key Architectural Features:

      • Transformer Architecture: Lumo leverages the attention mechanism of transformers to effectively capture long-range dependencies within the input sequence and generate coherent and contextually relevant responses.

      • Decoder-Only Model: Lumo is designed as a decoder-only model, focusing on generating text outputs based on given inputs, making it well-suited for tasks like text completion, summarization, and question answering.

      • 8 Billion Parameters: The model boasts 8 billion parameters, enabling it to learn complex patterns and relationships within the data and generate highly sophisticated outputs.

from peft import get_peft_model, prepare_model_for_kbit_training, LoraConfig
from dataclasses import asdict
from llama_recipes.configs import lora_config as LORA_CONFIG

lora_config = LORA_CONFIG()
lora_config.r = 8
lora_config.lora_alpha = 32
lora_dropout: float = 0.01

peft_config = LoraConfig(**asdict(lora_config))

model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)

  • Fine-tuning: To specialize Lumo for the Solana ecosystem, the base model undergoes a fine-tuning process on the Lumo-8B-DS-Instruct dataset. This dataset comprises over 28,518 high-quality question-answer pairs specifically curated for Solana, covering a wide range of topics:

    • Solana Fundamentals: Blockchain architecture, consensus mechanisms (Proof-of-History, Proof-of-Stake), tokenomics.

    • Development: Smart contract development (using languages like Rust, Solidity), interacting with the Solana RPC, using Solana developer tools.

    • Ecosystem: DeFi protocols, NFTs, dApps, governance, and the broader Solana ecosystem.

    • Technical Concepts: Cryptography, cryptography algorithms used in Solana (e.g., Ed25519), data structures (e.g., Merkle trees).

import torch.optim as optim
from llama_recipes.utils.train_utils import train
from torch.optim.lr_scheduler import StepLR

# Initialize wandb with new project name
wandb_run = wandb.init(project="finetune-llama-lumo-8b")
wandb_run.config.update(train_config)

model.train()

# Initialize optimizer with potentially adjusted hyperparameters for Lumo dataset
optimizer = optim.AdamW(
    model.parameters(),
    lr=train_config.lr,
    weight_decay=train_config.weight_decay,
)

# Keep the same scheduler structure
scheduler = StepLR(optimizer, step_size=1, gamma=train_config.gamma)

# Start training with updated dataloaders
results = train(
    model,
    train_dataloader,
    eval_dataloader,
    tokenizer,
    optimizer,
    scheduler,
    train_config.gradient_accumulation_steps,
    train_config,
    None,
    None,
    None,
    wandb_run,
)

  • Parameter-Efficient Fine-Tuning (PEFT): To optimize the fine-tuning process and enhance efficiency, Lumo employs PEFT techniques. Specifically, we utilize LoRA (Low-Rank Adaptation), a method that introduces trainable rank-decomposition matrices to the model's attention layers.

    • LoRA Parameters:

      • Rank: 8 (r = 8)

      • Alpha: 32 (alpha = 32)

      • Dropout: 0.01

    • Benefits of LoRA:

      • Reduced Training Time: Trains significantly faster than fine-tuning all model parameters.

      • Reduced Memory Footprint: Requires significantly less memory during training.

      • Preserves Pre-trained Knowledge: Minimizes the risk of catastrophic forgetting, where the model loses its pre-trained knowledge during fine-tuning.

Check out the model

Lumo-8B-Instruct is open-source, and deployed on HuggingFace, click the embedding below to check out the model.

lumolabs-ai/Lumo-8B-Instruct · Hugging Facehuggingface
Logo