LogoLogo
HuggingFace Community$LUMOXTelegram
  • Introduction
  • Roadmap
  • Partnerships and Listings
  • LumoKit: Solana AI Toolkit Framwork
    • Introduction to LumoKit
    • Installation Guide
      • Pre-requisites
      • Environment Configuration
      • Local Installation
  • How to Add Tools
  • Tools
    • Wallet Portfolio tool
    • Token Identification Tool
    • Rugcheck Token Information Tool
    • Fluxbeam Token Price
    • BirdEye Token Trending
    • Birdeye All Time Trades
    • CoinMarketCap Crypto News
    • Crypto.news Memecoin News
    • GeckoTerminal Trending Pump.Fun Tool
    • CoinGecko Global Crypto Data Tool
    • CoinGecko Trending Crypto Tool
    • CoinGecko Exchange Rates Tool
    • CoinGecko Coin Data Tool
    • CoinMarketCap Trending Coins Tool
    • DexScreener Top Boosts Tool
    • DexScreener Token Information
    • Jupiter Token Price
    • Jupiter Token Metadata Tool
    • Solana Send SOL Tool
    • Solana Send SPL Tokens Tool
    • Solana Burn Tokens Tool
    • Jupiter Swap (Buy/Sell) Tool
    • Pump.Fun Launch Coin Tool
  • Lumo-8B-Instruct Model
    • Model Overview
    • Capabilities and Limitations
    • Use Cases
  • Lumo Dataset
    • About Lumo-Iris
    • About Lumo-8B
    • Dataset Preparation
    • Training Metrics
  • Using The Model
    • HuggingFace Hub
    • How to Inference
  • Lumo Community
    • How to Contribute
    • Report Bugs/Issues
Powered by GitBook

Copyright © 2025 Lumo. All Rights Reserved. This software is open-source and licensed under the GNU Affero General Public License (AGPL) v3.0. You are free to redistribute and modify it under the terms of this license.

On this page
  1. Lumo Dataset

Training Metrics

PreviousDataset PreparationNextHuggingFace Hub

Last updated 4 months ago

The Lumo 8B Instruct dataset was used to fine-tune the Lumo model. The following metrics were closely monitored during the training process:

  • Training Loss:

    • The primary metric used to evaluate the model's performance during training.

    • Calculated using the cross-entropy loss function, which measures the difference between the model's predicted probabilities and the true probabilities of the next token in the sequence.

    • Lower training loss generally indicates better model performance.

  • Validation Loss:

    • Calculated on the validation set during each training epoch.

    • Used to monitor the model's performance on unseen data and detect overfitting.

  • Perplexity:

    • Measures the average probability of the next token in the sequence.

    • Lower perplexity indicates that the model is better at predicting the next token, suggesting a better understanding of the data.

Training Process

  • Optimizer: The AdamW optimizer was used to update the model's parameters during training.

  • Learning Rate: The learning rate was set to 3e-4.

  • Gradient Accumulation: Gradient accumulation was used to effectively train the model with smaller batch sizes, which can improve training stability and reduce memory consumption.

  • Learning Rate Scheduler: A StepLR scheduler was used to adjust the learning rate during training, allowing the model to converge more effectively.

By carefully monitoring these metrics and adjusting training hyperparameters as needed, the Lumo model was successfully fine-tuned on the Lumo 8B Instruct dataset, achieving state-of-the-art performance on Solana-related tasks.