Skip to content

datamllab/ltsm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting

LTSM Model

Test

Empowering forecasts with precision and efficiency.

Table of Contents


Overview

This work investigates the transition from traditional Time Series Forecasting (TSF) to Large Time Series Models (LTSMs), leveraging large transformer-based models like GPT. Training LTSMs on diverse time series data introduces challenges due to varying frequencies, dimensions, and patterns.

We explore multiple design choices, including pre-processing strategies, tokenization, model architectures, and dataset setups. We introduce:

  • Time Series Prompt: A statistical prompting strategy
  • LTSM-bundle: A toolkit encapsulating effective design practices

The project is developed by the Data Lab at Rice University.


Why LTSM-bundle?

The LTSM-bundle leverages HuggingFace transformers, allowing flexible integration of large-scale pre-trained language models for time series tasks. Users can customize the pipeline to fit specific forecasting needs with minimal overhead, making it adaptable across various domains and industries.

Key highlights:

  • Plug-and-play with GPT-style backbones
  • Modular pipeline for easy experimentation
  • Support for statistical and text prompts

Features

Category Highlights
βš™Γ―ΒΈοΏ½ Architecture Modular design, GPT-style transformers for time series
πŸ“ Prompting Time Series Prompt & Text Prompt support
⚑️ Performance GPU acceleration, optimized pipelines
πŸ”§ Integrations LoRA support, JSON/CSV-based dataset and prompt interfaces
πŸ”¬ Testing Unit and integration tests, GitHub Actions CI
πŸ“Š Data Built-in data loaders, scalers, and tokenizers
πŸ“‚ Documentation Tutorials in English and Chinese

Installation

We recommend using Conda:

conda create -n ltsm python=3.8.0
conda activate ltsm

Then install the package:

git clone https://github.com/datamllab/ltsm.git
cd ltsm
pip install -e .
pip install -r requirements.txt

πŸ”§ Training Examples

from ltsm.data_pipeline import StatisticalTrainingPipeline, get_args, seed_all
from ltsm.models.base_config import LTSMConfig
from ltsm.common.base_training_pipeline import TrainingConfig

# Option 1: Load config via command-line arguments
config = get_args()

# Option 2: Load config from a JSON file
config = TrainingConfig.load("example.json")

# Option 3: Manually customize a supported model config in code
# (e.g., LTSMConfig, DLinearConfig, InformerConfig, etc.)
config = LTSMConfig(seq_len=336, pred_len=96)

# Set random seeds for reproducibility
seed = config.train_params["seed"]
seed_all(seed)

# Initialize the training pipeline with the loaded config
pipeline = StatisticalTrainingPipeline(config)

# Run the training and evaluation process
pipeline.run()

πŸ” Inference Examples

import os
import torch
import pandas as pd
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
from ltsm.models import LTSMConfig, ltsm_model

# Download model config and weights from Hugging Face
config_path  = hf_hub_download("LSC2204/LTSM-bundle", "config.json")
weights_path = hf_hub_download("LSC2204/LTSM-bundle", "model.safetensors")

# Load model and weights
model_config = LTSMConfig()
model_config.load(config_path)
model = ltsm_model.LTSM(model_config)

state_dict = load_file(weights_path)
model.load_state_dict(state_dict)

# Move model to device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device).eval()

# Load your dataset (e.g., weather)
df_weather = pd.read_csv("/path/to/dataset.csv")
print("Loaded data shape:", df_weather.shape)

# Load prompts per feature
feature_prompts = {}
prompt_dir = "/path/to/prompts/"
for feature, filename in {
    "T (degC)": "weather_T (degC)_prompt.pth.tar",
    "rain (mm)": "weather_rain (mm)_prompt.pth.tar"
}.items():
    prompt_tensor = torch.load(os.path.join(prompt_dir, filename))
    feature_prompts[feature] = prompt_tensor.squeeze(0).float().to(device)

# Predict (custom code here depending on your model usage)
# For example:
with torch.no_grad():
    inputs = feature_prompts["T (degC)"].unsqueeze(0)
    preds = model(inputs)
    print("Prediction output shape:", preds.shape)

Project Structure

└── ltsm-package/
    β”œβ”€β”€ datasets
    β”‚   └── README.md
    β”œβ”€β”€ imgs
    β”‚   β”œβ”€β”€ ltsm_model.png
    β”‚   β”œβ”€β”€ prompt_csv_tsne.png
    β”‚   └── stat_prompt.png
    β”œβ”€β”€ ltsm
    β”‚Β Β  β”œβ”€β”€ common                  # Base classes 
    β”‚Β Β  β”œβ”€β”€ data_pipeline           # Model lifecycle management and training pipeline
    β”‚Β Β  β”œβ”€β”€ data_provider           # Dataset construction
    β”‚Β Β  β”œβ”€β”€ data_reader             # Read input data from various formats (CSV, JSON, etc.)
    β”‚Β Β  β”œβ”€β”€ evaluate_pipeline       # Evaluation workflow for model performance
    β”‚Β Β  β”œβ”€β”€ layers                  # Custom neural network components
    β”‚Β Β  β”œβ”€β”€ models                  # Implementations: LTSM, DLinear, Informer, PatchTST
    β”‚Β Β  β”œβ”€β”€ prompt_reader           # Prompt generation and formatting
    β”‚Β Β  β”œβ”€β”€ sk_interface            # Scikit-learn style interface
    β”‚Β Β  └── utils                   # Shared helper functions
    β”œβ”€β”€ multi_agents_pipeline       # Multi-agent time series reasoning framework
    β”‚Β Β  β”œβ”€β”€ Readme.md
    β”‚Β Β  β”œβ”€β”€ agents                  # Agent definitions: Planner, QA, TS, Reward
    β”‚Β Β  β”œβ”€β”€ llm-server.py           # Local LLM server interface
    β”‚Β Β  β”œβ”€β”€ ltsm_inference.py       # Inference script using LTSM pipeline
    β”‚Β Β  β”œβ”€β”€ main.py                 # Pipeline entry point
    β”‚Β Β  └── model_config.yaml       # Configuration file for models and agents
    β”œβ”€β”€ requirements.txt
    β”œβ”€β”€ setup.py
    β”œβ”€β”€ tests                       # Unit tests for LTSM modules
    β”‚Β Β  β”œβ”€β”€ common
    β”‚Β Β  β”œβ”€β”€ data_pipeline
    β”‚Β Β  β”œβ”€β”€ data_provider
    β”‚Β Β  β”œβ”€β”€ data_reader
    β”‚Β Β  β”œβ”€β”€ evaluate_pipeline
    β”‚Β Β  β”œβ”€β”€ models
    β”‚Β Β  └── test_scripts
    └── tutorial
        └── README.md

Datasets and Prompts

Download datasets:

cd datasets
# Google Drive link:
https://drive.google.com/drive/folders/1hLFbz0FRxdiDCzgFYtKCOPJYSBVvwW9P

Download time series prompts:

cd prompt_bank/prompt_data_csv
# Same Google Drive link applies

Model Access

You can find our trained LTSM models on Hugging Face:

➑️ https://huggingface.co/LSC2204/LTSM-bundle


Cite This Work

If you find this work useful, please cite:

@misc{chuang2025ltsmbundletoolboxbenchmarklarge,
      title={LTSM-Bundle: A Toolbox and Benchmark on Large Language Models for Time Series Forecasting},
      author={Yu-Neng Chuang and Songchen Li and Jiayi Yuan and Guanchu Wang and Kwei-Herng Lai and Songyuan Sui and Leisheng Yu and Sirui Ding and Chia-Yuan Chang and Qiaoyu Tan and Daochen Zha and Xia Hu},
      year={2025},
      eprint={2406.14045},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2406.14045},
}

License

This project is licensed under the MIT License. See the LICENSE file for details.


Acknowledgments

We thank all contributors and collaborators involved in the LTSM project. Special thanks to the Data Lab at Rice University and the open-source community for enabling fast prototyping and reproducible research.


About

Understanding Different Design Choices in Training Large Time Series Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 10

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy