Skip to content

hulsedev/hipamod

Repository files navigation

hipamod (HIghly PArallel MOdel Deployment)

Large Language Models (LLMs) are becoming more common, which is cool, however, they require huge amounts of computing resources to be deployed (see the metaseq api for some reference). What if we could run LLMs using only a few laptops?

Goal

This project seeks to determine two things:

  • to clarify the problem: how to perform a hardware-dependent scaling analysis for deployment of LLMs?
  • to solve it if it is solvable (based on the scaling analysis), otherwise prove that it is not solvable.

Basically, I'm aware that it's impossible that using CPUs from PCs can compete with TPU/GPU clusters, however, if we're able to achieve even 3% of efficiency compared to the more expensive setups, that still might be very helpful.

Compress 100x bigger models, to make them 100x smaller, and run them 100x faster.

In short, running opt-175B only on distributed CPUs is the mission.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy