Change the repository type filter
All
Repositories list
19 repositories
TeEFusion
PublicTeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance (ICCV 2025)Marco-Voice
Publicflashinfer
Public- An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.
- Awesome Unified Multimodal Models
Ovis
PublicA novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.- 🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
Marco-o1
PublicTransBench
PublicTG-LLaVA
PublicWings
PublicThe code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]M3Bench
PublicAutoGPTQ
Public