AIDC-AI

All

19 repositories

TeEFusion
Public
TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance (ICCV 2025)
text-to-image distillation-model sd3 classifier-free-guidance
Python
•
Other
•0•2•0•0•Updated Jul 25, 2025Jul 25, 2025
CHATS
Public
CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image Generation (ICML2025)
text-to-image dpo sdxl
Python
•
Apache License 2.0
•2•137•1•0•Updated Jul 24, 2025Jul 24, 2025
ComfyUI-Copilot
Public
An AI-powered custom node for ComfyUI designed to enhance workflow automation and provide intelligent assistance
agent flux ai copilot rag gpt-4 stable-diffusion comfyui llm-agent deepseek
TypeScript
•
MIT License
•165•2.3k•18•0•Updated Jul 24, 2025Jul 24, 2025
Marco-Voice
Public
Apache License 2.0
•0•0•0•0•Updated Jul 22, 2025Jul 22, 2025
Marco-Bench-MIF
Public
0•6•1•0•Updated Jul 18, 2025Jul 18, 2025
flashinfer
Public
FlashInfer: Kernel Library for LLM Serving
Cuda
•
Apache License 2.0
•400•1•0•0•Updated Jul 15, 2025Jul 15, 2025
UNIC-Adapter
Public
Python
•
MIT License
•0•7•1•0•Updated Jul 10, 2025Jul 10, 2025
Ovis-U1
Public
An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.
image-editing text-to-image multimodal-large-language-models
Python
•
Apache License 2.0
•9•367•4•0•Updated Jul 8, 2025Jul 8, 2025
Agentic-ADK
Public
Alibaba LangEngine is an AI application development framework written in Java.
Java
•
Apache License 2.0
•57•370•3•2•Updated Jul 3, 2025Jul 3, 2025
Awesome-Unified-Multimodal-Models
Public
Awesome Unified Multimodal Models
multimodal-models text-to-image-generation vision-language-model multimodal-large-language-models unified-multimodal-models
13•495•3•4•Updated Jul 2, 2025Jul 2, 2025
Ovis
Public
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
chatbot multimodality multimodal vision-language-model multimodal-large-language-models vision-language-learning qwen llama3
Python
•
Apache License 2.0
•66•995•53•2•Updated Jun 14, 2025Jun 14, 2025
Parrot
Public
🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
multilingual mixture-of-experts vision-language-model multimodal-large-language-models
Python
•
Apache License 2.0
•2•87•1•0•Updated Jun 12, 2025Jun 12, 2025
Marco-o1
Public
An Open Large Reasoning Model for Real-World Solutions
Python
•
Other
•82•1.5k•10•0•Updated May 30, 2025May 30, 2025
TransBench
Public
2•34•2•0•Updated May 29, 2025May 29, 2025
TG-LLaVA
Public
Python
•
Apache License 2.0
•0•6•0•0•Updated Jan 14, 2025Jan 14, 2025
Wings
Public
The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]
deep-learning mllm multimodal-large-language-models multimodal-llm text-only-forgetting
Python
•
Apache License 2.0
•1•20•0•0•Updated Dec 28, 2024Dec 28, 2024
M3Bench
Public
Python
•
Apache License 2.0
•4•2•0•0•Updated Dec 15, 2024Dec 15, 2024
Meissonic
Public
Python
•
Other
•0•3•0•0•Updated Nov 14, 2024Nov 14, 2024
AutoGPTQ
Public
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Python
•
Other
•523•3•0•0•Updated Nov 4, 2024Nov 4, 2024

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies:

Alternative Proxy