Skip to content

Realtime diffusion (LCM-LoRA) from screen capture or webcam, for architecture, using torch and Pyside6

Notifications You must be signed in to change notification settings

s-du/FocusPocusAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FocusPocusAI

Image generation from screen capture, webcam capture and/or simple brush strokes. The functions have been designed primarily for use in architecture, and for sketching in the early stages of a project. It uses Stable Diffusion and LCM-LoRA as AI backbone for the generative process. IP Adapter support is included! Initially, the Gradio code from https://github.com/flowtyone/flowty-realtime-lcm-canvas was adapted to Pyside6, and upgraded with the screen capture functionality.

example_focus

Any app can be used as a design inspiration source!

Examples of screen captures that could be a great source of information for diffusion :

  • Creating simple shapes in Blender
  • Painting in Photoshop/Krita
  • Stop a video on a specific frame
  • Google Earth or Google Street View
  • ...
Description

example showing a screen capture from Blender (on the left)

Description

example showing a screen capture from a video (on the left)

Installation

  • Install CUDA (if not done already)
  • Clone the repo and install a venv.
  • Install torch. Example for CUDA 11.8:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

(see https://pytorch.org/get-started/locally/)

  • Install other dependencies (see requirements):
    • opencv-python
    • accelerate
    • diffusers
    • transformers
    • Pyside6. Note: It works with Pyside 6.5.2. Newer versions can cause problem with the loading of ui elements.
  • Launch main.py

Usage

Screen capture a 512 x 512 window on top any app (the dimensions can be adapted depending on your GPU). By default, the capture timestep is 1 second. Then, paint with a brush or add simple shapes and see the proposed image adapting live.

CTRL + wheel to adapt cursor size. The SD model can be adapted in the lcm.py file or chosen in a drop-down menu. Voilà!

FocusPocus_output_light.mp4

Included models

The user can choose the inference model from within the UI (beware of hard drive space!). Here are the available built-in models:

Credits

The 'lcm.py' is adapted from https://github.com/flowtyone

About

Realtime diffusion (LCM-LoRA) from screen capture or webcam, for architecture, using torch and Pyside6

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy