ComfyUI Tutorial Series: Ep01 - Introduction and Installation

21.6k views3456 WordsCopy TextShare

pixaroma

Welcome to the first episode of the ComfyUI Tutorial Series! In this series, I will guide you throug...

Video Transcript:

hello I'm starting a new series of tutorials on stable diffusion AI using the comfy UI interface every day I'm learning more about it and uh I'll share the things I've learned with you you know I'll explain it from a graphic designer's perspective in a way that's easy to understand so anyone can use it I'll structure it into episodes allowing you to watch in a certain order and progress from beginner to Advanced in this first episode I'll discuss what comfy UI is and its advantages and disadvantages then we'll quickly go through the installation process download different models so by the end of this video you'll be able to generate your first image for free using your own computer when it comes to stable diffusion there are multiple interfaces to choose from each with its own characteristics some are easier to use While others are more complex or faster popular interfaces include automatic 1111 Forge UI invoke and today's Focus comfy UI comfy UI is a user interface framework that lets you create and manage workflows by visually connecting different tasks um much like building with Lego blocks each block or node represents a specific function and by connecting them you can easily construct complex processes advantages of using comfy UI comfy UI lets you create workflows quickly and flexibly without being limited by preset options each node shows a specific function making it easy to see the entire process you can easily share your workflows um and use ones made by others promoting collaboration comfy UI requires no coding just drag and drop nodes to create and customize workflows for your needs downsides of using comfy UI the organization of nodes and comfy UI can vary between workflows um which can be confusing when using other people's setups the detailed view of processes might overwhelm uh average users who prefer simpl despite no coding requirements there is a learning curve to effectively use nodes and build workflows complex workflows can affect performance um slowing down your computer if it doesn't meet system requirements despite these downsides comfy UI remains a strong and versatile tool uh for creating and managing workflows especially for users who need detailed control once you have your workflow set up it is fast and efficient initially I avoided comfy UI because it seemed too complex with all the nodes but after spending a few days with some tutorials I started to appreciate its capabilities so let's start with the installation um you can visit the GitHub page for comy UI in the feature section you'll find everything it can do such as stable diffusion video stable audio and other supported features there's also a list of shortcuts um available finally you'll find the installation instructions I have Windows operating system and an Nvidia RTX 4090 card so I'll demonstrate the installation for Windows and Nvidia if you have a different operating system or graphics card you can find installation instructions below for those systems for system requirements you'll need a recent operating system and the more vram your video card has the faster it will generate especially if it's from the RTX series therefore Nvidia cards are preferred for for their speed I tested it on a 6 GB vram video card and it worked okay but not very quickly so I would recommend opting for an Nvidia card with at least 8 gigabyte of vram or more and ensure you have 16 gigabyte of ram in your system for a smoother workflow for Windows it's quite straightforward you just need to click on the direct download link for the portable version and choose a folder where you want to store it I'll choose to put it on my D drive and create a folder named comfy UI the archive is in 7z format and you can extract it using szip you can find a download link here or in my case I have WinRAR and we'll use that to extract it now let's navigate into that newly extracted folder you'll notice it indicates that it's portable inside you'll find several dobat files and a readme file that you should read the readme file explains which dobat file you need to run depending on what you want to do for Optimal Performance use Nvidia GPU because it's much faster than the CPU and the redmi file also includes additional information such as where to place the models and other relevant details double click on the Run Nvidia GPU dobat file and a command window will open and this will launch the comfy UI you can view here details about your graphics card including how much vam and RAM you have once it finishes a new browser window will automatically open display playing The Comfy UI interface you can use the mouse wheel to zoom in or out or you can use shortcuts like alt plus and ALT minus you can move the canvas around by clicking on it and dragging or you can press the space bar uh and move the cursor you can also click on nodes and drag them around arrange the interface so that it fits your screen comfortably those little rectangular windows are called nodes in comfy UI you connect these nodes to load models input text create images and save your work in a step-by-step process the lines between nodes show how information moves between them connecting each part of your work together the Q prompt button in comfy UI adds your current image generation task to a list allowing multiple tasks to run one after another automatically um if I click on it it should generate an image uh but instead I get an error first of all when you encounter an error it's helpful to look at which nodes the error occurred in um you can see that the load checkpoint node has a red outline and the message says value is not in the list which means it couldn't find that specific checkpoint with that name on our computer if I click on that list and try to select a model um you'll notice I can't find one because I haven't downloaded any checkpoints for this to work therefore we need a stable diffusion model there in order for it to work properly the easiest way to get models is to visit the civit AI website there you'll find a tab labeled model click on it to see various types of models what I like to do is sort them by highest rating or most downloads you can also use filters to select a specific time period like a week or a month for file format I prefer safe tensor because it's safer than the ckpt format uh the most popular models are typically v1. 5 sdxl and recently sd3 I will select an sdxl model which is the one I usually use let's go to the search and look for Juggernaut models here we have a few options this one is the sdxl version and this one is the v1. 5 version I'll choose the XL version and you can see there are different versions available such as um the hyper version however I prefer version X because in my opinion it offers better quality let's download this model by clicking on the download button with the down arrow icon after that you need to place it in a specific folder navigate to The Comfy UI folder then locate the models directory inside there's a folder named checkpoints where you store all your stable diffusion models um regardless of the version click save and wait for the download to complete since it's quite large around 6 GB it will take some time to finish downloading let's download an SD v1.

5 model as well suitable for those with less powerful video cards to generate images faster at lower sizes you can go to the filters and select only v1. 5 models or you can search for Juggernaut again and choose the v1. 5 version the latest version is called reborn so you can download that one before downloading make sure to check for um the base model label to ensure it's the correct base model for example if it says v1.

5 it's crucial to use the same base model across all your models this is important because features like control net will only work with you know the same base model type for v1. 5 models you'll need v1. 5 control net and related extensions while sdxl models will require corresponding sdxl components and extensions for now it's important to understand that the base model exists and what type it is I'll talk more about it in future episodes so don't stress to much click download and place it in the same U checkpoints folder as the previous model if you check other models you'll notice that each one specifies its base model often with different versions from various dates generally newer versions have undergone more training and provide better quality so you'll likely prefer the latest version if I quickly navigate to comfy UI then go to models and checkpoints you'll see that the download isn't finished yet it's still in progress if I skip ahead a few minutes both models the reborn and X versions are now here and ready the installation of comfy UI isn't complicated at all you simply download comfy and the model and you're done I could have made a two-minute video showing just that however I prefer to explain each step in detail so you have a better understanding of what's happening this way you'll be able to do more than just clicking generate and getting an image you'll understand how to use it effectively if you try to load the newly downloaded model and nothing happens it's likely because the comfy UI was already open and when we downloaded the model in this case you need to click on refresh and now you should see the models appear in the list both the reborn and X versions should be visible now I'll select the sdxl version I will resize the node to see the model name better you can only adjust it from uh the bottom right corner after resizing you can then move the node into place now if I try to generate again using the Q prompt you'll notice that we have a green outline around this checkpoint note um but that's not all you can follow the flow to the next node and see that it's also green if each node turns green all the way to the end it means the image was successfully generated however if any nodes turn red there's a problem that needs to be addressed and for this prompt here is the result now I'll choose the other model the reborn version to see if it works I'll click on Q prompt as you can see the flow moves quickly from one node to the next you'll notice in the command window that it generated in just a few seconds the first time may take longer um because it's also loading the new model um if I generate again you'll see it only took one second for a 5 on 12 pixel image in the next episode I'll provide more detailed explanations of what each node does but for now here's the short version this workflow uh loads the model using the uh load checkpoint node it encodes the positive prompt where we describe what we want like a glass bottle and the negative prompt with words you want to avoid like Watermark using two clip text and code prompt nodes these encoded prompts are then fed into the case sampler node which adjusts parameters such as seed steps and CFG to generate a latent image based on these inputs the latent image is decoded using the vae decode node and the final step involves saving the generated image with the save image node each model performs best with specific settings and many models come with recommended configurations if we return to the model page and scroll down um clicking on show more reveals all the details here you'll find information such as recommended image size number of steps and other relevant settings these details help optimize the performance of each model according to its specifications as you can see here it indicates 512x 768 pixels for the image width and height the v1.

5 base models were trained on images sized 512 by 512 pixels you can go slightly larger up to 768 pixels but exceeding this might cause the model to interpret the image as 2 512 pixel images resulting in double heads and various distortions so we can create Square landscape and portrait images as long as they are closed to those values let's try the portrait orientation look for the empty latent image node and here you can enter the width and height I will use 768 for the height and press okay let's see what else we have here it mentions um sampler DPM plus plus 2m caras so we'll look for sampler in the list we can find it here although it doesn't have two plus signs but it has two PS so it's essentially the same but with a different name however we don't have Caris in the name instead Caris is placed under um Schuler in uh in the next field so where it says schedular normal um we replace normal with Caris um it also mentions steps 35 and cfg7 so we'll add those values as well since we're testing things out let's also try their example if I look where the images are um and click on the I button I can see the prompts and settings used to generate that image I can copy the positive prompt and paste it into my positive prompt field and do the same for the negative prompt you'll notice the settings are the same only the seat is fixed but I'll leave mine random when I cue The Prompt I get this beautiful portrait image not bad for a v1. 5 model uh let me try a few more Times Yes it seems to be working just fine now that we've configured all the settings and everything is working well we don't want to set up these configurations every time instead we can save the entire workflow so we can return to it later or you know share it with others click on Save and save it in a location where you can easily find it again I recommend creating a workflows folder somewhere on your drive for organization uh give it a descriptive name so you know what it's all about and and there you have it you've created um your working v1. 5 um workflow for this specific model let's go to the sdxl model page for the Juggernaut version XR 10 and uh click on show more here we have recommended settings for both the normal version and the hyper version since we downloaded the normal version we'll use the settings for that one now let's load that sdxl model into our comfy UI interface I prefer the sdxl model over v1.