I’m very happy to share my LoRA Caption workflow for ComfyUI that will let you run a batch of images through two different captioning methods using Florence 2 model or using Clip Interrogator. Both these are paired with WD14 Tagger node which generates some additional tags/keywords for the caption.
Images must be PNG format, JPEGs are not supported currently in the workflow due to custom node.
You can enter the LoRA training “trigger” word which is added to the prompt.
So I ran up my local instance on my computer of ComfyUI with Flux and started to see some incredible results.
Workflow is available here, you can download.
In order to run this, you need ComfyUI (update to the latest version) and then download these files.
Place the Model in the models\unet folder, VAE in models\VAE and Clip in models\clip folder of ComfyUI directories. Make sure you restart ComfyUI and Refresh your browser.
The workflow utilises Flux Schnell to generate the initial image and then Flux Dev to generate the higher detailed image. Final upscale is done using an upscale model.
I made a couple of changes to improve it for me but I don’t think they are major so you can explore and edit as you like.
The results and image quality are absolutely stunning!!
Dev is a higher quality model than Schnell, but Schnell is much faster (4 steps). These are big models though both of them weight a whopping 23.8GB each and they require high level of VRAM to run. It is recommended that you have 32GB RAM.
However, don’t be sad because there is a way to run them on lower VRAM GPUs. I have RTX4080 with 16GB and I can run both Dev and Schnell only difference is that Dev takes about 3 minutes to generate an image 1024px by 1536px while Schnell takes only 30-40 seconds to generate the same.
The buzz at the moment is that these models are at par with Midjourney and in my testing I have to agree that they are much better. It is better at many aspects actually:
Most importantly it doesn’t apply its own recipe or sauce to make your image better, so it stays close to your prompt as much as possible. Whereas, with Midjourney there is always the influence that their model tries to add in the image to make it better which can often make it hard to control the image with just a text prompt.
In order to run this, you need ComfyUI (update to the latest version) and then download these files.
Place the Model in the models\unet folder, VAE in models\VAE and Clip in models\clip folder of ComfyUI directories. Make sure you restart ComfyUI and Refresh your browser.
The default workflows are provided by ComfyAnonymous on their github page.
My adapted workflows are available as well for download. I provide two workflows Text 2 Image and Image 2 Image, just drag the PNG files in the zip into ComfyUI. Install any missing nodes using ComfyUI Manager.
Flux.1 Txt2Img and Img2Img Workflows (745 downloads )My Image to Image workflow utilises Florence 2 LLM and Clip Interrogator (got the original version online from somewhere I can’t recall) to generate an accompanying prompt to help guide Flux. So you have Image that is influencing the generation plus the text prompt that makes the result super!!
It’s been a wonderful breath of fresh air to get a model that can produce such high quality coherent results which has kick off the month of August with a bang. In wonder what other excitement is awaiting us next. For me I keep exploring Flux and had already downscaled my Midjourney subscription but is it time to ditch Midjourney, we will see.
]]>The custom node bring with a sample workflow that can be imported into ComfyUI and you can get started with generating your own animated live character from their image. The workflow is quite simple and all you need is:
Ideally the two components should be the same ratio, that if the video is 1:1 ratio use an image that is in the same ratio. Without that it does work but the results may be a bit skewed.
Based on the guiding video and the number of frames it has it will generate the same number of output frames. The Video Combine node will compile the final video based on the frame rate that you set.
There are few guiding videos available in the original Repo, you can download them in and use them to get started. However the most fun part is that you can record your own self and create some cool unique results.
Default workflow creates a merged video of the Guiding video and resulting Live Portrait. However, its easy to change that and only create the Live Portrait. Simply drag the full_image node from LivePortraitProcess into the images node of Video Combine. This will bypass the merged video and create the final result on its own.
I also shared a quick video on how to setup and use LivePortrait over on YouTube.
Playing around with all these I had so much fun with different character and expression, its quite addictive.
Abe after he's had a few too many!!#liveportrait #aianimation pic.twitter.com/FmRYJDvyOM
— Harmeet Gabha 🇦🇺 (@HarmeetGabha) July 9, 2024
Midjourney woman meets LivePortrait#liveportrait #aianimation pic.twitter.com/QbM63zeQID
— WeirdWonderfulAI.art (@wwAIArt) July 12, 2024
#gen3 not required pic.twitter.com/nVzcuXWi1O
— Harmeet Gabha 🇦🇺 (@HarmeetGabha) July 5, 2024
Midjourney girl with LivePortrait!!#liveportrait pic.twitter.com/ehGGJtTC8o
— Harmeet Gabha 🇦🇺 (@HarmeetGabha) July 12, 2024
Let me know if you have any questions or issues but installing and using LivePortrait is pretty easy.
]]>I added a comparison node that let’s you easily see the difference by overlaying the two images on top of one another. I found this workflow to be quite useful that I decided to share the ComfyUI Workflow
The workflow locks the Seed down using Global Seed custom node and also includes HiRes script node with Custom Ksampler so if you don’t have these custom nodes you will need to install them using ComfyUI Manager. I have a very quick 2-min tutorial below on how to do this.
The workflow is pretty straight forward as you can see below: load an image, resize it to workable size, duplicate to generate multiple frames and then add the zooming effect. Finally producing a Mp4 file at the end of it.
The custom node bundle includes a lot of other fun nodes which are useful for generating some cool results that can be explored further in future posts but this post was only focused on infinite zoom loop that you can generate using a single image in ComfyUI.
Infinite Zoom in #comfyui
— Harmeet Gabha 🇦🇺 (@HarmeetGabha) June 3, 2024
Workflow included 🔗👇 pic.twitter.com/EHAdn8Imup
— Harmeet Gabha 🇦🇺 (@HarmeetGabha) June 3, 2024
— Harmeet Gabha 🇦🇺 (@HarmeetGabha) June 3, 2024
One such amazing creator is Zho and has published some amazing workflows via Git Repo. However some of the content is published in Mandarin or Cantonese so you can translate it using Google. I have linked the whole repo in English using Google translate so checkout the whole page in English here.
He is very active and always creating new stuff. For the time it took to put this whole blog together Zho had released new nodes so chances are when you are reading this post there are much more Custom Workflows available on the GitHub page.
Simply download the JSON or PNG files provided in the links below and start using them by loading them in your ComfyUI interface. If you are missing some nodes (these will show up in red), follow this post and video to install missing nodes.
Zho has published a collection of Stable Cascade Workflows that let you create txt2img, use Canny ControlNet, in-painting and img2img. The workflows are beautifully laid out and organised on the screen.
Differential diffusion workflow lets you in-paint into an existing image using mask and text based prompts. Its a wonderful way to enjoy and edit existing images to create, enhance and add new elements to. Quite a fun workflow in my opinion.
You can download the Simple workflow or the Text2Img workflow
A very powerful AI based object detection and segmentation workflow that let’s you identify various objects in the scene easily using existing models and then can generate a mask you can manipulate the image if you so desire. Can work nicely when you want to in-paint something in an image. You could also combine it with some of the other workflows above.
A word cloud workflow that lets you take an image generated from Txt2Img prompt “a kangaroo” or anything else you like which is then segmented to remove the background and then the text provided is evaluated to generate a word cloud in the shape of the image it generated. I tried it myself and its pretty fun to generate some word clouds.
InstantID workflow that let’s you recreate a reference face based on the initial input image. You do need to install and have some models in place to use it so make sure you follow the instructions on the page and read the english version.
Zho has added some cool touches to this one where you can choose an artist style and movement using custom nodes. If you need the custom nodes using ComyUI Manager to install missing nodes (red nodes in your workflow)
APISR is a Anime Production Inspired Real-World Anime Super-Resolution that let you take anime/cartoon images and video and increase their resolution by 2x or 4x by using their provided models. Zho has implemented this APISR code in ComfyUI and the below workflow example allow you to increase resolution of anime/cartoons. English version of the readme.
Download the workflows here. Two JSON files are provided for use in ComfyUI. Make sure you use ComfyUI Manager to install missing nodes.
These are some of the coolest workflows that are available so far but Zho keeps adding more and more, I’ve just seen SVD version that Zho is working on so make sure you check this out.
]]>I used the pre-built ComfyUI template available on RunPod.io which installs all the necessary components and ComfyUI is ready to go. I will perhaps share my workflow in more details in coming days about RunPod.io.
However here we are talking about ComfyUI IPAdapter Plus which you can install using ComfyUI Manager on your instance. You then need to download a bunch of models yes the list is long and specifically with ClipVision models you need to remember to rename them as instructed by the Author. Doesn’t matter if you are running locally or in the cloud the steps are the same.
The first challenge is to download all the models, so I built a Jupyter Lab notebook (shown below) which you can simply upload via the file explorer and run up its cells. It will download and install all the models. You probably need to create ipadapter folder under ComfyUI\models as it seems the Custom Node does not create this folder, at least not in my instance.
You also need to install InsightFace if you want to use FACEID models. In this case if you are running on Windows then you should review this thread which covers how to make it work as you need specific build requirements.
However, I was on Linux based OS so its quite easy. Just run this command on your computer or venv. On the RunPod.io image there is no venv so it installs in the core.
pip install insightface onnxruntime
This will compile and build based on the requirements. Next you should make sure you restart ComfyUI service and then Refresh your Browser. This is crucial step each time you should do each time something new is installed.
I then downloaded the workflows available from the author’s side and load them up on ComfyUI to start experimentation. In the video below the author shares more details on the features and different workflows.
I started experimenting with different workflows and creates a few of my own variations. First one is using IPAdapters and upscaling them using Ultimate SD Upscaler, the result is that the face is consistently applied to the resulting image that is upscaled and all this is happening without use of any ControNet.
Next scenario I try Style Transfter which is hidden away in the IPAdapter node under weight type. First image is the face reference and the 2nd image is the style reference. I found the resulting image does take on the colour aesthetics of the style reference but doesn’t mimic the medium or other details. I think maybe that I have not figured out the optimum settings.
You can download all these resources via the link below and try them out yourself. The Jupyter notebook is also included in the download.
IPAdapter Workflow & Notebook (737 downloads )IPAdapter Plus certainly appears to be a great addition to ComfyUI workflow and I’m glad I started experimenting with these custom nodes and workflows. I will continue exploring this further as I get familiar with the different nodes. The video is a great way to learn the different workflows and of course watching is not only it, you need to install and setup IPAdapter Plus to try out the different settings and combinations.
]]>As shared in an earlier post about it produces very nice and coherent results and is leaps and bounds ahead of Stable Diffusion XL when it comes to producing Text.
As you can see below, the text is pretty damn good!! Spelling may be a problem but its an evolution.
So here are my two workflows that will let you play around with ComfyUI and Stable Cascade. You can download them below
You will need to download the models and place them in the folders specified below. You have choice of downloading full models if you want or bf16 or lite version the smaller you go the lesser the quality you will get in comparison to its bigger files.
ADVICE: Always download .SAFETENSORS version of models
When you import the workflow you will likely get some nodes that are missing (showing in red). Use ComfyUI Manager to install the missing nodes and you should be good to go!!!
Here are a bunch of images generated using this above workflow which work quite well.
ComyUI is very powerful when it comes to building custom workflows to generate images, videos and all kinds of created images.
In this video tutorial I show you how you can quickly and easily install ComfyUI Manager in your ComfyUI install. The steps are very easy and hopefully with this video you are able to get started and benefit from the vast ecosystem of ComfyUI extensions/custom nodes.
ComfyUI/custom_nodes
) inside ComfyUI folder. git clone https://github.com/ltdrdata/ComfyUI-Manager.git
this will close the repository onto your computerNow you have a new button in ComyUI interface “Manager” which you can use to install new custom nodes, install missing nodes which can happen when you download a pre-built workflow. You can keep your ComfyUI updates as well as extensions and do much more via the ComfyUI Manager without having to run command lines and scripts.