Oct 2
Recently, Francis Chollet, creator of Keras, announced that stable diffusion is now available on TensorFlow
So, I decided to give it a shot on Google Colab.
But, before we start how does stable diffusion work?
References: BigThink, YouTube
This is quite a simple explanation of how stable diffusion works but I think it makes sense — at least to me anyway.
Anyway, here’s my review of Stable Diffusion with TensorFlow on Google Colab. Just to note, all of this is running on GPU.
To run in the Colab Notebook, it takes 2 minutes to download the required packages. 1 minute to load the packages, and a further minute to produce a single image.
So, in total it takes about 4 to 5 minutes to produce the initial images and 1 minute thereafter for each follow-up image.
When producing batches, it takes a further minute for each image. That means, if we’re producing 4 images, then it’ll take 4 minutes, without including the 3 minute set up time.
For something in a non-production environment, it didn't have high hopes to even generate an image but I consider 5 minutes quite good, especially with future images only taking a further minute.
I guess you could make this faster by purchasing more cloud resources or deploying the package on a decent rig, but I'm satisfied with the wait time.
I’m also on the DALL-E Beta, so when I compare the two in terms of speed, DALL-E is a whole lot faster. I would dare to say for DALL-E to produce 4 images takes only 30 seconds while Stable Diffusion on this Colab Notebook takes 4 minutes. These aren’t comparable situations as DALL-E is a production API, but I’ve mentioned these times out of interest.
As mentioned, I've already used DALL-E, so how does Stable Diffusion compare?
I've found them comparable but I haven't pushed either model because I'm not great at writing inputs.
Using a small sample size, I would say that both systems produce high-quality and decent images.
Here's an example of outputs from Stable Diffusion and DALL-E.
DSLR photograph of an astronaut riding a horse.
Stable Diffusion
They both look pretty good if you ask me!
Stable Diffusion is trained on the LANION data set. This data set scrapes billions of Web pages monthly and scores images based on their aesthetic value.
I found it a bit hard to discover the underlying training set for DALL-E but my guess based on some OpenAI articles (DALL-E, CLIP) lead me to think that a chunk of the learning came from ImageNet and just the internet in general.
In the big scheme of things, I don't think the finer details of the training set of either model matter as we're talking about billions of images here.
I would probably just say that Stable Diffusion produces good quality images and with the right inputs you’re going to create stunning images.
Websites such as NightCafe and MidJourney have already productionized Stable Diffusion, and their outputs also look fantastic.
Stable diffusion is quite easy to use. Like DALL-E, all you need to do is enter in your input phrase, wait for the image to load and there you go.
I've yet to put the TensorFlow variation of Stable Diffusion into production, but for some reason, I don't feel like it should be extremely hard.
I would imagine it would be creating a user interface (let’s say with React), that APIs inputs to the TensorFlow Stable Diffusion Model, which predicts the images and APIs them back to the user interface.
I haven't done it myself personally, but I don't feel that it would be tremendously difficult if you have time and basic programming skills in Python, JavaScript, and some knowledge of cloud computing.
The answer is yes. For a free AI Generative model, it works fantastically. You don't need to pay to make images and all you need to do is wait.
It's still too early days to know if Stable Diffusion will be a DALL-E killer but given the free source code, it just might be.
However, it all comes down to price. Even if DALL-E and Stable Diffusion have the same high-quality images, wait times aren’t too long and both are trained on billions of images, it’ll be the AI service that provides the better customer support that wins out.
So, we'll just need to wait and see.
If this article has enlightened you, then please sign up for Medium and be a member with my referral link.
Subscribe to DDIntel Here.
Join our network here:

empowerment through data, knowledge, and expertise. subscribe to DDIntel at
JonJon Clark
Newtown Partners
Society of AI
NLP Updates
Harshit Nigam
Essence of Learning
A Sai Vinith
Ayodele Odubela
Exploring of the world through books. Writer of unlikely ideas.