Making AI Art with Midjourney

Midjourney is an independent research lab, exploring AI applications, and recently they launched a bot capable of producing art with text inputs.

Others were able to produce AI models to generate images from text, such as: DALL-E and Wombo. But until know, Midjourney was the best experience, at least for me.

My AI paintings 🎨

I will explain you ahead, how to use the Midjourney algorithm on discord, but first check my paintings generated out of text prompts!

"Data Scientist walking around a dystopian future market and buying goods with cryptocurrency"

Data Scientist walking around a dystopian future market and buying goods with cryptocurrency.png

"A leek with the face of wolverine, trying to fight a rainbow of puppets"

A leek with the face of wolverine, trying to fight a rainbow of puppets.jpg

"Fernando Pessoa in a very futuristic style"

mac.rodrigues_Fernando_Pessoa_in_a_very_futuristic_style_5640d2ee-97e6-473f-8e56-448cf8aa6705.png

"Gorgeous Mexican woman with freckles, dark hair, dark eyes and chiapas clothes"

gorgeous mexican woman with freckles, dark hair, dark eyes and chiapas clothes.png

Using Midjourney

The AI bot it's still on beta version, but you can try it on discord. You can access their discord through Midjourney's website.

Once on Midjourney's Discord, you have the instructions to use it, basically you just need to join a "newbie" channel and prompt:

/imagine <your quote>

The output will be 4 images like this:

mac.rodrigues_Fernando_Pessoa_in_a_very_futuristic_style_9ffbd501-c7d0-48e8-a8b9-0fe1e9f42224.png

You can then decide to get an high resolution image from the 4, or get more versions of one of the images.

Note that you don't have many tries as a free user (around 25 queries), so make sure to prompt the best ideas that come to your mind! (or not, just have fun 😄)

Behind the hood 💻

The technical background behind the magic easily gets very complex to explain, but mainly includes 3 steps to output the image: The training data, the deep learning model, and the latent space.

Schematic AI ART.png

The training data passes through the deep learning model (CNN) and needs to be encoded into a low-dimensional latent space before classification. The later basically allows for data compression.

In the end of the training, the last layer of the model captures the patterns of the input that are needed for the image classification. In the latent space, images that are labelled as the same object have very close representations.

Source: https://www.baeldung.com/cs/dl-latent-space

For more technical information, I found this video very enlightening:

Other sources: