AI Generated Portraits of Two People

Reading Time: 8 minutes

Using Dreambooth to train more than one subject token at a time to generate portraits using Stable Diffusion

TLDR; It didn’t work.

My instagram feed is full of selfies that are AI generated self portraits. Often surprising and evocative, a passel of sites have quickly spun up apps which let people stylize a headshot. More figurative images seem to possess the same sensibility and style as those of well-known artists. I have been doing a deeper dive into the world of generative art. Unlike the online apps like Lensa, which stylize a single image, I have been using Stable Diffusion to create a model that can generate images of multiple specific people in a wide range of scenes.

I got ahead of myself and conducted my 4th, 5th and sixth experiments without writing them up. In my second experiment I moved to customizing a Stable diffusion model.  In the third experiment I used a larger training set of better quality images. Taking the learnings from that experiment I resolved to try a different training set.  At this point I think people are getting tired of seeing my AI generated self portraits, so I found a new model. (My partner is a good sport) I also decided to fine-tune the model on two tokens at once. In this case my daughter Anne and my partner, Meighen. Instead of Shivram’s colab I used fast-DreamBooth colab from https://github.com/TheLastBen/fast-stable-diffusion. No real motivation except that online comments said it was faster. 

Preparing the Self Portrait Instance Images for Dreambooth

For this experiment I used 45 images of meighen and 26 of Anne. Some people have gotten results from using as few as 5 instance images. My experience has been a different in my previous experiments. One prepares the instance images by naming them all with the name of the token. E.g. token.jpg, token (1).jpg, token(2).jpg. This takes some doing.  I used Adobe Photoshop to batch process. First, I opened all the instance images I was going to use for that token and processed them by correcting the color, tone, balance, removing extraneous objects or people and cropping them to square aspect ratio.

Next, I picked one image and recorded a series of actions. First, I resized the image to 512 x 512 pixels and saved to token.jpg. I stopped recording. Next, Having completed one image, I automated a batch process for all the open images.  In the batch dialog box, I clicked override saving commands and built a little naming template that gave me result I wanted.  With all the instances ready.  I opened up the colab and clicked to the steps. Tool tip: It’s much faster to drag and drop your images into the file browser in the colab interface than to use the script. 

Contact sheet of instance photos of notmeighen
Contact sheet of instance photos of notmeighen

The Right Number of Training Steps

In my first run, I over-fitted the AI generated self portrait model with 20,000 steps.  This became apparent when I went to create images. I could not get any reasonable likeness – Only mangled blobs. For the sake of my model’s cooperation, I deleted the images and started over. I ran the training again with a more reasonable 10,000 steps and was ready to run some inferences. 

AI generated self portraits by prompting with two tokens trained at the same time.

I started off with a simple prompt: A photo of Anne and Meighen together:

AI Image Generator Stable Diffusion Dreambooth custom token image of girl and woman, features bleeding together
Photo of notmeighen and notanne.

As you can see the features of each token blended together and I got a composite. This bleed over could be very hard to overcome.using just prompts. Accordingly, I decided that I would pull on that thread later and concentrated on seeing what kind of results I could get for just one of the two  tokens I had trained. I apologize for not being very methodical in my attempts. I wanted to try all the myriad prompts I could imagine. The rabbit hole was deep, and I had a hard time climbing out of it. After thrashing around for hours, I got back down to business and started and started running the same prompts that I ran for my last experiment.  

First results for just one of the tokens

I started out with the default prompt in the notebook. I configured the inference console to generate 4 images at 512px x 512px.

AI Image Generator Stable Diffusion Dreambooth custom token image of woman digital painting
photo of notmeighen person, digital painting

Not a great likeness of Meighen if you compare against the training instance images. I got some really different results than I did the last time with the model fine-tuned on the token representing my likeness. 

Other Prompts

I ran through some of the highlight prompts from my other experiments. Starting with Studio 54

AI Image Generator Stable Diffusion Dreambooth custom token image of woman at Studio 54
closeup photograph of notmeighen at Studio 54 20 steps, 7 cfg, euler a
closeup photo of notmeighen woman partying with two men at Studio 54 20 steps, 7 cfg, euler a

I Trolled Myself

Even though I know these are synthetic pictures, I started to get a little jealous.  Who are these guys in the pictures?  I don’t remember hearing about any of this.

The WebUI Interface from Automatic111

The WebUI Interface is a lot more powerful than the interface I was using inline in Shivam Shirao’s colab. (image of interface) There are a lot more tools to tweak. The full documentation of the WebUI features is here. I was able to choose which sampler to use, the CFG Scale, the number of steps, and the seed to get image results that were more coherent.  In the following images, I will try to indicate the settings for each of these. 

Movie Key Art

beautiful curvaceous ample shapely notmeighen woman and handsome man, sci-fi movie key art by Bob Peak, drawn by Drew Struzan, by Richard Amsel, by John Alvin, Paris, detailed, hyperrealistic, photorealistic, psychedelic 40 steps, 20 cfg, DDIM
buxom, beautiful, notmeighen woman and attractive debonair 55-year-old man, Text reads “Schatz” Campy romantic comedy movie key art by Bob Peak, drawn by Drew Struzan, by Richard Amsel, by John Alvin, By Norman Rockwell, trending on artstation, film noir, San Francisco, detailed, photorealistic, psychedelic 40 steps, 7 cfg, DDIM
beautiful curvaceous ample shapely notmeighen woman and handsome man, sci-fi movie key art by Bob Peak, drawn by Drew Struzan, by Richard Amsel, by John Alvin, Paris, detailed, hyperrealistic, photorealistic, psychedelic 40 steps, 20 cfg, DDIM
beautiful curvaceous ample shapely notmeighen woman and handsome man, sci-fi movie key art by Bob Peak, drawn by Drew Struzan, by Richard Amsel, by John Alvin, Paris, detailed, hyperrealistic, photorealistic, psychedelic 40 steps, 20 cfg, DDIM

James Bond Cast As A Woman

The model didn’t really want to cooperate here. Perhaps I over trained the model so that it wouldn’t stylize. To compensate, I prompted the token along with the words Natasha Romanov, Black Widow from the Marvel Cinematic Universe.

Portrait of notmeighen as james bond,  key art, running, highly detailed, digital painting, trending on artstation, concept art, cinematic lighting, sharp focus, illustration, by gaston bussiere, alphonse mucha, john alvin 40 steps 1 cfg euler a
Portrait of notmeighen as natasha romanov,  key art, running, highly detailed, digital painting, trending on artstation, concept art, cinematic lighting, sharp focus, illustration, by gaston bussiere, alphonse mucha, john alvin 20 steps, 7 cfg euler a

Cross-gender Dating App Photo

Gender swapping shows the real power of AI generated self portraits. Trying to do a gender swap dating profile proved difficult. I guess that Meighen’s training images are just too girly to easily switch.  My first couple of attempts yielded generated images that were all decidedly female. To overcome this tendency, I added adjectives that primarily used with men until I got the AI to spit out a picture of a man. 

handsome, attractive, masculine, large, lithe, notmeighen man dating profile photo 30 steps 4 cfg euler a
handsome, attractive, masculine, large, lithe, notmeighen man dating profile photo 30 steps 4 cfg euler a
handsome, attractive, masculine, large, lithe, notmeighen man dating profile photo 30 steps 4 cfg euler a
handsome, attractive, masculine, large, lithe, notmeighen man dating profile photo 30 steps 4 cfg euler a

Better luck with Rembrandt Than Annie Leibovitz

notmeighen woman by Rembrandt
notmeighen woman by Annie Leibovitz

The AI generated self portraits from the prompt of Rembrandt gave a good likeness of Meighen and clear style of Rembrandt. For Annie Leibovitz, the AI seemingly could not resist including the features of the photographer’s likeness to the generated image.

Miscellaneous Other Prompts for AI Generated Self Portraits

Basically I tried to use the same prompts as in my other two experiments. Results varied.

A painting of a smug notmeighen woman by Henry Asencio, by Paul Wright, by Davide Cambria 20 steps 7 cfg
notmeighen woman as a warrior in a medieval fantasy, intricate clothing, Lord of the Rings style, 4k, high definition, photorealistic, unreal engine 5, 50mm, f1.8, handsome face, insanely detailed, cinematic lighting, dynamic pose, oil painting, art by artgerm, trending in artstation, art by Greg Rutkowski
a closeup photo of notmeighen woman as a post apocalyptic goddess at burning man festival playa, powerful, cinematic, beautifully lit, by artgerm, by craig mullins, by galan pang, 3 d, trending on artstation, octane render, 8 k  40 steps 9 cfg
Closeup Movie still of a mischievous notmeighen woman film noir
realsabin female elegant, ultra highly detailed, digital painting, smooth, sharp focus, artstation, pixiv, art by Ina Wong, Bo Chen, artgerm, rossdraws, sakimichan 30- steps 7 cfg
beautiful-artificial-cybernetic-cyberpunk-beautiful-cyborg-robot-notmeighen-female-gorgeous-exquisite-detail-woman
beautiful artificial cybernetic cyberpunk beautiful cyborg robot-notmeighen female gorgeous exquisite detail woman

Classic Mugshot

The mugshot is turning out to be one of my reference prompts. It captures a mood and a context that many other prompts don’t. The generated images also seem to pull out the most distinctive features of the token in the model and render them in the worst possible light. I got a really good one with the notmeighen token.

police booking mugshot photo of notmeighen woman 20 steps 7 cfg

Next Experiment

I generated a lot more images. For example the prompt “photo of attractive notmeighen woman corporate headshot technology, smart 40 steps 12 cfg” but the are mediocre and don’t add to the understanding of the potential of fine-tuning Stable Diffusion v15 with Dreambooth. Since then i have run a couple more experiments. For my next experiment, I trained a model on all four members of the Speiser family. My intention was to put Christmas card of ridiculous scenarios. Hilarious results ensue!