Filed Under #mlforweb

Final: Anime/Manga-Style Filter with PoseNet and Style Transfer

alt text I made a selfie filter that adjusts a photo in the browser with style transfer and poseNet models with ml5. The filter mimicks the art style in anime and manga (Japanese animation and comics), transforming a photo of the user to a character in a comic. Similar to filter apps like Meitu, the webpage will take in a user’s photo, process it, and then allow them to download it.

View this post on Instagram

A post shared by Jane Douglas (@penny_dreadful) on

Style Transfer

First, I tested different models of style transfer to find one that would best give an ‘anime style’ to a photo. I wasn’t sure how different styles of anime drawings would translate to style transfer models so I tried different types of ‘anime style’ depictions. I think the first one gave the best result although I like the 4th one too. I used this method to train a style transfer model remotely with Spell and convert to ml5.

alt text

As you can see, style transfer doesn’t change the size and shape of the face. So even if the drawings have exaggerated eyes, the style transfer model won’t. Yining, my teacher, suggested I use poseNet to enlarge the eyes and add other elements. I photoshopped the original photo with the effects I wanted to test how they would look like after being style transfered.

Pose Net

I used poseNet with ml5 to detect the eyes and nose of the person in the photo. I then used those points to relatively enlarge the eyes and place blush and sparkles. I started using poseNet on a webcam video stream to test and then moved to an image input. alt text

alt text alt text alt text

The blush are ellipses drawn relative to the eyes and nose keypoints. The sparkles are just a transparent png image. alt text alt text

I then saved frames of the canvas and created an image of the first frame. I fed that image into the style transfer model to get the final result. alt text

The results are a big scary. Mainly because of the rough edges of the enlarge eyes. I would like to use p5.mask with a radial gradient circle image instead of just drawing an ellipse in the future to blend the layers better.


I tested with another image of myself to see if the results would be comparable. The original image must be 500 by 500 pixels right now. alt text alt text alt text

It works the same as the first image I tried but it’s still scary.

Edit 12/18/18: Improvements

I worked on other aspects of this filter for another class and was able to improve the blending aroud the eyes with a png mask file. alt text

My final results as of now with different style transfer models: alt text alt text

It works with more than 1 pose too if you change the poseNet function call from poseNet.singlePose(img); to poseNet.multiPose(img);. alt text alt text

To Do

Originally, I also wanted to use body-pix to detect the silhouette of the figure and remove the background but I ran out of time. This is so I can have the filtered person in the original photo-realistic environment or just replace the background with something else completely.

Photoshopped idea of what I was going for: alt text

Most importatnly, I want people to upload a photo and then be able to download it. The user interface needs to be built. I would also like them to be able to choose between single pose or multiple poses and different style transfer models.

One issue I thought of is that this model probably won’t work well for people with darker skin tones. This is a similar issue that the Meitu app had too. Maybe I will need to use a different type of style transfer or experiment with parameters so that the colors of the style are not applied as strongly.

Link to project github repo

Try the demo here

PoseNet single pose image example

Fast style transfer with Spell and ml5 tutorial

Blog post on typography part of filter (in progress)

Written on December 11, 2018

Final: Ideas and Test

For the final, I wanted to use style transfer to experiment with formatting memes. I have been making memes on instagram in a specific style over the past 2 years. I noticed that some friends that also do this each have their own style for formatting the images and text. I wanted to see if I could train a model for each style of meme. Then, for example, use style transfer to change the formatting of a meme to look like it was made by me. alt text images used for this test are by me, @djinn_kazama, and @renaissance__man

The final result was not that good. Maybe with adjusting settings I can get more of the input images rather than the model? I would also like to know if I can train a model using multiple images (since I have a lot of memes with the same style) rather than just selecting one. The one I selected to train the model with has green and red accents and I sometimes use other specific colors like blue and magenta.

alt text try it here live

github repo

I understand with my project I’m not really using style transfer for exactly what it was intended for since I’m also using text. I’m just interested in having varied and interesting outputs rather than the output being readable as a meme. I may have better results with a more complex visual style (if I trained the model on my friend’s stuff instead).

My second idea that I didn’t get to test is to make a filter using style transfer that makes a user’s selfie look like a computer generated graphic. Kind of like sims or IMVU style.

Written on November 27, 2018

Training LSTM: Fortune Cookie Fortune Generator

I wanted to see if I could generate new fortune cookie fortunes. I trained a LSTM network on this dataset of fortunes. The dataset is rather small (around 2,800 quotes). I chose to try the LSTM network this week because I didn’t get a chance to on the first week.

The final results are not that readable/understandable. I think to get more fortune cookie-like results, I should clean the dataset and get rid of unnecessary seperator characters (%) and remove the names of who is attributed to each quote.

try it here
link to github repo

Written on November 20, 2018

Exploring ML5 Examples

I made a simple webapp that recognizes if you are showing it a teddy bear. The webpage asks you to show the bear and when it recognizes it, it thanks you. I adapted the ml5 p5.js image classification video example.

My code looks for the “teddy, teddy bear” result, but you could change this any of the other results that are available.

Video demo:

my github repo with code

Written on November 6, 2018