You may have seen that, last year, an AI-made painting was sold for hundreds of dollars. Other research teams have designed algorithms capable of generating text that are uncannily realistic… At the same time, companies like Jukedeck or Amper believe that AI can be used to create realistic-sounding music. All these examples beg the question of whether it is possible to understand and perhaps reimplement creativity into a program. Could robots actually learn to be creative and artistic beings? Is it an inherently human trait, or can we somehow analyze what makes us creative and turn it into a piece of software?
In a way, it is not true to say that those AI algorithms “create by themselves”. They have learnt, they have been trained on loads and loads of references by humans and are able to (re)produce patterns that we interpret as newly produced art. Hence, many journalists and critics wonder up to which extent the question is not: “can AI invent?” but rather: “can AI make us believe it invents?”.
With this project, Cali Rezo and I have taken a look at how to apply machine learning to art generation or analysis, and this series of articles will present some of our reflexions, results and questions. It will be separated in 6 articles, one every Monday for the upcoming weeks:
- Project & Goals (this article): This article is about the project, how we came up with the idea and what goals we had in mind – even if they were quite fuzzy, I’ll admit, and it was more about trying out things and seeing what would happen!
- Generative models: In the second article, I will present the two most common generative models in AI, the Variational Autoencoders (VAEs) and the Generative Adversarial Networks (GANs), and some images that our models produced thanks to those
- Art analysis and classification: The third article will be devoted to tracking patterns in images and doing some predictions about the type of an image
- Follow-up on models explainability and uncertainty: The fourth article will focus more on the ecosystem of AI, and the fact that the growing trend of AI has faced many data engineers with two new problems, namely how confident AI models are about their predictions and how well we can actually understand the results of their model; be it for risk assessment, ethics or pure R&D, the ability to get into the “black-box” model and look under the hood is sometimes preferred to performance, which lead the field of XAI (or Explainable AI) to slowly develop
- Cali’s experience: The fifth article will present Cali’s point of view on the project and what she got out of it
- Conclusion & Additional Notes: Finally, the last article will sum up the key things we came across during this great project, the various questions it rose and a few other things about Cali’s future events or the technologies we used
AI & Art: the project
I have always been fascinated with procedural generation and the idea of having mathematical rules and predefined code structures interact to create enough complexity for our brain to see it as brand new creation. To me, AI models and neural networks are somewhat derived examples of emergence – the idea that a system built from simple parts has more properties than the parts on their own. Because, in fact, a very crude description of the fancy AI models and neural networks everybody praise is simply: “a set of buttons that are gradually adjusted to approximate a function well”. To think that, from such a basic toolbox, you can get such a complex behavior is quite incredible!
In a way, we already talked about this kind of unforeseen complexity when we discussed Conway’s Game of Life: with only a few rules, it can generate huge auto-sustainable patterns, simulate real-life systems and even enter the world of fractals by having a Game of Life be a building cell of a bigger Game of Life…
Also, as you are probably aware, I have an on-going collaboration with the abstract painter Cali Rezo. We have both been curious about the links between art and technology for a long time, so it was kind of logical that we would end up collaborating on a project on this topic!
Cali is a hardworker who’s able to fill dozens of notebooks in a few weeks. I am truly mesmerized by how beautiful and well-organized those notebooks can be, and it seems that I am not the only one to like them, given the reactions she has on her Instagram channel on her videos…
She kindly agreed that we use her research notebooks as training examples for our AI models. We hoped the various models could learn to create similar forms or analyze the artwork. Here is a quick peek at one of our reference notebook:
When we started this project, we didn’t really have a clear goal. We just wanted to have fun with some AI models and see what we could come up with! We were more focused on generation: we wanted to test some simple ML structures to try and create images close to Cali’s work.
A bit later, when we started to actually get some results, we examined how different types of neural networks responded to our dataset and we thought of experimenting with classification.
The project also lead us to interesting discussions that we thought we would share through this series.
The work process
As is often the case in data science and AI development, our working process was not only the programmation and training of a neural network. We also needed to do some pre- and post-processing on the images.
Since Cali has the good habit of doing her research in neatly arranged small boxes, I was able to write a small algorithm to identify bounding boxes and automatically extract a bunch of training references from a scanned page of her notebooks. The program searches for clusters of pixels that form an image which size or area is close to a given reference, and can then cut down the initial image into multiple separate pieces. For example, here is what my program would analyze from this page (with the small boxes average size as reference):
This allowed us to quickly get about 600 training examples!
Because of limited computer power and to allow more R&D on various topics, we decided to stick with quite small image sizes: 100×100 pixels as input and output dimensions. Even if our actual scanned references were larger, we reduced them so that our models wouldn’t take too long to train. Unfortunately, because of the architecture of those models, this also means that we are limited to 100×100 in the size of the images we can produce. But, still, this is great for our first experiments!
Upon loading, the images would also be “binarized”, meaning that the pixels would be fixed to either true black or true white, removing all the intermediary gray.
The generative models
For our generative models, once ready, our images could be fed directly. This was the training phase.
And, finally, we would ask for our models to generate new images and do some post-processing: cleaning out pixel clusters that are really too tiny (and just make for small holes on a black background or little black “blobs” in the middle of a white zone), smoothing, averaging…
Just as a teaser, here are some results we got with our VAE:
As you can see, it is far from perfect, and it probably lacks the true inventiveness that Cali has, but we were still impressed to get this after only half an hour of training!
For the classifiers, as will be discussed in the third article of the series, there was generally an additional step of features extraction to help our models identify patterns a bit more easily.
After training them, we could then ask our models to predict the “type” of an image.
A small caveat, though: because we did supervised learning, of course, we had to avoid using the same images for training and testing – otherwise, the model would be overly accurate due to overfitting! This meant that had a smaller training dataset for this part of the project (I’ll talk about this more in depth in the 3rd article).
This article was just a short overview of what we did in this project, and we’ll give more details in the rest of the series. I hope you’ll enjoy reading about it as much as I enjoyed working on it!
In the next article, we’ll take a look at the VAEs and the GANs, two types of generative models that we used to create some “Cali Rezo-like” images.
And many thanks to Cali for graciously cleaning up some images generated by our models to make icons for this series of articles!
- Cali Rezo’s website: http://www.calirezo.com/site2015/
- Jukedeck’s website: https://www.jukedeck.com/
- Amper’s website: https://www.ampermusic.com/
- T. Graham, “Art made by AI is selling for thousands – is it any good?” (http://www.bbc.com/culture/story/20181210-art-made-by-ai-is-selling-for-thousands-is-it-any-good), December 2018 [Online; last access 21-April-2019].
- R. Metz, “This AI is so good at writing that its creators won’t let you use it” (https://edition.cnn.com/2019/02/18/tech/dangerous-ai-text-generator/index.html), February 2019 [Online; last access 21-April-2019].
- I. Wikimedia Foundation, “Autoencoder” (https://en.wikipedia.org/wiki/Autoencoder), March 2019. [Online; last access 21-April-2019].
- I. Wikimedia Foundation, “Generative adversarial network” (https://en.wikipedia.org/wiki/Generative_adversarial_network), April 2019. [Online; last access 21-April-2019].
- I. Wikimedia Foundation, “Explainable artificial intelligence” (https://en.wikipedia.org/wiki/Explainable_artificial_intelligence), April 2019. [Online; last access 21-April-2019].
- I. Wikimedia Foundation, “Emergence” (https://en.wikipedia.org/wiki/Emergence), April 2019. [Online; last access 21-April-2019].
- I. Wikimedia Foundation, “Overfitting” (https://en.wikipedia.org/wiki/Overfitting), February 2019. [Online; last access 22-April-2019].