DeepDream: Today Psychedelic Images, Tomorrow Unemployed Artists
Kaj Sotala
2015-09-25 00:00:00
URL

If you know how DD works, this is not too surprising in retrospect. The algorithm, similar to the human visual system, works by first learning to recognize simple geometric shapes, such as (possibly curvy) lines. Then it learns higher-level features combining those lower-level features, like learning that you can get an eyeball by combining lines in a certain way. The DD algorithm looks for either low- or high-level features and strengthens them.





Lines in a low-quality image are noisy versions of lines in a high-quality image. The DD algorithm has learned to “know” what lines “should” look like, so if you run it on the low-level setting, it takes anything possible that could be interpreted as a high-quality (possibly curvy) line and makes it one. Of course, what makes this fun is that it’s overly aggressive and also adds curvy lines that shouldn’t actually be there, but it wouldn’t necessarily need to do that. Probably with the right tweaking, you could make it into a general purpose image quality enhancer.



A very good one, since it wouldn’t be limited to just using the information that was actually in the image. Suppose you gave an artist a grainy image of a church, and asked them to draw something using that grainy picture as a reference. They could use that to draw a very detailed and high-quality picture of a church, because they would have seen enough churches to imagine what the building in the grainy image should look like in real life. A neural net trained on a sufficiently large dataset of images would effectively be doing the same.



Suddenly, even if you were using a cheap and low-quality camera to take your photos, you could make them all look like high-quality ones. Of course, the neural net might be forced to invent some details, so your processed photos might differ somewhat from actual high-quality photos, but it would often be good enough.



But why stop there? We’ve already established that the net could use its prior knowledge of the world to fill in details that aren’t necessarily in the original picture. After all, it’s doing that with all the psychedelic pictures. The next version would be a network that could turn sketches into full-blown artwork.



Just imagine it. Maybe you’re making a game, and need lots of art for it, but can’t afford to actually pay an artist. So you take a neural net, feed to it a large dataset of the kind of art you want. Then you start making sketches that aren’t very good, but are at least recognizable as elven rangers or something. You give that to the neural net and have it fill in the details and correct your mistakes, and there you go!



If NN-generated art would always have distinctive recognizable style, it’d probably quickly become seen as cheap and low status, especially if it wasn’t good at filling in the details. But it might not acquire that signature style, depending on how large of a dataset was actually needed for training it. 



Currently deep learning approaches tend to require very large datasets, but as time goes on, possibly you could do with less. And then you could get an infinite amount of different art styles, simply by combining any number of artists or art styles to get a new training set, feeding that to a network, and getting a blend of their styles to use. Possibly people might get paid doing nothing but just looking for good combinations of styles, and then selling the trained networks.



Using neural nets to generate art would be limited to simple 2D images at first, but you could imagine it getting to the point of full-blown 3D models and CGI eventually.



And yes, this is obviously going to be used for porn as well. Here’s a bit of a creepy thing: nobody will need to hack the iCloud accounts of celebrities in order to get naked pictures of them anymore. Just take the picture of any clothed person, and feed it to the right network, and it’ll probably be capable of showing you what that picture would look like if the person was naked. Or associated with one of any number of kinks and fetishes.



It’s interesting that for all the talk about robots stealing our jobs, we were always assuming that the creative class would basically be safe. Not necessarily so.



How far are we from that? Hard to tell, but I would expect at least the image quality enhancement versions to pop up very soon. Neural nets can already be trained on text corpuses and generate lots of novel text that almost kind of makes sense. Magic cards, too. I would naively guess image enhancement to be an easier problem than actually generating sensible text (which is something that seems AI-complete). And we just got an algorithm that can take two images of a scene and synthesize a third image from a different point of view, to name just the latest fun image-related result from my news feed. But then I’m not an expert on predicting AI progress (few if any people are), so we’ll see.





EDITED TO ADD: On August 28th, less than two months after the publication of this article, the news broke of an algorithm that could learn to copy the style of an artist.



Image #1: "me, before and after drugs"