Date: Sat, 18 Nov 2017 20:47:42 -0800
From: Bret Victor
Subject: Re: using AI to augment human intelligence
Hi Michael and Shan,

Thanks for sending me the draft.  I read it carefully and wrote some notes below.

The notes may read as oppositional, because, well, I'm opposed.  But I hope this perspective on the work can be useful to you.  (If you don't think it's useful, feel free to discard.)

Best,
-Bret


Research on IA has often been in competition with research on artificial intelligence (AI): competition for funding, competition for the interest of talented researchers

They've perhaps been in competition for "overriding conceptual paradigm", but I can't (offhand) think of examples of talented researchers who were torn between the two, or funding diverted from one to the other.

font tool

(Dragging doesn't work in Safari.)

Such generative models are similar in some ways to how scientific theories work. Scientific theories often greatly simplify the description of what appear to be complex phenomena, reducing large numbers of variables to just a few variables from which many aspects of system behaviour can be deduced. Furthermore, good scientific theories sometimes enable us to generalize to discover new phenomena.

Although the greatest concern and driver of many scientists are the facts that don't fit the theory (e.g. lamb shift).  Scientific theories strive to account for all the facts, and a single misprediction both invalidates the theory and indicates a promising direction for investigation.  ML models, and their users, don't seem to operate in this mode.

The heuristic of preserving enclosed negative space is not a priori obvious. However, it’s done in many professionally designed fonts. If you examine examples like those shown above closely it’s easy to see why: it improves legibility. During training, our generative model has automatically inferred this principle from the examples it’s seen. And our bolding interface then automatically makes this available to the user.

Makes what available to the user?  The bolded font, or the knowledge of the principle that's supposedly been inferred?

I would argue that a tool that augments one's intelligence must leave that person with a better understanding of whatever domain they're working in.  If the tool develops an "understanding", but does not induce that understanding into the user, I don't think it can be called an IA tool.

One of Alan's favorites analogies is the prosthesis on a healthy limb.  Here's one phrasing of it (from here):

Put a prosthetic on a healthy limb and it withers. Using the logic of current day education, we could say that since students are going to be drivers as adults, at age two we should put them in a little motorized vehicle and they will just stay there and learn how to be much better drivers. Now, we would think that was pretty horrible. But what if we gave the same person a bike? We're not going to feel so badly [because] the bike allows that person to go flat out with his body and it amplifies that. [The bike is] one of the great force amplifiers of all time because it doesn't detract from us--it takes everything we've got and amplifies it. Most computers today are sold like cars, where as many things as possible are done for you. You don't have to understand how it works and, in fact, you don't have to understand how to think because the most popular stuff is prepackaged solutions for this and that. When you put a person into a car, their muscles wither. You put a person into an information car, and their thinking ability withers. I wouldn't put a person within 15 yards of a computer unless I was absolutely sure that it was a kind of a bike for them.

The most important question for an IA tool is: does it grow the person?  Or does it create a dependency which ultimately weakens them?

Turkle's "Simulation and its Discontents" has some powerful observations about students and practitioners who were weakened in this way by computational tools.  They could produce work faster (for some definition of "work"), but knew less and understood less.  

(And, among other things, would make egregious "what were you even thinking" mistakes.  Because the tool had prevented them from learning to think.)

Thus, the tool expands ordinary people’s ability to explore the space of meaningful fonts.

No, it generates ugly fonts for ordinary people who don't know any better.

Augmenting a person's "font intelligence" might mean assisting them in developing fluency in the "language" of fonts, developing an eye for what matters to skilled human designers, crafting fonts that meet cultural aesthetic and functional criteria, or even just articulating those criteria.

A good book or class does these things.  A tool that mechanically interpolates between works of human craftsmanship does not.

A huge part of what makes a font "meaningful" is its cultural and historical context.  Around the time I joined Apple I was on an Optima kick, and I remember the first time I used Optima in a design, my boss told me to knock it off, saying "this is Apple, not an Italian restaurant or the Munich Olympics".

I am honestly terrified of a public whose intelligence is "augmented" by scraping 50000 fonts off the open web and analyzing the pixels while discarding all human and humanistic associations.  

A thing's "meaning" isn't inferred from an isolated training set, but the entire human culture, and the experience of growing up as a human being within it.

Using the same interface, we can use a generative model to create human faces, and manipulate them using qualities such as smiling, gender, and hair color. Or sentences, and manipulate them using length, or sarcasm or tone.

Why?  And how is this related to any reasonable notion of "intelligence"?

Such interfaces provide a kind of cartography of generative models, ways for humans to explore and make meaning using those models

What does "make meaning" mean here?  Meaning is derived from context, and unlike any human-designed examples, all of these examples are born and exist in a contextual void.

Let’s look at another example using machine learning models to augment human creativity

In what way is the shoe/landscape tool creativity?  Isn't it just selection?  Doesn't "human creativity" mean that something is being created by a human?  What here is being created by a human?

But the underlying idea is still to find a low-dimensional latent space which can be used to represent (say) all landscape images, and map that latent space to a corresponding image.

It might be worthwhile to meditate a bit on why some people might find this whole concept repugnant.

Do you not see a difference between representing a gas as a point in a low-dimensional space, and representing a work of human creativity?  Can you see why, to some people, claiming that Sophocles can be not just located at, but generated from, one parameter set, and Tolstoy from another, does violence to the entire history of human endeavor?

To put another way, classic visual manipulation paradigm does not prevent the user from “falling off” the manifold of natural images.

The essence of invention is falling off the manifold of what is believed to be natural.  A truly intelligence-augmenting tool would help people fall off -- help people think thoughts that are not thinkable in the current paradigm -- instead of constraining people to the paradigm.  (I realize you address this in a later section.)

Much of the work of a font designer, for example, consists of competent recombination of the best existing practices.... For such work, the generative interfaces we’ve been discussing are promising

This is a misleadingly reductionist approach to creation.  It's broadly true at the level of themes, tropes, idioms, etc.  But almost no actual creative practice reduces to literally making collages of the works of others.  What I feel in my heart when I am creating music, or poetry, or visual art, or an essay, is not at all represented by the phrase "competent recombination".  I'm sure that most font designers would say something similar.

What you describe here reminds me of the prolefeed-generating machines in "1984", generating an constant stream of pulp novels from the same passages endlessly permuted.  I think most creative people aspire to something more, if for nothing else than to give meaning to their own lives.

Do generative interfaces -- replacing creating with searching -- bring a person closer or further from feeling like they have touched the limits of their creative potential and created something truly from their heart?

Does an artisan improve their quality of life by joining an assembly line?

https://twitter.com/edelwax/status/715441843006259200

The model would have discovered stronger abstractions than human experts. Imagine a generative model trained on paintings up until just before the time of the cubists; might it be that by exploring that model it would be possible to discover cubism?

Why is a machine "discovering" cubism a good thing?  Doesn't the beauty of cubism lie in the communication between artist and viewer -- the statement that the artist is making about the human condition?

This is perhaps not high creativity of a Picasso-esque level. But it is still surprising. It’s certainly unlike images most of us have ever seen before.

Cyriak specializes in these sort of grotesques.  The beauty/fascination of Cyriak's work comes not from ha-ha-breadcat-is-so-random, but the demented coherence of the worlds he creates.  I think that's part of why Cyriak is (at least some would argue) indeed exhibiting high creativity of a Picasso-esque level, and pix2pix is not.

Breadcat is not part of any larger world, breadcat has no story.  Breadcat does not allow us to marvel at the depraved depths of the human imagination.

Aspirationally, our machine learning models will help us build interfaces which reify deep principles in ways meaningful to the user. For that to happen, the models would have to discover deep principles about the world, recognize those principles, and then surface them as vividly as possible in an interface, in a way comprehensible by the user.

Ultimately, I question these priorities.  This seems like a lot of outsourcing to the machine, to "just" discover deep principles, recognize them, and represent them for people!  With all this AI, I'm not sure what's left for IA here.

Aside from mere technical difficulty, one question is whether this new process (in realistic extrapolation) strips away the life-meaning of creative human beings.  Is this the good life?  Will it resemble gold-farming more than playing the game?

A more subtle question is what a "deep principle" is, and how this new process changes the meaning of that.  An infinite number of theorems can be derived from any set of axioms; an infinite number of theories can be claimed about the world.  People pursue theorems and theories that appear meaningful due to all-too-human criteria such as cultural interest (astronomy), cultural needs (medicine), beauty and elegance (maxwell's equations), subcultural sport (fermat's thereom, riemann hypothesis), personal interest, etc etc.

No doubt ML models will discover some sort of principles, and these principles will become important because they've been discovered.  But they may not be the same principles that human beings would have discovered on their pursuit to discover the most enriching principles for human life.

And what happens when human beings lose the ability to discover deep principles for themselves, and become dependent on tools to find meaning in the world?  Is this the civilization we really want?