Date: Thu, 14 May 2015 19:12:23 -0700
From: Bret Victor
Subject: virtual sprawl
[struggling through some half-formed thoughts]

Tying together a few recent influences:

Craig Mod: The Digital-Physical -- http://craigmod.com/journal/digital_physical/
[ethereality of virtual representations]

There’s a feeling of thinness that I believe many of us grapple with working digitally. It's a product of the ethereality inherent to computer work. The more the entirety of the creation process lives in bits, the less solid the things we’re creating feel in our minds. Put in more concrete terms: a folder with one item looks just like a folder with a billion items. Feels just like a folder with a billion items. And even then, when open, with most of our current interfaces, we see at best only a screenful of information, a handful of items at a time. 

Perceptually, beyond some low threshold, data becomes boundless to us. Cloud storage compounds this: we don't even worry about HDs filling up anymore! Even when digital streams have clear beginnings and ends, I think we — humans — do a bad job at keeping those edges in view. In trying to reflect upon vast experiences or datasets captured entirely in bits with most standard interfaces, we run into the same wall as in trying to imagine infinity: we can’t.

Viznut: The resource leak bug of our civilization -- http://countercomplex.blogspot.com/2014/08/the-resource-leak-bug-of-our.html
[unconstrained sprawl and complexity due to encapsulation and black-boxing] 

Since the computing world is virtually limitless, it can serve as an interesting laboratory example where the growth-for-its-own-sake ideology takes a rather pure and extreme form. Nearly every methodology, language and tool used in the virtual world focuses on cumulative growth while neglecting many other aspects.

To concretize, consider web applications. There is a plethora of different browser versions and hardware configurations. It is difficult for developers to take all the diversity in account, so the problem has been solved by encapsulation: monolithic libraries (such as Jquery) that provide cross-browser-compatible utility blocks for client-side scripting. Also, many websites share similar basic functionality, so it would be a waste of labor time to implement everything specifically for each application. This problem has also been solved with encapsulation: huge frameworks and engines that can be customized for specific needs. These masses of code have usually been built upon previous masses of code (such as PHP) that have been designed for the exactly same purpose. Frameworks encapsulate legacy frameworks, and eventually, most of the computing resources are wasted by the intermediate bloat. Accumulation of unnecessary code dependencies also makes software more bug-prone, and debugging becomes increasingly difficult because of the ever-growing pile of potentially buggy intermediate layers. 

Software developers tend to use encapsulation as the default strategy for just about everything. It may feel like a simple, pragmatic and universal choice, but this feeling is mainly due to the tools and the philosophies they stem from. The tools make it simple to encapsulate and accumulate, and the industrial processes of software engineering emphasize these ideas. Alternatives remain underdeveloped. Mainstream tools make it far more cumbersome to do things like metacoding, static analysis and automatic code transformations, which would be far more relevant than static frameworks for problems such as cross-browser compatibility.

... The way of building complex systems from more-or-less black boxes is also the way how our industrial society is constructed. Computing just takes it more extreme. 

Alan Kay: STEPS NSF Report
[orders-of-magnitude reduction in complexity by inventing architectural concepts and representing meaning]

Many software systems today are made from millions to hundreds of millions of lines of program  code  that  is  too  large,  complex  and  fragile  to  be  improved,  fixed,  or  integrated.  (One hundred million lines of code at 50 lines per page is 5000 books of 400 pages each!  This is beyond human scale.)

What  if  this  could  be  made  literally  1000  times  smaller—or  more?  And  made  more  powerful,  clear,  simple,  and  robust?  This  would  bring  one  of  the  most  important  technologies  of  our  time  from  a  state  that  is  almost  out  of  human  reach—and  dangerously  close to being out of control—back into human scale. 

An  analogy  from  daily  life  is  to  compare  the  great  pyramid  of  Giza,  which  is  mostly  solid bricks piled on top of each other with very little usable space inside, to a structure  of  similar  size  made  from  the  same  materials,  but  using  the  later  invention  of  the  arch.  The  result  would  be  mostly  usable  space  and  require  roughly  1/1000  the  number  of  bricks. In other words, as size and complexity increase, architectural design dominates materials. 

Life of a Chromium Developer************************

It takes a special kind of machine to build Chromium.

Chromium is a large project!
 - ~5000 build objects
 - ~100 library objects
 - 1 massive linked executable (~1.3GB on Linux Debug)

Even if you're building a 32-bit executable, you need a 64-bit machine since linking requires >4GB virtual memory.

Development Machine
General requirements:
 - Lots of cores
 - Lots of RAM
 - Second hard drive to source code and building

[Downloading all the source code] can take a few hours.  Grab a coffee. 
For best results check out the code on a dedicated hard drive.

[The checkout will be from 6GB to 22GB, depending on your choices]

Chromium:  23 million lines in 29 languages -- https://www.openhub.net/p/chrome/analyses/latest/languages_summary
Lua: under 6000 lines of ANSI C -- http://www.lua.org/ddj.html


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Sprawl
(Unchecked growth is what happens when you can't check out what you're growing.)

My hypothesis here is that the enabling factor for this uncontrolled complexity is virtual representations seen through tiny rectangles.

 - [virtual] The ethereality/weightlessness of source code means that, unlike physical material, accumulation isn't felt.  It doesn't feel costly to add more.
 - [pinhole] Through the tiny rectangle, you never see more than a screenful of code at once.  The rest is out of sight, out of mind.  Nobody ever sees the whole.

These two properties have the effect of incentivizing sprawl.  The default action is "add more", and that "more" quickly disappears from view.  

Coding is somewhat "speech-like" in this way.  The act of speaking consists of continually generating new ethereal stuff which immediately disappears.  But unlike speech, code is not entirely ephemeral -- it accumulates and becomes a burden.

As I've mentioned before, virtual representations give the illusion of being "writing-like", because they use language, they simulate persistence, etc.  But they're fundamentally different creatures, and this seems to be another instance where the illusion fails.

 speech-like -- synthesized on demand and dissipated afterwards, persisting not in any "place" but just as an observer's memory of an event
 writing-like -- having an independent stable existence in a location

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Accretion

Most forms of art and craft are not done in this "unbounded accretion" style.  A drawing or painting starts with a sketch of the whole and overlays detail until it's done.  Most of the work on a novel is subtractive, trimming the excess until it's done.  In both cases, the work has finite spatial bounds (the bounds of a canvas, or the beginning-midding-end of a story) and finite temporal bounds (the project ends, you publish it).

Working on a circuit design or mechanical design isn't normally a matter of "adding more" -- it's revising and rethinking the properties and connections of a handful of components.  In this case, the material (and the tooling for assembling the material) have cost, and the primary incentive is to minimize.

In terms of temporal bounds, a long-running software project like Chromium might be more similar to an infrastructural project, such as maintaining the roads or sewage system in a city.  But these projects aren't characterized by accretion either -- repairs and improvements do not make the system more complex.  I think this has something to do with the properties of physical space and physical material, and there is something to be learned from these properties.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Boundedness
(To understand the whole system, you must first be able to see the whole system.)

My first sketch for STEPS was imagining the OS as an e-book, where the subsystems were chapters and functions were sections.  The book would be the running "source code" of the OS.  https://github.com/damelang/nile-viz-comps/tree/master/2011-09-21


Part of the impetus was to make the system visible and explorable, etc., because arches don't do any good unless people can see them and understand them.

But it was also important that it feel bounded.  You could look at the table of contents and feel like "there is a finite amount of things to know here, and I can see them all at once."  (You can't really see them all at once on a screen, of course.  You can only see the opportunity to play peek-a-boo with each one.  But I was still thinking in screens at the time.)

A phrase that I started using a lot in the last year is seeing the "shape of the whole".  But you can only see the shape of the whole when 
  (1) the whole can be given a shape (this is the design problem, both of the system structure and of its representations), and 
  (2) the whole can be seen at once (for a system of any complexity, this requires large-scale spatial representations).

I don't think one can or will ever see the shape of Chromium.  Neither (1) nor (2) are possible.  Instead, Google provides the same entry point that they do to the unbounded and unboundable web -- a search field.



  https://www.chromium.org/developers/code-browsing-in-chromium


- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Wrappers

Following Viznut's point about abstraction:  when building hardware, if you (say) have a complicated microcontroller, you don't lay down a simpler microcontroller next to it, run all the signals through it, and claim you've now simplified the interface.  In hardware, the accumulation of material can be seen and felt, it takes up space, it must be paid for.

In hardware, there are certainly modules that contain components, but the modules typically perform a special and distinctly-different purpose than the internal components -- a bluetooth module that uses a ARM processor, for example.  As opposed to software wrappers, which serve the same purpose as the wrapped API, but provide a different aesthetic or abstraction.  My own LayerCake, which wrapped Core Animation, which wrapped OpenGL, which wrapped...  Or jQuery's delay function, which wraps Javascript's setTimeout, which wraps some timer in Chromium, which wraps the posix select function, which wraps some kernel timer, which wraps some CPU interrupt, with undoubtably a dozen more layers tucked away in there.

Creating a software wrapper feels like simplification, because you go from a complex API to an apparently simpler one.  But the system as a whole has gotten more complex.  The trick only works if the weight of the whole -- the implementations of the wrapper and the wrapped -- can be hidden out-of-sight.  Virtual representations hide very well, because the screen is so tiny that you never see more than a fraction of the whole anyway.  Virtual representations also hide very well because, dream-like, they don't physically exist in the first place.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

Black Boxes

There's an undercurrent which runs through the Research Agenda which is a kind of "anti-magic" pervasive transparency -- a feeling that there is something deeply wrong with black boxes, and preferring tools for "creating from scratch with powerful primitives, visibly and observably" instead of "grabbing and bolting together prebuilt modules, hiding details under the hood".

Black boxes disempower by creating dependence -- believing someone's argument based on their black-box models is blindly trusting them.  Using a black box tool (as in "Simulation and its Discontents") engenders helplessness, and induces a "grain" which can hinder or corrupt a work.

Black boxes disempower by removing the context and detail needed to gain a full understanding.  The whole point of "Ladder of Abstraction" is that you can always go down a level to get immersed in that detail, build those associations, understand why.  Without being able to go down, myth and superstition takes the place of true understanding.  (This was also the point of "Ten Brighter Ideas" etc.)

As Viznut points out, virtual material is well-suited for crafting black boxes, since the interface of the box need bear no relation to its innards.  Physical interfaces are constrained by physical implementation -- the design of a car betrays the structure of the engine and drive train; you can't make a fridge that's smaller than the food inside.

Virtual interfaces can be perfect "abstractions", betraying nothing about what's inside.  But when you can't see inside, you can't understand inside, and you can't control inside.  So you build your own protective abstraction over it, one that you can see inside and understand and control.  And the next person does the same to your abstraction.  Before long, you have 23 million lines of Chromium.

C++ and Rust advertise "zero-cost abstractions".  Their cost metric is tied to execution speed.  But the cost we're concerned with here is related to simplicity, transparency, understandability, a grasp of the whole.  The confidence of the user, the absence of myth and superstition.  What are "zero-cost abstractions" here?  What material are they built from?