Date: Sun, 12 Apr 2015 12:22:17 -0700
From: Robert M Ochshorn
Subject: Re: [journal] I would trust it on the highway. ( - or - how a little fiddling and some proper nouns helped me identify pages in a book, a partial dispatch )
On Apr 5, 2015, at 1:02 PM, Robert M Ochshorn wrote:

On Apr 4, 2015, at 1:15 PM, Robert M Ochshorn wrote:

On Apr 4, 2015, at 12:42 PM, Toby Schachman wrote:

What's the difference with liblinear? The documentation makes it sound like it's just more efficient, not doing anything different.

I don't fully understand this, but from the documentation:

SVC and NuSVC implement the “one-against-one” approach (Knerr et al., 1990) for multi- class classification. [...]

On the other hand, LinearSVC implements “one-vs-the-rest” multi-class strategy, thus training n_class models. If there are only two classes, only one model is trained


I’m not sure if this is right, but what I hope LinearSVC is doing relates to this work on Exemplar SVM by Tomasz Malisiewicz (et al) at CMU. Malisiewicz’s thesis was both inspiring and literate (and started with the memex!), though if you’re just interested in this particular classification approach, the short paper is probably better.

So, the night before the opening I was training my classifier on the second book (aka the “blue book”), which is much more challenging because it’s much more text heavy, has fewer custom layouts, and is much longer (~350 pages). Here is a typical spread, as seen from the overhead camera:

 

I ran my SVM classifier again and panicked when the computer claimed a pathetic 7% validation rate, i.e., it was completely useless. After calming down, I realized that I was accidentally using LIBSVM again instead of LIBLINEAR, and flipping the switch (“this one weird trick”) magically yielded a 98% success rate on my testing data (and even that 2% error was probably just error in dataset preparation). The realtime page detection ended up being among the most robust components of the whole thing, proving insensitive to occlusions and different usage patterns, &c.

One thing I didn’t do was auto-alignment: I depended on the book being in a predictable position. This required the design and construction of a “jail” for each book, in the form of binder clips (with their handles removed) that were screwed into the table between wooden supports. Here’s LAPD’s Austin Hines about to pull open one of the clips:


I think I took this video of flipping through pages before final calibrations and alignments, but it gives some sense of the rough timing of the system.

Between the slightly-elevated mounting, the precision illumination, and the use of original documents from the 70s, quite an aura was produced/preserved:


Several hundred people showed up in the course of the evening (the show got some press before opening, but I think mostly it was word of mouth, aka “street cred,” that was bringing in the crowd):


Since each book was controlling a projection, it felt at times like a very elaborate book-based DJ/turntable system:


The design of the projections was a bit ad-hoc. I inverted the color of the pages to white on black, and then allowed the highlights to expand. The LAPD theater group did several rehearsals reading from the book, and their readings were baked into the projected book. Additionally, external interviews and narration were “linked” to pages of the book. Here is a clickthrough of the projection, as if someone were flipping through the book. There’s a very minimal map at the top showing where you are (yellow) and where there is dynamic content (green).


All of the content was prepared within archive UI’s that I was writing as we were preparing the show, which, predictably enough got pretty messy. This is the horrendous state of the linking interface (subset of video <-> subset of pdf) by yesterday: 



Now that most of the gnarly lower-level details are somewhat resolved (the opening was not without a few technical hiccups—one of the books popped out of its clamps mid-way through and I had to refresh Chrome a couple times through Screen Sharing) I’ve been reflecting about how it actually worked, from a non-technical perspective, and I’ve been coming back to some earlier ideas about the “dreams” of computer-information systems. More on this, later.

Thanks for all of your help, ideas, and support. 

Your correspondent,

R.M.O.