It’s amazing how “against the grain”[0] of web/media such a basic experiment as this is:
http://rmozone.com/snapshots/2014/12/backforwardaudiojerk.html
(This is an exploratory building block for a time-flickable video player—it’s not intended to be anything on its own.)
I’m not expecting any insight to come from these footnoted hoops I jumped through, but I thought to share because I think this frustration is not unique[1] and is somewhat emblematic of the modern ambivalence or even malaise towards the premise/promise of “high level” abstractions. A few months ago, Glen sent me a link to a paper about the
UNIX shell as a 4GL:
A fourth generation programming language (4GL) should
make possible the simple statement of what you want, rather
than a detailed procedure of how to produce it. Although there
are many products calling themselves 4GL today, they are
mostly rewrites of COBOL and report writers. They are too
low level and tedious. This is definitely not what a 4GL
should be.
There’s a lot to disentangle from this tendency, which seems to continue unabated even today (and at CDG, &c). I think there are two beliefs that are often confused when advocating a “4GL” or similar. The first is the belief that computers are fast and compiler theory is improving, so the programmer need not be responsible for squeezing in every “ounce” of optimization. It’s sad that, still, that doesn’t seem true for many common domains—it’s disheartening that in 2014 I need to be careful with memory/CPU/bandwidth when doing simple a/v manipulations. But there’s a second belief, which is often implicit and which is more troubling in my view, which is that we will be able to dismantle the ladder of abstraction, if you will, once we’ve reached the top, because everything we need will be in the attic. Or, to mix metaphors:
I love in Bret’s essay how often the ladder is used in both directions and the “spirit of abstraction” is beautifully characterized as “[omitting] some details in the interest of exposing patterns.” When you put it that way, it seems obvious: (1) that abstractions are great, useful, quotidian, &c; and (2) that you’ll usually want the details back eventually. Open source is appealing, at least in theory, because the abstractions provided are “descendable.” Unfortunately it’s painful to go down into the Linux kernel, &c. I find that HLLs, DSLs, dataflow block environments, and the W3C/WHATWG all tend to assume that they’re making a binary break with the past: ascent! I keep looking for a way to design for descent—start “high” and rough, get into the details as needed—but don’t have many good examples where that’s encouraged by toolsets.
Happy holidays, all!
R.M.O.
[0] Crossing a couple grains:
1. There is a “playbackRate” property on MediaElements in
html5. My
first attempt was to use that directly on a <video>. However, that worked
extremely poorly. I could set the playbackRate, but if I did so repeatedly (eg to adjust the rate smoothly) the whole thing slowed down to a halt. And playing backwards I was lucky to get more than one FPS.
2. The Web Audio API also has a playbackRate property on its AudioBuffer objects, though not on its streaming nodes, so you need to have the entire audio file in-memory if you want to adjust playback speed. I
tried that, moving to audio-only for simplicity, and was pleased to see it sounded alright. BUT, when playbackRate was set less than 0, it simply ignored the request. Eventually, I found
the discussion where the standards authors decided that “
Playing backwards complicates the model.”
Worse is better lives!
3. What I ended up needing to do was to load the entire audio buffer into memory, make a duplicate buffer with a reversed copy, and when the playback rate got less than zero, switch playback sources, doing some sort of insane* index calculations to keep track of exactly where to start the reversed buffer while swapping it in.
4. I’m a bit surprised that (3) worked at all. I was expecting to go down to the sample level, which would have been much simpler (I did a similar thing in ~10lines of python/numpy a few weeks ago)
--
* Not insane as in inherently difficult, but insane as in keeping track of the workings of a black box through some very specific timing assumptions inferred between the lines of the “standard.”
[1] The “not unique” part may be wishful thinking: my Whorfian/cynical side insinuates that few people fight against anticipated affordances of tools, APIs, libraries, and languages… they just do the things Web Audio API designers expected of them. But I don’t think this is quite right…