Comparative Latencies

It is usually hard to get an idea of how the time taken for various fundamental operations varies, and it does matter, but it’s hard to viscerally get it (time intervals in nanoseconds, microseconds, milliseconds aren’t really felt in the same way).

I came across this idea of representing the smallest number as a single second and everything else in terms of it, so that the relationship between the numbers is represented in more of a human scale, which results in the following table:

Op Latency Table

I wanted to show this in a chart, but it never shows more than the last two values, so I had to break it down into a series of smaller charts (I could use a log scale to represent them too, but that would’ve again lessened the impact you feel when seeing these numbers side by side)

Mostly similar, as long as it’s on the chip

 

This is a big jump!
Tremendous jump! Main memory latency is like 0.1% of the SSD access time!
… so imagine how much slower disk is, compared to RAM.
And if that was slow, you should check out the internet …
… exceeded only by an operation on the “machine”; this is finally when we can feel seconds ticking by.
Op Latency 6
Obviously, the worst thing you can do is try to restart the “real” system.

 

Advertisements

Summing up

I came across this post talking about numerical speed in Clojure, so I thought I would try out the equivalent in Common Lisp (Clozure CL) on my Macbook:

CL-USER> (let* ((arr-size 3000000)
        (double-arr (make-array arr-size :element-type 'single-float)))
       (dotimes (i arr-size)
         (setf (aref double-arr i) (random 1.0)))
       (time (loop for i from 0 below arr-size
            summing (aref double-arr i))))
(LOOP FOR I FROM 0 BELOW ARR-SIZE SUMMING (AREF DOUBLE-ARR I))
took 45,649 microseconds (0.045649 seconds) to run.
During that period, and with 4 available CPU cores,
     45,558 microseconds (0.045558 seconds) were spent in user mode
         57 microseconds (0.000057 seconds) were spent in system mode
1500183.5

So — 45 milliseconds, not bad.

Lispium ?

What if you did the following:

  • Take a chromebook
  • Modify the chromium build running to run Sbcl within it.
  • Create lisp bindings to the internal surface, so that all UI elements can be created and manipulated within the Lisp image.
  • Allow downloading, compiling and running arbitrary lisp code
  • One of the tabs is always a Repl
  • Add a caching filesystem that would persist part or whole of the image

… might this create a modern-day Lisp machine? Maybe.

Did I miss anything obvious here? If not, this sounds doable in a few years.

I’m lazy, do you if you like this idea (I’m sure there’s a guaranteed niche market for these machines), go ahead and throw it on to Kickstarter. Or something.

What has speed brought ?

Many good things, to be sure, but more has been omitted.

Perhaps Kent Pitman expressed it the best:

My only problem with this is that things are ALREADY sped up. What’s the point of running a zillion times faster than the machines of yesteryear, yet still not be willing to sacrifice a dime of it to anything other than doing the same kinds of boring computations that you did before? I want speedups not just to make my same old boring life faster, but to buy me the flexibility to do something I wasn’t willing to do at slower speeds.

Is JavaScript the new Lisp?

I seem to have been blissfully unaware of just how far JavaScript has come over the last decade or so.

I wanted to make a foray into mobile app programming (ah ok, nothing serious! Just a toy game or two) — and when I looked around, it looked like I had to deeply immerse myself into either the entire iOS ecosystem or the entire Android ecosystem.

Well, in fact, it’s worse — because I first have to make that choice!

So there are platform-independent alternatives — Xamarin (C#? no thanks), Titanium (maybe) and PhoneGap (heard good things!) Eventually though I came across [this nifty open-source framework], that seemed like it would fit my use case of “a toy game project” just fine.

It was super easy to get started (hey! a simulator in the browser! — contrast that with a completely non-working emulator for Android). But very soon I ran into (what seemed like) a huge problem — how the $#% was I supposed to debug anything here?

The running server (context: the app development environment is basically a node.js environment) just printed nonsensical error messages about “native view: undefined” or some such. This was horrible! How did anyone ever use this?

And then I encountered the obvious solution — the tooling for JavaScript is just really, really good, right out of the box. The simulator mentioned earlier is just another object in my web page — so I can drill into it, print out the values of any running objects, and — isn’t this the live-debugging nirvana? — modify them in-place to fix any error and resume.

Put that in your REPL and smoke it

The JavaScript console (sorry, my experience so far has only been with Chrome) is phenomenal. I can evaluate anything I like, getting not just text, but entire arbitrary-nested hierarchical objects dumped out for my inspection.

Yes, there is the whole “dynamic types => errors from typos” problem, and I ran into this pretty early on when a missing comma gave me a lot of grief. But this is somewhat made up by the source-level debugging at the console, where I can see the problem, and fix it right away!

WTF? Everything just works? And they’re tons of libraries too!

And here I was, thinking that the solution to the Lisp GUI problem was to tie in Webkit bindings to an MVC framework, to create a modern version of CLIM — but there’s already a (non-lispy) version of that out there!

(I’m confused)

Propagating, artfully

Just came across [this paper] today, and it was super-interesting. (I’m assuming the title is a reference to [an earlier one] by Steele and Sussman).

You can read the abstract, or [watch Sussman talk about it], so I won’t waste your time summarizing it here. Instead, I’ll skip all the way ahead to the [thesis version] of the paper, past the “Conclusions” and on to the section titled Philosophical Insights. This part was quite … enlightening.

First off, the “concurrency problem” really isn’t one:

… the problem comes from trying to maintain the illusion that the events of a concurrent system occur in a particular chronological order.

But this artificiality is about concurrency in programs, not in the natural world, since

… concurrency is the natural state of affairs, and synchronicity is what’s difficult. The physical world in which we live is perfectly concurrent: every little patch of universe evolves on its own, according to local rules, and physical effects travel from one patch to another at a finite speed …

More importantly,

Our experience of time appears as a linear stream of events only because our memory imposes an order on them, which is the order in which events are remembered to have occurred … the illusion of synchronous time is sustainable for our society, however, because the patch of universe we occupy synchronizes far faster than the temporal resolution of “event” our society expects.

So, then, why is this a problem? Because we work around the behavior of the “analog world” to create the “digital world”.

A great deal of engineering effort is expended to build from this substrate computational devices that provide the illusion of synchronous time.

This would be fine, except that these cores share memory. Memory is what creates this problem.

This large memory implicitly imposes a linear order on the reads or writes that it experiences. Since synchronizing concurrent things well is hard, the implicit synchronization done by such a memory is terrible.

And this is what makes concurrent programming so hard, which is where the propagator network steps in.

… the structure of a propagator network is analogous to space, and the speed at which information propagates through it is analogous to the speed of light … the idea of partial information lets us build a propagation infrastructure that does not force the notion of synchronous time.

Let’s take a step back now and consider the bigger picture.

An electronic computer is a fixed electrical circuit that performs a universal computation, that is, it implements a layer of indirection which allows it to mimic any computation that is described to it … we understand the meaning of machine instructions as sequences of things for the computer to do, with a very powerful notion of the flow of time

Seen in this way, we can derive many concepts in Computer Science as notions introduced to rearrange or describe parts of these sequences, such as imperative programming …

… pure linear sequences, or course, are insufficient for describing the computations we want to describe, so we add means of naming portions of such sequences and referring to them …

… or virtual memory …

… virtual memory introduces a notion of space. The memories of separate programs running on the same computer are kept separate … each program, therefore, has its own flow of time …

… or lexical scope …

… lexical scope introduces a finer-grained notion of space … each lexical scope is therefore a sort of point in space; and their interconnections give space a topology.

This leads us, inexorably, to thinking of the fundamental structure of computations as trees rather than lines.

Tree-shared computing systems can of course be made out of line-shaped ones with some additional mechanism, such as a recursive interpreter of the tree structure of a program.

… and I love this next bit …

Conceptually, the space of such a system is topologically a tree, and time is inherited from the underlying linear interpreter, and therefore constitutes some traversal of the tree that is space. The use of names to refer to reusable partial results of a tree computation turns it into a directed graph computation. The sequential time of the underlying computer constitutes a topological sort of the computation, forcing the graph to be acyclic.

Whew. Ok! So now we know! With this insight, let’s tackle the next bit, the bane of all FP-advocates: Side effects.

(Note: I would like the paper to end right here, at this high point. I could split this article into two halves, but if I stop here I probably won’t get around to the next one)

The essential reason why side effects tend to mess up models of computation is that side effects inescapably introduce time.

So the fundamental tension can be expressed as …

… we want to be able to build programs whose observable actions are what we want them to be; for that we need the observable actions of our programs to be predictable from the programs themselves.

… or equivalently, as …

… between giving our computers the freedom to operate as they will on their internal structures, where we are not concerned with the details of progress but only with its result; and constraining our computers to act the way we wish externally …

Ok, we know this, we’ve seen this before. What exactly is new about this?

The same tension exists in the architecture of human minds. Indeed, people too can both think and act … we have a word for the transition humans make between thought and action: “decision”.

This is something I found quite novel. Thinking doesn’t respect the flow of time at all. But action is constrained entirely by it. So we are faced with an (apparent ?) cartesian-like duality of thought and action, and we have to choose how to deal with it.

So the result is either that the primitives for action amount to always doing everything whenever one thinks about doing it, or that the story about thinking offers no primitives for action at all.

So either thoughts are constrained in a way to make actions predictable, or else “bridges” between these two worlds of thought and action are needed. Again, the propagator paradigm can step in and resolve this.

We can have three worlds, of thoughts, actions and decisions, each being some sort of propagator. To put it more concretely,

… when a network has many fan ins and cycles, and operates on interesting partial information structures, propagators can run in a chaotic order, and be run many times, and let the merging of the partial information ensure that they produce a coherent result at the end; this is thought.
… when a network is acyclic and the “partial” information structure is the all-or-nothing structure, then propagation mimics conventional evaluation, and the order of events is perfectly predictable; this is action
… propagators that will accept partial inputs, that can perhaps be refined with the passage of time, and will take upon themselves the responsibility of picking a moment, making a choice, and committing to an all-or-nothing answer; this is decision

That’s it! Read it once, read it twice, I can guarantee you that you won’t think about the computation the same way again!