Sujith Jay Nair Thinking Aloud

The Blue Flower : A Review

The Blue FlowerThe Blue Flower by Penelope Fitzgerald
My rating: 5 of 5 stars

Historical fiction can work at such disparate levels; an era as a backdrop for the narrative, familiar textbook history unraveling as background score to the symphony of the lead characters’ life, the idiosyncrasies of the bygone era pictured in contrast to the era of the writer. The Blue Flower has everyone of these devices used to perfection, but it is so much more.

It is a purported biography of the early life of Novalis, a romantic poet & philosopher from 18th/19th century Saxony. It is an unusual love story. I do not use ‘unusual’ as moral judgement for love across an uncomfortable age-divide, but to mean the stark contrast between the lovers in (for want of better words) their levels of intellect & emotional range. To highlight my point, let me present my favorite exchange of words between Novalis and his ladylove Sophie:

`Should you like to be born again?’, asks Novalis, expecting a conversation on the philosophy of transmigration.
Sophie considered a little. ‘Yes, if I could have fair hair.’


Such an unbridgeable divide, but Fitzgerald convinces us of the irrational sway of love (love of the truly, madly, deeply variety).

In addition, the book is an account of the lives of Lower German nobility; a comical sketch of the reaction of this landed gentry to contemporaneous French Revolution, the epochal ideas of liberty & egalitarianism that it espoused, and the subsequent march of Napoleon.

Lastly, but foremost for me, the book’s thin underlying veneer of (Fichtean) philosophy makes you want more & know more of it.

Why should poetry, reason and religion not be higher forms of Mathematics? All that is needed is a grammar of their common language.

S3 and HDFS

Cluster storage systems have, over the past decade, moved their gold standards from directory-oriented file-systems such as HDFS to object-stores such as AWS S3. The two storage models have been dissected & compared over & again from multiple perspectives 1 2. Again, based on your use-case, you might be more interested in a certain cross-section of differences between S3 & HDFS than other differences. I am not trying here to repeat the analyses.

I wrote this short, bullet-style compilation as a quick refresher for myself on ways S3 differs from HDFS; it is focused on the APIs & interactions Hadoop-like data-processing systems (such as Hadoop, Spark, or Flink ) might have with storage systems.

.. Read More

Converse Conway's Law

Melvin Conway in his 1968 paper How Do Committees Invent? postulated the now-famous Conway’s Law.

Organisations which design systems are constrained to produce designs which are copies of the communication structures of these organisations.

This homomorphism between organisational communication structures and systems designed by them, has become an adage in software management. It implies a one-way effect, though. But, does it work in the other direction?

Given a mature (say, software) system, can we infer organisational communication structures? Particularly, informal communication structures? 1 Do informal communication structures affect system design in the first place?

Footnotes
  1. Formal communication structures are defined by organisational reporting structures. 

Providing Streaming Joins as a Service at Facebook

Providing Streaming Joins as a Service at Facebook. Jacques-Silva, G., Lei, R., Cheng, L., et al. (2018). Proceedings of the VLDB Endowment, 11(12), 1809-1821.

Stream-stream joins are a hard problem to solve at scale. “Providing Streaming Joins as a Service at Facebook” provides us the overview of systems within Facebook to support stream-stream joins.

The key contributions of the paper are:

  1. a stream synchronization scheme based on event-time to pace the parsing of new data and reduce memory consumption,

  2. a query planner which produces streaming join plans that support application updates, and

  3. a stream time estimation scheme that handles the variations on the distribution of event-times observed in real-world streams and achieves high join accuracy.

Trade-offs in Stream-Stream Joins

Stream-stream joins have a 3-way trade-off of output latency, join accuracy, and memory footprint. One extreme of this trade-off is to provide best-effort (in terms of join accuracy) processing-time joins. Another extreme is to persist metadata associated with every joinable event on a replicated distributed store to ensure that all joinable events get matched. This approach provides excellent guarantees on output latency & join accuracy, with memory footprint sky-rocketing for large, time-skewed streams.

The approach of the paper is in the middle: it is best-effort with a facility to improve join accuracy by pacing the consumption of the input streams based on dynamically estimated watermarks on event-time.

.. Read More

Natural Languages are Interfaceless

In The Design of Everyday Things, Donald Norman talks about the temperature knobs on his refrigerator:

I used to own an ordinary, two-compartment refrigerator - nothing very fancy about it. The problem was that I couldn’t set the temperature properly. There were only two things to do: adjust the temperature of the freezer compartment and adjust the temperature of the fresh food compartment. And there were two controls, one labeled “freezer”, the other “refrigerator”. What’s the problem? Oh, perhaps I’d better warn you. The two controls are not independent. The freezer control also affects the fresh food temperature, and the fresh food control also affects the freezer.
In fact, there is only one thermostat and only one cooling mechanism. One control adjusts the thermostat setting, the other the relative proportion of cold air sent to each of the two compartments of the refrigerator. It’s not hard to imagine why this would be a good design for a cheap fridge: it requires only one cooling mechanism and only one thermostat. Resources are saved by not duplicating components - at the cost of confused customers.

Norman is talking about the lack of a (good) interface here: a layer to translate (and hide) the structure of the underlying mechanism to the users of the mechanism. 1 The need to translate to the user arises in two scenario:

  1. There is a divide between the want of the user, and the how the mechanism is structured. I like to call it the what-how divide. 2
  2. Although the mechanism & the user’s want are aligned, the mechanism is too convoluted for the user to use in a direct way. A facilitator is needed.

In both cases, a translation is needed, and the translator is termed an interface.

Languages are Interfaceless

(Inter)Faceless a.k.a No-Face

(Inter)Faceless a.k.a No-Face

(Natural) Languages are the quintessential human way of communication. Our advanced languages are arguably the lone differentiators of our species from our cousins in the primate family, and the larger animal kingdom. 3

We have been inventing, honing, assimilating, and discarding languages since the start of our existence as a species. But we do not develop languages with an intent for it to be translated. Languages are not meant by its inventors to be translated. Every language is developed as if it is the only language in existence, and everyone else understands it.

.. Read More