Github Sponsors

24 May 2019 • OPEN-SOURCE TWEETSTORM

1/ The recent announcement of @github sponsors (https://t.co/OjxZU1t6UT) is an interesting development in OSS. I feel it is a great time to revisit my essay on the questions facing open-source software: https://t.co/UgK40taf9R #GitHubSponsors 1/N
— Sujith Jay Nair (@suj1th) May 23, 2019

Q.1: Can we make the open-source movement self-sustaining? Open source survives on philanthropy: the altruism of the initiator of an open source project, the unpaid labour of the maintainer, and the monetary donations to foundations. Is there an alternative, self-sustaining way?
— Sujith Jay Nair (@suj1th) May 23, 2019

Github Sponsors could, prima facie, remove the reliance of OSS projects on foundations. It would continue to be based on philanthropy. Of the sponsors. Is that an improvement over the present? I am not sure.
— Sujith Jay Nair (@suj1th) May 23, 2019

Q.2:
a) Can we pay back for the effort of the maintainer and the individual contributor?
b) Can we provide economic incentives to the maintainers and contributors to help continued development?
c) How do we assign value to an open source project & contribution?
— Sujith Jay Nair (@suj1th) May 23, 2019

Catastrophic Forgetting

01 Dec 2018 • DEEP-LEARNING TWEETSTORM

1/ Catastrophic Forgetting is a long-recognised problem in neural networks; and is of great interest in cognitive sciences. In plain words, it is the destructive interference effect of learning a new skill on pre-existing skills. #deeplearning
— Sujith Jay Nair (@suj1th) December 1, 2018

2/ Research in Deep Learning has had a particular focus on this problem in recent time, particularly in the realm of reinforcement learning. e.g. [Rolnick, Ahuja, Schwarz et al. 2018], [Shin, Lee, Kim et al. 2017] among others. #deeplearning
— Sujith Jay Nair (@suj1th) December 1, 2018

3/ A common thread in these research is the replay of past data to reinforce acquired skills from the past. Rolnick et al. (https://t.co/jluXqvN2Qr) choose a 50-50 split of replay vs. new task data. #deeplearning
— Sujith Jay Nair (@suj1th) December 1, 2018

Caveat to Open Source Disruption

20 Oct 2018 • OPEN-SOURCE TWEETSTORM

I believe an important caveat exists to this postulate: the physical capital necessary for the production & innovation of the resource should have low-cost access & wide distribution. I will try & explore this caveat a bit. https://t.co/p57V70PiCr
— Sujith Jay Nair (@suj1th) October 20, 2018

I will use the case of the Pharmaceutical industry. Modern drug discovery is a patent-heavy process, which should make it a ripe candidate for open source disruption. But this has not been the case, yet.
— Sujith Jay Nair (@suj1th) October 20, 2018

My argument for why this is so is the concentrated nature of the physical asset (lab infrastructure, capital for clinical trials) needed for innovation in drug discovery - it is limited to large pharmaceutical firms and some university departments.
— Sujith Jay Nair (@suj1th) October 20, 2018

The concentrated nature of the physical asset ensures the opportunity cost of losing out on innovation that could have been garnered by the resource as a commons, is very low. This, in turn, reduces the effective implementation cost of property for the resource. Hence, patents!
— Sujith Jay Nair (@suj1th) October 20, 2018

Open Source Eats Patents

19 Oct 2018 • OPEN-SOURCE TWEETSTORM

@asynchio postulates that every patent-heavy industry will be dis-intermediated by open source. A thread on why this prediction could turn out to be true. 1/N
— Sujith Jay Nair (@suj1th) October 19, 2018

Demsetz' Theory on Property Rights models the emergence of property around a resource as a function of the cost of implementing & enforcing property rights.
— Sujith Jay Nair (@suj1th) October 19, 2018

A resource, managed as property, could evolve into commons when the implementation cost of property rights exceeds the value of the increase in the efficiency of utilisation of the resource caused by adoption of property rights.
— Sujith Jay Nair (@suj1th) October 19, 2018

Dynamo vs Cassandra : Systems Design of NoSQL Databases

02 Oct 2018 • DATA-SYSTEMS NOSQL DATABASES

State-of-the-art distributed databases represent a distillation of years of research in distributed systems. The concepts underlying any distributed system can thus be overwhelming to comprehend. This is truer when you are dealing with databases without the strong consistency guarantee. Databases without strong consistency guarantees come in a range of flavours; but they are bunched under a category called NoSQL databases.

NoSQL databases do not represent a single kind of data model, nor do they offer uniform guarantees regarding consistency and availability. However, they are built on very similar principles and ideas.

From a historical perspective, the advent of NoSQL databases was precipitated by the publication of Dynamo by Amazon¹ & BigTable by Google, and the emergence of a number of open-source distributed data stores, which were (improved?) clones of either (or both) of these systems. Bigtable-inspired NoSQL stores are referred to as column-stores (e.g. HyperTable, HBase), whereas Dynamo influenced most of the key/value-stores. We will term these systems loosely as Dynamo-family databases, which include Riak, Aerospike, Project Voldemort, and Cassandra.

I would like to focus on systems design ideas in Dynamo-family NoSQL databases in this article, with a particular focus on Cassandra. The approach of this article is to compare and contrast Cassandra with Dynamo; and in this process, touch upon the underlying ideas. Expect a lot of homework & further readings; I will have copious amounts of references throughout the article.

Sujith Jay Nair Thinking Aloud

Github Sponsors

Catastrophic Forgetting

Caveat to Open Source Disruption

Open Source Eats Patents

Dynamo vs Cassandra : Systems Design of NoSQL Databases