Sujith Jay Nair Thinking Aloud

Broadcast Hash Joins in Apache Spark

image-title-here

Introduction

This post is part of my series on Joins in Apache Spark SQL. Joins are amongst the most computationally expensive operations in Spark SQL. As a distributed SQL engine, Spark SQL implements a host of strategies to tackle the common use-cases around joins.

In this post, we will delve deep and acquaint ourselves better with the most performant of the join strategies, Broadcast Hash Join.

.. Read More

An Early Employee's Field Guide to Workplace Arguments

TL; DR Conflicts are common in an early-stage startup. This post lists a set of mental models an early employee can use to prevent, judge, diffuse and take leverage of conflicts.

.. Read More

Multiple Parameter Lists in Scala

Note: I wrote this article as part of a contribution to Scala Documentation. The original post can be found here.

Methods may define multiple parameter lists. When a method is called with a fewer number of parameter lists, then this will yield a function taking the missing parameter lists as its arguments. This is formally known as currying.

.. Read More