Column Store Joins
Transitioning intermediate results from row-oriented to column-oriented storage to maximize cache locality and throughput.
The Transition
The benefits of column store versus row store organization are well-established for analytical workloads. However, our baseline implementation was still using a row store format for intermediate results produced during joins.
The Goal: To maintain intermediate results in a column store structure. This allows us to exploit CPU cache locality (keeping similar data types together) and significantly accelerate the execution of subsequent joins.
Architectural Shift
{id, val}{id, val}
[id, id][val, val]
Paged Storage for Intermediate Results
The storage space for data in the intermediate results of each join is now designed as a collection of physically contiguous buffers (pages). Each page contains many elements packed tightly together.
Memory Efficiency
By using large pages instead of allocating memory for individual tuples or small vectors, we minimize the overhead of memory allocation (malloc/new) directly within the algorithm's hot path.
Future-Proofing for Parallelism
This paged structure is a precursor to parallel probing. It allows each thread to write results into its own private pages without locking.
Partial results from all threads can then be trivially combined by simply collecting the list of pages produced by each thread, requiring zero data copying.
Implementation Details
-
New Structure:
column_tA paged structure holding
value_telements (defined in the previous step) contiguously. This replaces the old row-based vectors. -
ScanNode Modification
ScanNodes were updated to produce intermediate results in the form of
vector<column_t>instead of rows. -
Hash Join Iteration
The hash join logic was rewritten to iterate over this new columnar structure and produce output in the same format. Crucially, no row store structures remain in
execute()after this optimization.