From 5bea7ddee28e675913e3cb9c35fb87d454ca09ca Mon Sep 17 00:00:00 2001 From: Andrei Pechkurov Date: Fri, 20 Mar 2026 11:15:24 +0200 Subject: [PATCH 1/2] Document multi-RHS table support in HORIZON JOIN --- documentation/query/sql/horizon-join.md | 104 +++++++++++++++++++++--- 1 file changed, 93 insertions(+), 11 deletions(-) diff --git a/documentation/query/sql/horizon-join.md b/documentation/query/sql/horizon-join.md index 230b8289e..98b5f295c 100644 --- a/documentation/query/sql/horizon-join.md +++ b/documentation/query/sql/horizon-join.md @@ -15,6 +15,10 @@ It is a variant of the [`JOIN` keyword](/docs/query/sql/join/) that combines [ASOF JOIN](/docs/query/sql/asof-join/) matching with a set of forward (or backward) time offsets, computing aggregations at each offset in a single pass. +HORIZON JOIN supports joining against **multiple right-hand-side tables** in a +single query, enabling aggregation of columns from several time-series sources +against a common left-hand table and offset grid. + ## Syntax ### RANGE form @@ -25,7 +29,10 @@ given `STEP`: ```questdb-sql title="HORIZON JOIN with RANGE" SELECT [,] FROM AS -HORIZON JOIN AS [ON ()] +HORIZON JOIN AS [ON ()] +[HORIZON JOIN AS [ON ()]] +... +HORIZON JOIN AS [ON ()] RANGE FROM TO STEP AS [WHERE ] [GROUP BY ] @@ -42,7 +49,10 @@ Specify explicit offsets as interval literals: ```questdb-sql title="HORIZON JOIN with LIST" SELECT [,] FROM AS -HORIZON JOIN AS [ON ()] +HORIZON JOIN AS [ON ()] +[HORIZON JOIN AS [ON ()]] +... +HORIZON JOIN AS [ON ()] LIST (, ...) AS [WHERE ] [GROUP BY ] @@ -53,18 +63,29 @@ For example, `LIST (0, 1s, 5s, 30s, 1m)` generates offsets at those specific points. Offsets must be monotonically increasing. Unitless `0` is allowed as shorthand for zero offset. +When using multiple HORIZON JOINs, only the **last** HORIZON JOIN in the chain +carries the `RANGE`/`LIST` and `AS` clauses. Preceding HORIZON JOINs omit them. +Each right-hand table can independently use or omit the `ON` clause — you can +freely mix keyed and non-keyed (timestamp-only ASOF) joins within the same +query. + ## How it works For each row in the left-hand table and each offset in the horizon: 1. Compute `left_timestamp + offset` -2. Perform an ASOF match against the right-hand table at that computed timestamp +2. Perform an ASOF match against each right-hand table at that computed timestamp 3. When join keys are provided (via `ON`), only right-hand rows matching the keys are considered +With multiple right-hand tables, each table is matched independently at each +offset. If a right-hand table has no match for a given row/offset combination, +its columns resolve to `NULL` (or `NaN` for numeric types). + Results are implicitly grouped by the non-aggregate SELECT columns (horizon offset, left-hand table keys, etc.), and aggregate functions are applied across -all matched rows. +all matched rows. Aggregate expressions can reference columns from different +right-hand tables (e.g., `avg(b.bid + a.ask)`). ## The horizon pseudo-table @@ -196,6 +217,63 @@ WHERE t.timestamp IN '$now-1h..$now' ORDER BY horizon_sec; ``` +### Multi-table: bid and ask spread around trades + +Join against two separate tables (bids and asks) in a single HORIZON JOIN query +to compute the average spread at each horizon offset: + +```questdb-sql title="Multi-table HORIZON JOIN with bid/ask spread" +SELECT + h.offset / 1000000000 AS horizon_sec, + t.sym, + avg(b.bid) AS avg_bid, + avg(a.ask) AS avg_ask, + avg(a.ask - b.bid) AS avg_spread +FROM trades AS t +HORIZON JOIN bids AS b ON (t.sym = b.sym) +HORIZON JOIN asks AS a ON (t.sym = a.sym) + RANGE FROM -2s TO 2s STEP 2s AS h +GROUP BY horizon_sec, t.sym +ORDER BY t.sym, horizon_sec; +``` + +Note that the `RANGE` clause appears only on the last HORIZON JOIN. + +### Multi-table: keyed and non-keyed mix + +Combine a keyed join (matching by symbol) with a non-keyed join (timestamp-only +ASOF) in the same query — useful when one source is symbol-specific and another +is market-wide: + +```questdb-sql title="Mixed keyed and non-keyed HORIZON JOIN" +SELECT + avg(p.price) AS avg_price, + avg(r.rate) AS avg_rate +FROM trades AS t +HORIZON JOIN prices AS p ON (t.sym = p.sym) +HORIZON JOIN rates AS r + LIST (0, 1s, 5s) AS h; +``` + +Here `prices` is matched by symbol (keyed), while `rates` uses timestamp-only +ASOF matching (non-keyed). + +### Multi-table: three right-hand tables + +HORIZON JOIN supports more than two right-hand tables: + +```questdb-sql title="Three-table HORIZON JOIN" +SELECT + avg(b.bid) AS avg_bid, + avg(a.ask) AS avg_ask, + avg(m.mid) AS avg_mid +FROM trades AS t +HORIZON JOIN bids AS b ON (t.sym = b.sym) +HORIZON JOIN asks AS a ON (t.sym = a.sym) +HORIZON JOIN mids AS m ON (t.sym = m.sym) + LIST (0) AS h; +``` + ## Mixed-precision timestamps The left-hand and right-hand tables can use different timestamp resolutions @@ -228,22 +306,26 @@ Look for these indicators in the plan: - **Async Horizon Join**: Parallel execution using Java-evaluated expressions - **Async JIT Horizon Join**: Parallel execution using Just-In-Time-compiled expressions for better performance +- **Async Multi Horizon Join**: Parallel execution with multiple right-hand + tables (shown with `tables: N` indicating the number of right-hand tables) ## Current limitations -- **No other joins**: HORIZON JOIN cannot be combined with other joins in the - same level of the query. Joins can be done in an outer query. +- **No other join types**: HORIZON JOIN cannot be combined with other join types + (e.g., `JOIN`, `ASOF JOIN`) in the same level of the query. Multiple HORIZON + JOINs are allowed, but mixing with non-HORIZON joins is not. Other joins can + be done in an outer query. - **No window functions**: Window functions cannot be used in HORIZON JOIN queries. Wrap the HORIZON JOIN in a subquery and apply window functions in the outer query. - **No SAMPLE BY**: `SAMPLE BY` cannot be used with HORIZON JOIN. Use `GROUP BY` with a time-bucketing expression instead. - **WHERE filters left-hand table only**: The `WHERE` clause can only reference - columns from the left-hand table. References to right-hand table columns or - horizon pseudo-table columns (`h.offset`, `h.timestamp`) are not allowed. -- **Both tables must have a designated timestamp**: The left-hand and right-hand - tables must each have a designated timestamp column. -- **Right-hand side must be a table**: The right-hand side of HORIZON JOIN must + columns from the left-hand table. References to any right-hand table columns + or horizon pseudo-table columns (`h.offset`, `h.timestamp`) are not allowed. +- **All tables must have a designated timestamp**: The left-hand and all + right-hand tables must each have a designated timestamp column. +- **Right-hand side must be a table**: Each right-hand side of HORIZON JOIN must be a table, not a subquery. - **RANGE constraints**: `STEP` must be positive; `FROM` must be less than or equal to `TO`. From e9c08bead0bcb520038d03036c29a33b9a9a17ce Mon Sep 17 00:00:00 2001 From: Marko Topolnik Date: Thu, 26 Mar 2026 11:38:30 +0100 Subject: [PATCH 2/2] Some improvements - specify in more detail which queries are allowed - add negative numbers to RANGE and LIST examples - use active voice - add underscores to millions and billions --- documentation/query/sql/horizon-join.md | 82 +++++++++++++------------ 1 file changed, 44 insertions(+), 38 deletions(-) diff --git a/documentation/query/sql/horizon-join.md b/documentation/query/sql/horizon-join.md index 98b5f295c..95a67af40 100644 --- a/documentation/query/sql/horizon-join.md +++ b/documentation/query/sql/horizon-join.md @@ -39,8 +39,8 @@ RANGE FROM TO STEP AS [ORDER BY ...] ``` -For example, `RANGE FROM 0s TO 5m STEP 1m` generates offsets at 0s, 1m, 2m, -3m, 4m, 5m. +For example, `RANGE FROM -3m TO 3m STEP 1m` generates offsets at -3m, -2m, -1m, +0m, 1m, 2m, 3m. ### LIST form @@ -59,9 +59,9 @@ LIST (, ...) AS [ORDER BY ...] ``` -For example, `LIST (0, 1s, 5s, 30s, 1m)` generates offsets at those specific -points. Offsets must be monotonically increasing. Unitless `0` is allowed as -shorthand for zero offset. +For example, `LIST (-1m, -5s, -1s, 0, 1s, 5s, 1m)` generates offsets at those +specific points. Offsets must be monotonically increasing. Unitless `0` is +allowed as shorthand for zero offset. When using multiple HORIZON JOINs, only the **last** HORIZON JOIN in the chain carries the `RANGE`/`LIST` and `AS` clauses. Preceding HORIZON JOINs omit them. @@ -74,18 +74,19 @@ query. For each row in the left-hand table and each offset in the horizon: 1. Compute `left_timestamp + offset` -2. Perform an ASOF match against each right-hand table at that computed timestamp -3. When join keys are provided (via `ON`), only right-hand rows matching the - keys are considered +2. Perform an ASOF match against each right-hand table at that computed + timestamp +3. When join keys are provided (via `ON`), consider only the right-hand rows + matching the keys -With multiple right-hand tables, each table is matched independently at each -offset. If a right-hand table has no match for a given row/offset combination, -its columns resolve to `NULL` (or `NaN` for numeric types). +With multiple right-hand tables, QuestDB matches each table, at each offset, +independently. If a right-hand table has no match for a given row/offset +combination, its columns resolve to `NULL`. -Results are implicitly grouped by the non-aggregate SELECT columns (horizon -offset, left-hand table keys, etc.), and aggregate functions are applied across -all matched rows. Aggregate expressions can reference columns from different -right-hand tables (e.g., `avg(b.bid + a.ask)`). +QuestDB implicitly groups the results by the non-aggregate SELECT columns +(horizon offset, left-hand table keys, etc.), and applies aggregate functions +across all matched rows. Aggregate expressions can reference columns from +different right-hand tables (e.g., `avg(b.bid + a.ask)`). ## The horizon pseudo-table @@ -94,15 +95,15 @@ the `AS` clause. This pseudo-table exposes two columns: | Column | Type | Description | |--------|------|-------------| -| `.offset` | `LONG` | The offset value in the left-hand table's designated timestamp resolution. For example, with microsecond timestamps, `h.offset / 1000000` converts to seconds; with nanosecond timestamps, `h.offset / 1000000000` converts to seconds or `h.offset / 1000000` converts to milliseconds. | +| `.offset` | `LONG` | The offset value in the left-hand table's designated timestamp resolution. For example, with microsecond timestamps, `h.offset / 1_000_000` converts to seconds; with nanosecond timestamps, `h.offset / 1_000_000_000` converts to seconds. | | `.timestamp` | `TIMESTAMP` | The computed horizon timestamp (`left_timestamp + offset`). Available for grouping or expressions. | ## Interval units All offset values in `RANGE` (`FROM`, `TO`, `STEP`) and `LIST` **must include a -unit suffix**. Bare numbers are not valid — write `5s`, not `5` or `5000000000`. -The only exception is `0`, which is allowed without a unit as shorthand for zero -offset. +unit suffix**. Bare numbers are not valid — write `5s`, not `5` or +`5_000_000_000`. The only exception is `0`, which is allowed without a unit as +shorthand for zero offset. Both `RANGE` and `LIST` use the same interval expression syntax as [SAMPLE BY](/docs/query/sql/sample-by/): @@ -121,7 +122,8 @@ Note that `h.offset` is always returned as a `LONG` in the left-hand table's timestamp resolution (e.g., nanoseconds for `TIMESTAMP_NS` tables), regardless of the unit used in the `RANGE` or `LIST` definition. When matching offset values in a `PIVOT ... FOR offset IN (...)` clause, use the raw numeric value -(e.g., `1800000000000` for 30 minutes in nanoseconds), not the interval literal. +(e.g., `1_800_000_000_000` for 30 minutes in nanoseconds), not the interval +literal. ## GROUP BY rules @@ -131,21 +133,21 @@ by all non-aggregate `SELECT` columns. When `GROUP BY` is present, it follows stricter rules than regular `GROUP BY`: - Each `GROUP BY` expression must **exactly match** a non-aggregate `SELECT` - expression (with table prefix tolerance, e.g., `t.symbol` matches `symbol`) - or a `SELECT` column alias. + expression (with table prefix tolerance, e.g., `t.symbol` matches `symbol`) or + a `SELECT` column alias. - Every non-aggregate `SELECT` column must appear in `GROUP BY`. - Column index references are supported (e.g., `GROUP BY 1, 2`). -For example, if the `SELECT` list contains `h.offset / 1000000000 AS +For example, if the `SELECT` list contains `h.offset / 1_000_000_000 AS horizon_sec`, the `GROUP BY` must use either the alias `horizon_sec` or the full -expression `h.offset / 1000000000` — using just `h.offset` is not valid because -it does not exactly match any non-aggregate `SELECT` expression. +expression `h.offset / 1_000_000_000` — using just `h.offset` is not valid +because it does not exactly match any non-aggregate `SELECT` expression. ## Examples -The examples below use the [demo dataset](/docs/cookbook/demo-data-schema/) tables -`fx_trades` (trade executions) and `market_data` (order book snapshots with 2D -arrays for bids/asks). +The examples below use the [demo dataset](/docs/cookbook/demo-data-schema/) +tables `fx_trades` (trade executions) and `market_data` (order book snapshots +with 2D arrays for bids/asks). ### Post-trade markout at uniform horizons @@ -154,7 +156,7 @@ way to evaluate execution quality and price impact: ```questdb-sql title="Post-trade markout curve" demo SELECT - h.offset / 1000000000 AS horizon_sec, + h.offset / 1_000_000_000 AS horizon_sec, t.symbol, avg((m.best_bid + m.best_ask) / 2) AS avg_mid FROM fx_trades AS t @@ -169,11 +171,11 @@ nanoseconds. Dividing by 1,000,000,000 converts to seconds. ### Markout P&L at non-uniform horizons -Compute the average post-trade markout at specific horizons using `LIST`: +Compute the average post-trade markout value at specific horizons using `LIST`: ```questdb-sql title="Markout at specific time points" demo SELECT - h.offset / 1000000000 AS horizon_sec, + h.offset / 1_000_000_000 AS horizon_sec, t.symbol, avg((m.best_bid + m.best_ask) / 2 - t.price) AS avg_markout FROM fx_trades AS t @@ -190,7 +192,7 @@ detecting information leakage or adverse selection: ```questdb-sql title="Price movement around trade events" demo SELECT - h.offset / 1000000000 AS horizon_sec, + h.offset / 1_000_000_000 AS horizon_sec, t.symbol, avg((m.best_bid + m.best_ask) / 2) AS avg_mid, count() AS sample_size @@ -203,11 +205,11 @@ ORDER BY t.symbol, horizon_sec; ### Volume-weighted markout -Compute an overall volume-weighted markout without grouping by symbol: +Compute an overall volume-weighted markout value without grouping by symbol: ```questdb-sql title="Volume-weighted markout across all symbols" demo SELECT - h.offset / 1000000000 AS horizon_sec, + h.offset / 1_000_000_000 AS horizon_sec, sum(((m.best_bid + m.best_ask) / 2 - t.price) * t.quantity) / sum(t.quantity) AS vwap_markout FROM fx_trades AS t @@ -220,11 +222,11 @@ ORDER BY horizon_sec; ### Multi-table: bid and ask spread around trades Join against two separate tables (bids and asks) in a single HORIZON JOIN query -to compute the average spread at each horizon offset: +to compute the average spread at each horizon: ```questdb-sql title="Multi-table HORIZON JOIN with bid/ask spread" SELECT - h.offset / 1000000000 AS horizon_sec, + h.offset / 1_000_000_000 AS horizon_sec, t.sym, avg(b.bid) AS avg_bid, avg(a.ask) AS avg_ask, @@ -291,7 +293,7 @@ verify parallelization: ```questdb-sql title="Analyze HORIZON JOIN execution plan" demo EXPLAIN SELECT - h.offset / 1000000000 AS horizon_sec, + h.offset / 1_000_000_000 AS horizon_sec, t.symbol, avg((m.best_bid + m.best_ask) / 2) AS avg_mid FROM fx_trades AS t @@ -326,7 +328,11 @@ Look for these indicators in the plan: - **All tables must have a designated timestamp**: The left-hand and all right-hand tables must each have a designated timestamp column. - **Right-hand side must be a table**: Each right-hand side of HORIZON JOIN must - be a table, not a subquery. + be a table with an optional filter, more complex subqueries aren't supported. +- **Left-hand side queries are restricted as well**. On the left hand side, the + query that works best is a table with an optional filter. Some other query + types are also supported, but they degrade the query plan to single-threaded + processing. - **RANGE constraints**: `STEP` must be positive; `FROM` must be less than or equal to `TO`. - **LIST constraints**: Offsets must be interval literals (e.g., `1s`, `-2m`,