Skip to content

Use PartitionRecognizer and PartitionCache to handle window function's partition.#17280

Open
Sh-Zh-7 wants to merge 2 commits intomasterfrom
perf/szh/optimize_window_partition
Open

Use PartitionRecognizer and PartitionCache to handle window function's partition.#17280
Sh-Zh-7 wants to merge 2 commits intomasterfrom
perf/szh/optimize_window_partition

Conversation

@Sh-Zh-7
Copy link
Contributor

@Sh-Zh-7 Sh-Zh-7 commented Mar 9, 2026

This PR reuses TableFunctionOperator's component to help window function handle partition.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 9, 2026

@codecov
Copy link

codecov bot commented Mar 9, 2026

Codecov Report

❌ Patch coverage is 73.50993% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.70%. Comparing base (8782662) to head (0d78f7b).
⚠️ Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
...n/operator/process/window/partition/Partition.java 63.04% 34 Missing ⚠️
...r/process/rowpattern/PatternPartitionExecutor.java 0.00% 4 Missing ⚠️
...n/operator/process/window/TableWindowOperator.java 95.34% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17280      +/-   ##
============================================
- Coverage     39.70%   39.70%   -0.01%     
  Complexity      282      282              
============================================
  Files          5101     5100       -1     
  Lines        341925   341920       -5     
  Branches      43527    43556      +29     
============================================
- Hits         135773   135751      -22     
- Misses       206152   206169      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors window-function partition handling to reuse the existing PartitionRecognizer + PartitionCache infrastructure (previously used by table functions), reducing duplicated partition-boundary logic in TableWindowOperator.

Changes:

  • Replaced TableWindowOperator’s manual partition detection/caching with PartitionRecognizer state processing and PartitionCache-backed buffering.
  • Updated Partition to support construction from Slice lists (and to operate on column “segments” rather than cached TsBlocks).
  • Adjusted PartitionExecutor and PatternPartitionExecutor to read/write through the updated Partition APIs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
.../window/partition/PartitionExecutor.java Adds constructors that accept a Partition directly and routes row copying through Partition helpers.
.../window/partition/Partition.java Reworks partition storage to Column[] segments; adds ctor from Slice and updates access methods accordingly.
.../window/TableWindowOperator.java Switches partition handling to PartitionRecognizer + PartitionCache, and updates memory reservation flow.
.../rowpattern/PatternPartitionExecutor.java Uses Partition.writeTo(...) instead of directly accessing underlying TsBlock columns.
.../function/partition/PartitionCache.java Exposes getSlices() and isEmpty() for window operator integration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +203 to 217
private void finalizeCurrentPartition() {
if (!partitionCache.isEmpty()) {
Partition partition = new Partition(partitionCache.getSlices());
PartitionExecutor partitionExecutor =
new PartitionExecutor(
partition,
inputDataTypes,
outputChannels,
windowFunctions,
frameInfoList,
sortChannels);
cachedPartitionExecutors.addLast(partitionExecutor);
releasePartitionCacheMemory();
partitionCache.clear();
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

releasePartitionCacheMemory() releases memory based on partitionCache.getEstimatedSize(), but PartitionCache.clear() does not reset its internal estimatedSize field. After the first partition is finalized, subsequent partitions will cause released to be too large (and totalMemorySize can go negative), leading to incorrect cumulative memory accounting. Fix by resetting estimatedSize to 0 when clearing the cache (or provide a dedicated reset method and call it alongside clear()).

Copilot uses AI. Check for mistakes.
Comment on lines 65 to 70
public int getPositionCount() {
if (cachedPositionCount == -1) {
// Lazy initialized
cachedPositionCount = 0;
for (TsBlock block : tsBlocks) {
cachedPositionCount += block.getPositionCount();
for (Column[] segment : segments) {
cachedPositionCount += segment[0].getPositionCount();
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getPositionCount() (and getPartitionIndex() below) assume every segment has at least one column and uses segment[0] to derive the row count. If a window partition is built with 0 required/value columns (possible for window functions like row_number() or count(*) with no input symbols), this will throw ArrayIndexOutOfBoundsException. Consider tracking segment row counts separately (e.g., store int[] segmentPositionCounts from TsBlock.getPositionCount() / Slice.getSize()), so partitions with zero columns are still supported.

Copilot uses AI. Check for mistakes.
this.tsBlocks.add(lastBlock);
}

private Partition(List<Column[]> segments, boolean directSegments) {
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The private constructor Partition(List<Column[]> segments, boolean directSegments) takes a directSegments flag that is never read. This is confusing and makes the API harder to understand/maintain. Either remove the parameter or store it and document what behavioral difference it is meant to control.

Suggested change
private Partition(List<Column[]> segments, boolean directSegments) {
private Partition(List<Column[]> segments) {

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants