Skip to content

perf: Views engine render optimizations — warm 99.99% faster, cold 10% faster#724

Open
johnml1135 wants to merge 4 commits intomainfrom
001-render-speedup
Open

perf: Views engine render optimizations — warm 99.99% faster, cold 10% faster#724
johnml1135 wants to merge 4 commits intomainfrom
001-render-speedup

Conversation

@johnml1135
Copy link
Contributor

@johnml1135 johnml1135 commented Mar 3, 2026

Render Performance Optimizations

Summary

This branch delivers comprehensive performance optimizations to the FieldWorks Views engine rendering pipeline, reducing warm render time by 99.99% (153ms → 0.01ms) and cold render time by ~10% (60.36ms → 54.32ms) across 15 benchmark scenarios.

All 15 scenarios pass with 0% pixel variance — no visual regressions.

Motivation

FieldWorks UI responsiveness is bottlenecked by the Views engine render pipeline. Switching between entries, refreshing displays, or navigating within editors triggers Reconstruct()Layout()DrawTheRoot() cycles. Profiling revealed:

  • Redundant layout passes: Layout ran twice per render (once inside Reconstruct, once externally) — both doing identical work
  • Redundant reconstruction: RefreshDisplay() called Reconstruct on ALL root sites, even when no data changed
  • Redundant GDI calls: Every font switch created/destroyed HFONT handles; every draw call set colors unconditionally
  • Redundant NFC normalization: Text offset conversion normalized NFC text that was already NFC, called 12+ times per paragraph

Benchmark Infrastructure

New test infrastructure

  • RenderBenchmarkHarness — creates real VwRootBox + VwGraphics in an offscreen HWND, measures CreateView/MakeRoot/PerformOffscreenLayout/DrawTheRoot independently
  • RenderTimingSuiteTests — 15 scenarios covering simple through extreme content (long-prose: 4 sections × 80 verses; lex-extreme: 25 entries × 10 senses)
  • RenderVerifyTests — pixel-perfect snapshot verification using Verify + ImageSharp
  • RenderBenchmarkResults / RenderBenchmarkReportWriter — generates markdown summary reports with per-scenario timing
  • DataTreeRenderTests — DataTree-level render scenarios (collapsed/expanded/deep/extreme/multiws/simple)

Test files

  • Src/Common/RootSite/RootSiteTests/RenderBenchmarkHarness.cs
  • Src/Common/RootSite/RootSiteTests/RenderTimingSuiteTests.cs
  • Src/Common/RootSite/RootSiteTests/RenderVerifyTests.cs
  • Src/Common/RootSite/RootSiteTests/RenderBenchmarkTestsBase.cs
  • Src/Common/RenderVerification/ (supporting infrastructure)
  • Src/Common/Controls/DetailControlsTests/DataTreeRenderTests.cs
  • 15 verified baseline PNG snapshots

Phase 1: Warm Render Optimizations (153ms → 0.01ms)

PATH-L1: Width-Invariant Layout Guard

  • File: Src/views/VwRootBox.cpp, VwRootBox.h
  • Change: Added m_fNeedsLayout + m_dxLastLayoutWidth to VwRootBox::Layout(). If box tree hasn't changed and width is the same, skip the entire layout traversal.
  • Impact: Warm Layout: 50ms → 0.03ms

PATH-L4: Harness GDI Caching

  • File: Src/Common/RootSite/RootSiteTests/RenderBenchmarkHarness.cs
  • Change: Cached GDI bitmap/graphics in PerformOffscreenLayout() instead of allocating per call.
  • Impact: Warm PerformOffscreenLayout: 27ms → 0.00ms

PATH-R1: Reconstruct Guard

  • File: Src/views/VwRootBox.cpp, VwRootBox.h
  • Change: Added m_fNeedsReconstruct flag. Reconstruct() returns early if no PropChanged, style change, or overlay change has occurred since last construction.
  • Impact: Warm Reconstruct: 100ms → 0.01ms

PATH-L5: Skip Reconstruct When Data Unchanged

  • Files: Src/views/Views.idh, Src/Common/SimpleRootSite/SimpleRootSite.cs
  • Change: Exposed NeedsReconstruct property via COM. SimpleRootSite.RefreshDisplay() checks this before performing expensive reconstruct cycle.
  • Impact: Eliminates managed overhead (SelectionRestorer, SuspendDrawing, COM interop) for all no-op refreshes across all root sites.

Phase 2: Cold Render Optimizations (60.36ms → 54.32ms)

PATH-R1-FIX: Construct Guard Bug Fix

  • File: Src/views/VwRootBox.cpp
  • Change: Construct() was leaving m_fNeedsReconstruct = true (set during Init), so the PATH-R1 guard never fired on the first warm render. Fixed by setting m_fNeedsReconstruct = false after successful construction.
  • Impact: Warm render 98.20ms → 0.01ms (the guard was previously dead code)

PATH-C1: HFONT LRU Cache

  • Files: Src/views/lib/VwGraphics.h, VwGraphics.cpp
  • Change: Added 8-entry LRU cache of (LgCharRenderProps → HFONT) in VwGraphics. On font change, check cache via memcmp of font-specific fields. Cache hit → reuse HFONT (skip CreateFontIndirect). Cache miss → create and add to cache. All cached HFONTs freed on destructor/ReleaseDC.
  • Impact: ~2% avg cold reduction. Best for multi-WS text with frequent font switching.

PATH-C2: Color State Caching

  • Files: Src/views/lib/VwGraphics.h, VwGraphics.cpp
  • Change: Track current foreground color, background color, and background mode. Only call GDI color functions when value actually changes. Cache invalidated on DC change.
  • Impact: <1% avg cold reduction

PATH-N1: NFC Normalization Bypass

  • Files: Src/views/lib/UniscribeSegment.h, UniscribeSegment.cpp, UniscribeEngine.cpp
  • Change: OffsetInNfc() and OffsetToOrig() performed COM Fetch + ICU NFC normalization on every call — 12+ times per FindBreakPoint invocation. For content-heavy views, this meant millions of redundant normalizations. Added NFC detection in CallScriptItemize() (comparing pre/post normalization length). When text is already NFC (common case — FieldWorks stores NFC), new overloads return identity offsets, completely skipping COM and ICU calls.
  • Impact: 8.2% avg cold reduction (59.17ms → 54.32ms). Best results: mixed-styles -27.1%, lex-deep -24.9%, lex-extreme -24.3%

Bug Fixes

VwPropertyStore Crash Fix

  • File: Src/views/VwPropertyStore.cpp
  • Change: Assert(m_chrp.ws) and AssertPtr(qws) at line 1336 fired when ws=0 reached VwPropertyStore during PropChanged-driven box tree rebuilds. Fixed by guarding get_EngineOrNull with if (m_chrp.ws) check and using AssertPtrN(qws) to allow null engine pointer. The code already defaulted to LTR directionality when qws was null.

VwDrawRootBuffered Coordinate Fixes

  • File: Src/views/VwRootBox.cpp
  • Change: Fixed coordinate mapping and trailing separator bars in VwDrawRootBuffered.

Other Fixes

Final Benchmark Results

Scenario Cold (ms) Warm (ms) Pixel Variance
simple 162.23 0.02 0%
medium 16.40 0.01 0%
complex 32.04 0.00 0%
deep-nested 17.42 0.01 0%
custom-heavy 19.28 0.01 0%
many-paragraphs 33.47 0.01 0%
footnote-heavy 66.78 0.01 0%
mixed-styles 73.71 0.01 0%
long-prose 159.94 0.01 0%
multi-book 24.01 0.01 0%
rtl-script 37.16 0.01 0%
multi-ws 45.60 0.01 0%
lex-shallow 17.51 0.00 0%
lex-deep 21.42 0.01 0%
lex-extreme 87.90 0.01 0%
Average 54.32 0.01 0%

Files Changed (Key Production Files)

File Changes
Src/views/VwRootBox.h Layout/Reconstruct guard fields, NeedsReconstruct property
Src/views/VwRootBox.cpp Layout/Reconstruct guards, dirty flags, Construct fix, coordinate fixes
Src/views/Views.idh NeedsReconstruct COM property on IVwRootBox
Src/views/VwPropertyStore.cpp Crash fix for ws=0 during PropChanged
Src/views/lib/VwGraphics.h HFONT cache, color cache structures
Src/views/lib/VwGraphics.cpp HFONT LRU cache, color change detection
Src/views/lib/UniscribeSegment.h NFC-aware overloads, CallScriptItemize extension
Src/views/lib/UniscribeSegment.cpp NFC bypass logic, NFC-aware offset functions
Src/views/lib/UniscribeEngine.cpp 12 FindBreakPoint call sites updated for NFC bypass
Src/Common/SimpleRootSite/SimpleRootSite.cs NeedsReconstruct check in RefreshDisplay
Src/Common/Controls/DetailControls/DataTree.cs HWND diagnostics, render scenario support
Src/Common/Controls/DetailControls/Slice.cs Render measurements

Future Optimization Paths

The remaining opportunity for significant cold render improvement is PATH-V1: Lazy VwLazyBox Construction — deferring off-screen paragraph construction until scrolled into view. This is the only path that can achieve the 50% cold reduction target, as the engine is fundamentally bottlenecked by ScriptShape/ScriptPlace kernel calls that cannot be made faster, only avoided.

See layout-optimization-paths.md for full analysis.


This change is Reviewable

@johnml1135 johnml1135 marked this pull request as ready for review March 3, 2026 21:20
Copilot AI review requested due to automatic review settings March 3, 2026 21:20
@github-actions

This comment has been minimized.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Performance-focused updates introduce render verification/benchmark infrastructure and targeted optimizations in the DetailControls/DataTree rendering path to reduce redundant work and support pixel-perfect render validation.

Changes:

  • Added RenderVerification infrastructure (environment validation, diagnostics/trace toggling, benchmark result models, reporting & comparison utilities).
  • Optimized DataTree/Slice construction, layout, painting, and visibility paths to avoid repeated O(N) work.
  • Added/updated test harness assets (layouts/parts, new diagnostics tests, VS Code tasks) and snapshot testing support.

Reviewed changes

Copilot reviewed 68 out of 153 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
Src/Common/RenderVerification/RenderModels.cs Adds shared models for scenarios and timing results used by render verification/benchmarks.
Src/Common/RenderVerification/RenderEnvironmentValidator.cs Captures environment details and produces a hash/diff for deterministic snapshot comparisons.
Src/Common/RenderVerification/RenderDiagnosticsToggle.cs Adds a toggle for diagnostics/trace output with file-backed flags and a trace listener.
Src/Common/RenderVerification/RenderBenchmarkResults.cs Defines JSON-serializable benchmark run/result structures and flags loading.
Src/Common/RenderVerification/RenderBenchmarkReportWriter.cs Writes JSON + Markdown summary reports and recommendations for benchmark runs.
Src/Common/RenderVerification/RenderBenchmarkComparer.cs Compares runs to detect regressions/improvements and generates a summary.
Src/Common/RenderVerification/CompositeViewCapture.cs Implements two-pass DataTree capture with Views content overlay for pixel-perfect snapshots.
Src/Common/FieldWorks/FieldWorks.Diagnostics.dev.config Adds a new trace switch for Views render timing.
Src/Common/Controls/XMLViews/XMLViewsTests/XmlBrowseViewBaseTests.cs Updates test stub interface to include NeedsReconstruct.
Src/Common/Controls/DetailControls/VectorReferenceView.cs Avoids drawing a trailing separator bar after the last item.
Src/Common/Controls/DetailControls/Slice.cs Adds cached XML attribute parsing + width fast-path + DataTree method call updates.
Src/Common/Controls/DetailControls/PossibilityVectorReferenceView.cs Avoids drawing a trailing separator bar after the last item.
Src/Common/Controls/DetailControls/HwndDiagnostics.cs Adds helper methods to read USER/GDI handle counts for diagnostics/tests.
Src/Common/Controls/DetailControls/DetailControlsTests/TestResult.xml Adds a test runner output artifact (currently shows a failed/invalid run).
Src/Common/Controls/DetailControls/DetailControlsTests/TestParts.xml Extends test parts inventory to include senses sequences and collapsible slices.
Src/Common/Controls/DetailControls/DetailControlsTests/Test.fwlayout Updates test layouts to include CitationForm and senses parts/layouts.
Src/Common/Controls/DetailControls/DetailControlsTests/DetailControlsTests.csproj Adds snapshot/JSON deps and references RenderVerification project.
Src/Common/Controls/DetailControls/DetailControlsTests/DataTreeTests.cs Updates expected template XML string for modified layout composition.
Src/Common/Controls/DetailControls/DetailControlsTests/DataTreeHwndCountTest.cs Adds baseline diagnostics tests for handle growth and slice install count.
Src/Common/Controls/DetailControls/DataTree.cs Adds multiple layout/paint/visibility optimizations and diagnostics counters.
Directory.Packages.props Adds central package version for Verify.
AGENTS.md Adds note about external dependency repository (liblcm).
.vscode/tasks.json Adds tasks for render baseline tests and building tests via repo script.
.vscode/settings.json Allows the new “Build Tests” task as an approved VS Code task.
.github/skills/openspec-verify-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-sync-specs/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-onboard/SKILL.md Updates onboarding workflow text/commands and cross-platform command notes.
.github/skills/openspec-new-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-ff-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-explore/SKILL.md Updates explore-mode wording around “proposal” vs other workflows.
.github/skills/openspec-continue-change/SKILL.md Updates generatedBy metadata version.
.github/skills/openspec-bulk-archive-change/SKILL.md Updates text and generatedBy metadata version.
.github/skills/openspec-archive-change/SKILL.md Updates guidance to invoke the sync skill via Task tool.
.github/skills/openspec-apply-change/SKILL.md Updates generatedBy metadata version.
.github/prompts/opsx-onboard.prompt.md Mirrors onboarding prompt updates (CLI install check, command list).
.github/prompts/opsx-ff.prompt.md Clarifies template usage vs context/rules constraints.
.github/prompts/opsx-explore.prompt.md Mirrors explore prompt updates (“proposal” wording).
.github/prompts/opsx-bulk-archive.prompt.md Mirrors bulk-archive prompt copy update.
.github/prompts/opsx-archive.prompt.md Mirrors archive prompt sync invocation update.
.github/instructions/csharp.instructions.md Adds C# instructions file (currently specifies “latest C# / C# 14” guidance).
.gitattributes Marks Verify snapshot PNG extensions as binary.
.GitHub/skills/render-testing/SKILL.md Adds a render-testing skill doc with build/test/run instructions and baseline workflow.
.GitHub/skills/code-review-skill-main/scripts/pr-analyzer.py Adds a Python tool to analyze diffs and estimate PR review complexity.
.GitHub/skills/code-review-skill-main/reference/typescript.md Adds TypeScript review reference documentation.
.GitHub/skills/code-review-skill-main/reference/security-review-guide.md Adds security review reference documentation.
.GitHub/skills/code-review-skill-main/reference/java.md Adds Java review reference documentation.
.GitHub/skills/code-review-skill-main/reference/cpp.md Adds C++ review reference documentation.
.GitHub/skills/code-review-skill-main/reference/code-review-best-practices.md Adds code review best practices reference documentation.
.GitHub/skills/code-review-skill-main/reference/c.md Adds C review reference documentation.
.GitHub/skills/code-review-skill-main/assets/review-checklist.md Adds a condensed review checklist asset.
.GitHub/skills/code-review-skill-main/assets/pr-review-template.md Adds a PR review template asset.
.GitHub/skills/code-review-skill-main/SKILL.md Adds a top-level code review skill description/index.
.GitHub/skills/code-review-skill-main/README.md Adds documentation for the code-review skill bundle.
.GitHub/skills/code-review-skill-main/LICENSE Adds license file for the bundled code-review skill content.
.GitHub/skills/code-review-skill-main/CONTRIBUTING.md Adds contribution guidelines for the bundled code-review skill content.
.GitHub/skills/code-review-skill-main/.gitignore Adds gitignore for the bundled code-review skill content.
Comments suppressed due to low confidence (1)

.GitHub/skills/render-testing/SKILL.md:1

  • This doc hardcodes a user-specific NUnit console path and recommends dotnet build for a .NET Framework workflow. For portability and alignment with repo build guidance, prefer .\build.ps1 / .\test.ps1 (and avoid assuming C:\\Users\\johnm\\... exists). Also consider avoiding or clearly calling out commands with pipes/redirection in docs, since the repo guidance prefers wrapper scripts for piped commands.
---

@github-actions
Copy link

github-actions bot commented Mar 3, 2026

NUnit Tests

    1 files  ±  0      1 suites  ±0   10m 2s ⏱️ + 4m 13s
4 490 tests +416  4 365 ✅ +362  87 💤 +16  38 ❌ +38 
4 499 runs  +416  4 374 ✅ +362  87 💤 +16  38 ❌ +38 

For more details on these failures, see this check.

Results for commit 34648c1. ± Comparison against base commit b6ac397.

This pull request removes 49 and adds 465 tests. Note that renamed tests count towards both.
AlloGenModelTests.AllomorphGeneratorsTests ‑ NewOperationTest
AlloGenModelTests.AllomorphGeneratorsTests ‑ RemoveEmptyReplaceOperationsTest
AlloGenModelTests.DuplicateTests ‑ DuplicateTest
AlloGenModelTests.PatternTests ‑ CreationTest
AlloGenModelTests.ReplaceOpsTest ‑ DeleteReplaceOpsTests
AlloGenModelTests.ReplaceOpsTest ‑ FindReplaceOpsTests
AlloGenModelTests.ReplaceOpsTest ‑ ReplaceOpIsEmptyTests
SIL.AlloGenServiceTests.AllomorphCreatorTests ‑ AddAllomorphToEntryTest
SIL.AlloGenServiceTests.DatabaseMigratorTest ‑ CreateFileNameTest
SIL.AlloGenServiceTests.DatabaseMigratorTest ‑ Migrate03to04Test
…
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_AllViewportSlicesVisible
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_FullLayoutAndPaintPathPositionsAgree
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_IsHeaderNodeDelegatesToCachedProperty
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_NoDummySlicesInViewportAfterPaint
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_RemoveVisibleSliceInvalidatesHighWaterMark
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_ScrollPositionStableAcrossPaints
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_SequentialPaintsProduceIdenticalOutput
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_SliceHeightsStableAfterConvergence
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_SlicePositionsMonotonicallyIncreasing
SIL.FieldWorks.Common.Framework.DetailControls.DataTreeRenderTests ‑ DataTreeOpt_VisibilitySequenceHasNoGaps
…

♻️ This comment has been updated with latest results.

@johnml1135 johnml1135 force-pushed the 001-render-speedup branch from 04900b0 to d6741a0 Compare March 5, 2026 19:02
@johnml1135 johnml1135 requested a review from Copilot March 12, 2026 15:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 93 out of 178 changed files in this pull request and generated 8 comments.

@johnml1135 johnml1135 force-pushed the 001-render-speedup branch 2 times, most recently from 17af6b7 to 141e590 Compare March 13, 2026 11:31
- speed up DataTree and Views rendering, scrolling, and layout work

- expand render benchmarks, committed baselines, and timing coverage

- keep timing baselines informational in CI and harden test runs

- refresh build, tracing, and supporting test infrastructure
- replace unsupported Unicode escape syntax in the local

  PowerShell whitespace checker
@github-actions

This comment has been minimized.

Force ANTIALIASED_QUALITY for render tests by setting
FW_FONT_QUALITY=4 in the render test assembly setup and
reading that override in the native LOGFONT creation
paths. This keeps local and CI rasterization aligned
without changing production rendering.

Also remove the RootSite per-channel pixel tolerance
now that tests render deterministically again, so
snapshot failures remain strict and actionable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants