IB Memory Reduction by danieljvickers · Pull Request #1307 · MFlowCode/MFC

danieljvickers · 2026-03-13T21:12:47Z

Description

Before scaling up, I was looking for any free memeory that we have been loose with and cleaning it up.

This removes the inner point array, as it is just a waste of memory resources. It also means we can reduce the number of ghost points to keep in memory. I also added protection preventing arbitrary memory allocation in non-moving cases.

Type of change

Bug fix

Testing

How did you test your changes?

I reran all mibm examples to ensure we did not run out of memory

GPU changes (expand if you modified src/simulation/)

GPU results match CPU results
Tested on NVIDIA GPU or AMD GPU

AI code reviews

Reviews are not triggered automatically. To request a review, comment on the PR:

@coderabbitai review — incremental review (new changes only)
@coderabbitai full review — full review from scratch
/review — Qodo review
/improve — Qodo code suggestions

…ry IBs, and removal of inner point array

github-actions · 2026-03-13T21:15:06Z

Claude Code Review

Head SHA: 23a49931ad4070320a7013bc4def2a2134ef46d8
Files changed: 1 — src/simulation/m_ibm.fpp
+20 / -62

Summary

Removes the inner_points array and num_inner_gps counter, reducing GPU/CPU memory usage for IBM cases.
Removes the inner-points momentum-correction loop from s_ibm_correct_state.
Changes ghost_points allocation from a generous over-allocated buffer to an exact-count allocation.
Adds a moving_immersed_boundary_flag guard around the MPI allreduce + cap, skipping it for static IB cases.
Updates s_find_num_ghost_points and s_find_ghost_points signatures to drop the inner-points output/argument.

Findings

1. Correctness — removed inner-points momentum-correction loop (high concern)
src/simulation/m_ibm.fpp, s_ibm_correct_state (around old line 401–418):
The loop that zeroed momentum components (q_cons_vf(q)%sf(j,k,l) = 0._wp for q = momxb, momxe) for all cells classified as inner points (i.e., cells fully inside an immersed boundary) has been removed entirely. The PR description does not explain why this correction is no longer needed. If any fluid solver path writes nonzero momentum into those cells, the immersed boundary no longer resets them, which could affect mass/momentum conservation near solid boundaries.

Please confirm: is this loop genuinely redundant (e.g., because inner points are never touched by the Riemann solver), or does removing it change physics results? The testing note says "reran all mibm examples to ensure we did not run out of memory" — it would be valuable to also confirm that pressure/velocity fields inside IBs are unchanged.

2. Dead code — count_i declared but now unused in s_find_ghost_points
src/simulation/m_ibm.fpp, s_find_ghost_points:

The only code that used count_i was the removed inner-points else branch. count_i is now declared but never referenced. Strict compilers (Intel ifx, Cray ftn with -warn unused) may emit a warning; some configurations treat this as an error. Remove the declaration.

3. Exact allocation without buffer for non-moving case (low-risk but worth noting)
src/simulation/m_ibm.fpp, s_ibm_setup (around new line 129):

For moving_immersed_boundary_flag = .false., num_gps is now the raw local GPU-counted value with no allreduce and no safety factor. The old code allocated (max_num_gps + max_num_inner_gps) * 2, which was admittedly very generous. The tightened allocation is safe if and only if the two GPU parallel loops in s_find_num_ghost_points and s_find_ghost_points see an identical grid state. If ib_markers is updated between the two calls (e.g., by s_find_ghost_points itself writing ib_markers%sf(i,j,k) = patch_id), a second call path through s_update_ib_markers calling s_find_num_ghost_points + s_find_ghost_points would re-count on already-modified markers. Confirm this is safe in the moving-boundary update path.

Minor

The PR description checkboxes ("Bug fix", "GPU results match CPU results", "Tested on NVIDIA GPU or AMD GPU") are all unchecked. For a simulation-only GPU module change, please fill in the GPU testing checklist.

github-actions · 2026-03-14T14:06:31Z

Claude Code Review

Head SHA: 4150b12

Files changed: 1

src/simulation/m_ibm.fpp (+20 / -62)

Summary

Removes the inner_points array and num_inner_gps counter, saving memory for IB simulations.
Adds a guard: MPI allreduce and 2× buffer allocation only happen when moving_immersed_boundary_flag is true; non-moving IBs allocate exactly num_gps elements.
Removes the inner-point momentum-zeroing loop in s_ibm_correct_state.
Simplifies s_find_num_ghost_points and s_find_ghost_points to operate only on ghost points.

Findings

1. Medium — `ib_markers%sf` not decoded for inner IB cells (potential downstream correctness issue)

File: src/simulation/m_ibm.fpp, s_find_ghost_points (removed else branch around line 630)

Previously, for inner IB cells the code decoded the encoded patch ID and wrote back the plain patch_id to ib_markers%sf(i,j,k). This step is now removed. Any code that later reads ib_markers%sf for a deeply-interior IB cell will now see the encoded value (with periodicity bits set) rather than the clean patch ID.

If interior cells are never read after this point (e.g., they are completely masked), this is harmless. But it warrants confirming that no downstream consumer of ib_markers%sf can reach interior cells and misinterpret the encoded value.

2. Medium — Removal of inner-point momentum zeroing (physics correctness)

File: src/simulation/m_ibm.fpp, s_ibm_correct_state (removed ~lines 398–419)

The deleted loop set q_cons_vf(q)%sf(j,k,l) = 0 for all momentum components of every inner IB cell. Removing this means deeply-interior cells will accumulate whatever state the solver writes there. For a stationary IB this is generally benign (they never influence the exterior solution), but for moving IBs where a cell transitions from interior to ghost/fluid, stale momentum in those cells could contaminate the solution momentarily.

Since you tested mibm examples this may be validated, but consider adding a note in a comment or PR body about why removing this is safe for both static and moving IB cases.

3. Low — Non-moving IB allocation uses exact count with no buffer

File: src/simulation/m_ibm.fpp, s_ibm_setup (~line 130)

For non-moving IBs, num_gps is the local rank's exact count with no additional buffer. s_find_ghost_points then fills up to exactly that count, so there is no out-of-bounds risk for the static case. This is intentional and safe, just worth a brief comment explaining the design decision (static = exact, moving = 2× global sum).

4. Low — Verify `s_finalize_ibm_module` was updated

The diff does not show s_finalize_ibm_module being touched. Confirm that any prior @:DEALLOCATE(inner_points(...)) in that routine was also removed. If it was not present, this means inner_points was a pre-existing memory leak now resolved by elimination.

Improvement Opportunity

The comment at the top of the new moving-IB block could clarify why a 2× factor is used:

This would help future maintainers understand the intent without reading git history.

Overall this is a clean, well-scoped reduction. The main item to double-check is the ib_markers%sf encoding for interior cells and whether any downstream reader can hit those cells.

…nto ib-memory-reduction

github-actions · 2026-03-14T18:53:18Z

Claude Code Review

Head SHA: 48a588009f9aaadae65ce8f321241e787a290ed4

Files changed: 1

src/simulation/m_ibm.fpp (+22 / -62)

Summary

Removes inner_points array and num_inner_gps counter, reducing GPU/CPU memory for IB simulations.
Adds a moving_immersed_boundary_flag guard: static IBs allocate exactly num_gps locally; moving IBs allreduce-sum and apply a 2× cap.
Removes inner-point momentum-zeroing loop from s_ibm_correct_state.
Simplifies s_find_num_ghost_points and s_find_ghost_points signatures to drop inner-point arguments.

Findings

1. Medium — `ib_markers%sf` encoding not cleaned for interior cells

File: src/simulation/m_ibm.fpp, s_find_ghost_points (removed else branch ~line 630)

The removed else block decoded the encoded patch ID and wrote the clean patch_id back to ib_markers%sf(i,j,k) for deeply-interior cells. That writeback is now gone. Any downstream code reading ib_markers%sf for an interior cell will see the encoded value (with periodicity bits) instead of the plain patch ID.

If interior cells are fully masked in all downstream readers this is benign, but please confirm no consumer of ib_markers%sf can reach interior cells and misinterpret the encoded value.

2. Medium — Removal of inner-point momentum zeroing (physics change)

File: src/simulation/m_ibm.fpp, s_ibm_correct_state (removed ~lines 401–418)

The deleted loop set q_cons_vf(q)%sf(j,k,l) = 0 for momentum of every fully-interior IB cell. For a static IB these cells are inert, but for a moving IB a cell that transitions from interior to ghost/fluid could carry stale nonzero momentum until the next correction cycle.

Testing "reran all mibm examples" covers memory correctness; it would also be valuable to confirm pressure/velocity inside IBs are unchanged between the old and new code, particularly for moving-boundary cases.

3. Low — Possible unused `count_i` variable in `s_find_ghost_points`

File: src/simulation/m_ibm.fpp, s_find_ghost_points

The diff removes all uses of count_i (it was only used in the removed inner-points else branch), but the declaration line (integer :: count, count_i, local_idx) is not shown being removed. If count_i is still declared, Intel ifx and Cray ftn may emit an unused-variable warning — or a build error under strict flags. Please verify the declaration is removed.

4. Low — Verify `s_finalize_ibm_module` was updated

The diff does not include any change to the finalization subroutine. Please confirm that @:DEALLOCATE(inner_points(...)) (or equivalent GPU deallocation) is also removed there. If no deallocation existed before, this was a pre-existing memory/GPU-data leak now resolved by eliminating the array.

Minor

PR description checkboxes ("Bug fix", "GPU results match CPU results", "Tested on NVIDIA GPU") are all unchecked. For a src/simulation/ change touching GPU memory layout, filling in the GPU testing checklist would strengthen confidence.
A short comment explaining why the 2× factor is used for the moving-IB allocation (headroom for boundary motion between setup and first timestep update) would help future readers.

Memeory footprint reduction to prevent very large arrays for stationa…

23a4993

…ry IBs, and removal of inner point array

Merge branch 'master' into ib-memory-reduction

4150b12

danieljvickers added 2 commits March 14, 2026 14:50

Fix an issue with setting num_gps too large for later use

ee9b269

Merge branch 'ib-memory-reduction' of github.com:danieljvickers/MFC i…

48a5880

…nto ib-memory-reduction

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IB Memory Reduction#1307

IB Memory Reduction#1307
danieljvickers wants to merge 4 commits intoMFlowCode:masterfrom
danieljvickers:ib-memory-reduction

danieljvickers commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

github-actions bot commented Mar 14, 2026

Uh oh!

github-actions bot commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

danieljvickers commented Mar 13, 2026

Description

Type of change

Testing

AI code reviews

Uh oh!

github-actions bot commented Mar 13, 2026

Claude Code Review

Summary

Findings

Minor

Uh oh!

github-actions bot commented Mar 14, 2026

Claude Code Review

Summary

Findings

1. Medium — ib_markers%sf not decoded for inner IB cells (potential downstream correctness issue)

2. Medium — Removal of inner-point momentum zeroing (physics correctness)

3. Low — Non-moving IB allocation uses exact count with no buffer

4. Low — Verify s_finalize_ibm_module was updated

Improvement Opportunity

Uh oh!

github-actions bot commented Mar 14, 2026

Claude Code Review

Summary

Findings

1. Medium — ib_markers%sf encoding not cleaned for interior cells

2. Medium — Removal of inner-point momentum zeroing (physics change)

3. Low — Possible unused count_i variable in s_find_ghost_points

4. Low — Verify s_finalize_ibm_module was updated

Minor

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

1. Medium — `ib_markers%sf` not decoded for inner IB cells (potential downstream correctness issue)

4. Low — Verify `s_finalize_ibm_module` was updated

1. Medium — `ib_markers%sf` encoding not cleaned for interior cells

3. Low — Possible unused `count_i` variable in `s_find_ghost_points`

4. Low — Verify `s_finalize_ibm_module` was updated