Code review fixes: config centralization, dead code removal, import cleanup, threshold normalization by Copilot · Pull Request #18 · Graphlet-AI/serf

Copilot · 2026-03-08T23:42:22Z

Initial SERF implementation had several code quality issues identified during review: hardcoded config values scattered across the codebase, unreachable dead code, imports inside CLI functions, and inconsistent evaluation threshold scales.

Changes

Dead code (pipeline.py): Unreachable return pd.read_csv(..., encoding="latin-1") after the encoding-retry loop — Latin-1 never raises UnicodeDecodeError. Replaced with ValueError.
Inline imports (cli/main.py): All serf module imports and stdlib imports (asyncio, random, pandas) were inside Click command bodies. Moved to module top-level per project conventions.
Config centralization (cli/main.py, pipeline.py, analyze/profiler.py, block/pipeline.py, block/embeddings.py, match/matcher.py): Replaced hardcoded "gemini/gemini-2.0-flash", embedding model strings, and numeric defaults (block sizes, iteration counts, concurrency) with config.get(). Model CLI options now use default=None with runtime resolution (model = model or config.get("models.llm")). Added missing er.convergence section and models.analyze_llm key to config.yml.
Threshold scale normalization (evaluator.py, config.yml): COVERAGE_THRESHOLD and OVERLAP_THRESHOLD were percentage-scale (0–100) but ERROR_THRESHOLD was fraction-scale (0–1) with a * 100 compensating multiplier in the comparison. Normalized all three to percentage scale, removed the * 100 multiplier, updated config.yml values to match. Effective thresholds are unchanged.
Subprocess-isolated embedding (block/subprocess_embed.py, block/pipeline.py, pipeline.py): Introduced subprocess isolation for PyTorch embedding and FAISS clustering to avoid MPS/FAISS memory conflicts on macOS. Embedding model updated to intfloat/multilingual-e5-base.
Docker support: Added Dockerfile, docker-compose.yml, and .dockerignore for containerized execution and benchmarking.

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…vel, fix threshold scale Co-authored-by: rjurney <42149+rjurney@users.noreply.github.com>

rjurney · 2026-03-09T03:05:42Z

/gemini update this PR and resolve conflicts

gemini-code-assist · 2026-03-09T03:05:50Z

It looks like you're trying to update the pull request and resolve conflicts. As gemini-code-assist, I am an AI assistant and cannot directly perform actions like updating a pull request or resolving conflicts. My capabilities are limited to providing code reviews and summaries.

If you'd like me to review the current state of the pull request, you can use the /gemini review command. If you want a summary, use /gemini summary.

For a full list of commands and how to use them, please refer to the help message by typing /gemini help.

rjurney · 2026-03-09T03:06:22Z

@copilot update this PR and resolve conflicts

…h subprocess embedding Co-authored-by: rjurney <42149+rjurney@users.noreply.github.com>

Copilot · 2026-03-09T03:17:09Z

@copilot update this PR and resolve conflicts

Updated and conflicts resolved in commit 6b9a748.

The base branch had advanced with subprocess-isolated embedding/FAISS clustering (fixes macOS MPS segfault), a new analyze_llm config key, updated embedding model (multilingual-e5-base), Docker support, and several smaller fixes. Here's how conflicts were resolved:

config.yml: Merged base's new analyze_llm key and embedding model change with our er.convergence section and normalized eval thresholds (percentages)
pipeline.py: Adopted base's subprocess embedding approach, kept our dead-code fix and config.get() defaults, adopted base's model: str | None = None pattern
cli/main.py: Kept our top-level imports, adopted base's default=None for model options with model = model or config.get("models.llm") at runtime
profiler.py: Kept our top-level config import, adopted base's model: str | None = None with runtime lookup

Also brought in all non-conflicting base branch additions: subprocess_embed.py, updated block/pipeline.py, Docker files, pyproject.toml, and doc updates.

Initial plan

c8acdb4

Copilot AI assigned Copilot and rjurney Mar 8, 2026

Copilot AI mentioned this pull request Mar 8, 2026

Serf initial implementation: Semantic Entity Resolution Framework #17

Merged

Copilot started work on behalf of rjurney March 8, 2026 23:42 View session

Code review: centralize config, fix dead code, move imports to top le…

8d63a62

…vel, fix threshold scale Co-authored-by: rjurney <42149+rjurney@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update Serf initial implementation for Semantic Entity Resolution Framework~~ Code review fixes: config centralization, dead code removal, import cleanup, threshold normalization Mar 9, 2026

Copilot finished work on behalf of rjurney March 9, 2026 00:02

Copilot started work on behalf of rjurney March 9, 2026 03:06 View session

Update branch: incorporate base branch changes, resolve conflicts wit…

6b9a748

…h subprocess embedding Co-authored-by: rjurney <42149+rjurney@users.noreply.github.com>

Copilot finished work on behalf of rjurney March 9, 2026 03:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code review fixes: config centralization, dead code removal, import cleanup, threshold normalization#18

Code review fixes: config centralization, dead code removal, import cleanup, threshold normalization#18
Copilot wants to merge 3 commits intocursor/serf-long-shot-plan-system-b0d4from
copilot/sub-pr-17

Copilot AI commented Mar 8, 2026 •

edited

Loading

Uh oh!

rjurney commented Mar 9, 2026

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Uh oh!

rjurney commented Mar 9, 2026

Uh oh!

Copilot AI commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

rjurney commented Mar 9, 2026

Uh oh!

gemini-code-assist bot commented Mar 9, 2026

Uh oh!

rjurney commented Mar 9, 2026

Uh oh!

Copilot AI commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 8, 2026 •

edited

Loading