Skip to content

feat(python): use pyproject.toml metadata for root SBOM component#368

Merged
ruromero merged 2 commits intomainfrom
TC-3894
Mar 26, 2026
Merged

feat(python): use pyproject.toml metadata for root SBOM component#368
ruromero merged 2 commits intomainfrom
TC-3894

Conversation

@ruromero
Copy link
Collaborator

Summary

  • Read project name, version, and license from pyproject.toml metadata (PEP 621 and Poetry formats) instead of using hardcoded defaults for the root SBOM component
  • Add getRootComponentName() / getRootComponentVersion() virtual methods in PythonProvider base class for subclass override
  • Override readLicenseFromManifest() in PythonPyprojectProvider to extract license from TOML before falling back to LICENSE file
  • Cache parsed TOML object to avoid redundant parsing

Test plan

  • PEP 621 name/version extraction from [project] section
  • Poetry name/version extraction from [tool.poetry] section
  • Fallback to defaults when no metadata present
  • PEP 621 license extraction
  • Poetry license extraction
  • Existing requirements.txt provider behavior unchanged
  • All 26 pyproject tests pass (8 new + 18 existing)

Implements TC-3894

🤖 Generated with Claude Code

Read project name, version, and license from pyproject.toml instead of
using hardcoded defaults. Supports both PEP 621 ([project]) and Poetry
([tool.poetry]) formats with graceful fallback to existing defaults.

- Add getRootComponentName/Version overrides in PythonPyprojectProvider
- Add readLicenseFromManifest override for TOML license extraction
- Cache parsed TOML to avoid redundant parsing
- Add virtual methods in PythonProvider base class for subclass override

Implements TC-3894

Assisted-by: Claude Code
@qodo-code-review
Copy link
Contributor

ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan

Review Summary by Qodo

Extract pyproject.toml metadata for root SBOM component

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Extract project metadata from pyproject.toml for root SBOM component
• Support both PEP 621 and Poetry configuration formats
• Add virtual methods in base class for subclass override capability
• Cache parsed TOML to avoid redundant parsing operations
• Add comprehensive test coverage for metadata extraction scenarios
Diagram
flowchart LR
  A["PythonProvider<br/>Base Class"] -->|"adds virtual methods"| B["getRootComponentName<br/>getRootComponentVersion"]
  C["PythonPyprojectProvider<br/>Subclass"] -->|"overrides methods"| D["Extract from<br/>pyproject.toml"]
  D -->|"PEP 621"| E["project.name<br/>project.version<br/>project.license"]
  D -->|"Poetry"| F["tool.poetry.name<br/>tool.poetry.version<br/>tool.poetry.license"]
  E -->|"fallback"| G["Default values"]
  F -->|"fallback"| G
  H["TOML Cache"] -->|"avoids re-parsing"| D
Loading

Grey Divider

File Changes

1. src/main/java/io/github/guacsec/trustifyda/providers/PythonProvider.java ✨ Enhancement +10/-4

Add virtual methods for root component metadata

• Add protected virtual methods getRootComponentName() and getRootComponentVersion() returning
 default constants
• Replace hardcoded default values with method calls in provideStack() and provideComponent()
• Enable subclass override capability for metadata extraction

src/main/java/io/github/guacsec/trustifyda/providers/PythonProvider.java


2. src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java ✨ Enhancement +73/-4

Extract metadata from pyproject.toml with caching

• Add cachedToml field to cache parsed TOML and avoid redundant parsing
• Implement getToml() method with error handling and caching logic
• Override getRootComponentName() to extract from PEP 621 project.name or Poetry
 tool.poetry.name
• Override getRootComponentVersion() to extract from PEP 621 project.version or Poetry
 tool.poetry.version
• Override readLicenseFromManifest() to extract from PEP 621 project.license or Poetry
 tool.poetry.license with fallback to LICENSE file
• Refactor parseDependencyStrings() to use cached getToml() method

src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java


3. src/test/java/io/github/guacsec/trustifyda/providers/Python_Pyproject_Provider_Test.java 🧪 Tests +69/-0

Add comprehensive metadata extraction test coverage

• Add 8 new test cases covering metadata extraction scenarios
• Test PEP 621 name/version extraction with fallback to defaults
• Test Poetry name/version extraction with fallback to defaults
• Test PEP 621 license extraction from project.license
• Test Poetry license extraction from tool.poetry.license

src/test/java/io/github/guacsec/trustifyda/providers/Python_Pyproject_Provider_Test.java


View more (3)
4. src/test/resources/tst_manifests/pip/pip_pyproject_toml_no_metadata/pyproject.toml 🧪 Tests +4/-0

Test fixture for missing metadata scenario

• New test fixture with minimal pyproject.toml containing only dependencies
• Used to verify fallback to default name and version values

src/test/resources/tst_manifests/pip/pip_pyproject_toml_no_metadata/pyproject.toml


5. src/test/resources/tst_manifests/pip/pip_pyproject_toml_pep621_license/pyproject.toml 🧪 Tests +7/-0

Test fixture for PEP 621 metadata format

• New test fixture with PEP 621 format metadata including name, version, and license
• Contains project.name, project.version, and project.license fields

src/test/resources/tst_manifests/pip/pip_pyproject_toml_pep621_license/pyproject.toml


6. src/test/resources/tst_manifests/pip/pip_pyproject_toml_poetry_license/pyproject.toml 🧪 Tests +8/-0

Test fixture for Poetry metadata format

• New test fixture with Poetry format metadata including name, version, and license
• Contains tool.poetry section with name, version, and license fields

src/test/resources/tst_manifests/pip/pip_pyproject_toml_poetry_license/pyproject.toml


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Mar 25, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Action required

1. Caches errored TOML result🐞 Bug ✓ Correctness
Description
PythonPyprojectProvider.getToml() assigns cachedToml before checking hasErrors(); if errors exist it
throws IOException but leaves cachedToml non-null, so subsequent calls will skip validation and
return an errored TomlParseResult, potentially leading to incorrect/empty metadata or dependency
extraction instead of failing fast.
Code

src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[R59-67]

+  private TomlParseResult getToml() throws IOException {
+    if (cachedToml == null) {
+      cachedToml = Toml.parse(manifest);
+      if (cachedToml.hasErrors()) {
+        throw new IOException(
+            "Invalid pyproject.toml format: " + cachedToml.errors().get(0).getMessage());
+      }
+    }
+    return cachedToml;
Evidence
getToml() sets the instance field cachedToml, then throws on hasErrors() without clearing/resetting
it. Since later calls only validate when cachedToml == null, an errored parse can be returned on
subsequent calls without throwing, creating inconsistent behavior across calls within the same
provider instance.

src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[59-68]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`PythonPyprojectProvider#getToml()` caches the `TomlParseResult` before validating `hasErrors()`. If parsing yields errors, an `IOException` is thrown but the `cachedToml` field remains set, so later calls can return an errored parse result without re-validating.
### Issue Context
This can produce inconsistent behavior within a single provider instance (first call throws, subsequent calls may silently proceed with an errored `TomlParseResult`).
### Fix Focus Areas
- src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[59-68]
### Suggested change
Parse into a local variable, validate, then assign:

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Swallowed TOML parse errors🐞 Bug ✓ Correctness
Description
PythonPyprojectProvider silently ignores IOExceptions from TOML parsing in root name/version and
license extraction, making it difficult to diagnose why SBOM root metadata or license falls back to
defaults/LicenseUtils behavior.
Code

src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[R70-85]

+  @Override
+  protected String getRootComponentName() {
+    try {
+      TomlParseResult toml = getToml();
+      String name = toml.getString("project.name");
+      if (name != null && !name.isBlank()) {
+        return name;
+      }
+      String poetryName = toml.getString("tool.poetry.name");
+      if (poetryName != null && !poetryName.isBlank()) {
+        return poetryName;
+      }
+    } catch (IOException e) {
+      // fall through to default
+    }
+    return super.getRootComponentName();
Evidence
The catch blocks for metadata extraction explicitly swallow IOException with only comments and no
logging, so operators have no signal that parsing failed and fallbacks were used.

src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[70-86]
src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[88-104]
src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[106-127]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`getRootComponentName()`, `getRootComponentVersion()`, and `readLicenseFromManifest()` catch `IOException` from `getToml()` and fall back, but they do so silently. This makes it hard to understand why defaults were used.
### Issue Context
Fallback behavior is fine/intentional, but a debug/warn log when TOML parsing fails would improve debuggability.
### Fix Focus Areas
- src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[70-86]
- src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[88-104]
- src/main/java/io/github/guacsec/trustifyda/providers/PythonPyprojectProvider.java[106-127]
### Suggested change
Add a logger to `PythonPyprojectProvider` (similar to other providers) and log the exception message at `FINE`/`WARNING` when falling back, e.g.:

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

Test Results

379 tests   379 ✅  1m 45s ⏱️
 25 suites    0 💤
 25 files      0 ❌

Results for commit 2b0fde8.

♻️ This comment has been updated with latest results.

- Parse TOML into local variable before caching to avoid retaining
  errored parse results across subsequent calls
- Add FINE-level debug logging when TOML parsing fails and metadata
  extraction falls back to defaults

Implements TC-3894

Assisted-by: Claude Code
@ruromero
Copy link
Collaborator Author

Verification Report for TC-3894

Check Result Details
Scope Containment WARN 4 out-of-scope test fixture files (justified — new test data)
Diff Size PASS +177/-8 across 6 files — proportionate to scope
Commit Traceability PASS Both commits reference TC-3894
Sensitive Patterns PASS No secrets detected
CI Status WARN Unit tests pass; some integration tests failing (pre-existing, unrelated to this PR — osv-github high count mismatch in yarn-berry); many checks still pending
Acceptance Criteria PASS 6 of 6 criteria met
Verification Commands N/A No verification commands in task

Overall: WARN

Notes:

  • Out-of-scope files are 3 new test fixture pyproject.toml files + the test class itself — all directly support the acceptance criteria (unit test coverage). No production code outside scope was modified.
  • CI failures are integration test flakes (osv-github high count mismatch: expected 7 got 6) — same failure observed on other PRs, not caused by this change.
  • Qodo review findings (cached errored TOML, swallowed exceptions) were addressed in the second commit.

Acceptance Criteria Detail

# Criterion Result
1 Root SBOM uses [project].name / [project].version PASS
2 Falls back to [tool.poetry].name / [tool.poetry].version PASS
3 Falls back to defaults (default-pip-root / 0.0.0) PASS
4 License from [project].license or [tool.poetry].license with LICENSE file fallback PASS
5 Existing requirements.txt behavior unchanged PASS
6 Unit tests cover all metadata resolution scenarios PASS

This comment was AI-generated by sdlc-workflow/verify-pr v0.5.0.

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Mar 25, 2026

CI Feedback 🧐

(Feedback updated until commit 2b0fde8)

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: call-shared / integration-tests (ubuntu-latest, gradle-groovy)

Failed stage: Run Integration Tests [❌]

Failed test name: stack_analysis validation

Failure summary:

The GitHub Action failed during the integration test validation step for stack analysis results.
-
The stack analysis output validation detected a mismatch for provider rhtpa using source osv-github:
remediations count was expected to be 14 but the CLI returned 2 (log line ~15254).
- Because of this
remediations mismatch, the test suite reported Stack analysis validation failed and the job exited
with code 1 (log lines ~15255-15256).

Relevant error logs:
1:  ##[group]Runner Image Provisioner
2:  Hosted Compute Agent
...

287:  env:
288:  TRUSTIFY_DA_DEV_MODE: true
289:  TRUSTIFY_DA_BACKEND_URL: https://exhort.stage.devshift.net
290:  PYTHONIOENCODING: utf-8
291:  PYTHONUNBUFFERED: 1
292:  pythonLocation: /opt/hostedtoolcache/Python/3.11.15/x64
293:  PKG_CONFIG_PATH: /opt/hostedtoolcache/Python/3.11.15/x64/lib/pkgconfig
294:  Python_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.15/x64
295:  Python2_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.15/x64
296:  Python3_ROOT_DIR: /opt/hostedtoolcache/Python/3.11.15/x64
297:  LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.11.15/x64/lib
298:  ##[endgroup]
299:  + python -u shared-scripts/run_tests_no_runtime.py java artifact gradle-groovy
300:  ---
301:  Scenario: No runtime available
302:  Description: It fails when no runtime is available
303:  Manifest: /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/scenarios/gradle-groovy/simple/build.gradle
304:  Expecting failure (no runtime available)
305:  Executing: java -jar /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/artifact/cli.jar component /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/scenarios/gradle-groovy/simple/build.gradle
306:  ✅ Command failed as expected (no runtime available)
307:  Executing: java -jar /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/artifact/cli.jar stack /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/scenarios/gradle-groovy/simple/build.gradle
308:  ✅ Command failed as expected (no runtime available)
309:  Executing: java -jar /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/artifact/cli.jar stack /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/scenarios/gradle-groovy/simple/build.gradle --summary
310:  ✅ Command failed as expected (no runtime available)
311:  Executing: java -jar /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/artifact/cli.jar stack /home/runner/work/trustify-da-java-client/trustify-da-java-client/integration-tests/scenarios/gradle-groovy/simple/build.gradle --html
312:  ✅ Command failed as expected (no runtime available)
313:  ---
...

12518:  "name": "Apache License 2.0",
12519:  "isDeprecated": false,
12520:  "isOsiApproved": true,
12521:  "isFsfLibre": true,
12522:  "category": "PERMISSIVE"
12523:  }
12524:  ],
12525:  "expression": "Apache-2.0",
12526:  "name": "Apache License 2.0",
12527:  "category": "PERMISSIVE",
12528:  "source": "deps.dev",
12529:  "sourceUrl": "https://api.deps.dev"
12530:  }
12531:  ]
12532:  },
12533:  "pkg:maven/com.google.errorprone/error_prone_annotations@2.10.0?scope=compile": {
12534:  "concluded": {
...

15240:  "direct": 4,
15241:  "transitive": 9,
15242:  "total": 13,
15243:  "dependencies": 10,
15244:  "critical": 1,
15245:  "high": 4,
15246:  "medium": 8,
15247:  "low": 0,
15248:  "remediations": 6,
15249:  "recommendations": 119,
15250:  "unscanned": 0
15251:  }
15252:  }
15253:  }
15254:  ❌ stack_analysis provider rhtpa source osv-github remediations mismatch: expected 14, got 2
15255:  ❌ Stack analysis validation failed
15256:  ##[error]Process completed with exit code 1.
15257:  Post job cleanup.

@ruromero ruromero requested a review from soul2zimate March 25, 2026 22:45
@ruromero ruromero merged commit 7f8eff5 into main Mar 26, 2026
19 of 40 checks passed
@ruromero ruromero deleted the TC-3894 branch March 26, 2026 08:33
@ruromero ruromero restored the TC-3894 branch March 26, 2026 11:21
@ruromero ruromero deleted the TC-3894 branch March 26, 2026 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants