Add /selftest.extension core extension to test other extensions#1758
Add /selftest.extension core extension to test other extensions#1758dhilipkumars wants to merge 16 commits intogithub:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an experimental sandbox under tests/extension-commands/ intended to evaluate whether LLM/agent workflows can discover extension command definitions and “execute” their mapped markdown instructions.
Changes:
- Introduces a mock Python entrypoint (
main.py) that prints timestamped outputs for--lintand--deploy. - Adds a
.specify/sandbox containing command markdown files (lint.md,deploy.md) and anextensions.ymlmapping. - Adds
TESTING.mdwith a copy/paste prompt for running the exercise in an LLM chat.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/extension-commands/main.py | Mock CLI script for lint/deploy command execution output |
| tests/extension-commands/TESTING.md | Manual LLM evaluation instructions and expected terminal-style output |
| tests/extension-commands/.specify/extensions.yml | Declares command mappings for the sandbox |
| tests/extension-commands/.specify/lint.md | Markdown “command file” instructing how to run the mock linter |
| tests/extension-commands/.specify/deploy.md | Markdown “command file” instructing how to run the mock deploy |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@copilot code review[agent] |
|
Wild idea, maybe as an extension with its own command /selftest "extension" ? |
|
Interesting. Where do you think such a |
|
Probably in the core repo under extensions/selftest ? |
|
perfect let me re-work my PR |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ructure and argparse mutually exclusive groups
…ual tests sandbox
9d5649a to
5048406
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mnriem
left a comment
There was a problem hiding this comment.
Can you address the Copilot feedback? Note this will be an official extension so it should be in the regular catalog.json and then refer to the full URL it will land on once we merge it in (extensions/selftest directory) as the download URL? Do we support referring to a directory instead of a tgz file? If we do NOT we should ;)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
mnriem
left a comment
There was a problem hiding this comment.
Can you address the Copilot feedback where applicable. If not applicable, please explain why.
@mnriem sure, on it. Have made some changes to address your previous comment was testing them locally last night. i should be get them changes in tonight or tomorrow. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| commands: | ||
| - name: speckit.selftest.extension | ||
| file: commands/selftest.md | ||
| description: Validate the lifecycle of an extension from the catalog. |
There was a problem hiding this comment.
The PR description/sample output uses /speckit.selftest.run ..., but the extension manifest registers the command as speckit.selftest.extension. Either update the docs/sample invocation to match the registered command name, or rename the command here so users can run the command shown in the PR description.
yes we do have that already as apart from that i have addressed all copilots review comments and updated |
This PR adds an experimental testing sandbox under
tests/extension-commandsto evaluate how LLMs/Agents parse extension specifications and execute their mapped commands.Sample Output from Gemini LLM
In my Gemini CLI if i entered below prompt
It outputs the following.
AI Usage disclosure: Used Aniti-gravity to build this out.
Future Goal: Currently its manual to trigger these tests but at some point we can build a workflow for co-pilot to run these
/selftests