Skill v1.0.1
currentAutomated scan100/1003 files
version: "1.0.1" name: qa description: Run quality assessment on a SpecWeave increment with risk scoring and quality gate decisions
/sw:qa - Quality Assessment Command
IMPORTANT: You MUST invoke the CLI specweave qa command using the Bash tool. The slash command provides guidance and orchestration only.
Purpose
Run comprehensive quality assessment on an increment using:
- ✅ Gate 1: Rule-based validation (130+ automated checks)
- ✅ Gate 2: LLM-as-Judge (AI quality assessment with chain-of-thought reasoning)
- ✅ Gate 3: Risk scoring (BMAD Probability × Impact quantitative assessment)
- ✅ Quality gate decisions (PASS/CONCERNS/FAIL)
LLM-as-Judge Pattern
This command implements the LLM-as-Judge pattern - an established AI/ML evaluation technique where an LLM evaluates outputs using structured reasoning.
How it works:
┌─────────────────────────────────────────────────────────────┐│ LLM-as-Judge Gate │├─────────────────────────────────────────────────────────────┤│ Input: spec.md, plan.md, tasks.md ││ ││ Process: ││ 1. Chain-of-thought analysis (7 dimensions) ││ 2. Evidence-based scoring (0-100 per dimension) ││ 3. Risk identification (BMAD P×I formula) ││ 4. Formal verdict (PASS/CONCERNS/FAIL) ││ ││ Output: Structured quality report with: ││ - Blockers (MUST fix) ││ - Concerns (SHOULD fix) ││ - Recommendations (NICE to fix) │└─────────────────────────────────────────────────────────────┘
Why LLM-as-Judge?
- Consistency: Applies uniform evaluation criteria
- Depth: Catches nuanced issues humans might miss
- Speed: ~30 seconds vs hours of manual review
- Documented reasoning: Explains WHY something is an issue
Usage
/sw:qa <increment-id> [options]
Examples
# Quick mode (default)/sw:qa 0008# Pre-implementation check/sw:qa 0008 --pre# Quality gate check (comprehensive)/sw:qa 0008 --gate# Export blockers to tasks.md/sw:qa 0008 --export# CI mode (exit 1 on FAIL)/sw:qa 0008 --ci# Skip AI assessment (rule-based only)/sw:qa 0008 --no-ai# Force run even if rule-based fails/sw:qa 0008 --force
Options
--quick- Quick mode (default) - Fast assessment with core checks--pre- Pre-implementation mode - Check before starting work--gate- Quality gate mode - Comprehensive check before closing--full- Full multi-agent mode (Phase 3)--ci- CI mode - Exit 1 on FAIL (for automation)--no-ai- Skip AI assessment - Rule-based validation only (free, fast)--export- Export blockers/concerns to tasks.md--force- Force run even if rule-based validation fails-v, --verbose- Show recommendations in addition to blockers/concerns
What It Does
Step 1: Rule-Based Validation (Always First, Always Free)
The command runs 120+ validation checks on increment files:
- ✅ File existence (spec.md, plan.md, tasks.md)
- ✅ YAML frontmatter structure
- ✅ AC-ID traceability (spec.md → tasks.md)
- ✅ Link integrity
- ✅ Format consistency
If rule-based fails → Stop (don't waste AI tokens) unless --force flag used
Step 2: AI Quality Assessment (Optional, skip with --no-ai)
IMPORTANT: This step uses the increment-quality-judge-v2 skill (auto-activated).
The skill provides guidance and the CLI handles execution:
# CLI invokes quality assessment directlyspecweave qa 0008 --pre
DO NOT spawn agents for quality assessment - use the CLI command which handles everything internally.
The assessment evaluates:
- 7 Dimensions:
- Clarity (18% weight)
- Testability (22% weight)
- Completeness (18% weight)
- Feasibility (13% weight)
- Maintainability (9% weight)
- Edge Cases (9% weight)
- Risk Assessment (11% weight)
Risk Assessment uses quantitative method:
- Probability (0.0-1.0) × Impact (1-10) = Risk Score (0.0-10.0)
- 4 categories: Security, Technical, Implementation, Operational
- Severity: CRITICAL (≥9.0), HIGH (6.0-8.9), MEDIUM (3.0-5.9), LOW (<3.0)
Step 3: Quality Gate Decision
Based on thresholds:
FAIL if any:
- Risk score ≥ 9.0 (CRITICAL)
- Test coverage < 60%
- Spec quality < 50
- Critical security vulnerabilities ≥ 1
CONCERNS if any:
- Risk score 6.0-8.9 (HIGH)
- Test coverage < 80%
- Spec quality < 70
- High security vulnerabilities ≥ 1
PASS otherwise
Step 4: Display Report
Show results with:
- 🟢 PASS / 🟡 CONCERNS / 🔴 FAIL decision
- Blockers (MUST fix)
- Concerns (SHOULD fix)
- Recommendations (NICE to fix, with
--verbose) - Spec quality scores (7 dimensions)
- Summary (duration, tokens, cost)
Step 5: Export (Optional)
If --export flag provided:
- Append blockers/concerns to tasks.md
- Add priority (P0 for blockers, P1 for concerns)
- Include mitigation strategies
Implementation
When user runs `/qa <increment-id>`:
- Parse and normalize arguments
```typescript let incrementId = args[0]; // e.g., "0008" or "0008-feature-name"
// Normalize increment ID if (incrementId.includes('-')) { // Extract numeric portion: "0008-feature-name" → "0008" incrementId = incrementId.split('-')[0]; } // Convert to 4-digit format: "8" → "0008" incrementId = incrementId.padStart(4, '0');
const options = parseOptions(args.slice(1)); `` Both formats work: /sw:qa 0153 or /sw:qa 0153-feature-name`
- Invoke CLI command via Bash tool
``bash specweave qa 0008 --pre --export ``
- CLI handles everything:
- Rule-based validation
- AI assessment invocation
- Quality gate decision
- Report display
- Export to tasks.md
- Return result to user
- Show CLI output (already formatted)
- Suggest next steps based on decision
Modes Explained
Quick Mode (Default)
Use when: Quick check during development Checks: Rule-based + AI spec quality + risk assessment Time: ~30 seconds Cost: ~$0.025-$0.050
Pre-Implementation Mode (--pre)
Use when: Before starting increment work Checks: All quick mode checks + architecture review Time: ~1 minute Cost: ~$0.05-$0.10
Quality Gate Mode (--gate)
Use when: Before closing increment (via /sw:done) Checks: All pre-implementation checks + test coverage + security audit Time: ~2-3 minutes Cost: ~$0.10-$0.20
Full Multi-Agent Mode (--full, Phase 3)
Use when: Comprehensive audit for critical increments Checks: 6 specialized subagents in parallel Time: ~5 minutes Cost: ~$0.50-$1.00
Cost Breakdown
| Mode | Tokens | Cost (USD) | Time | |
|---|---|---|---|---|
| Quick | ~2,500 | ~$0.025 | 30s | |
| Pre | ~5,000 | ~$0.050 | 1m | |
| Gate | ~10,000 | ~$0.100 | 2-3m | |
| Full | ~50,000 | ~$0.500 | 5m |
Optimization: Use Haiku model by default (cheapest, fastest)
Exit Codes (for CI)
When --ci flag used:
- Exit 0: PASS or CONCERNS (warning, but not blocking)
- Exit 1: FAIL (blocking issues found)
CI Integration Example:
# .github/workflows/qa-check.yml- name: Run QA Checkrun: specweave qa ${{ env.INCREMENT_ID }} --gate --ci
Error Handling
Common errors:
- ❌ Increment not found → Check ID format (4 digits: 0001, 0008)
- ❌ Missing files → Run
/sw:incto create increment first - ❌ Rule-based fails → Fix validation errors before AI assessment
- ❌ AI timeout → Retry with
--quickmode or--no-ai
Integration Points
Auto-invoked by:
/sw:done- Runs--gatemode before closing increment- Post-task-completion hook (optional) - Runs
--quickmode after tasks complete
Manual invocation:
- During development -
/qa 0008for quick checks - Before commit -
/qa 0008 --preto catch issues early - Before PR -
/qa 0008 --gate --exportfor comprehensive check
Best Practices
- Run early and often - Use
--quickduring development - Fix blockers immediately - Don't proceed with FAIL decision
- Address concerns before release - CONCERNS = should fix
- Use risk scores to prioritize - Fix CRITICAL (≥9.0) risks first
- Export to tasks.md - Convert blockers/concerns to actionable tasks
- CI integration - Block PRs with FAIL decision
Related
- Skill:
increment-quality-judge-v2(7 dimensions with risk assessment) - Command:
/sw:done(auto-runs QA gate) - CLI:
specweave qa(direct invocation) - Types:
src/core/qa/types.ts(TypeScript definitions) - Tests:
tests/unit/qa/(58 test cases, 100% passing)
Example Session
User: /sw:qa 0008