Merge Findings into One Report

Two sources of truth, one report.

The AI thinks the code is fine. The tests are red. Both are real. The job of the merge step is to surface both, and to make sure a green AI review doesn't mask a broken build.

Outcome

Update the CLI to parse lifecycle.testResult.stdout/stderr into test findings, merge them with the AI review's findings, recalculate overallRisk, and print the unified result.

Fast Track

Import parseTestFailures in the CLI.
After the analyzer call, parse test failures from the combined stdout/stderr.
Build a combined review where overallRisk escalates to 'high' if any tests failed.

Hands-on exercise

Open src/cli.ts and update the action body:

import { Command } from 'commander';
import { runSandboxLifecycle } from './sandbox-lifecycle';
import { analyzeRepository } from './analyze';
import { parseTestFailures } from './test-runner';
 
function isValidGitHubRepoUrl(input: string): boolean {
  return /^https:\/\/github\.com\/[\w.-]+\/[\w.-]+\/?$/.test(input);
}
 
const program = new Command();
 
program
  .name('repo-review')
  .description('Clone and review a GitHub repository in a Sandbox')
  .version('0.1.0');
 
program
  .command('review <repoUrl>')
  .description('Run a Sandbox review against a GitHub repository URL')
  .action(async (repoUrl: string) => {
    if (!isValidGitHubRepoUrl(repoUrl)) {
      console.error(`Invalid GitHub repository URL: ${repoUrl}`);
      console.error('Expected format: https://github.com/<owner>/<repo>');
      process.exitCode = 2;
      return;
    }
 
    console.log(`Reviewing ${repoUrl}...`);
 
    try {
      const lifecycle = await runSandboxLifecycle(repoUrl);
      console.log(`Sandbox: ${lifecycle.sandboxId}`);
      console.log(`Collected ${lifecycle.files.length} file(s) for analysis.`);
      console.log(`Tests (${lifecycle.testResult.packageManager}): exit code ${lifecycle.testResult.exitCode}`);
 
      const aiReview = lifecycle.files.length === 0
        ? { overallRisk: 'low' as const, findings: [] }
        : await analyzeRepository(lifecycle.files);
 
      const testFindings = parseTestFailures(
        `${lifecycle.testResult.stdout}\n${lifecycle.testResult.stderr}`
      );
 
      const combined = {
        overallRisk: testFindings.length > 0 ? 'high' as const : aiReview.overallRisk,
        aiFindings: aiReview.findings,
        testFindings
      };
 
      console.log(`\nOverall risk: ${combined.overallRisk}`);
      console.log(`AI findings: ${combined.aiFindings.length}`);
      for (const finding of combined.aiFindings) {
        console.log(`  [${finding.severity}] ${finding.summary} (${finding.file})`);
      }
 
      console.log(`Test findings: ${combined.testFindings.length}`);
      for (const finding of combined.testFindings) {
        console.log(`  [${finding.severity}] ${finding.details}`);
      }
    } catch (error) {
      console.error('Review failed:', error instanceof Error ? error.message : error);
      process.exitCode = 1;
    }
  });
 
program.parse();

The combine rule is intentionally blunt: any test failure forces overall risk to high, regardless of what the AI thought. A passing test suite means we trust the AI's assessment.

You could imagine more nuanced rules (severity-weighted blends, configurable thresholds), and those are reasonable in a production tool. For this course, blunt is fine. The point is that test failures stop being invisible to the summary.

Notice we're also keeping AI findings and test findings as separate arrays in the output, not jamming them into one combined list. That makes the report easier to read; a reader can see at a glance "the AI thinks this, the tests prove that." The two lenses don't need to look identical.

Troubleshooting: test findings empty when tests failed

If the test exit code is non-zero but combined.testFindings is empty, the parser didn't recognize the failure markers. Print the raw stdout and check what your test runner uses. Add the marker to FAILURE_MARKERS in test-runner.ts.

Troubleshooting: combined risk feels too harsh

If you want flaky test suites to not auto-escalate to high risk, add a flakyAllowance threshold (e.g. allow up to N failures before escalating). Out of scope for this course, but it's a one-line change.

Try It

Run against a repo with failing tests:

pnpm review https://github.com/<repo-with-failing-tests>

Expected output:

Reviewing https://github.com/<...>...
Sandbox: sbx_7N2k4A...
Collected 2 file(s) for analysis.
Tests (pnpm): exit code 1

Overall risk: high
AI findings: 2
  [medium] Missing input validation (lib/auth.ts)
  [low] Inconsistent error handling (src/index.ts)
Test findings: 2
  [high] ✕ src/auth.test.ts > login rejects empty password
  [high] FAIL  src/auth.test.ts

Then a clean repo:

pnpm review https://github.com/<a-passing-repo>

Reviewing https://github.com/<...>...
Sandbox: sbx_7N2k4A...
Collected 2 file(s) for analysis.
Tests (pnpm): exit code 0

Overall risk: low
AI findings: 1
  [low] Consider extracting magic number (src/index.ts)
Test findings: 0

Both reports are useful. The first one tells you to look at tests before merging; the second one tells you you're in fine shape.

Commit

git add src/cli.ts
git commit -m "feat(cli): merge ai findings with test failures in one report"

Done-When

CLI imports and calls parseTestFailures
Combined report has separate aiFindings and testFindings sections
overallRisk escalates to 'high' when any test failed
Clean run (no failures) preserves the AI's risk level

Solution

src/cli.ts (relevant section)

const lifecycle = await runSandboxLifecycle(repoUrl);
 
const aiReview = lifecycle.files.length === 0
  ? { overallRisk: 'low' as const, findings: [] }
  : await analyzeRepository(lifecycle.files);
 
const testFindings = parseTestFailures(
  `${lifecycle.testResult.stdout}\n${lifecycle.testResult.stderr}`
);
 
const combined = {
  overallRisk: testFindings.length > 0 ? 'high' as const : aiReview.overallRisk,
  aiFindings: aiReview.findings,
  testFindings
};
 
console.log(`\nOverall risk: ${combined.overallRisk}`);
console.log(`AI findings: ${combined.aiFindings.length}`);
for (const finding of combined.aiFindings) {
  console.log(`  [${finding.severity}] ${finding.summary} (${finding.file})`);
}
console.log(`Test findings: ${combined.testFindings.length}`);
for (const finding of combined.testFindings) {
  console.log(`  [${finding.severity}] ${finding.details}`);
}