---
title: "Merge AI and Test Findings"
description: "In the CLI, parse the test output from the lifecycle, turn failures into structured findings, and merge them with the AI review. Recalculate overall risk so a failing test bumps the report's severity even if the AI thought everything was fine."
canonical_url: "https://vercel.com/academy/vercel-sandbox/merge-ai-and-test-findings"
md_url: "https://vercel.com/academy/vercel-sandbox/merge-ai-and-test-findings.md"
docset_id: "vercel-academy"
doc_version: "1.0"
last_updated: "2026-05-17T23:44:57.320Z"
content_type: "lesson"
course: "vercel-sandbox"
course_title: "Vercel Sandbox"
prerequisites:  []
---

<agent-instructions>
Vercel Academy — structured learning, not reference docs.
Lessons are sequenced.
Adapt commands to the human's actual environment (OS, package manager, shell, editor) — detect from project context or ask, don't assume.
The lesson shows one path; if the human's project diverges, adapt concepts to their setup.
Preserve the learning goal over literal steps.
Quizzes are pedagogical — engage, don't spoil.
Quiz answers are included for your reference.
</agent-instructions>

# Merge AI and Test Findings

# Merge Findings into One Report

Two sources of truth, one report.

The AI thinks the code is fine. The tests are red. Both are real. The job of the merge step is to surface both, and to make sure a green AI review doesn't mask a broken build.

## Outcome

Update the CLI to parse `lifecycle.testResult.stdout/stderr` into test findings, merge them with the AI review's findings, recalculate `overallRisk`, and print the unified result.

## Fast Track

1. Import `parseTestFailures` in the CLI.
2. After the analyzer call, parse test failures from the combined stdout/stderr.
3. Build a `combined` review where `overallRisk` escalates to `'high'` if any tests failed.

## Hands-on exercise

Open `src/cli.ts` and update the action body:

```ts
import { Command } from 'commander';
import { runSandboxLifecycle } from './sandbox-lifecycle';
import { analyzeRepository } from './analyze';
import { parseTestFailures } from './test-runner';

function isValidGitHubRepoUrl(input: string): boolean {
  return /^https:\/\/github\.com\/[\w.-]+\/[\w.-]+\/?$/.test(input);
}

const program = new Command();

program
  .name('repo-review')
  .description('Clone and review a GitHub repository in a Sandbox')
  .version('0.1.0');

program
  .command('review <repoUrl>')
  .description('Run a Sandbox review against a GitHub repository URL')
  .action(async (repoUrl: string) => {
    if (!isValidGitHubRepoUrl(repoUrl)) {
      console.error(`Invalid GitHub repository URL: ${repoUrl}`);
      console.error('Expected format: https://github.com/<owner>/<repo>');
      process.exitCode = 2;
      return;
    }

    console.log(`Reviewing ${repoUrl}...`);

    try {
      const lifecycle = await runSandboxLifecycle(repoUrl);
      console.log(`Sandbox: ${lifecycle.sandboxId}`);
      console.log(`Collected ${lifecycle.files.length} file(s) for analysis.`);
      console.log(`Tests (${lifecycle.testResult.packageManager}): exit code ${lifecycle.testResult.exitCode}`);

      const aiReview = lifecycle.files.length === 0
        ? { overallRisk: 'low' as const, findings: [] }
        : await analyzeRepository(lifecycle.files);

      const testFindings = parseTestFailures(
        `${lifecycle.testResult.stdout}\n${lifecycle.testResult.stderr}`
      );

      const combined = {
        overallRisk: testFindings.length > 0 ? 'high' as const : aiReview.overallRisk,
        aiFindings: aiReview.findings,
        testFindings
      };

      console.log(`\nOverall risk: ${combined.overallRisk}`);
      console.log(`AI findings: ${combined.aiFindings.length}`);
      for (const finding of combined.aiFindings) {
        console.log(`  [${finding.severity}] ${finding.summary} (${finding.file})`);
      }

      console.log(`Test findings: ${combined.testFindings.length}`);
      for (const finding of combined.testFindings) {
        console.log(`  [${finding.severity}] ${finding.details}`);
      }
    } catch (error) {
      console.error('Review failed:', error instanceof Error ? error.message : error);
      process.exitCode = 1;
    }
  });

program.parse();
```

The combine rule is intentionally blunt: any test failure forces overall risk to high, regardless of what the AI thought. A passing test suite means we trust the AI's assessment.

You could imagine more nuanced rules (severity-weighted blends, configurable thresholds), and those are reasonable in a production tool. For this course, blunt is fine. The point is that test failures stop being invisible to the summary.

Notice we're also keeping AI findings and test findings as separate arrays in the output, not jamming them into one combined list. That makes the report easier to read; a reader can see at a glance "the AI thinks this, the tests prove that." The two lenses don't need to look identical.

\*\*Warning: Troubleshooting: test findings empty when tests failed\*\*

If the test exit code is non-zero but `combined.testFindings` is empty, the parser didn't recognize the failure markers. Print the raw stdout and check what your test runner uses. Add the marker to `FAILURE_MARKERS` in `test-runner.ts`.

\*\*Note: Troubleshooting: combined risk feels too harsh\*\*

If you want flaky test suites to not auto-escalate to high risk, add a `flakyAllowance` threshold (e.g. allow up to N failures before escalating). Out of scope for this course, but it's a one-line change.

## Try It

Run against a repo with failing tests:

```bash
pnpm review https://github.com/<repo-with-failing-tests>
```

Expected output:

```txt
Reviewing https://github.com/<...>...
Sandbox: sbx_7N2k4A...
Collected 2 file(s) for analysis.
Tests (pnpm): exit code 1

Overall risk: high
AI findings: 2
  [medium] Missing input validation (lib/auth.ts)
  [low] Inconsistent error handling (src/index.ts)
Test findings: 2
  [high] ✕ src/auth.test.ts > login rejects empty password
  [high] FAIL  src/auth.test.ts
```

Then a clean repo:

```bash
pnpm review https://github.com/<a-passing-repo>
```

```txt
Reviewing https://github.com/<...>...
Sandbox: sbx_7N2k4A...
Collected 2 file(s) for analysis.
Tests (pnpm): exit code 0

Overall risk: low
AI findings: 1
  [low] Consider extracting magic number (src/index.ts)
Test findings: 0
```

Both reports are useful. The first one tells you to look at tests before merging; the second one tells you you're in fine shape.

## Commit

```bash
git add src/cli.ts
git commit -m "feat(cli): merge ai findings with test failures in one report"
```

## Done-When

- [ ] CLI imports and calls `parseTestFailures`
- [ ] Combined report has separate `aiFindings` and `testFindings` sections
- [ ] `overallRisk` escalates to `'high'` when any test failed
- [ ] Clean run (no failures) preserves the AI's risk level

## Solution

```ts title="src/cli.ts (relevant section)"
const lifecycle = await runSandboxLifecycle(repoUrl);

const aiReview = lifecycle.files.length === 0
  ? { overallRisk: 'low' as const, findings: [] }
  : await analyzeRepository(lifecycle.files);

const testFindings = parseTestFailures(
  `${lifecycle.testResult.stdout}\n${lifecycle.testResult.stderr}`
);

const combined = {
  overallRisk: testFindings.length > 0 ? 'high' as const : aiReview.overallRisk,
  aiFindings: aiReview.findings,
  testFindings
};

console.log(`\nOverall risk: ${combined.overallRisk}`);
console.log(`AI findings: ${combined.aiFindings.length}`);
for (const finding of combined.aiFindings) {
  console.log(`  [${finding.severity}] ${finding.summary} (${finding.file})`);
}
console.log(`Test findings: ${combined.testFindings.length}`);
for (const finding of combined.testFindings) {
  console.log(`  [${finding.severity}] ${finding.details}`);
}
```


---

[Full course index](/academy/llms.txt) · [Sitemap](/academy/sitemap.md)
