---
title: "Resilient Error Handling"
description: "Right now, a failure anywhere in the pipeline aborts the whole review. In this lesson, we wrap each stage so a broken test runner doesn't hide the AI findings (and a failed AI call doesn't hide the test results)."
canonical_url: "https://vercel.com/academy/vercel-sandbox/resilient-error-handling"
md_url: "https://vercel.com/academy/vercel-sandbox/resilient-error-handling.md"
docset_id: "vercel-academy"
doc_version: "1.0"
last_updated: "2026-05-18T00:44:06.999Z"
content_type: "lesson"
course: "vercel-sandbox"
course_title: "Vercel Sandbox"
prerequisites:  []
---

<agent-instructions>
Vercel Academy — structured learning, not reference docs.
Lessons are sequenced.
Adapt commands to the human's actual environment (OS, package manager, shell, editor) — detect from project context or ask, don't assume.
The lesson shows one path; if the human's project diverges, adapt concepts to their setup.
Preserve the learning goal over literal steps.
Quizzes are pedagogical — engage, don't spoil.
Quiz answers are included for your reference.
</agent-instructions>

# Resilient Error Handling

# Don't Let One Broken Stage Kill the Whole Run

The current pipeline is all-or-nothing. If the AI call times out, you get no findings, no test results, and no idea what was wrong with the repo. If the test runner blows up, the AI review is wasted.

Production tools don't behave like that. A failed test step should still let the AI review come through, and vice versa. We want partial results.

## Outcome

Refactor the CLI's pipeline so each stage (lifecycle, AI analysis, test parsing) catches its own errors and reports them as warnings in the final output, instead of aborting the whole run.

## Fast Track

1. Add a `safe(label, fn, fallback)` helper that runs `fn` and returns `fallback` if it throws, logging a warning.
2. Wrap the AI analysis call in `safe`.
3. Keep the lifecycle outside the safe wrapper (a failed lifecycle is fatal; everything else builds on it).

## Hands-on exercise

Open `src/cli.ts`. We're adding one helper and changing how stages are wrapped:

```ts
import { Command } from 'commander';
import { runSandboxLifecycle } from './sandbox-lifecycle';
import { analyzeRepository } from './analyze';
import { parseTestFailures } from './test-runner';

function isValidGitHubRepoUrl(input: string): boolean {
  return /^https:\/\/github\.com\/[\w.-]+\/[\w.-]+\/?$/.test(input);
}

async function time<T>(label: string, fn: () => Promise<T>): Promise<T> {
  const startedAt = Date.now();
  try {
    return await fn();
  } finally {
    console.log(`  ⏱  ${label}: ${Date.now() - startedAt}ms`);
  }
}

async function safe<T>(label: string, fn: () => Promise<T>, fallback: T): Promise<T> {
  try {
    return await fn();
  } catch (error) {
    console.warn(
      `⚠  ${label} failed: ${error instanceof Error ? error.message : error}`
    );
    return fallback;
  }
}

const program = new Command();

program
  .name('repo-review')
  .description('Clone and review a GitHub repository in a Sandbox')
  .version('0.1.0');

program
  .command('review <repoUrl>')
  .description('Run a Sandbox review against a GitHub repository URL')
  .action(async (repoUrl: string) => {
    if (!isValidGitHubRepoUrl(repoUrl)) {
      console.error(`Invalid GitHub repository URL: ${repoUrl}`);
      console.error('Expected format: https://github.com/<owner>/<repo>');
      process.exitCode = 2;
      return;
    }

    console.log(`Reviewing ${repoUrl}...`);
    const totalStart = Date.now();

    try {
      const lifecycle = await time('sandbox lifecycle', () => runSandboxLifecycle(repoUrl));

      const aiReview = lifecycle.files.length === 0
        ? { overallRisk: 'low' as const, findings: [] }
        : await time('ai analysis', () =>
            safe(
              'ai analysis',
              () => analyzeRepository(lifecycle.files),
              { overallRisk: 'low' as const, findings: [] }
            )
          );

      const testFindings = safe(
        'test parsing',
        async () =>
          parseTestFailures(
            `${lifecycle.testResult.stdout}\n${lifecycle.testResult.stderr}`
          ),
        []
      );

      const resolvedTestFindings = await testFindings;

      const combined = {
        overallRisk: resolvedTestFindings.length > 0 ? 'high' as const : aiReview.overallRisk,
        aiFindings: aiReview.findings,
        testFindings: resolvedTestFindings
      };

      console.log(`\nOverall risk: ${combined.overallRisk}`);
      console.log(`AI findings: ${combined.aiFindings.length}`);
      for (const finding of combined.aiFindings) {
        console.log(`  [${finding.severity}] ${finding.summary} (${finding.file})`);
      }
      console.log(`Test findings: ${combined.testFindings.length}`);
      for (const finding of combined.testFindings) {
        console.log(`  [${finding.severity}] ${finding.details}`);
      }

      console.log(`\nTotal: ${Date.now() - totalStart}ms`);
    } catch (error) {
      console.error('Review failed:', error instanceof Error ? error.message : error);
      process.exitCode = 1;
    }
  });

program.parse();
```

The rule is: catch errors at stages that can fail independently, let everything else propagate.

The lifecycle stays outside `safe` because if the Sandbox didn't boot or the clone failed, there's literally nothing to review. That's a real abort condition.

The AI analysis is the textbook `safe` candidate. It can time out, hit rate limits, or fail schema validation, and none of those are reasons to throw away a perfectly good test report.

Test parsing is wrapped too, even though `parseTestFailures` is pure and shouldn't ever throw. Defensive habit: any future change that adds I/O to the parser would suddenly have a failure mode the caller wasn't expecting.

We're not changing the exit code logic. If the lifecycle dies, we still exit 1. If the AI fails but the test parser succeeds, we exit 0 with a warning. The review is partial, but it ran.

\*\*Warning: Troubleshooting: stage warnings still throw\*\*

If you see a stack trace instead of a `⚠` warning, the error is escaping `safe`. Most likely the wrapped function does `setImmediate(() => { throw ... })` or similar deferred throws, which `try/catch` can't catch. Convert those to `Promise.reject(...)` instead.

\*\*Note: Troubleshooting: when to abort vs continue\*\*

The rule of thumb: abort when the next stage literally cannot run without this one's output. Continue when the next stage can run with a fallback. The lifecycle has to abort; analysis and parsing don't.

## Try It

Force an AI failure to see partial results. Easiest way: temporarily set the model in `src/analyze.ts` to a model you don't have access to. Then run:

```bash
pnpm review https://github.com/<a-repo-with-passing-tests>
```

Expected output:

```txt
Reviewing https://github.com/<...>...
  ⏱  sandbox lifecycle: 9420ms
⚠  ai analysis failed: Model "openai/nonsense-model" not available
  ⏱  ai analysis: 1240ms

Overall risk: low
AI findings: 0
Test findings: 0

Total: 10660ms
```

Note three things:

- The pipeline didn't abort. Exit code is 0.
- AI findings are empty (fallback) and a warning explains why.
- The test result still came through.

Reset the model in `src/analyze.ts` back to a real one before moving on.

## Commit

```bash
git add src/cli.ts
git commit -m "feat(cli): per-stage error handling so one failure doesn't abort the run"
```

## Done-When

- [ ] `safe(label, fn, fallback)` helper returns the fallback on error and logs a warning
- [ ] AI analysis failures don't abort the pipeline
- [ ] Lifecycle failures still abort with exit code 1
- [ ] Partial results print with `⚠` warnings explaining what failed

## Solution

```ts title="src/cli.ts (helper)"
async function safe<T>(label: string, fn: () => Promise<T>, fallback: T): Promise<T> {
  try {
    return await fn();
  } catch (error) {
    console.warn(
      `⚠  ${label} failed: ${error instanceof Error ? error.message : error}`
    );
    return fallback;
  }
}
```


---

[Full course index](/academy/llms.txt) · [Sitemap](/academy/sitemap.md)
