Extraction - Your First AI Script

Now that you've learned some theory and got your project setup, it's time to ship some code. You will build and run a script that extracts info from text using the AI SDK's generateText method. This will show you firsthand how tweaking your prompt or swapping models instantly changes your results.

How Your Script Works

Loading diagram...

Analyzing the Starter Script

Open your project code. Look for app/(1-extraction)/extraction.ts and essay.txt.

Update the contents of extraction.ts with this code that extracts names from the essay:

app/(1-extraction)/extraction.ts

import dotenvFlow from 'dotenv-flow';
dotenvFlow.config(); // Load environment variables (API keys, etc.)
import fs from 'fs';
import { generateText } from 'ai'; // AI SDK's core text generation function
 
// Read the essay file that we'll extract names from
const essay = fs.readFileSync('app/(1-extraction)/essay.txt', 'utf-8');
 
async function main() {
  // Call the LLM with our extraction prompt
  const result = await generateText({
    model: 'openai/gpt-4.1', // Fast, cost-effective for simple extraction tasks (non-reasoning)
                              // For complex analysis, try 'openai/gpt-5' (reasoning model, slower but more accurate)
    prompt: `Extract all the names mentioned in this essay. List them separated by commas.
Essay:
${essay}`, // Instruction + the actual essay content
  });
  
  // The AI's response is in result.text
  console.log('\n--- AI Response ---');
  console.log(result.text); // This will be something like: "John Smith, Jane Doe, ..."
  console.log('-------------------');
}
 
// Run the async function and catch any errors
main().catch((error) => {
  console.error('❌ Extraction failed:', error.message);
  console.log('\n💡 Common issues:');
  console.log('  - Check your .env.local file has valid API keys');
  console.log('  - Verify essay.txt exists at app/(1-extraction)/essay.txt');
  console.log('  - Ensure you have internet connectivity for API calls');
  process.exit(1);
});

Run Your First AI Script!

From your terminal, run:

pnpm extraction

You'll see the AI extracting names from the essay. Your first feature works. Nice!

--- AI Response ---
Here are all the names mentioned in the essay, separated by commas:
 
Brian Chesky, Ron Conway, Steve Jobs, John Sculley
-------------------

Verification Task

Check app/(1-extraction)/essay.txt and use search (Cmd+F/Ctrl+F) to verify the names. Did the AI nail it or miss some?

Understanding Token Usage

LLMs process text as 'tokens' (~4 chars each). Understanding tokens helps optimize speed and cost:

Visualize tokenization at tiktokenizer.vercel.app
Count tokens programmatically with tiktoken: pnpm add tiktoken
Monitor usage to estimate costs and stay within context limits

Try pasting different prompts into Tiktokenizer to see surprising patterns (spaces matter!).

Iteration is Everything

Running the script once is just the start. Working with LLMs is all about iteration. Play with the prompt and see for yourself:

Challenge 1: Prompt Engineering – Change the Task

Task: Swap the prompt to the following:

// Inside the prompt backticks:
What is the key takeaway of this piece in 50 words?
 
Essay:
${essay}

Action: Save and re-run pnpm extraction
Observe: See how one prompt change completely transforms what your app does

Challenge 2: Model Swapping – Upgrade the Brain

Task: Keep the summary prompt but change the model using the following code block:

// Change this line:
model: 'openai/gpt-5',

Action: Save and run again
Observe: Compare results. Better quality? Worth the extra cost/time?

Model Selection Guide

Available Models via Vercel AI Gateway:

OpenAI:

openai/gpt-5 - Most capable for complex reasoning
openai/gpt-4.1 - Fast & cost-effective for most tasks (non-reasoning)
openai/gpt-5-nano - Fastest for simple tasks
openai/gpt-4.1-mini - Previous generation, still capable (non-reasoning)

Anthropic:

anthropic/claude-sonnet-4 - Strong reasoning & analysis

Google:

google/gemini-2.5-pro - Advanced multimodal capabilities
google/gemini-2.5-flash - Fast responses, good balance
google/gemini-2.5-flash-lite - Lightweight & quick
google/gemini-2.0-flash - Previous flash version

See the Vercel AI Gateway models for pricing & details, or the OpenAI models documentation for OpenAI-specific info.

Simply swap the model string to experiment - the AI SDK handles all the provider differences for you!

Real-World Applications

This simple extraction pattern powers serious production features like:

Content Moderation: Finding problematic content
Research Tools: Pulling key data from papers
Data Pipelines: Converting messy text to clean data
Compliance Systems: Identifying PII/sensitive info

It's the same pattern: send content + instructions, process the response.

Key things to remember

generateText = your basic AI workhorse
The prompt = what guides the AI
The model = power/speed/cost tradeoff
Iteration = the key to success

What's Next: Model Types and Performance

You've built your first AI script and experienced the power of prompt engineering. In the next lesson, you'll learn about different model types and their performance characteristics. Understanding when to use fast models vs reasoning models is crucial for building AI features that deliver the right user experience.

After that, you'll be ready for "invisible AI" - behind-the-scenes features that enhance your product's UX using the patterns you've learned here.