Template for building your own custom ChatGPT style doc search powered by Next.js, OpenAI, and Supabase.
This starter takes all the .mdx
files in the pages
directory and processes them to use as custom context within OpenAI Text Completion prompts.
Deploy this starter to Vercel. The Supabase integration will automatically set the required environment variables and configure your Database Schema. All you have to do is set your OPENAI_KEY
and you're ready to go!
[
Building your own custom ChatGPT involves four steps:
.mdx
files in your pages
folder).Step 1. and 2. happen at build time, e.g. when Vercel builds your Next.js app. During this time the generate-embeddings
script is being executed which performs the following tasks:
sequenceDiagramparticipant Vercelparticipant DB (pgvector)participant OpenAI (API)loop 1. Pre-process the knowledge baseVercel->>Vercel: Chunk .mdx pages into sectionsloop 2. Create & store embeddingsVercel->>OpenAI (API): create embedding for page sectionOpenAI (API)->>Vercel: embedding vector(1536)Vercel->>DB (pgvector): store embedding for page sectionendend
In addition to storing the embeddings, this script generates a checksum for each of your .mdx
files and stores this in another database table to make sure the embeddings are only regenerated when the file has changed.
Step 3. and 4. happen at runtime, anytime the user submits a question. When this happens, the following sequence of tasks is performed:
sequenceDiagramparticipant Clientparticipant Edge Functionparticipant DB (pgvector)participant OpenAI (API)Client->>Edge Function: { query: lorem ispum }critical 3. Perform vector similarity searchEdge Function->>OpenAI (API): create embedding for queryOpenAI (API)->>Edge Function: embedding vector(1536)Edge Function->>DB (pgvector): vector similarity searchDB (pgvector)->>Edge Function: relevant docs contentendcritical 4. Inject content into promptEdge Function->>OpenAI (API): completion request prompt: query + relevant docs contentOpenAI (API)-->>Client: text/event-stream: completions responseend
The relevant files for this are the SearchDialog
(Client) component and the vector-search
(Edge Function).
The initialization of the database, including the setup of the pgvector
extension is stored in the supabase/migrations
folder which is automatically applied to your local Postgres instance when running supabase start
.
cp .env.example .env
OPENAI_KEY
in the newly created .env
file.NEXT_PUBLIC_SUPABASE_ANON_KEY
and SUPABASE_SERVICE_ROLE_KEY
run:Note: You have to run supabase to retrieve the keys.
Make sure you have Docker installed and running locally. Then run
supabase start
To retrieve NEXT_PUBLIC_SUPABASE_ANON_KEY
and SUPABASE_SERVICE_ROLE_KEY
run:
supabase status
In a new terminal window, run
pnpm dev
.mdx
format. This can be done by renaming existing (or compatible) markdown .md
file.pnpm run embeddings
to regenerate embeddings.Note: Make sure supabase is running. To check, run
supabase status
. If is not running runsupabase start
.
pnpm dev
again to refresh NextJS localhost:3000 rendered page.Apache 2.0
Template for building your own custom ChatGPT style doc search powered by Next.js, OpenAI, and Supabase.
This starter takes all the .mdx
files in the pages
directory and processes them to use as custom context within OpenAI Text Completion prompts.
Deploy this starter to Vercel. The Supabase integration will automatically set the required environment variables and configure your Database Schema. All you have to do is set your OPENAI_KEY
and you're ready to go!
[
Building your own custom ChatGPT involves four steps:
.mdx
files in your pages
folder).Step 1. and 2. happen at build time, e.g. when Vercel builds your Next.js app. During this time the generate-embeddings
script is being executed which performs the following tasks:
sequenceDiagramparticipant Vercelparticipant DB (pgvector)participant OpenAI (API)loop 1. Pre-process the knowledge baseVercel->>Vercel: Chunk .mdx pages into sectionsloop 2. Create & store embeddingsVercel->>OpenAI (API): create embedding for page sectionOpenAI (API)->>Vercel: embedding vector(1536)Vercel->>DB (pgvector): store embedding for page sectionendend
In addition to storing the embeddings, this script generates a checksum for each of your .mdx
files and stores this in another database table to make sure the embeddings are only regenerated when the file has changed.
Step 3. and 4. happen at runtime, anytime the user submits a question. When this happens, the following sequence of tasks is performed:
sequenceDiagramparticipant Clientparticipant Edge Functionparticipant DB (pgvector)participant OpenAI (API)Client->>Edge Function: { query: lorem ispum }critical 3. Perform vector similarity searchEdge Function->>OpenAI (API): create embedding for queryOpenAI (API)->>Edge Function: embedding vector(1536)Edge Function->>DB (pgvector): vector similarity searchDB (pgvector)->>Edge Function: relevant docs contentendcritical 4. Inject content into promptEdge Function->>OpenAI (API): completion request prompt: query + relevant docs contentOpenAI (API)-->>Client: text/event-stream: completions responseend
The relevant files for this are the SearchDialog
(Client) component and the vector-search
(Edge Function).
The initialization of the database, including the setup of the pgvector
extension is stored in the supabase/migrations
folder which is automatically applied to your local Postgres instance when running supabase start
.
cp .env.example .env
OPENAI_KEY
in the newly created .env
file.NEXT_PUBLIC_SUPABASE_ANON_KEY
and SUPABASE_SERVICE_ROLE_KEY
run:Note: You have to run supabase to retrieve the keys.
Make sure you have Docker installed and running locally. Then run
supabase start
To retrieve NEXT_PUBLIC_SUPABASE_ANON_KEY
and SUPABASE_SERVICE_ROLE_KEY
run:
supabase status
In a new terminal window, run
pnpm dev
.mdx
format. This can be done by renaming existing (or compatible) markdown .md
file.pnpm run embeddings
to regenerate embeddings.Note: Make sure supabase is running. To check, run
supabase status
. If is not running runsupabase start
.
pnpm dev
again to refresh NextJS localhost:3000 rendered page.Apache 2.0