Document search with embeddings

Overview

This example demonstrates how to use the Gemini API to create embeddings so that you can perform document search. You will use the JS client library to build a word embedding that allows you to compare search strings, or questions, to document contents.

In this tutorial, you’ll use embeddings to perform document search over a set of documents to ask questions related to the Google Car.

Setup

Install the Google GenAI SDK

Install the Google GenAI SDK from npm.

$ npm install @google/genai

Setup your API key

You can create your API key using Google AI Studio with a single click.

Remember to treat your API key like a password. Don’t accidentally save it in a notebook or source file you later commit to GitHub. In this notebook we will be storing the API key in a .env file. You can also set it as an environment variable or use a secret manager.

Here’s how to set it up in a .env file:

$ touch .env
$ echo "GEMINI_API_KEY=<YOUR_API_KEY>" >> .env

Tip

Another option is to set the API key as an environment variable. You can do this in your terminal with the following command:

$ export GEMINI_API_KEY="<YOUR_API_KEY>"

Load the API key

To load the API key from the .env file, we will use the dotenv package. This package loads environment variables from a .env file into process.env.

$ npm install dotenv

Then, we can load the API key in our code:

const dotenv = require("dotenv") as typeof import("dotenv");

dotenv.config({
  path: "../.env",
});

const GEMINI_API_KEY = process.env.GEMINI_API_KEY ?? "";
if (!GEMINI_API_KEY) {
  throw new Error("GEMINI_API_KEY is not set in the environment variables");
}
console.log("GEMINI_API_KEY is set in the environment variables");

GEMINI_API_KEY is set in the environment variables

Note

In our particular case the .env is is one directory up from the notebook, hence we need to use ../ to go up one directory. If the .env file is in the same directory as the notebook, you can omit it altogether.

│
├── .env
└── examples
    └── Talk_to_documents_with_embeddings.ipynb

Initialize SDK Client

With the new SDK, now you only need to initialize a client with you API key (or OAuth if using Vertex AI). The model is now set in each call.

const google = require("@google/genai") as typeof import("@google/genai");

const ai = new google.GoogleGenAI({ apiKey: GEMINI_API_KEY });

Select a model

Now select the model you want to use in this guide, either by selecting one in the list or writing it down. Keep in mind that some models, like the 2.5 ones are thinking models and thus take slightly more time to respond (cf. thinking notebook for more details and in particular learn how to switch the thiking off).

const tslab = require("tslab") as typeof import("tslab");

const MODEL_ID = "gemini-2.5-flash-preview-05-20";

Embedding generation

In this section, you will see how to generate embeddings for a piece of text using the embeddings from the Gemini API.

See the Embeddings quickstart to learn more about the taskType parameter used below.

const EMBEDDING_MODEL_ID = "gemini-embedding-001";

const embeddingResponse = await ai.models.embedContent({
  model: EMBEDDING_MODEL_ID,
  contents: `
    Title: The next generation of AI for developers and Google Workspace
    Full article:
    Gemini API & Google AI Studio: An approachable way to explore and
    prototype with generative AI applications
  `,
  config: {
    taskType: "retrieval_document",
    title: "The next generation of AI for developers and Google Workspace",
  },
});
console.log("Embedding response:", embeddingResponse.embeddings?.[0].values?.slice(0, 5), "...");

Embedding response: [ -0.01921668, 0.015680602, 0.0067214905, -0.057462018, 0.012769895 ] ...

Building an embeddings database

Here are three sample texts to use to build the embeddings database. You will use the Gemini API to create embeddings of each of the documents. Turn them into a dataframe for better visualization.

const DOCUMENT1 = {
  title: "Operating the Climate Control System",
  content:
    "Your Googlecar has a climate control system that allows you to adjust the temperature and airflow in the car. To operate the climate control system, use the buttons and knobs located on the center console.  Temperature: The temperature knob controls the temperature inside the car. Turn the knob clockwise to increase the temperature or counterclockwise to decrease the temperature. Airflow: The airflow knob controls the amount of airflow inside the car. Turn the knob clockwise to increase the airflow or counterclockwise to decrease the airflow. Fan speed: The fan speed knob controls the speed of the fan. Turn the knob clockwise to increase the fan speed or counterclockwise to decrease the fan speed. Mode: The mode button allows you to select the desired mode. The available modes are: Auto: The car will automatically adjust the temperature and airflow to maintain a comfortable level. Cool: The car will blow cool air into the car. Heat: The car will blow warm air into the car. Defrost: The car will blow warm air onto the windshield to defrost it.",
};
const DOCUMENT2 = {
  title: "Touchscreen",
  content:
    'Your Googlecar has a large touchscreen display that provides access to a variety of features, including navigation, entertainment, and climate control. To use the touchscreen display, simply touch the desired icon.  For example, you can touch the "Navigation" icon to get directions to your destination or touch the "Music" icon to play your favorite songs.',
};
const DOCUMENT3 = {
  title: "Shifting Gears",
  content:
    "Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.",
};

const documents = [DOCUMENT1, DOCUMENT2, DOCUMENT3];

Organize the contents of the dictionary into a dataframe for better visualization.

const danfo = require("danfojs-node") as typeof import("danfojs-node");

const df = new danfo.DataFrame(documents);
df.print();

╔════════════╤═══════════════════╤═══════════════════╗
║            │ title             │ content           ║
╟────────────┼───────────────────┼───────────────────╢
║ 0          │ Operating the C…  │ Your Googlecar …  ║
╟────────────┼───────────────────┼───────────────────╢
║ 1          │ Touchscreen       │ Your Googlecar …  ║
╟────────────┼───────────────────┼───────────────────╢
║ 2          │ Shifting Gears    │ Your Googlecar …  ║
╚════════════╧═══════════════════╧═══════════════════╝

Get the embeddings for each of these bodies of text. Add this information to the dataframe.

async function embed(title: string, content: string) {
  const response = await ai.models.embedContent({
    model: EMBEDDING_MODEL_ID,
    contents: content,
    config: {
      taskType: "retrieval_document",
      title,
    },
  });
  return response.embeddings?.[0].values;
}

const embeddings = await Promise.all(documents.map((doc) => embed(doc.title, doc.content)));
df.addColumn("embedding", new danfo.Series(embeddings), { inplace: true });
df.print();

╔════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
║            │ title             │ content           │ embedding         ║
╟────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 0          │ Operating the C…  │ Your Googlecar …  │ 0.02483931,-0.0…  ║
╟────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 1          │ Touchscreen       │ Your Googlecar …  │ 0.008149438,-0.…  ║
╟────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 2          │ Shifting Gears    │ Your Googlecar …  │ 0.009464946,0.0…  ║
╚════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

Document search with Q&A

Now that the embeddings are generated, let’s create a Q&A system to search these documents. You will ask a question about hyperparameter tuning, create an embedding of the question, and compare it against the collection of embeddings in the dataframe.

The embedding of the question will be a vector (list of float values), which will be compared against the vector of the documents using the dot product. This vector returned from the API is already normalized. The dot product represents the similarity in direction between two vectors.

The values of the dot product can range between -1 and 1, inclusive. If the dot product between two vectors is 1, then the vectors are in the same direction. If the dot product value is 0, then these vectors are orthogonal, or unrelated, to each other. Lastly, if the dot product is -1, then the vectors point in the opposite direction and are not similar to each other.

Note, with the new embeddings model (gemini-embedding-001), specify the task type as QUERY for user query and DOCUMENT when embedding a document text.

Task Type	Description
RETRIEVAL_QUERY	Specifies the given text is a query in a search/retrieval setting.
RETRIEVAL_DOCUMENT	Specifies the given text is a document in a search/retrieval setting.

const request = await ai.models.embedContent({
  model: EMBEDDING_MODEL_ID,
  contents: "How to shift gears in the Google car?",
  config: {
    taskType: "retrieval_document",
  },
});

Use the findBestPassage function to calculate the dot products, and then sort the dataframe from the largest to smallest dot product value to retrieve the relevant passage out of the database.

/* eslint-disable @typescript-eslint/no-unsafe-call, @typescript-eslint/no-unsafe-member-access */

import { DataFrame } from "danfojs-node";

function dotProduct(a: number[], b: number[]): number {
  return a.reduce((sum, value, index) => sum + value * b[index], 0);
}

async function findBestPassage(query: string, dataframe: DataFrame): Promise<string> {
  const queryEmbedding = await ai.models.embedContent({
    model: EMBEDDING_MODEL_ID,
    contents: query,
    config: {
      taskType: "retrieval_document",
    },
  });

  const dotProducts = dataframe.embedding.values.map((embedding: string) => {
    const embeddingArray = embedding.split(",").map(Number);
    return dotProduct(embeddingArray, queryEmbedding.embeddings[0].values!);
  }) as number[];
  const maxIndex = dotProducts.indexOf(Math.max(...dotProducts));
  return dataframe.content.values[maxIndex] as string;
}

View the most relevant document from the database:

tslab.display.markdown(await findBestPassage("How to shift gears in the Google car?", df));

Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position. Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.

Question and Answering Application

Let’s try to use the text generation API to create a Q & A system. Input your own custom data below to create a simple question and answering example. You will still use the dot product as a metric of similarity.

function makePrompt(query: string, relevantPassage: string): string {
  const escaped = relevantPassage.replace(/'/g, "").replace(/"/g, "").replace(/\n/g, " ");
  const prompt = `
        You are a helpful and informative bot that answers questions using text
        from the reference passage included below. Be sure to respond in a
        complete sentence, being comprehensive, including all relevant
        background information.
    
        However, you are talking to a non-technical audience, so be sure to
        break down complicated concepts and strike a friendly and conversational
        tone. If the passage is irrelevant to the answer, you may ignore it.
    
        QUESTION: '${query}'
        PASSAGE: '${escaped}'
    
        ANSWER:
    `;
  return prompt;
}

const prompt = makePrompt(
  "How to shift gears in the Google car?",
  await findBestPassage("How to shift gears in the Google car?", df)
);
tslab.display.markdown(prompt);

    You are a helpful and informative bot that answers questions using text
    from the reference passage included below. Be sure to respond in a
    complete sentence, being comprehensive, including all relevant
    background information.

    However, you are talking to a non-technical audience, so be sure to
    break down complicated concepts and strike a friendly and conversational
    tone. If the passage is irrelevant to the answer, you may ignore it.

    QUESTION: 'How to shift gears in the Google car?'
    PASSAGE: 'Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.'

    ANSWER:

Choose one of the Gemini content generation models in order to find the answer to your query.

const answer = await ai.models.generateContent({
  model: MODEL_ID,
  contents: prompt,
});
tslab.display.markdown(answer.text ?? "");

Shifting gears in your Googlecar is quite simple because it has an automatic transmission, which means you don’t need to worry about a clutch! To change gears, you just move the shift lever to the position you need.

Here’s a breakdown of what each position does:

Park (P): You’ll use this when you’re done driving and parked. It locks the wheels, so your car won’t move at all.
Reverse (R): This position is for when you need to back up.
Neutral (N): If you’re stopped at a traffic light or in slow-moving traffic, you can put it in Neutral. In this position, the car isn’t actually in gear, so it won’t move unless you press the gas pedal (though you should still keep your foot on the brake when stopped!).
Drive (D): This is the position you’ll use for most of your driving, letting you move forward.
Low (L): This setting is handy for tricky situations like driving in snow or on other slippery surfaces, giving you a bit more control.

Next steps

Check out the embeddings quickstart to learn more, and browse the cookbook for more examples.