Weaviate Query Agent with Gemini API

This notebook will show you how to define the Weaviate Query Agent as a tool through the Gemini API.

Requirements

Weaviate Cloud instance (WCD): The Weaviate Query Agent is only accessible through WCD at the moment. You can create a serverless cluster or a free 14-day sandbox here.
Have a GCP project and Gemini API key (generate one here)
Install the Google Gen AI SDK with npm install @google/genai
Install the Weaviate Javascript client with npm install weaviate-client
You’ll need a Weaviate cluster with data. If you don’t have one, check out this notebook to import the Weaviate Blogs.

Setup

Install the Google GenAI SDK

Install the Google GenAI SDK from npm.

$ npm install @google/genai

Setup your API key

You can create your API key using Google AI Studio with a single click.

Remember to treat your API key like a password. Don’t accidentally save it in a notebook or source file you later commit to GitHub. In this notebook we will be storing the API key in a .env file. You can also set it as an environment variable or use a secret manager.

Here’s how to set it up in a .env file:

$ touch .env
$ echo "GEMINI_API_KEY=<YOUR_API_KEY>" >> .env

Tip

Another option is to set the API key as an environment variable. You can do this in your terminal with the following command:

$ export GEMINI_API_KEY="<YOUR_API_KEY>"

Load the API key

To load the API key from the .env file, we will use the dotenv package. This package loads environment variables from a .env file into process.env.

$ npm install dotenv

Then, we can load the API key in our code:

const dotenv = require("dotenv") as typeof import("dotenv");

dotenv.config({
  path: "../../.env",
});

const GEMINI_API_KEY = process.env.GEMINI_API_KEY ?? "";
if (!GEMINI_API_KEY) {
  throw new Error("GEMINI_API_KEY is not set in the environment variables");
}
console.log("GEMINI_API_KEY is set in the environment variables");

GEMINI_API_KEY is set in the environment variables

Note

In our particular case the .env is is two directories up from the notebook, hence we need to use ../../ to go up two directories. If the .env file is in the same directory as the notebook, you can omit it altogether.

│
├── .env
└── examples
    └── weaviate
        └── query-agent-as-a-tool.ipynb

Initialize SDK Client

With the new SDK, now you only need to initialize a client with you API key (or OAuth if using Vertex AI). The model is now set in each call.

const google = require("@google/genai") as typeof import("@google/genai");

const ai = new google.GoogleGenAI({ apiKey: GEMINI_API_KEY });

Select a model

Now select the model you want to use in this guide, either by selecting one in the list or writing it down. Keep in mind that some models, like the 2.5 ones are thinking models and thus take slightly more time to respond (cf. thinking notebook for more details and in particular learn how to switch the thiking off).

const tslab = require("tslab") as typeof import("tslab");

const MODEL_ID = "gemini-2.5-flash-preview-05-20";

Define Query Agent function

import weaviate from "weaviate-client";
import { QueryAgent } from "weaviate-agents";

async function query_agent_request(query: string): Promise<string> {
  const client = await weaviate.connectToWeaviateCloud(process.env.WEAVIATE_URL!, {
    authCredentials: new weaviate.ApiKey(process.env.WEAVIATE_API_KEY!),
    headers: {}, // add the API key to the model provider from your Weaviate collection, for example `headers={"X-Goog-Studio-Api-Key": GEMINI_API_KEY}` not needed for purpose of this example
  });
  const agent = new QueryAgent(client, {
    collections: ["WeaviateBlogChunk"],
  });
  const response = await agent.run(query);
  return response.finalAnswer;
}

Let’s test the Weaviate Query Agent function. This function will take a question as input and return the answer from the Weaviate cluster.

const answer = await query_agent_request(
  "Can you explain what the 'Weaviate retrieval plugin' does and how it was created?"
);
tslab.display.markdown(answer);

The Weaviate retrieval plugin is designed to connect ChatGPT to an instance of the Weaviate vector database. Its main functionality is to enable ChatGPT to query relevant documents stored in the vector database, effectively allowing it to “know” and use custom data for answering questions. Beyond querying, the plugin allows ChatGPT to upsert (store or update) documents for remembering information for later use, as well as delete documents to forget information. This integration serves as a form of long-term memory for ChatGPT, enabling it to persist knowledge beyond the short-lived memory of a chat session. This capability makes the vector database act as the cortex of ChatGPT, grounding its responses in the context of data stored externally.

Regarding its creation, the Weaviate retrieval plugin was developed primarily using Python with FastAPI as the server framework to run the plugin. The development process follows a typical plugin-building approach, involving setting up server endpoints that handle the plugin’s core functionalities: /query to retrieve documents, /upsert to add or update documents, and /delete to remove documents. The complete source code and development details, including implementation challenges and how they were overcome, are openly available on GitHub. The creation process highlights the plugin’s extendable design, which allows for customizing ChatGPT by integrating it with third-party vector databases like Weaviate to enhance its knowledge base.

In summary, the Weaviate retrieval plugin connects ChatGPT seamlessly with a vector database to provide a persistent, context-aware information layer, and it was built using familiar web development tools and practices, primarily Python and FastAPI, with openly shared resources to guide others in creating similar plugins.

Configure Tool

import { CallableTool, FunctionCall, FunctionDeclaration, Part, Tool, Type } from "@google/genai";

const query_agent_request_tool: FunctionDeclaration = {
  name: "query_agent_request",
  description: "A function that takes a question as input and returns the answer from the Weaviate cluster.",
  parameters: {
    type: Type.OBJECT,
    properties: {
      query: {
        type: Type.STRING,
        description: "The question to be answered by the Weaviate cluster.",
      },
    },
    required: ["query"],
  },
  response: {
    type: Type.STRING,
    description: "The answer from the Weaviate cluster.",
  },
};

const query_agent_request_callable: CallableTool = {
  callTool: async (functionCalls: FunctionCall[]): Promise<Part[]> => {
    console.log("Query Agent Function Calls:", JSON.stringify(functionCalls, null, 2));
    const fc = functionCalls[0];
    const { query } = fc.args as { query: string };
    return [
      google.createPartFromFunctionResponse(fc.id ?? "", fc.name ?? "", {
        name: fc.name,
        result: await query_agent_request(query),
      }),
    ];
  },
  tool: async (): Promise<Tool> => ({ functionDeclarations: [query_agent_request_tool] }),
};

Query Time

const chat = ai.chats.create({
  model: MODEL_ID,
  config: {
    tools: [query_agent_request_callable],
    toolConfig: {
      functionCallingConfig: {
        mode: google.FunctionCallingConfigMode.AUTO,
      },
    },
  },
});

// temporarily make console.warn a no-op to avoid warnings in the output (non-text part in GenerateContentResponse caused by accessing .text)
// https://github.com/googleapis/js-genai/blob/d82aba244bdb804b063ef8a983b2916c00b901d2/src/types.ts#L2005
// copy the original console.warn function to restore it later
const warn_fn = console.warn;
// eslint-disable-next-line @typescript-eslint/no-empty-function, no-empty-function
console.warn = function () {};

const chat_response = await chat.sendMessageStream({
  message:
    "You are connected to a database that has a blog post on deploying Weaviate on Docker. Can you answer how I can use Weaviate with Docker?",
});
let response_parts = "";
for await (const part of chat_response) {
  response_parts += part.text ?? "";
}
tslab.display.markdown(response_parts);

Query Agent Function Calls: [
  {
    "name": "query_agent_request",
    "args": {
      "query": "How can I use Weaviate with Docker?"
    }
  }
]

Weaviate can be used with Docker in two primary ways:

1. Quick Start with Docker Run: You can quickly launch a Weaviate instance using a simple Docker command:

docker run -p 8080:8080 -p 50051:50051 cr.weaviate.io/semitechnologies/weaviate:1.24.1

This command exposes the necessary ports for API and gRPC communication.

2. Custom Setup with Docker Compose: For more customized deployments, you can configure a docker-compose.yml file. This allows you to define the Weaviate service and enable other modules.

Here’s an example of a docker-compose.yml to run Weaviate with OpenAI modules:

version: '3.4'
services:
  weaviate:
    image: semitechnologies/weaviate:1.17.3
    ports:
      - 8080:8080
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: 'text2vec-openai'
      ENABLE_MODULES: 'text2vec-openai,generative-openai'
      OPENAI_APIKEY: sk-foobar # Replace with your actual OpenAI API key or provide at runtime
      CLUSTER_HOSTNAME: 'node1'
    command:
      - --host
      - 0.0.0.0
      - --port
      - '8080'
      - --scheme
      - http
    restart: on-failure:0

After creating or modifying your docker-compose.yml, navigate to the directory containing the file and run:

docker-compose up -d

This will start Weaviate with your specified configuration. Remember to replace sk-foobar with your actual OpenAI API key if you’re using OpenAI modules.

Advanced Setups: You can integrate additional modules (e.g., multi2vec-bind for multimodal data, or other vectorizers like Cohere or Hugging Face) by adjusting the ENABLE_MODULES environment variable and adding relevant services and environment variables to your docker-compose.yml. When integrating with other projects like Auto-GPT, you’ll need to configure Docker Compose for inter-container communication and update environment variables with your Weaviate instance URL and API keys.

Verification: To verify that Weaviate is running, you can access http://localhost:8080 in your browser.