LLM Tools: From Chatbot to Real-World Agent (Part 1)

Series Overview

This is part 1 of a four-part series on building production-ready AI agents:

Part 1: Building Your First LLM Agent: From Chatbot to Tool-Using Assistant - This post
Part 2: Scaling LLM Agents with MCP (Model Context Protocol)
Part 3: Securing LLM Agents with Authentication
Part 4: Real-time LLM Responses in Production

I hope you find the code in this series helpful! The complete implementation for this post can be found here and the final code for the project can be found here. Feel free to fork it and adapt it for your own projects.

Note: This project includes comprehensive testing with a carefully configured Jest/SWC setup for TypeScript monorepos. Testing LLM applications with MCP can be quite tricky, so if you fork this project, don’t ignore the valuable testing configuration—it includes solutions for common issues like workspace package mocking, module resolution, and proper test isolation.

Architecture Note: This series implements MCP protocol manually rather than using the official MCP TypeScript SDK for educational purposes and maximum control. You’ll learn exactly how MCP works under the hood, making debugging and customization easier. The patterns shown here can easily be adapted to use the official SDK if preferred.

The repository includes an AI.md file with comprehensive guidance for developers and LLMs who want to modify and extend this code as a starting point for their own projects. It covers architecture patterns, extension points, testing configuration, and production considerations.

You build an LLM-powered chatbot, and it’s great at conversation. It can write, summarize, and answer questions about almost anything. But then you realize: it’s stuck in a bubble.

Your LLM doesn’t know what’s in your database right now. It can’t check if that product is actually in stock. It can’t trigger any real actions in your application. All it can do is generate text based on what it learned during training.

So how do we fix this? How do we connect the LLM’s reasoning power to actual, live data and real-world actions?

The solution is Tool Use (some folks call it Function Calling). Instead of asking the LLM to just answer questions, we give it a toolkit. The LLM’s job becomes: “Figure out which tool to use, and how to use it, to get the information I need.”

This turns your chatbot into something much more powerful: an agent that can actually interact with your application. Let’s build one to see how it works.

The Problem: When Your AI Can’t Actually Help

Let’s say we’re building a real estate app with an AI assistant. A user asks:

“Find me active 3-bedroom houses in Portland under $800,000.”

Send this to an LLM, and you’ll get something like: “I don’t have access to current real estate listings, but here’s how you could search for properties…” Not exactly helpful.

The LLM has zero connection to your database. It can’t search listings, check what’s available, or give you actual data. It’s just generating text based on its training.

The Fix: Give Your LLM Some Tools

Here’s what we’re going to do: give our LLM access to tools that can actually query our real estate database. But we’re going to be smart about it.

We’ll build this in TypeScript so our LLM’s understanding of our tools stays in sync with our actual code. No more “schema drift” where your AI integration breaks because someone changed a database field.

First, let’s define our data types. These are simplified versions of what a real estate application might use:

📁 View complete file on GitHub

// src/tools/listings.types.ts

/**
 * Defines the search criteria for finding property listings.
 * The LLM will learn to populate this structure from natural language.
 */
export interface ListingFilters {
  status?: "Active" | "Pending" | "Sold";
  city?: string;
  state?: string;
  minBedrooms?: number;
  maxPrice?: number;
}

/**
 * Represents a single property listing.
 * This is the data structure our tool will return.
 */
export interface Listing {
  listingId: string;
  address: {
    street: string;
    city: string;
    state: string;
    zip: string;
  };
  price: number;
  bedrooms: number;
  bathrooms: number;
  status: "Active" | "Pending" | "Sold";
}

Now let’s implement the tools that use these types:

📁 View complete file on GitHub

// src/tools/listings.ts
import { Listing, ListingFilters } from "./listings.types";
import { mockListings } from "./mock.data"; // Assume we have mock data

/**
 * A tool that finds property listings based on a set of filters.
 * In a real application, this would query a database.
 */
export function findListings(filters: ListingFilters): Listing[] {
  console.log("--- Calling 'findListings' tool with filters:", filters);
  // Mock implementation: filter a static list of properties
  // the real implementation would use the filters to create a database query
  return mockListings.filter((listing) => {
    return (
      (!filters.status || listing.status === filters.status) &&
      (!filters.city ||
        listing.address.city.toLowerCase() === filters.city.toLowerCase()) &&
      (!filters.state ||
        listing.address.state.toLowerCase() === filters.state.toLowerCase()) &&
      (!filters.minBedrooms || listing.bedrooms >= filters.minBedrooms) &&
      (!filters.maxPrice || listing.price <= filters.maxPrice)
    );
  });
}

/**
 * A tool that sends an email report of specified listings.
 * In a real application, this would trigger an email service.
 */
export function sendListingReport(
  listingIds: string[],
  recipientEmail: string,
): { success: boolean; message: string } {
  console.log(`--- Calling 'sendListingReport' tool ---`);
  console.log(
    `Emailing report for listings ${listingIds.join(", ")} to ${recipientEmail}`,
  );
  // Mock implementation
  if (!recipientEmail.includes("@")) {
    return { success: false, message: "Invalid email address provided." };
  }
  return {
    success: true,
    message: `Report sent successfully to ${recipientEmail}.`,
  };
}

The Magic: Turning Types Into Schemas

Here’s where it gets clever. LLMs need JSON schemas to understand how to use your tools. But writing these schemas by hand? That’s a recipe for bugs and maintenance headaches. Change your TypeScript interface, forget to update the schema, and boom—your AI integration breaks.

So let’s not do that. Instead, we’ll generate the JSON schema directly from our TypeScript types. One source of truth, no drift, no manual schema updates.

Here’s what the ListingFilters TypeScript interface becomes as a JSON schema:

📁 View schema generation on GitHub

// The TypeScript interface (what we write)
interface ListingFilters {
  status?: "Active" | "Pending" | "Sold";
  city?: string;
  state?: string;
  minBedrooms?: number;
  maxPrice?: number;
}

// The generated JSON schema (what the LLM sees)
const listingFiltersSchema = {
  type: "object",
  properties: {
    status: {
      type: "string",
      enum: ["Active", "Pending", "Sold"],
      description: "The listing status to filter by",
    },
    city: {
      type: "string",
      description: "The city to search in",
    },
    state: {
      type: "string",
      description: "The state to search in",
    },
    minBedrooms: {
      type: "number",
      description: "Minimum number of bedrooms",
    },
    maxPrice: {
      type: "number",
      description: "Maximum price in dollars",
    },
  },
  required: [],
  additionalProperties: false,
};

You don’t want to generate this manually every time. Libraries like typescript-json-schema handle this for you:

npm install typescript-json-schema

Then you add it to your build process:

📁 View complete package.json on GitHub

// package.json
{
  "scripts": {
    "build": "tsc && npm run generate-schemas",
    "generate-schemas": "npm run generate-schemas:listings && npm run generate-schemas:tools",
    "generate-schemas:listings": "typescript-json-schema src/tools/listings.types.ts ListingFilters --required false > src/schemas/listing-filters.json",
    "generate-schemas:tools": "typescript-json-schema src/tools/listings.types.ts '*' --required false > src/schemas/all-tools.json",
    "dev": "npm run generate-schemas && tsx watch src/index.ts"
  }
}

Now your schemas regenerate automatically whenever you build or start the dev server. This ensures your LLM tools always match your TypeScript types.

Even better, you can generate schemas programmatically:

📁 View complete schema generator on GitHub

// generate-schemas.ts
import * as TJS from "typescript-json-schema";
import fs from "fs";
import path from "path";

// Configure the schema generator
const program = TJS.getProgramFromFiles(["src/tools/listings.types.ts"], {
  strictNullChecks: true,
});

// Generate schema for ListingFilters
const schema = TJS.generateSchema(program, "ListingFilters", {
  required: false,
  ref: false,
});

// Save to file for use in other parts of the app
const schemasDir = path.join(__dirname, "../schemas");
if (!fs.existsSync(schemasDir)) {
  fs.mkdirSync(schemasDir, { recursive: true });
}

fs.writeFileSync(
  path.join(schemasDir, "listing-filters.schema.json"),
  JSON.stringify(schema, null, 2),
);

console.log("✅ Generated schema for ListingFilters");

Then in your main application, you’d import and use these generated schemas:

📁 View complete tools config on GitHub

// src/tools/definitions.ts
import listingFiltersSchema from "../schemas/listing-filters.schema.json";
import { findListings, sendListingReport } from "./listings";

// Define your tools with auto-generated schemas
export const tools = [
  {
    type: "function",
    function: {
      name: "findListings",
      description: "Find property listings based on filters",
      parameters: listingFiltersSchema, // Type-safe schema!
    },
  },
  {
    type: "function",
    function: {
      name: "sendListingReport",
      description: "Send email report of listings",
      parameters: {
        // This could also be auto-generated from a TypeScript type
        type: "object",
        properties: {
          listingIds: { type: "array", items: { type: "string" } },
          recipientEmail: { type: "string" },
        },
        required: ["listingIds", "recipientEmail"],
      },
    },
  },
];

// Export for use in your LLM integration
export const toolHandlers = {
  findListings,
  sendListingReport,
};

Add this to your package.json to run the generator:

{
  "scripts": {
    "generate-schemas": "tsx scripts/generate-schemas.ts"
  }
}

That’s the beauty of this approach: your TypeScript types drive everything. Update a type, and the schema updates automatically. No manual sync required.

Connecting to LLMs with OpenRouter

Before we see tool integration in action, let’s talk about our LLM connection. We’ll use OpenRouter, which gives us access to multiple models through a single API.

Why OpenRouter? Instead of managing multiple API keys and SDKs for different providers, you get:

One API for everything - Claude, GPT-4, Gemini, Kimi, and more through a single endpoint
Automatic fallbacks if a model is unavailable or rate-limited
Usage-based pricing across all models with transparent costs
Simple REST API that works with any HTTP client - no vendor-specific SDKs required

We’re using a dual-model approach for optimal performance:

Kimi K2 for tool use - it excels at understanding complex queries and selecting the right tools with proper parameters
Gemini 2.0 Flash for conversational responses - it’s fast and cost-effective for generating natural, friendly responses to users

A Real-World Conversation: From Search to Action

Here’s how the complete flow works when tools are integrated directly into your application:

Direct Tool Integration Flow

Let’s walk through exactly what happens when a user makes a request:

Managing System Prompts

Before we see the conversation flow, let’s set up proper prompt management. Instead of hardcoding prompts, we’ll create specialized prompts for different purposes:

📁 View complete prompts file on GitHub

// src/agents/system.prompts.ts
export const TOOL_SELECTION_PROMPT = `
You are a professional real estate assistant. 
When users ask about properties, use the findListings tool to search. 
When users ask to send reports, use the sendListingReport tool. 
Only use available tools - do not make up functions or code.

If you are asked to do something that is not in the tools, respond normally without using tools.
`;

export const RESPONSE_GENERATION_PROMPT = `
You are a professional real estate assistant. 
Please be professional but friendly.

Always tell users the listings that you found.

If a user asks for a report, for each report that is sent, confirm to the user that the report has been sent.

Identify each report sent with this format: <street address> <city> <state>.

If you are asked to do something that is not in the tools, say you cannot do it.

Do not generate code or mention tool calls. Just provide a natural, conversational response based on the data.
`;

This separation ensures that:

Kimi K2 gets focused instructions for tool selection
Gemini 2.0 Flash gets instructions for natural conversation
Both models follow consistent behavioral guidelines

Step 1: The Search

User: “Find active listings in Portland, OR with at least 3 bedrooms and under $850,000.”

Here’s how we send this to OpenRouter with our tool definitions:

📁 View complete agents service on GitHub

// src/agents/agents.service.ts
import axios from 'axios';
import { TOOL_SELECTION_PROMPT, RESPONSE_GENERATION_PROMPT } from './system.prompts';
import { tools } from '../tools/definitions';

async function handleUserQuery(userMessage: string) {
  const response = await axios.post('https://openrouter.ai/api/v1/chat/completions', {
    model: 'moonshotai/kimi-k2', // Kimi K2 for tool selection
    messages: [
      {
        role: 'system',
        content: TOOL_SELECTION_PROMPT
      },
      {
        role: 'user',
        content: userMessage
      }
    ],
    tools: tools, // Our generated tool schemas
    tool_choice: 'auto'
  }, {
    headers: {
      'Authorization': `Bearer ${process.env.OPENROUTER_API_KEY}`,
      'HTTP-Referer': process.env.YOUR_SITE_URL || 'http://localhost:3000',
      'X-Title': 'Real Estate AI Agent',
      'Content-Type': 'application/json'
    }
  });

  return response.data;
}

When we send this to Kimi K2 along with our tool definitions, here’s what happens internally:

Kimi K2’s thought process: “The user is asking to find properties. The findListings tool is perfect for this. I need to extract the filters: status=‘Active’, city=‘Portland’, state=‘OR’, minBedrooms=3, and maxPrice=850000.”

The model responds not with an answer, but with a tool call instruction:

{
  "tool_calls": [
    {
      "id": "call_abc123",
      "type": "function",
      "function": {
        "name": "findListings",
        "arguments": "{\"status\":\"Active\",\"city\":\"Portland\",\"state\":\"OR\",\"minBedrooms\":3,\"maxPrice\":850000}"
      }
    }
  ]
}

Step 2: The Execution

Our application parses this instruction and calls the actual findListings function:

📁 View tool execution logic on GitHub

const filters = JSON.parse(toolCall.function.arguments);
const results = findListings(filters);
// Returns: [
//   { listingId: "L001", address: {...}, price: 825000, bedrooms: 3, ... },
//   { listingId: "L002", address: {...}, price: 799000, bedrooms: 4, ... }
// ]

We feed these results back to the LLM as part of the conversation history.

Step 3: The Response

After executing the tool, we switch to Gemini 2.0 Flash for the conversational response:

📁 View response generation on GitHub

// Format the tool results for a friendly response
const toolResults = findListings(filters); // Returns the 2 matching properties

// Use Gemini for natural conversation with clean context
const chatResponse = await axios.post('https://openrouter.ai/api/v1/chat/completions', {
  model: 'google/gemini-2.0-flash-exp:free', // Gemini 2.0 Flash for conversation
  messages: [
    {
      role: 'system',
      content: RESPONSE_GENERATION_PROMPT
    },
    {
      role: 'user',
      content: userMessage
    },
    {
      role: 'assistant',
      content: `I found ${toolResults.length} properties matching your criteria. Here are the results: ${JSON.stringify(toolResults)}`
    }
  ]
}, {
  headers: {
    'Authorization': `Bearer ${process.env.OPENROUTER_API_KEY}`,
    'HTTP-Referer': process.env.YOUR_SITE_URL || 'http://localhost:3000',
    'X-Title': 'Real Estate AI Agent',
    'Content-Type': 'application/json'
  }
});

Gemini to User: “I found 2 properties matching your criteria:

123 Oak Street - 3 bedrooms, 2 bathrooms - $825,000
456 Pine Avenue - 4 bedrooms, 3 bathrooms - $799,000

Would you like more details about either property or would you like me to do something else?”

Step 4: The Action

User: “Great, please email a report of these two to my client at jane.doe@example.com.”

We send this back to Kimi K2 for tool selection:

Kimi K2’s thought process: “The user wants to email a report. The sendListingReport tool is designed for this. I need the listingIds from the previous results and the recipientEmail from the user’s request.”

Kimi K2 generates another tool call:

{
  "tool_calls": [
    {
      "id": "call_def456",
      "type": "function",
      "function": {
        "name": "sendListingReport",
        "arguments": "{\"listingIds\":[\"L001\",\"L002\"],\"recipientEmail\":\"jane.doe@example.com\"}"
      }
    }
  ]
}

Our application executes sendListingReport and returns the success message. We then use Gemini 2.0 Flash again for the friendly confirmation: “I’ve successfully sent a report of both properties to jane.doe@example.com.”

This dual-model approach gives us the best of both worlds:

Kimi K2 handles the complex reasoning needed for tool selection and parameter extraction
Gemini 2.0 Flash provides fast, natural conversational responses

What’s Next: From Foundation to Production

Our real estate assistant works great for a single application. But what happens when success brings new challenges? What if the marketing team wants their own AI tool that searches listings for market analysis? What if different teams are building different tools that all need access to the same listing data?

This is where you’ll need to think about scaling your tool architecture. In Part 2 of this series, we’ll explore how to transform these direct tools into reusable services using the Model Context Protocol (MCP), allowing multiple applications to share the same tool logic while maintaining clean separation of concerns.

Once you have a scalable architecture, Part 3 covers adding authentication and security to protect your AI endpoints when shipping to real users.

Finally, Part 4 transforms the user experience with real-time streaming responses, making your AI feel responsive and production-ready rather than leaving users waiting in silence.

Conclusion

By giving your LLM tools through direct function calling, you transform it from a conversationalist into an agent capable of real-world interaction. The type-driven approach we’ve covered ensures your tools stay in sync with your application code, creating a robust bridge between the LLM’s reasoning capabilities and your application’s functionality.

This direct integration approach is perfect for getting started and works well for single applications. As your AI ecosystem grows, you’ll want to consider the scaling patterns we cover in Part 2, the security measures in Part 3, and the real-time user experience improvements in Part 4.

Next up: Part 2: Scaling LLM Agents with MCP (Model Context Protocol) - Transform your direct tools into reusable microservices that multiple applications can share.