LLM Tools: Securing Your MCP Architecture with Authentication (Part 3)

Series Overview

This is part 3 of a four-part series on building production-ready AI agents:

Part 1: Building Your First LLM Agent: From Chatbot to Tool-Using Assistant
Part 2: Scaling LLM Agents with MCP (Model Context Protocol)
Part 3: Securing LLM Agents with Authentication - This post
Part 4: Real-time LLM Responses in Production

I hope you find the code in this series helpful! The complete implementation for this post can be found here and the final code for the project can be found here. Feel free to fork it and adapt it for your own projects.

Note: This project includes comprehensive testing with a carefully configured Jest/SWC setup for TypeScript monorepos. Testing LLM applications with MCP can be quite tricky, so if you fork this project, don’t ignore the valuable testing configuration—it includes solutions for common issues like workspace package mocking, module resolution, and proper test isolation.

Architecture Note: This series implements MCP protocol manually rather than using the official MCP TypeScript SDK for educational purposes and maximum control. You’ll learn exactly how MCP works under the hood, making debugging and customization easier. The patterns shown here can easily be adapted to use the official SDK if preferred.

The repository includes an AI.md file with comprehensive guidance for developers and LLMs who want to modify and extend this code as a starting point for their own projects. It covers architecture patterns, extension points, testing configuration, and production considerations.

You’ve built a sophisticated AI real estate agent using MCP microservices. The architecture is clean, the tools work beautifully, and the chat interface feels polished. But there’s a fundamental problem: anyone can access it.

Your agent handles potentially sensitive real estate queries, generates detailed market reports, and could be making API calls that cost money with every conversation. Without authentication, you’re running an open system that’s vulnerable to abuse, offers no personalization, and provides no way to track usage or implement rate limiting.

Moving from prototype to production means securing your application. Users need accounts, conversations need to be private, and you need control over who can access your expensive LLM powered features. This is where authentication transforms your MCP architecture from a demo into a production ready application.

Why Authentication Matters for LLM Applications

Authentication isn’t just about security; it’s about building sustainable AI applications. Every conversation with your agent potentially triggers multiple API calls to OpenRouter, searches through your property database, and generates detailed reports. Without user accounts, you have no way to:

Control costs by implementing usage limits per user
Personalize responses based on user preferences and history
Debug issues when users report problems with specific conversations
Scale responsibly by understanding usage patterns and bottlenecks
Meet compliance requirements for handling user data and conversations

LLM applications are inherently expensive to run. A single complex real estate query might cost several cents in API calls, and those costs add up quickly when your application is publicly accessible. User authentication gives you the foundation to build a sustainable business model around your AI agent.

Our Authentication Strategy

We’ll implement a comprehensive authentication system that includes:

User Registration and Login using secure password hashing and session management. We’ll use Passport.js for proven authentication patterns that work seamlessly with both server side rendering and API endpoints.

JWT Token Protection for our chat API endpoints, ensuring that only authenticated users can access the expensive LLM powered features. The tokens will be short lived and include user context that our agents can use for personalization.

Session Based Frontend Protection using HTTP only cookies for the web interface, providing security against XSS attacks while maintaining a smooth user experience with automatic login persistence.

MCP Server Security by adding authentication headers to tool calls, ensuring that even our microservices verify user identity before executing potentially sensitive operations.

This multi-layered approach protects both your application and your users while maintaining the clean MCP architecture we built in Part 2.

Setting Up Passport.js Authentication

We’ll use Passport.js with JWT tokens for our authentication system. Passport.js is the most widely used authentication library for Node.js applications and integrates seamlessly with NestJS.

Our authentication strategy combines several key components:

SQLite Database for user storage with a simple users table containing id, email, password_hash, name, and role fields. SQLite is perfect for development and small-scale production deployments, though you’d typically use PostgreSQL or MySQL for larger applications.

Secure Password Hashing using bcrypt with 12 salt rounds. This provides strong protection against rainbow table attacks and ensures that even if your database is compromised, passwords remain secure.

JWT Token Management for stateless authentication that works seamlessly with both web and API endpoints. Tokens include user ID, email, and role information for quick authorization decisions.

The core authentication service handles the essential operations:

📁 View complete auth service on GitHub

export const DATABASE_TOKEN = 'DATABASE_CONNECTION';

@Injectable()
export class AuthService {
  constructor(
    private jwtService: JwtService,
    @Inject(DATABASE_TOKEN) private db: any
  ) {}

  async validateUser(email: string, password: string): Promise<User | null> {
    const stmt = this.db.prepare(`
      SELECT id, email, password_hash, name, role, created_at
      FROM users WHERE email = ?
    `);
    
    const user = stmt.get(email);
    if (!user) return null;
    
    const isValid = await bcrypt.compare(password, user.password_hash);
    return isValid ? user : null;
  }

  async login(user: User): Promise<LoginResult> {
    const payload = { sub: user.id, email: user.email, role: user.role };
    const access_token = this.jwtService.sign(payload);
    return { user, access_token };
  }
}

The service provides user validation for login, secure user creation with password hashing, and JWT token generation. Database operations use prepared statements for both security and performance. Notice how we inject the database as a dependency using a custom token; this pattern makes the service fully testable by allowing us to inject mock databases during testing.

The authentication module configures these providers:

📁 View auth module on GitHub

@Module({
  providers: [
    {
      provide: DATABASE_TOKEN,
      useFactory: () => getDatabase(),
    },
    AuthService,
    LocalStrategy,
    JwtStrategy,
  ],
})
export class AuthModule {}

Implementing Passport.js Strategies

Passport.js integrates authentication into our NestJS request pipeline through two key strategies:

JWT Strategy for API endpoint protection. This strategy extracts Bearer tokens from Authorization headers and validates them by looking up the user in our database. The token payload contains user ID, email, and role information.

Local Strategy for form-based login flows. This handles traditional email/password authentication and validates credentials using our authentication service.

Both strategies follow the same pattern: extract credentials, validate them, and return user context for the request pipeline. The JWT strategy is particularly important because it secures our expensive chat API endpoints.

📁 View JWT strategy on GitHub

@Injectable()
export class JwtStrategy extends PassportStrategy(Strategy) {
  async validate(payload: JwtPayload) {
    const user = await this.authService.findUserById(payload.sub);
    if (!user) throw new UnauthorizedException('Invalid token');
    return user;
  }
}

Protecting the Chat API

Our chat endpoint is where users interact with the expensive LLM agent, making it the most critical endpoint to secure. We’ll require JWT authentication and pass user context to our agents for personalization and audit logging.

The protection pattern is straightforward:

📁 View agents controller on GitHub

@Controller('agents')
@UseGuards(JwtAuthGuard)
export class AgentsController {
  @Post('chat')
  async chat(@Body() chatRequest: ChatRequestDto, @Request() req) {
    const user = req.user; // Automatically extracted from JWT
    
    const enrichedRequest = {
      ...chatRequest,
      user: { id: user.id, email: user.email, name: user.name }
    };

    return this.agentsService.processChat(enrichedRequest);
  }
}

The @UseGuards(JwtAuthGuard) decorator ensures only authenticated users can access the endpoints in the AgentsContorller class. User information is automatically extracted from the JWT token and passed to our agents service. This enables personalized responses and provides audit trails for debugging and cost tracking.

The JWT guard (full implementation) provides clean error handling and can be extended with role-based access control if your application needs different permission levels.

Creating Authentication Pages

Users need clean, professional login and registration pages. Our authentication UI focuses on simplicity and user experience with proper loading states, error handling, and responsive design.

The authentication pages (login template, signup template) follow a consistent pattern:

Clean card-based design that matches the main application aesthetic
Form validation with appropriate input types and required fields
Loading states to provide feedback during authentication requests
Error handling with user-friendly error messages
Responsive layout that works well on mobile devices

The login form captures email and password, while the signup form adds a name field. Both forms integrate with our JavaScript authentication manager for smooth user flows and automatic redirects after successful authentication.

Client-Side Authentication Logic

The frontend authentication manager (complete implementation) handles the user experience side of authentication with several key responsibilities:

Token Management — Stores JWT tokens in localStorage for persistence across browser sessions. The manager automatically includes tokens in API requests and handles token expiration gracefully.

Form Handling — Processes login and registration forms with proper validation, loading states, and error messaging. Users get immediate feedback when authentication requests are in progress.

Automatic Redirects — Seamlessly moves users between authentication pages and the main application. Unauthenticated users are redirected to login, while successful authentication leads to the chat interface.

The core authentication flow demonstrates the clean separation of concerns:

📁 View auth manager on GitHub

class AuthManager {
  async handleLogin(event) {
    const credentials = this.extractFormData(event);
    this.setLoadingState(true);
    
    try {
      const result = await fetch('/auth/login', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify(credentials)
      });
      
      this.setToken(result.access_token);
      window.location.href = '/';
    } catch (error) {
      this.showError(error.message);
    }
  }
}

This approach provides a smooth user experience with proper error handling and automatic token management.

Updating the Chat Interface

The chat interface requires key updates to work with authentication. The main changes focus on token validation, secure API requests, and graceful error handling.

Authentication Checks: The chat interface verifies user authentication on initialization and redirects unauthenticated users to the login page. This prevents unauthorized access to expensive LLM endpoints.

Secure API Requests: All chat requests now include JWT tokens in Authorization headers. The interface handles token expiration gracefully by detecting 401 responses and redirecting to login.

User Context Display: The interface shows the logged in user’s name and provides a logout option, creating a personalized experience.

The key authentication integration points:

📁 View chat interface on GitHub

class ChatInterface {
  constructor() {
    if (!this.authManager.isAuthenticated()) {
      window.location.href = '/auth/login';
      return;
    }
    // Initialize chat interface...
  }

  async sendMessage() {
    const token = this.authManager.getToken();
    const response = await fetch('/agents/chat', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${token}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({ userMessage: message })
    });
    
    if (response.status === 401) {
      this.handleSessionExpired();
    }
  }
}

The updated chat interface (complete implementation) now provides secure, authenticated access to your AI agent.

Securing MCP Tool Calls

Our MCP architecture enables an additional security layer by requiring authentication for all tool calls. Each MCP server validates JWT tokens and tracks user context for audit logging and personalization.

The MCP servers implement authentication using Fastify middleware that validates JWT tokens on every request:

📁 View MCP auth middleware on GitHub

export const authenticateToken = async (request: FastifyRequest, reply: FastifyReply) => {
  const authHeader = request.headers.authorization;
  const token = authHeader && authHeader.split(' ')[1];

  if (!token) {
    return reply.status(401).send({ error: 'Access token required' });
  }

  try {
    const secretKey = process.env.JWT_SECRET || 'your-secret-key';
    const decoded = jwt.verify(token, secretKey) as any;
    
    // Extract user or service context
    request.userId = decoded.userId || decoded.sub;
    request.serviceId = decoded.serviceId;
  } catch (error) {
    return reply.status(403).send({ error: 'Invalid or expired token' });
  }
};

Each MCP server applies this middleware to all tool routes, ensuring only authenticated requests can access expensive operations:

📁 View MCP tools routes on GitHub

export default async function routes(fastify: FastifyInstance) {
  // Apply authentication middleware to all routes in this context
  fastify.addHook('preHandler', authenticateToken);

  fastify.post('/call', async (request, reply) => {
    const { name, arguments: args } = request.body as ToolCallRequest;
    
    // Log tool execution with user context for audit trail
    fastify.log.info(`Tool ${name} executed by user: ${request.userId || request.serviceId}`);
    
    const result = await executeToolByName(name, args);
    return result;
  });
}

This approach enables several powerful capabilities:

Audit Logging: Track which users make which tool calls for debugging and usage analytics
Rate Limiting: Implement per user limits to control API costs and prevent abuse
Personalization: Customize responses based on user context available in request.userId
Security Filtering: Restrict access to sensitive tools based on authentication status

Each MCP server can now make authorization decisions at the tool level and maintains full traceability of which authenticated users are accessing which tools.

User Context in Agent Responses

With authentication in place, our agents can provide personalized experiences by incorporating user context into both system prompts and tool calls.

The agents service (complete implementation) receives user information from the authenticated chat request and uses it in several ways:

Personalized System Prompts: The LLM receives user specific context including the user’s name and preferences. This enables responses like “Hi Sarah, I found 3 properties in your price range” instead of generic interactions.

User Attributed Tool Calls: All MCP tool calls include user context, enabling audit logging and personalized results. Each property search or report generation is attributed to a specific user.

Enhanced Error Handling: Error messages can be personalized and user specific debugging information is available when issues arise.

The to have the LLM remember to whom it is speaking, you can create a dynamic system prompt, or you can add an additional prompt to your tool call. It would look something like the code below.

async processChat(request: AuthenticatedChatRequest) {
  const { userMessage, user } = request;
  
  const systemPrompt = `You are helping ${user.name} with real estate queries.
    USER CONTEXT: ${user.name} (${user.email})`;
  
  // Tool calls include user context for audit logging
  const toolResults = await this.mcpClient.callTool(
    toolName, 
    args, 
    user // User context passed to MCP servers
  );
}

This approach transforms your AI agent from a generic chatbot into a personalized assistant that remembers who it’s helping and can provide user-specific responses and audit trails.

Security Best Practices

Authentication is just the foundation of application security. Here are the key practices we’ve implemented and additional considerations for production deployment:

Password Security is handled by Passport.js with bcrypt hashing and secure salt generation. Our implementation handles password complexity requirements and prevents timing attacks during login verification.

Token Security uses JWT tokens with reasonable expiration times and secure signing. Tokens are stored in localStorage on the client side, which provides a good balance of security and user experience for our use case. For higher-security applications, consider using HTTP-only cookies.

Session Management includes automatic token refresh and proper cleanup on logout. Our authentication system handles session validation and provides clean error handling when tokens expire.

Input Validation should be implemented at every layer. Validate user inputs on the frontend for user experience, validate again in your API endpoints for security, and validate once more in your MCP servers for defense in depth.

Rate Limiting becomes crucial when you have user identification. Consider implementing per-user rate limiting to prevent abuse and control API costs. This can be done at the application level or using a reverse proxy like nginx.

HTTPS Only in production is non-negotiable. Authentication tokens and user data should never be transmitted over unencrypted connections. This includes communication between your main application and MCP servers if they’re deployed separately.

Remember that security is a process, not a destination. Regular security audits, dependency updates, and monitoring are essential for maintaining a secure application.

What’s Next: From Secure to Real-time

With authentication in place, your LLM agent is ready for production use. Users can create accounts, have private conversations, and access personalized features. Your MCP architecture provides a solid foundation for scaling, and authentication gives you the control needed to build a sustainable AI application.

But there’s one more crucial piece: user experience. Even with great functionality and solid security, users still have to wait in silence while your AI processes their requests. Those 3-5 second delays can make your application feel unresponsive.

In Part 4, we’ll transform the user experience with real-time streaming responses using Server-Sent Events (SSE). Users will see immediate feedback, progress updates, and tokens appearing in real-time as the LLM generates responses. The authentication foundation we’ve built here ensures that streaming connections remain secure and personalized.

You now have a production-ready, authenticated AI agent that can scale to real users while maintaining security and providing personalized experiences. The combination of MCP architecture and robust authentication creates a powerful foundation for building sophisticated AI applications.

Next up: Part 4: Real-time LLM Responses in Production - Add streaming responses to eliminate silent waits and create an engaging, responsive user experience.