The feature image for the project WT-Chat.
The feature image for the project WT-Chat.
The feature image for the project WT-Chat.
The feature image for the project WT-Chat.

AI CHAT APP | 2026

WT-Chat

A simple chat interface for user to interact with 2 different LLM endpoints, namely Gemini and ChatGPT.

Next.js
FastAPI
Tailwind CSS
Pydantic
Pytest
OpenAI API
Gemini API

OVERVIEW

wt-chat gives users more agency than typical AI chat apps through a combination of standout features. Users can switch between two live LLM backends — Google's Gemini and OpenAI's GPT — and tune the response tone to either playful or serious, adapting the AI's personality to their needs. Responses stream in real-time, reducing wait time, while each conversation receives an auto-generated title for effortless organization. Secure user authentication keeps conversations private, and a dark/light mode toggle ensures visual comfort. Together, these features make wt-chat a flexible, user-centric alternative to single-model chat interfaces.

IMPLEMENTATION

TECH STACK

Framework

Next.js

Backend

FastAPI

Language

TypeScript, Python

AI

OpenAI, Gemini

Styling

Tailwind CSS

Testing

Pytest

Ui

Tailgrids

Deployment

Render + Vercel

FEATURES

Authentication and Authorization

Dark / Light mode

Chat with Gemini (gemini-3-flash-preview) or GPT (gpt-5.4-nano)

Get a response in playful or serious manner.

Generation of title for each conversation.

Receiving response in chunks via Streaming.

CHALLENGES & SOLUTIONS

Streaming LLM responses across a buffered deployment environment

Implemented token-by-token streaming via FastAPI's StreamingResponse and decoded chunks on the frontend using the Web Streams API with a TextDecoder. Added a deliberate delay between chunks to simulate smooth streaming, compensating for Render's response buffer that batches all tokens before delivery.

Managing conversation history across multiple LLM provider APIs

Normalised chat history into a shared HistoryEntry format, then mapped it to each provider's required input type at the model layer — using EasyInputMessageParam for GPT and the equivalent structure for Gemini — keeping the frontend agnostic of provider-specific contracts.

Testing LLM endpoints without incurring excessive API costs

Designed tests to mock LLM calls using pytest-mock's mocker.patch() and mocker.Mock(), replacing live API calls with controlled fake responses. Reserved direct endpoint hits only for integration checks, minimising cost while still verifying that the wiring between the app and each LLM provider is correct.

CORS configuration and validation in FastAPI

Added CORSMiddleware to the FastAPI app with an explicit allowlist of frontend origins. Validated the setup in pytest using two complementary strategies: checking for access-control-allow-origin headers in standard responses, and verifying that preflight OPTIONS requests return a 200 status from allowed origins.