Case Study · 2025

Verdikt AI — Multi-LLM design reviews that think like senior designers

How I designed and built an AI-powered product design review tool that routes feedback through three large language models — giving designers and startups instant, structured critique on their UI work without waiting for a senior reviewer.

Web App

AI / LLM

Product Design

Full-Stack

LangChain

View Live Site

Overview

Product

Verdikt AI — Multi-LLM Product Design Review

Platform

Next.js · Supabase · Vercel

Tech

Gemini Pro · Claude Sonnet · LangChain & LangGraph

My Role

Product Designer & Full-Stack AI Developer

Background

Designers needed honest, structured feedback — without relying on availability or seniority

Design feedback is one of the most valuable and hardest things to get consistently. Junior designers often lack access to experienced reviewers who can pinpoint issues in layout, hierarchy, accessibility, and interaction patterns. Solo designers working independently have no one to challenge their decisions. Even within teams, disagreements about design direction can stall progress — and seeking external review is slow and expensive.

Verdikt AI was built to solve that problem. It's an AI-powered product design review tool with a threaded chat interface that simulates the role of a senior product designer. Users upload screenshots of their web or mobile app designs, describe the context and flow, and receive structured feedback — covering pros, cons, accessibility concerns, and an overall evaluation summary.

What makes Verdikt AI different is its multi-model approach. Instead of relying on a single LLM, the tool routes each review through three separate language models — GPT 4.0, Claude 3.5 Sonnet, and Gemini 2.5 Flash — using a LangChain feedback graph. This produces a three-panel review that gives designers multiple expert perspectives on every design, reducing bias and surfacing issues that a single model might miss.

How might we give designers and startup teams instant, structured design feedback that mirrors what a senior product designer would provide — without the wait, the cost, or the scheduling overhead?

App Features

A focused toolset for design critique at speed

Verdikt AI is intentionally scoped around one job: giving designers fast, useful feedback. Every feature serves that goal directly — no bloat, no distractions.

Chat Interface

A clean, familiar chat experience modelled on how designers already interact with AI tools like ChatGPT and Claude. Threaded messages keep feedback organised and easy to revisit across sessions.

Image Upload

Users can attach up to three design screenshots (5 MB max each). The images are processed directly by the LLMs, enabling visual understanding of layout, typography, spacing, and hierarchy.

Contextual Input

A text input lets designers describe the flow and context behind their screens before submitting for review. This ensures the AI evaluates the design within the right product context, not in isolation.

Three-Panel Review

Each review returns a structured summary from three LLMs — displaying Pros, Cons, Accessibility notes, and an overall evaluation side by side. Three models, three perspectives, one clear picture.

UX Approach

How Verdikt AI was shaped

Familiar Patterns, New Purpose

The chat interface was chosen deliberately — it's a UX pattern designers already know from tools like ChatGPT and Claude. By reusing that mental model, onboarding is near-zero. Users immediately understand how to submit a query and interpret the response.

Multi-Model Architecture

Getting three LLMs to produce consistent, structured output was the biggest technical challenge. Finding the right models, configuring the LangChain feedback graph, and handling deployment edge cases required significant iteration — especially around Claude's API behaviour.

Error Handling & Polish

Deployment surfaced several issues — error states between the server and frontend needed clear visual indicators, and PDF report generation had formatting bugs that were identified and resolved through hands-on debugging.

Product Design Patterns

Design patterns that support the AI workflow

The interface is built around three core patterns that keep the experience clean while handling the complexity of multi-model AI responses under the hood.

Sidebar

Thread management

A persistent sidebar stores the user's review threads and past feedback. Designers can return to any previous review, track iterations, and delete threads they no longer need — keeping the workspace tidy.

Thinking states

When the three LLMs are processing a review, the UI reflects a pulsing 'thinking' animation. This gives designers a clear signal that work is happening — essential when responses from three models take a few moments to complete.

States

Success & error feedback

Every server-side event has a corresponding frontend indicator. Successful reviews land cleanly. Errors — whether from API limits, image processing, or network issues — surface clear, actionable messages rather than silent failures.

Reports

PDF download

After a review is complete, users can download a PDF report summarising the feedback from all three models. This makes it easy to share results with teammates or stakeholders outside the app.

Tech Stack

An AI-native stack built for multi-model orchestration

Verdikt AI's stack was chosen to handle the unique challenge of routing design reviews through multiple AI models while keeping the frontend responsive and the data layer reliable.

Next.js — Application Framework

Next.js provides the full-stack foundation — server-side rendering for the marketing pages, API routes for the LLM orchestration layer, and fast client-side navigation for the chat interface.

Shadcn UI & Tailwind CSS — Design System

Shadcn components and Tailwind CSS ensured the interface stayed consistent and polished while iterating rapidly on the chat, sidebar, and review panel layouts.

LangChain & LangGraph — AI Orchestration

LangChain handles the connection to all three LLMs. LangGraph manages the feedback graph — a structured pipeline that routes each design review through GPT 4.0, Claude 3.5 Sonnet, and Gemini 2.5 Flash, then aggregates their outputs into a unified review.

Three LLMs — GPT 4.0, Claude 3.5 Sonnet, Gemini 2.5 Flash

Each model brings a different evaluation perspective. Running reviews through all three reduces single-model bias and gives designers a richer, more balanced critique of their work.

Supabase — Database & Auth

Supabase stores all review threads, user data, and feedback history. It also handles authentication, providing a seamless login experience without additional infrastructure.

GitHub & Vercel — Deployment

Version-controlled on GitHub and deployed via Vercel's continuous deployment pipeline. Every push to main is live in seconds, enabling fast iteration cycles.

User Journey

From landing page to design feedback in under a minute

The user journey is designed to minimise friction and get designers to their first review as quickly as possible — including a try-before-you-sign-up option that removes the biggest barrier to adoption.

Try Verdikt AI

New visitors can try the tool up to three times without creating an account. This lets designers experience the value of multi-LLM feedback before committing — removing the sign-up barrier entirely.

Start a New Chat

The 'New Chat' button creates a fresh review thread. Each thread is a self-contained conversation — designers can run multiple reviews across different projects without losing context.

Upload & Describe

Users upload their design screenshots and describe the flow and context in the text input. This two-step submission gives the LLMs enough information to evaluate the design within its intended product context.

Review & Download

The three-panel review appears with structured feedback from each LLM. Once complete, a PDF report can be downloaded for sharing with teammates or archiving for future reference.

Design Impact

What Verdikt AI delivered in practice

Verdikt AI was put to the test in a real-world scenario — used on active design work to evaluate and improve production UI. The feedback proved immediately actionable, surfacing issues that would have otherwise reached development.

LLM perspectives on every design review — reducing single-model bias

Real

Used on live project work to catch typography, contrast, and accessibility issues before handoff

Fast

Design feedback delivered in seconds — replacing hours of waiting for senior review availability

Beyond speed, Verdikt AI proved valuable as a productivity tool for improving design quality. It consistently caught issues with typography, colour contrast, and accessibility compliance — areas that are easy to overlook under deadline pressure but critical to user experience.

Takeaways

Gather more designer feedback on desired features. The core review loop works — the next step is understanding what additional capabilities designers want most, whether that's component-level feedback, design system compliance checks, or competitive benchmarking.

Redesign the dashboard experience. The current dashboard is functional but minimal. A more polished UI — with better thread organisation, search, and visual hierarchy — would make Verdikt AI feel like a premium daily-use tool.

Launch with tiered memberships. The try-before-you-sign-up model validates interest. The next phase introduces paid tiers with expanded usage limits, team collaboration features, and priority access to newer LLM models.

Next Project

The Elon Podcast

View All Work