Telecom AI Bot System Designv1.0

System Overview

AI-powered customer service platform handling text and voice interactions for telecom operations serving millions of subscribers.

🎙️ Voice + Text Bot

Multi-modal customer interaction via WebSocket voice streams and REST chat endpoints. Powered by open-source LLMs with domain-specific fine-tuning for telecom context.

WhisperPiper TTSvLLMLangChain4j

🔌 MCP Tool Server

Model Context Protocol enabled APIs that expose telecom business functions as tools the LLM can invoke — billing, account lookup, plan management, and more.

Spring MCPTool RegistryJSON-RPC

🛡️ Guardrails Engine

Multi-layer safety: identity verification before data access, PII redaction, context boundary enforcement, and business-rule alignment for every response.

GuardrailsAIJWTOPARow-Level Security

📊 Data Isolation

PostgreSQL Row-Level Security (RLS) ensures zero data leakage between customers. Every query is scoped to the authenticated subscriber ID at the database level.

PostgreSQL RLSpgcryptoTenant Context

Scalable Infrastructure

AWS EKS with auto-scaling pods, ALB for traffic distribution, ElastiCache for session management, and SQS for async processing of billing operations.

EKSALBElastiCacheSQS

📈 Observability

Full-stack observability with OpenTelemetry traces, Prometheus metrics, Grafana dashboards, and Loki for log aggregation across all microservices.

OpenTelemetryPrometheusGrafanaLoki

Key Design Principles

Zero-Trust Data Access

No customer data is returned without verified identity. Database queries are scoped via RLS policies. LLM context never contains other customers' data.

Context-Aware Conversations

Redis-backed session store maintains conversation state with TTL. The bot remembers what was discussed within a session but purges after timeout.

Business-Aligned Responses

System prompts and guardrails constrain responses to telecom domain. Off-topic or harmful requests are intercepted before reaching the LLM.

Non-Functional Targets

MetricTarget
Response Latency (text)< 2 seconds P95
Response Latency (voice)< 3.5 seconds P95
Concurrent Users10,000+
Availability99.95%
Data Isolation100% (RLS enforced)
Context Accuracy> 95% on-topic
Auth Before Data100% enforced
PII in Logs0 (redacted)

Application Architecture

Layered microservices architecture with MCP-enabled tool calling and multi-modal input processing.

🌐 Client Layer — Channels
Mobile AppReact Native / Flutter
Web WidgetEmbedded JS SDK
IVR GatewayTwilio / Amazon Connect
WhatsAppBusiness API
USSDShort Code Gateway
▼ HTTPS / WSS / gRPC
🔀 API Gateway & Load Balancing
AWS ALBTLS Termination
Spring Cloud GatewayRate Limiting + Routing
Kong / APISIXAuth + Throttle
▼ JWT + mTLS
🤖 AI Orchestration Layer — Spring Boot 3.4+
Chat ControllerREST + SSE Streaming
Voice ControllerWebSocket + gRPC
Session ManagerRedis + TTL
LLM OrchestratorLangChain4j + Spring AI
Guardrails EnginePre/Post Filters
▼ MCP JSON-RPC over HTTP/SSE
🔧 MCP Tool Server Layer
Billing Toolsget_bill, pay_bill
Account Toolsget_profile, update_info
Plan Toolscurrent_plan, upgrade
Invoice Toolsget_invoices, download
Network Toolsdata_usage, speed_test
▼ SQL (RLS enforced) / REST
💾 Data & AI Model Layer
PostgreSQL 16RLS + pgcrypto
Redis ClusterSessions + Cache
vLLM / OllamaMistral-7B / LLaMA-3
Whisper ServerASR (Speech→Text)
Piper TTSText→Speech
▼ Metrics / Traces / Logs
📡 Observability & Infrastructure
OpenTelemetryDistributed Tracing
PrometheusMetrics Collection
GrafanaDashboards
LokiLog Aggregation
AWS EKSKubernetes Cluster

Component Catalog

Every component identified for development, with responsibilities and technology choices.

Core AI Orchestration Service

The central brain that coordinates LLM inference, tool selection, and response generation.

Sub-ComponentTechnologyResponsibility
LLM ClientLangChain4j + Spring AIUnified interface to vLLM-hosted Mistral/LLaMA models with structured output parsing
Prompt EngineJinja2-style TemplatesDynamic system prompt construction with customer context, guardrail instructions, and tool schemas
Tool RouterSpring MCP ClientDiscovers and invokes MCP tools based on LLM function-calling output
Context BuilderCustom Java ServiceAssembles conversation history + auth state + business context into LLM context window
Response ValidatorCustom GuardrailsPost-LLM output validation — PII detection, hallucination check, business alignment
MCP Tool Server (Billing, Account, Plans)

MCP-compliant tool server exposing telecom business operations as callable tools with JSON Schema definitions.

Tool NameParametersAction
get_current_billsubscriber_idFetches current billing cycle amount, due date, status
get_bill_payment_linksubscriber_id, amountGenerates a one-time payment link via payment gateway
get_past_invoicessubscriber_id, monthsReturns invoice history for the last N months
get_subscriber_profilesubscriber_idFetches name, email, plan, status (masked phone/email)
update_subscriber_infosubscriber_id, field, valueUpdates profile fields after OTP verification
get_current_plansubscriber_idReturns active plan details — data, calls, validity
get_data_usagesubscriber_id, periodData consumption for current cycle or custom period
Security Authentication & Authorization

Multi-step identity verification before any data access. No customer data is exposed without proof of identity.

ComponentTechnologyDetails
Phone/OTP VerificationSpring Security + TwilioSMS OTP sent to registered number; verified within session
JWT Session TokensSpring OAuth2 Resource ServerShort-lived JWT (15min) with subscriber_id claim
Row-Level SecurityPostgreSQL RLS PoliciesSET app.current_subscriber_id before every query — DB enforces isolation
API Rate LimitingRedis + Bucket4jPer-subscriber rate limits: 60 req/min chat, 20 req/min voice
PII RedactionCustom Filter ChainRegex + NER-based PII detection in logs and LLM context
Voice Voice Processing Pipeline

Real-time voice interaction with streaming ASR and TTS for natural conversational experience.

StageTechnologyDetails
Audio CaptureWebSocket (Opus codec)16kHz mono audio streamed from client in 100ms frames
Speech-to-TextFaster-Whisper (GPU)CTranslate2-optimized Whisper model; real-time factor < 0.3x
Language DetectionWhisper Built-inAuto-detect Hindi, English, regional languages
Intent → LLMSame AI OrchestratorTranscribed text processed through same pipeline as chat
Text-to-SpeechPiper TTS / CoquiNeural TTS with natural prosody; multilingual support
Audio StreamingWebSocket responseTTS audio chunks streamed back as PCM/Opus frames
Data Database & Caching
StoreTechnologyUsage
Primary DBPostgreSQL 16 (RDS)Subscribers, billing, invoices, plans — with RLS policies
Session StoreRedis 7 (ElastiCache)Conversation history, auth state, TTL-based expiry
Vector Storepgvector extensionFAQ embeddings, policy documents for RAG retrieval
Audit LogPostgreSQL (append-only)Immutable log of all tool invocations and data access
Rate LimitRedis (Sliding Window)Per-subscriber request counting with Bucket4j

Security Architecture

Defense-in-depth approach ensuring zero data leakage, identity verification, and business-aligned responses.

🔒 Layer 1 — Identity Verification Gate
Phone Number InputCustomer provides number
OTP via SMS6-digit, 5min TTL
JWT Issuancesubscriber_id claim
Session BindingRedis: session → sub_id
▼ Only verified users proceed
🛡️ Layer 2 — Pre-LLM Guardrails
Input SanitizerXSS, injection prevention
Topic ClassifierOn-topic vs off-topic
Jailbreak DetectorPrompt injection filter
Context BoundaryMax history tokens
▼ Clean, validated input
🤖 Layer 3 — LLM Execution with Constrained Tools
System PromptRole + domain constraints
Tool WhitelistingOnly registered MCP tools
Param Injectionsubscriber_id from JWT only
Token BudgetMax output tokens enforced
▼ LLM response + tool results
✅ Layer 4 — Post-LLM Validation
PII RedactionMask sensitive data in logs
Hallucination CheckVerify tool output matches response
Business AlignmentDomain vocabulary check
Audit LoggerImmutable access log
▼ Safe, verified response to customer
💾 Layer 5 — Database Level Enforcement
RLS PolicyWHERE subscriber_id = current_setting('app.current_subscriber_id')
Column Encryptionpgcrypto for PII columns
Connection PoolingPgBouncer with SET on checkout

⚠️ Critical Security Invariant

The subscriber_id used in MCP tool calls is NEVER taken from the user's message or LLM output. It is always injected server-side from the verified JWT token. This prevents any attempt by a user (or hallucinating LLM) to access another subscriber's data.

Request Data Flow

End-to-end sequence for a customer asking "What is my current bill?"

Customer
"What is my current bill?" Chat Widget / Voice
API Gateway
Validates JWT token Extracts subscriber_id from claims Rate limit check
Chat Controller
Receives request Loads session from Redis Appends to conversation history
Guardrails (Pre)
Input sanitization Topic classification (billing = ✅ on-topic) No jailbreak detected
LLM Orchestrator
Builds prompt: system instructions + conversation history + available MCP tools Sends to vLLM
LLM (Mistral-7B)
Generates tool call: get_current_bill(subscriber_id) — note: subscriber_id is injected server-side
MCP Tool Server
Executes get_current_bill Sets RLS context Queries PostgreSQL Returns: ₹599, due Mar 15
LLM (Round 2)
Receives tool result Generates natural response: "Your current bill is ₹599, due on March 15th."
Guardrails (Post)
Verifies response matches tool output No PII leakage Business-aligned ✅
Audit Logger
Logs: tool=get_current_bill, subscriber=SUB_XXXX, timestamp, result_hash (no raw PII in logs)
Customer
"Your current bill is ₹599, due on March 15th. Would you like a payment link?"

Voice Flow Addition

Customer
Speaks: "Mera current bill kitna hai?" Audio via WebSocket (Opus 16kHz)
Voice Controller
Buffers audio frames Sends to Faster-Whisper Receives transcript + detected language: Hindi
AI Orchestrator
Same flow as text chat LLM responds in Hindi: "Aapka current bill ₹599 hai, due date 15 March"
Piper TTS
Synthesizes Hindi audio Streams PCM chunks via WebSocket back to client

Technology Stack

Production-grade, open-source-first technology choices with AWS managed services for operations.

Application Layer

CategoryTechnologyVersion
RuntimeJava 21 (GraalVM)21 LTS
FrameworkSpring Boot3.4.x
AI FrameworkLangChain4j + Spring AI1.0.x
MCP SDKSpring MCP (spring-ai-mcp)1.0.x
WebSocketSpring WebFlux6.2.x
SecuritySpring Security + OAuth26.4.x
DB AccessSpring Data JPA + jOOQ3.19.x
Cache ClientLettuce (Spring Data Redis)6.4.x
Rate LimitingBucket4j + Redis8.x
ResilienceResilience4j2.2.x
API DocsSpringDoc OpenAPI2.7.x

AI / ML Layer

CategoryTechnologyDetails
LLMMistral-7B-Instruct / LLaMA-3-8BOpen-source, fine-tunable
LLM ServingvLLM / OllamaOpenAI-compatible API
ASRFaster-Whisper (large-v3)CTranslate2 optimized
TTSPiper TTS / Coqui TTSNeural, multilingual
Embeddingsall-MiniLM-L6-v2For RAG vector search
Vector DBpgvector (PostgreSQL ext)Embedded in primary DB
GuardrailsCustom Java + NeMo GuardrailsPre/post LLM filters

Infrastructure

ServiceAWS / Open Source
Container OrchestrationAWS EKS (Kubernetes)
DatabaseAmazon RDS PostgreSQL 16
CacheAmazon ElastiCache (Redis 7)
QueueAmazon SQS / RabbitMQ
StorageAmazon S3 (invoices/audio)
Load BalancerAWS ALB + NLB for WebSocket
GPU InstancesAWS g5.xlarge (A10G GPU)
IaCTerraform + Helm Charts
CI/CDGitHub Actions + ArgoCD

PoC Step-by-Step Guide

Prototype implementation roadmap — from database to working chatbot.

Step 1   Database Schema with RLS

Create PostgreSQL tables: subscribers, billing, invoices, plans. Enable Row-Level Security policies that scope all queries by subscriber_id set at connection time via SET app.current_subscriber_id.

CREATE TABLE billing ( id UUID DEFAULT gen_random_uuid(), subscriber_id UUID NOT NULL REFERENCES subscribers(id), amount DECIMAL(10,2), due_date DATE, status VARCHAR(20) ); ALTER TABLE billing ENABLE ROW LEVEL SECURITY; CREATE POLICY billing_isolation ON billing USING (subscriber_id = current_setting('app.current_subscriber_id')::UUID);

Step 2   Spring Boot Project Setup

Initialize Spring Boot 3.4 project with dependencies: spring-ai, langchain4j, spring-data-jpa, spring-security, spring-websocket, spring-data-redis. Configure virtual threads for high concurrency.

@SpringBootApplication public class TelecomAiBotApplication { public static void main(String[] args) { SpringApplication.run(TelecomAiBotApplication.class, args); } } // application.yml: spring.threads.virtual.enabled=true

Step 3   MCP Tool Server Implementation

Build MCP-compliant tool server with @Tool annotations. Each tool maps to a business operation. The server exposes tool schemas via MCP discovery and handles JSON-RPC invocations.

@McpTool("get_current_bill") public BillResponse getCurrentBill( @ToolParam(description="Subscriber ID") UUID subscriberId ) { rlsContextSetter.setSubscriber(subscriberId); // Set RLS context return billingRepo.findCurrentBill(subscriberId); }

Step 4   AI Orchestrator with LangChain4j

Connect to vLLM-hosted Mistral model. Configure the agent with MCP tools, system prompt with telecom domain constraints, and conversation memory backed by Redis.

var agent = AiServices.builder(TelecomAssistant.class) .chatLanguageModel(openAiModel) // vLLM with OpenAI-compatible API .tools(billingTools, accountTools, planTools) .chatMemoryProvider(id -> new RedisChatMemory(id, Duration.ofMinutes(30))) .build();

Step 5   Guardrails & Security Pipeline

Implement pre-LLM filters (input validation, topic classification, jailbreak detection) and post-LLM validators (PII redaction, response-tool consistency check, business alignment).

Step 6   Voice Pipeline Integration

Add WebSocket endpoint for audio streaming. Integrate Faster-Whisper for real-time speech-to-text and Piper TTS for response synthesis. Share the same AI orchestrator for text and voice.

Step 7   Authentication Flow (OTP)

Build OTP-based verification: customer provides phone number → bot sends OTP → customer confirms → JWT issued with subscriber_id → all subsequent tool calls use this verified identity.

Step 8   Docker Compose & Testing

Containerize all services. Docker Compose for local dev: PostgreSQL, Redis, vLLM/Ollama, Whisper, Piper, Spring Boot app. Integration tests verify data isolation across subscribers.