Sentiment Service

Overview

The Sentiment service is a hybrid HTTP + gRPC microservice that analyzes comment text and classifies it into one of four categories: positive, negative, neutral, or unrelated. It includes consumer registration, rate limiting, caching, and intentional failure simulation.

Dual Protocol Support:

gRPC (Port 3004): Primary interface for sentiment analysis and consumer registration
HTTP (Port 3005): Health checks, monitoring, and statistics endpoints

Purpose

Classify comment sentiment using keyword matching
Provide authenticated access via consumer registration
Enforce rate limits (100/sec auth, 10/sec unauth)
Cache results to reduce redundant analysis
Simulate real-world service failures
Expose health and monitoring endpoints via HTTP

Architecture

┌─────────────────────────────────────────────┐
│         Sentiment Service                   │
│                                             │
│  ┌──────────────┐      ┌──────────────┐   │
│  │ HTTP Server  │      │ gRPC Server  │   │
│  │ (Port 3005)  │      │ (Port 3004)  │   │
│  └──────┬───────┘      └──────┬───────┘   │
│         │                     │            │
│         ▼                     │            │
│  ┌─────────────┐      ┌───────┴────────┐  │
│  │   Health    │      │    Register    │  │
│  │  Endpoint   │      │    Consumer    │  │
│  │   /health   │      └────────────────┘  │
│  └─────────────┘              │            │
│                                │            │
│                        ┌───────┴────────┐  │
│                        │    Analyze     │  │
│                        │   Sentiment    │  │
│                        └───────┬────────┘  │
│                                │            │
│                   ┌────────────┴────────┐  │
│                   │                     │  │
│                   ▼                     ▼  │
│           ┌───────────┐         ┌──────────────┐
│           │   Rate    │         │   LRU Cache  │
│           │  Limiter  │         │  (500 items) │
│           └───────────┘         └──────────────┘
│                   │                     │  │
│                   └────────┬────────────┘  │
│                            ▼               │
│                  ┌──────────────────┐     │
│                  │  Keyword Matcher │     │
│                  │ (classification) │     │
│                  └──────────────────┘     │
└─────────────────────────────────────────────┘

Port Assignment:

3004: gRPC service (sentiment analysis, registration)
3005: HTTP service (health checks, monitoring)

gRPC Service Definition

Proto file: sentiment/proto/sentiment.proto

service SentimentService {
  rpc RegisterConsumer(RegisterRequest) returns (RegisterResponse);
  rpc AnalyzeSentiment(SentimentRequest) returns (SentimentResponse);
}
 
message RegisterRequest {
  string serviceName = 1;
}
 
message RegisterResponse {
  string consumerId = 1;
  int32 rateLimit = 2;
  string message = 3;
}
 
message SentimentRequest {
  string consumerId = 1;
  string text = 2;
  string timestamp = 3;
}
 
message SentimentResponse {
  string tag = 1;  // positive | negative | neutral | unrelated
  int32 processingTimeMs = 2;
}

Key Components

1. Registration Service

Location: sentiment/src/registration.service.ts

Purpose: Issues unique consumer IDs for authenticated access

Implementation:

@Injectable()
export class RegistrationService {
  private registeredConsumers = new Map<string, RegisteredConsumer>()
 
  registerConsumer(serviceName: string): RegisterResponse {
    const consumerId = uuidv4()
    
    this.registeredConsumers.set(consumerId, {
      id: consumerId,
      serviceName,
      registeredAt: new Date(),
      requestCount: 0
    })
 
    return {
      consumerId,
      rateLimit: 100,  // SENTIMENT_AUTH_RATE_LIMIT
      message: `Registered successfully as ${consumerId}`
    }
  }
 
  isRegistered(consumerId: string): boolean {
    return this.registeredConsumers.has(consumerId)
  }
}

Why registration:

Distinguishes authenticated from unauthenticated requests
Enables per-consumer rate limiting
Could track usage statistics (request counts, etc.)

2. Rate Limiter Service

Location: sentiment/src/rate-limiter.service.ts

Two-tier rate limiting:

@Injectable()
export class RateLimiterService {
  private authenticatedLimit = 100  // per second
  private unauthenticatedLimit = 10  // per second
  
  async checkRateLimit(consumerId: string): Promise<boolean> {
    const isRegistered = this.registrationService.isRegistered(consumerId)
    
    const limit = isRegistered 
      ? this.authenticatedLimit 
      : this.unauthenticatedLimit
    
    const key = `ratelimit:${consumerId}`
    const current = await this.redis.get(key) || 0
    
    if (current >= limit) {
      return false  // Rate limit exceeded
    }
    
    await this.redis.incr(key)
    await this.redis.expire(key, 1)  // 1 second window
    
    return true
  }
}

Configuration:

SENTIMENT_AUTH_RATE_LIMIT=100
SENTIMENT_UNAUTH_RATE_LIMIT=10

Why two tiers:

Incentivizes registration
Prevents abuse from unregistered clients
Simulates API quota systems

Enforcement:

const allowed = await this.rateLimiter.checkRateLimit(consumerId)
 
if (!allowed) {
  throw new RpcException({
    code: status.RESOURCE_EXHAUSTED,
    message: 'Rate limit exceeded'
  })
}

3. LRU Cache

Configuration:

const sentimentCache = new LRUCache({
  max: 500,  // SENTIMENT_CACHE_SIZE
  ttl: 1000 * 60 * 60 * 24  // 24 hours
})

Usage:

const cacheKey = createHash('sha256').update(text).digest('hex')
const cached = sentimentCache.get(cacheKey)
 
if (cached) {
  return { tag: cached, processingTimeMs: 0 }
}
 
const tag = this.classifySentiment(text)
sentimentCache.set(cacheKey, tag)

Why caching:

Same comments analyzed repeatedly (duplicates get through dedup timing)
Reduces CPU load on keyword matching
Faster response times

4. Sentiment Classification

Keyword-based approach:

private classifySentiment(text: string): SentimentTag {
  const lowerText = text.toLowerCase()
  
  // Positive keywords
  const positiveWords = [
    'amazing', 'great', 'excellent', 'perfect', 'love',
    'best', 'wonderful', 'fantastic', 'delicious', 'outstanding'
  ]
  
  // Negative keywords
  const negativeWords = [
    'terrible', 'awful', 'worst', 'bad', 'horrible',
    'disappointing', 'disgusting', 'rude', 'cold', 'overpriced'
  ]
  
  // Unrelated patterns
  const unrelatedPatterns = [
    /what time/i, /parking/i, /wifi/i, /password/i,
    /reservation/i, /follow me/i, /link in bio/i, /crypto/i
  ]
  
  // Check unrelated first
  if (unrelatedPatterns.some(pattern => pattern.test(text))) {
    return 'unrelated'
  }
  
  const positiveCount = positiveWords.filter(word => 
    lowerText.includes(word)
  ).length
  
  const negativeCount = negativeWords.filter(word => 
    lowerText.includes(word)
  ).length
  
  if (positiveCount > negativeCount) return 'positive'
  if (negativeCount > positiveCount) return 'negative'
  return 'neutral'
}

Why keyword matching:

Simple and deterministic
Fast processing
Sufficient for PoC
Easily understandable

Limitations:

No context understanding
Sarcasm not detected
Simple word counting

5. Failure Simulation

Configuration:

SENTIMENT_FAILURE_RATE=0.03125  # 1 in 32 requests

Implementation:

private shouldSimulateFailure(): boolean {
  return Math.random() < this.failureRate
}
 
async analyzeSentiment(request: SentimentRequest): Promise<SentimentResponse> {
  if (this.shouldSimulateFailure()) {
    throw new RpcException({
      code: status.UNAVAILABLE,
      message: 'Service temporarily unavailable'
    })
  }
  
  // Normal processing...
}

Why simulate failures:

Tests consumer retry mechanism
Simulates real-world service instability
Validates error handling paths
Populates dead-letter queue

6. Processing Time Simulation

Simulates variable processing time:

const processingTimeMs = text.length * 2  // 2ms per character
 
await sleep(processingTimeMs)

Why simulate delay:

More realistic than instant responses
Tests timeout handling
Creates observable latency variations

Example:

"Great!" (6 chars) → 12ms
"Amazing food! Best restaurant..." (30 chars) → 60ms

Behavior Details

Registration Flow

Request:

{
  "serviceName": "consumer-service"
}

Response:

{
  "consumerId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "rateLimit": 100,
  "message": "Registered successfully as a1b2c3d4-..."
}

Consumer stores consumerId and includes it in all sentiment requests.

Analysis Flow

Request:

{
  "consumerId": "a1b2c3d4-...",
  "text": "The food was amazing!",
  "timestamp": "2026-03-26T10:30:00Z"
}

Processing steps:

Check if consumerId is registered
Check rate limit (100/sec if registered, 10/sec if not)
Check LRU cache for text hash
If not cached, classify sentiment
Simulate processing time (42ms for 21 chars)
1 in 32 chance to throw error
Return result

Response:

{
  "tag": "positive",
  "processingTimeMs": 42
}

Rate Limiting Behavior

Scenario: Registered consumer

Request 1-100: Accepted
Request 101: Rejected (RESOURCE_EXHAUSTED)
[After 1 second]
Request 102-201: Accepted

Scenario: Unregistered consumer

Request 1-10: Accepted
Request 11: Rejected (RESOURCE_EXHAUSTED)
[After 1 second]
Request 12-21: Accepted

Error response:

gRPC error: code=RESOURCE_EXHAUSTED, message="Rate limit exceeded"

HTTP Endpoints

The sentiment service runs both HTTP and gRPC servers simultaneously on different ports.

Health Endpoint

URL: GET http://localhost:3005/health

Purpose:

Service health monitoring
Aggregated statistics from all internal services
Container readiness/liveness probes
Debugging and observability

Response:

{
  "status": "healthy",
  "service": "sentiment-grpc",
  "cache": {
    "size": 245,
    "maxSize": 500,
    "hitRate": 0.38
  },
  "rateLimiter": {
    "authenticatedLimit": 100,
    "unauthenticatedLimit": 10,
    "activeConsumers": 3
  },
  "registration": {
    "totalRegistered": 2,
    "consumers": [
      {
        "id": "a1b2c3d4...",
        "name": "consumer-service",
        "registeredAt": "2026-03-26T10:00:00Z"
      }
    ]
  },
  "timestamp": "2026-03-26T10:30:00Z"
}

Aggregated Metrics:

Cache stats: Current size, capacity, hit rate
Rate limiter stats: Active limits, consumer count
Registration stats: All registered consumers with timestamps
Service status: Overall health indicator

Why use HTTP instead of gRPC:

Easy to test with curl/Postman
Standard for Docker/Kubernetes health checks
Human-readable JSON output
No proto compilation needed for monitoring

Docker Healthcheck:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:3005/health"]
  interval: 10s
  timeout: 5s
  retries: 3

Hybrid Architecture Benefits

gRPC for Core Business Logic:

High performance binary protocol
Type-safe service contracts
Streaming support (if needed later)
Efficient for service-to-service calls

HTTP for Operations:

Easy monitoring and debugging
Standard healthcheck protocols
Browser-accessible endpoints
No special client tooling needed

Single Process, Dual Protocols:

Both servers run in same NestJS application
Shared service layer (cache, rate limiter, registration)
Different ports prevent confusion
Minimal overhead

Performance Characteristics

Throughput:

With cache hits: 1000+ requests/second
Without cache: 200-500 requests/second

Latency:

Cached: < 5ms
Uncached: 10-200ms (depends on text length)
Failed (simulated): 0ms (immediate exception)

Cache effectiveness:

Hit rate: ~30-40% in typical scenarios
Varies based on duplicate rate from producer

Monitoring

Logs:

[Sentiment] Registered consumer: consumer-service (ID: a1b2c3d4-...)
[Sentiment] Analyzing sentiment for: "Great food!"
[Sentiment] Cache hit for text hash: 3a2f1b...
[Sentiment] Rate limit exceeded for consumer: unregistered
[Sentiment] Simulated failure (1 in 32)

Metrics to track:

Total requests
Registered vs unregistered requests
Cache hit rate
Rate limit violations
Simulated failures
Average processing time

Development

Run locally:

cd sentiment
pnpm dev

Test gRPC calls:

Using grpcurl:

# Register
grpcurl -plaintext -d '{"serviceName":"test"}' \
  localhost:3004 SentimentService/RegisterConsumer
 
# Analyze
grpcurl -plaintext -d '{"consumerId":"abc","text":"Great!"}' \
  localhost:3004 SentimentService/AnalyzeSentiment

Environment variables:

SENTIMENT_AUTH_RATE_LIMIT=100
SENTIMENT_UNAUTH_RATE_LIMIT=10
SENTIMENT_CACHE_SIZE=500
SENTIMENT_FAILURE_RATE=0.03125

Testing

Test Registration

# Should return consumerId
curl http://localhost:3005/health
# Check registeredConsumers count

Test Rate Limiting

Send 150 requests rapidly:

for i in {1..150}; do
  grpcurl -plaintext -d '{"consumerId":"test","text":"Test"}' \
    localhost:3004 SentimentService/AnalyzeSentiment
done
# Should see RESOURCE_EXHAUSTED after request 100

Test Cache

# Send same text twice
grpcurl -d '{"consumerId":"test","text":"Same text"}' ...
# Second call should be faster (processingTimeMs: 0)

Test Failure Simulation

# Send 100 requests
# ~3 should fail with UNAVAILABLE

Next Steps

API & SSE - How processed data is served
Consumer Service - How sentiment is requested
Frontend - How tags are displayed