Orbit Social Hub — Real-time Messaging Architecture
A messaging fabric that turns scattered conversations into a single, observable surface. Built to scale horizontally and degrade gracefully under spike load.
- Role
- Platform Architect
- Year
- 2024
- Stack
- NestJSWebSocketsRedis StreamsPostgresKubernetes
Context
A B2B communications product needed to consolidate inbound channels (web chat, social DMs, internal notifications) into one operator surface — without losing the real-time feel that made the product worth using.
Approach
- WebSockets at the edge, Redis Streams in the middle. A clean split between fan-out and durability.
- Idempotent everything. Every event has a stable hash; replays are safe and observed.
- Backpressure as a feature. When an operator gets behind, the UI tells them and the system protects itself instead of dropping silently.
Engineering decisions worth talking about
- A connection broker that holds tenant state in memory but writes session checkpoints to Redis every few seconds — fast normal path, no surprises during pod rotation.
- Lag and error-budget metrics piped directly into the operator UI: the team that runs the system sees what the system feels.
- A small library that lets product engineers add new event types without touching the realtime substrate.
Outcome
The system handled launch spikes that would have crushed the previous architecture. The on-call rotation became uneventful. The product kept the live, snappy feel users expected.
Outcomes
- Sustained 10× peak throughput during launch events without changing the deployment topology.
- Sub-300ms end-to-end latency at the 99th percentile across regions.
- Operator dashboards that surfaced live throughput, lag, and error budgets at a glance.