Skip to content
STAGING — not production

The Sub-50µs Cloud Lie

Why cloud vendors' latency claims don't match reality for trading. Real measurements and the hard limits of cloud infrastructure.

Intermediate 20 min read Expert Version →

🎯 What You'll Learn

  • Understand why vendor latency claims are misleading
  • Learn how to measure real trading latency
  • Identify cloud infrastructure limitations
  • Know when cloud works and when it doesn't

The Marketing vs Reality Gap

Cloud vendors claim “sub-millisecond latency.” Your trading system measures 5-50ms. What’s going on?

AWS claims: "Single-digit millisecond latency"
Your measurement: 15ms to Binance
Reality: Both are "correct" - but measuring different things
```diff

This lesson exposes the gap between marketing claims and trading reality.

---

## What "Latency" Actually Means

Vendors measure **inter-VM latency** within the same datacenter:

```text
EC2 instance-A → EC2 instance-B (same AZ)
AWS claims: ~50-100µs
```text

What you actually need:

```yaml
Your EC2 → Internet → Exchange → Processing → Response
Reality: 5-50ms depending on exchange
```python

**Marketing latency ≠ application latency**

---

## Measuring Real Latency

> **The hypervisor adds 5-20µs of jitter to every network operation.** You share physical hardware with other tenants. When they spike, you spike. This variability is invisible in averages but destroys your p99 latency.

Dedicated hardware doesn't have this problem.

### Measure VM-to-VM (What AWS Claims)

```bash
# Install sockperf on two EC2 instances
sudo apt install sockperf

# Server side
sockperf server -i 0.0.0.0 -p 12345

# Client side - measure latency
sockperf ping-pong -i <server-ip> -p 12345 --pps=max -t 60

# Typical AWS result: avg 60µs, p99 150µs
```text

## Measure to Exchange (What You Actually Get)

```python
import time
import requests

def measure_exchange_latency(url, n=100):
    latencies = []
    for _ in range(n):
        start = time.perf_counter()
        requests.get(url)
        latencies.append((time.perf_counter() - start) * 1000)

    latencies.sort()
    print(f"Min: {latencies[0]:.1f}ms")
    print(f"Avg: {sum(latencies)/len(latencies):.1f}ms")
    print(f"P99: {latencies[int(n*0.99)]:.1f}ms")
    print(f"Max: {latencies[-1]:.1f}ms")

# Run from EC2
measure_exchange_latency("https://api.binance.com/api/v3/time")
# Typical: Min 15ms, Avg 25ms, P99 80ms
```bash

---

## Where Cloud Latency Comes From

| Source | Contribution | Fixable? |
|--------|--------------|----------|
| Physical distance | 1-50ms | Move to colo |
| Internet routing | 1-20ms | Pay for direct connect |
| Hypervisor overhead | 5-20µs | Bare metal instance |
| Kernel network stack | 10-50µs | Kernel tuning |
| Your application | Variable | Code optimization |

**90% of your latency is location + network path.** Optimizing code won't fix this.

---

## The Noisy Neighbor Problem

Shared infrastructure means shared variability:

```text
Normal operation:
  Your latency: 50µs

Neighbor running ML training:
  Your latency: 200µs (CPU steal)

Neighbor doing heavy I/O:
  Your latency: 500µs (network contention)
```text

This variability is **random and unpredictable**. Your p99 suffers.

### Measuring CPU Steal

```bash
# Check if you're losing CPU to other tenants
vmstat 1 | awk 'NR>2 {print "steal:", $18"%"}'

# >0% steal means others are taking your CPU time
```bash

---

## AWS Instance Selection

| Instance Type | Latency Profile | Monthly Cost |
|---------------|-----------------|--------------|
| t3.medium | High variability, burst | $30 |
| c6i.2xlarge | Better, still shared | $250 |
| c6i.metal | Bare metal, no hypervisor | $3,000 |
| p4d.24xlarge | Dedicated network | $30,000+ |

**For trading:** Minimum c5n/c6i.xlarge with Enhanced Networking.

---

## Common Misconceptions

**Myth:** "Faster instance types = lower latency."
**Reality:** Instance type affects CPU, not network latency. A t3.micro and p4d.24xlarge have similar network latency to external destinations.

**Myth:** "AWS Direct Connect solves all latency problems."
**Reality:** Direct Connect reduces internet routing variability (~5-10ms savings) but doesn't fix hypervisor jitter or distance.

**Myth:** "My cloud setup is fast enough because average latency is low."
**Reality:** Averages hide tail latency. Your p99 or p99.9 is what matters for trading. One 500ms spike per minute is catastrophic.

---

## When Cloud Makes Sense

### Cloud is Fine For:
- Swing trading (minutes to days)
- Backtesting and research
- Non-latency-sensitive strategies
- Starting out / proving concepts

### Cloud is Not Fine For:
- Market making
- HFT strategies
- Arbitrage (especially cross-exchange)
- Any strategy where you compete on speed

---

## Honest Latency Budget

If you're serious about cloud trading:

```text
Fixed costs (can't optimize):
  Distance to exchange: 10-30ms
  Internet routing: 5-15ms
  TLS handshake: 5-10ms

Variable costs (can optimize):
  Application code: 0.1-10ms
  Network stack: 0.01-0.1ms

Realistic total: 25-70ms

Your competitor in colo: 0.1-1ms
```diff

You're 25-700x slower. Accept it or move to colo.

---

## Practice Exercises

### Exercise 1: Measure Your Reality
```bash
# From your trading server, measure to your exchange
while true; do
  curl -w "%{time_total}\n" -o /dev/null -s https://api.exchange.com/time
  sleep 1
done | tee latency.log
```text

## Exercise 2: Check for Steal Time
```bash
# Monitor for 1 hour
vmstat 1 3600 | awk '{print $18}' > steal.log
# Any non-zero values?
```text

## Exercise 3: Compare Instance Types
```text
If budget allows:
- Spin up c6i.xlarge and c6i.metal
- Run same latency test on both
- Compare p99 latency

Key Takeaways

  1. Vendor claims measure the wrong thing - VM-to-VM ≠ to-exchange
  2. Hypervisor adds jitter - Shared infrastructure = shared variability
  3. Distance dominates - No amount of tuning fixes 10ms of physics
  4. Know your use case - Cloud works for some strategies, not others

What’s Next?

Want to go deeper?

Weekly infrastructure insights for engineers who build trading systems.

Free forever. Unsubscribe anytime.

You're in. Check your inbox.

Questions about this lesson? Working on related infrastructure?

Let's discuss