aegis · June 3, 2026 · 4 min read

Network Intrusion Detection Without Deep Packet Inspection

CAI Technology · Last reviewed: 6/3/2026

Cinematic data center corridor with flowing translucent network traffic streams entering switch ports, rendered in cyan-to-magenta palette. Symbolically evokes network traffic

Network Intrusion Detection Without Deep Packet Inspection

Roughly 95% of HTTPS connections rode TLS 1.3 or QUIC by late 2025 (Cloudflare Radar Year in Review 2024), which means the signature-based IDS still pattern-matching payloads on your perimeter is scanning ciphertext it can no longer read. A paper posted to arXiv this month proposes a different angle: treat the sequence of packet headers in a flow as a language, and learn what “benign” sounds like.

The model is called PLM-NIDS. It consumes L3/L4 metadata only — packet length, inter-arrival time, TTL, TCP flag combinations, hashed source/destination ports — and feeds them as token sequences into an RWKV-4 state-space architecture. Training used 344,232 unlabelled benign flows. Inference scores per-flow perplexity: high perplexity means the flow does not sound like normal traffic on this network.

The numbers from the PLM-NIDS preprint are worth quoting: PR-AUC 0.93 zero-shot, 0.94 after supervised fine-tuning on labelled attacks. Competitive with classical supervised baselines that need full labelled corpora, without ever cracking a packet open.

Why this matters operationally

Three things change for an aegis-aligned detection stack:

Encryption invariance. TLS 1.3 hides SNI extensions behind ECH; QUIC encrypts most of the transport handshake (RFC 9001). Header-only models do not care.
Cold-start posture. Zero-shot PR-AUC 0.93 means a new SOC can stand up baseline detection on captured benign traffic alone — no labelled attack corpus required up front. This matters for the NIS2 24-hour early-warning obligation on essential entities.
Privacy alignment. No DPI means no payload retention, which simplifies GDPR Art. 6(1)(f) balancing tests and aligns with ENISA’s 2024 work on encrypted-traffic analytics.

2026-05-31T14:22:08Z plm-nids: flow_id=8af3 src=10.4.2.17:54221 dst=104.18.32.7:443
                     proto=tcp len_seq=[60,52,1460,1460,1460,89] iat_ms=[12,8,4,4,4,210]
                     perplexity=87.4 threshold=42.0 verdict=ANOMALY action=mirror_to_soc

Architecture in one diagram

flowchart TD A[NIC mirror / SPAN port] --> B[Flow assembler<br/>5-tuple bucketing] B --> C[Metadata tokenizer<br/>len, IAT, TTL, flags] C --> D[RWKV-4 inference<br/>per-flow perplexity] D --> E{Perplexity > τ?} E -->|yes| F[SOAR enrichment<br/>analyst queue] E -->|no| G[Drop, retain flow hash only] classDef good fill:#dcfce7,stroke:#10b981 classDef bad fill:#fee2e2,stroke:#ef4444 classDef neutral fill:#f1f5f9,stroke:#94a3b8 class F bad class G good class A,B,C,D neutral

The caveats nobody puts in the abstract

Header-only models inherit two known failure modes. Low-and-slow exfiltration over long-lived QUIC sessions can mimic benign streaming flow shapes — the MITRE ATT&CK T1041 technique explicitly targets this gap. Second, perplexity-based detectors decay against adversarial flow shaping; an attacker who paces beacons to match learned IAT distributions stays under threshold.

CAI Technology’s position: header-language models are a high-recall first stage, not the only detector on the wire. Pair them with agentic SOC triage and you get the recall of language modeling with the precision of an analyst loop. Reserve DPI for the 0.1% of flows the language model flags as off-distribution — the only economically defensible posture once your fleet is fully encrypted. See how we wire this into our aegis detection pipeline.

Read further

Estimated reading time: 3 minutes

Network Intrusion Detection Without Deep Packet Inspection

Why this matters operationally

Architecture in one diagram

The caveats nobody puts in the abstract

Read further

We start with a 30-minute conversation.