LLM Infrastructure and Data Protection

This document describes the infrastructure, data flow, and security architecture of the Large Language Model (LLM) inference services used by the Streamdiver platform. It is intended to provide transparency for data protection officers and security assessors evaluating CLOUD Act exposure, data sovereignty, and the handling of sensitive content during AI processing.

Document Information


Document ID	TC-LLM-001
Version	1.3
Date	2026-02-16
Scope	Streamdiver LLM Inference Infrastructure
Applicable Law	GDPR (EU), öDSG (AT); compatible with BDSG (DE) and FADP/nDSG (CH)
Related Documents	TC-CDP-001 — Cryptography & Data Protection

1. Purpose

Streamdiver operates its own LLM inference infrastructure on dedicated servers in Europe. No external AI services (such as OpenAI, Google, Anthropic, or Microsoft) are used. All inference is performed within Streamdiver's own infrastructure boundary.

2. Architecture Overview

3. Inference Framework

Property	Details
Framework	vLLM
API	OpenAI-compatible REST API
License	Open Source (Apache 2.0)
Developer	vLLM Project (open-source community)
Deployment	Self-hosted on dedicated GPU servers

vLLM is a high-performance open-source inference engine. Streamdiver operates it as a self-hosted service — there is no service relationship, telemetry, or data exchange with any external party.

4. Models

Streamdiver uses state-of-the-art open-source models for all LLM-based processing tasks (summarization, question generation, RAG-based retrieval, entity extraction).

Property	Details
Model Source	Reputable open-source models from established providers (e.g., Meta LLaMA, Mistral AI)
Model Hosting	Downloaded and served locally on Streamdiver infrastructure
Model Selection	Architecture is model-agnostic; models can be exchanged without platform changes
Model Training	No fine-tuning or training on customer data
Model Updates	Models are updated periodically to leverage improvements in the open-source ecosystem

The specific model in use may change as the open-source ecosystem evolves. The architecture is designed to be model-agnostic — the inference API remains stable regardless of the underlying model.

5. Network Isolation

The LLM inference servers are not publicly accessible. They are exclusively reachable within the Streamdiver Tailscale tailnet — a private, encrypted overlay network.

Property	Details
Network	Streamdiver Tailscale tailnet (private overlay)
Encryption	WireGuard (ChaCha20-Poly1305, Curve25519 key exchange)
Access Control	Identity-based; only authorized Streamdiver services can reach the LLM endpoints
Public Exposure	None — no public IP, no public DNS, no internet-facing ports
Monitoring	All connections are logged within the tailnet control plane

No traffic to or from the LLM servers traverses the public internet.

6. Data Flow

6.1 Inbound (Platform → LLM)

The RAG engine within the Streamdiver platform constructs a prompt (context + query).
The prompt is sent to the vLLM server via the encrypted Tailscale tunnel.
The vLLM server processes the prompt in RAM and returns the result.

6.2 Outbound (LLM → External)

None. The LLM server has no outbound internet access. It cannot:

Send data to any external AI service
Phone home to any vendor
Transmit telemetry or usage data
Access any resource outside the Tailscale tailnet

6.3 Data Lifecycle on the LLM Server

Phase	Data Location	Duration
Request received	RAM (prompt + context)	Milliseconds to seconds
Inference	GPU VRAM + RAM	Duration of inference
Response sent	Transmitted via WireGuard tunnel	Immediate
After response	Prompt and context are released from memory	Immediate

No data is written to disk at any point during inference. The LLM server is stateless — it retains no customer data between requests.

7. Hosting

LLM inference workloads run on dedicated GPU servers provided by multiple European hosting partners. The specific provider is selected based on capacity and workload requirements; all providers meet the same baseline security criteria.

Property	Details
Providers	Exoscale (Switzerland/Austria), Hetzner Online GmbH (Germany), Verda (Finland)
Locations	Austria, Germany, Finland
Provider Jurisdictions	Swiss, Austrian, German, and Finnish law — all EU/EEA
Infrastructure Certification	ISO 27001
Server Type	Dedicated GPU servers (not shared / not multi-tenant)
Physical Security	Biometric access control, 24/7 surveillance, individual rack locking
CLOUD Act Exposure	None — all providers are European companies with no US ownership

8. CLOUD Act Assessment

The CLOUD Act applies to US companies. The following assessment covers all layers of the LLM infrastructure:

Layer	Component	Jurisdiction	CLOUD Act Risk
Hosting Providers	Exoscale, Hetzner, Verda	Switzerland/Austria, Germany, Finland	None
Inference Framework	vLLM (open source)	N/A (open source, self-hosted)	None
Models	Open-source models	N/A (open source, self-hosted)	None
Network	Tailscale	Canada (Tailscale Inc.)	See below
Encryption	WireGuard	N/A (open source)	None

Tailscale Note: Tailscale Inc. is a Canadian company. The Tailscale control plane coordinates key exchange and ACL policies but does not relay or have access to data traffic — all data flows directly between nodes via WireGuard tunnels with end-to-end encryption. Tailscale cannot decrypt the traffic. Even in the event of a theoretical legal request, only connection metadata (which nodes are connected) would be available, not content.

Summary: No US company has access to data processed by or stored on the LLM infrastructure. There is no CLOUD Act exposure.

9. Customer Data Usage

Question	Answer
Is customer data used to train or fine-tune models?	No. Customer data is not used to tune LLM models.
Is customer data retained after inference?	No. The LLM server is stateless; data is released from memory immediately after response.
Is customer data shared with any third party?	No. No data leaves the Streamdiver infrastructure.
Are inference prompts or results logged?	No. No prompts or inference results are written to logs on the LLM server.

1. Purpose​

2. Architecture Overview​

3. Inference Framework​

4. Models​

5. Network Isolation​

6. Data Flow​

6.1 Inbound (Platform → LLM)​

6.2 Outbound (LLM → External)​

6.3 Data Lifecycle on the LLM Server​

7. Hosting​

8. CLOUD Act Assessment​

9. Customer Data Usage​