A buildable design for a self-hosted assistant a small team can use on confidential client material — threat model, three deployment options, model and hardware sizing, the hardening checklist, and a mapping from each control to the legal "reasonable steps" test.
For trade secrets the asset is confidentiality. So the only question that matters is: who or what can technically observe the prompts, the documents, the model's working memory, the disk, the logs, and the outputs — and can you prove the list is short?
Every design decision below shortens that list and makes it auditable. The legal payoff (covered in the business report) is that a short, controlled, documented list is exactly what "reasonable steps to keep it secret" looks like in practice.
Built for Option A (own hardware); the same stack lifts onto Option B unchanged. Everything runs on one machine, behind a default-deny firewall, reachable only over your LAN or a VPN.
The open-weight leaderboard reshuffles monthly, so choose by durable criteria and slot in the current best release at build time.
| Model size | VRAM @ 4-bit* | Fits on |
|---|---|---|
| 7–9B | ~6–8 GB | Any modern GPU |
| 13–14B | ~10–12 GB | 16–20 GB (e.g. GEX44) |
| 30–34B | ~20–24 GB | 24–48 GB |
| 70–72B | ~40–48 GB | 48 GB (RTX 6000 Ada) — tight; 80 GB comfortable |
| 100B+ MoE | varies | Multi-GPU / 80 GB+ |
| Path | Spec | GPU / VRAM | Cost | Notes |
|---|---|---|---|---|
| A · Own box | Workstation: 1× pro GPU, 64–128 GB RAM, NVMe (LUKS) | RTX 6000 Ada 48 GB (≈ €6.8k) or RTX PRO 6000 Blackwell 96 GB (≈ €8k) | ~€8k–12k once | Runs 30–70B comfortably; full physical control |
| B · Hetzner GEX44 | Dedicated, EU | RTX 4000 SFF Ada 20 GB | €184/mo + €79 setup | Good for ≤14B models / lighter use |
| B · Hetzner GEX130 | Dedicated, EU | RTX 6000 Ada 48 GB | ~€838/mo + €79 setup | Matches the owned-box GPU, as OPEX |
| B · AWS dedicated | Dedicated host / bare-metal GPU, EU region | L4 / L40S / A10G class | Usage-based, higher | Strong isolation primitives; easy to misconfigure |
| C · Managed | Bedrock / Azure OpenAI, EU region | n/a (service) | Per token | No GPU ops; data leaves runtime |
Apply all of these for A and B. For C, the network/crypto items become contract + provider-config items (KMS, PrivateLink, region pinning, no-logging).
localhost/private ifaceThis table is the hand-off to the business report. It turns engineering into the evidence a court or a client wants: a documented, proportionate set of measures.
| Technical control | Maps to the legal requirement… |
|---|---|
| Self-hosted open-weight model; no external API | No disclosure of the secret outside the trusted circle |
| LAN/VPN-only, no public endpoint, egress locked | Restricting access; preventing onward transmission |
| Full-disk encryption, no/encrypted swap | "Appropriate technical measures" to secure the information |
| Individual accounts, MFA, RBAC, access log | Limiting the number of people with access; demonstrable control |
| Logging disabled / minimised; temp wiped | Not creating uncontrolled copies of the secret |
| Single-tenant infra + signed DPA (Option B) | Sufficient guarantees from any third party that is involved |
| Documented wipe / decommission | Evidence the holder actively maintained secrecy throughout |
| NDAs + access policy (organisational, not technical) | The contractual half of "reasonable steps" — pair with the above |
An outline, not copy-paste commands — adapt to your distro and provider. The ordering matters: bring the data path online after egress is cut.
# 1. Provision & encrypt install Ubuntu LTS on LUKS-encrypted NVMe; disable swap (or encrypt it) harden SSH: keys only, no root login, restrict source IPs ufw default deny incoming; ufw default deny outgoing allow out only: OS + package mirrors (temporarily, for setup) # 2. Pull the model (still online), then go dark download weights on this box or a staging machine verify sha256 against the published checksum once installed: remove the temporary outbound allow-rules # 3. Serve + UI, no logging run vLLM (or Ollama) bound to 127.0.0.1; disable request logging run Open WebUI bound to LAN/VPN iface; create per-user accounts + RBAC confirm: no OpenTelemetry / analytics / crash reporting enabled # 4. Verify the boundary (the important step) tcpdump / firewall logs: confirm zero unexpected egress during a real chat grep the box for prompt text in logs/temp/swap — should find nothing # 5. Decommission (end of engagement) stop services; securely wipe data + model volumes on rented infra: delete the server, volumes, and snapshots record the wipe (date, method, operator) in the evidence pack
"Reasonable steps" is only worth what you can show. Maintain a short folder, versioned, that a client or a court can be walked through: