Prevod v pripravi Ta stran še ni v celoti prevedena v slovenščino. Vsebina je trenutno prikazana v angleščini. Odpri angleško različico →

Buying guide · five stages

Start small.
Leave room to scale.

The cheapest mistake at Stage 1 is buying hardware that cannot host the next GPU generation. The most expensive mistake at Stage 3 is realising your facility was the constraint all along. Below is a five-stage buying guide for sovereign AI infrastructure - what each stage looks like, what it costs, and what you learn before the next one.

The stages

From on-paper to dedicated AI infrastructure.

Stage

Scoping the first workload

Pick ONE use case · estimate token volume, latency, concurrency · identify which data classes are involved

→ Answers: "What's the smallest real workload we can run?"

Footprint None yet - sizing exercise on paper

Spend Zero hardware spend

What you learn Workload profile (params, throughput, SLA) · the foundation for every later spend decision

Stage

Starter node

Single workstation or 1U/2U server · 1–2 GPUs · run quantised open-weight model (8B–30B class) · inference only · single use case · limited users

→ Answers: "Does this work end-to-end inside our perimeter?"

Footprint 1 box · single-room cooling

Spend Low five figures EUR

What you learn Real latency · real model quality on your data · real ops burden

Stage

Production single-node

Properly racked server · 2–8 GPUs · redundant power · supported OS · monitoring · backups · larger model or higher concurrency · 1–3 use cases sharing the box

→ Answers: "Can we run this as a real internal service?"

Footprint 1 rack unit in existing server room

Spend Low to mid six figures EUR

What you learn Where the bottleneck is - GPU memory, interconnect, storage IO, or cooling

Stage

Multi-node cluster

2–8 nodes · high-speed interconnect · shared storage · workload scheduler · larger models · fine-tuning · multi-team access

→ Answers: "Can we serve the whole company from this?"

Footprint Dedicated rack(s) · liquid cooling starts paying off

Spend High six to low seven figures EUR

What you learn Whether your facility (power, cooling, network) is the next constraint

Stage

Dedicated AI infrastructure

Purpose-built environment · dense GPU racks · liquid cooling as default · capacity planning tied to roadmap · DR / failover · governance and audit at platform level

→ Answers: "Is AI now a core piece of company infra?"

Footprint Dedicated room or co-lo cage

Spend Seven figures EUR and up · ongoing

What you learn How to forecast and amortise - this is now a capex programme, not a project

Design principles

Four principles that protect your early spend.

The decisions you take at Stage 1 either compound favourably or compound against you. These four principles - applied early - are the difference between a Stage 4 estate that grew gracefully from a Stage 1 box, and a Stage 4 estate that required throwing away a generation of hardware along the way.

Principle 01

Buy chassis and cooling that can host the next GPU generation

Today's 700 W GPUs are tomorrow's 1 000+ W GPUs. Air cooling at high density runs out of headroom faster than the GPU you bought. Specifying liquid cooling at Stage 1 is cheaper than rebuilding at Stage 3.

Principle 02

Standardise on an inference stack portable across stages

Whatever runtime, container layout, and observability stack you pick at Stage 1 should still work at Stage 4. The cost of porting a poorly-chosen runtime across three rack moves is high.

Principle 03

Treat Stage 1 hardware as a lab asset after Stage 2 - not as throwaway

The starter node becomes the development and evaluation environment once the production single-node is live. Plan for that from the start; the box keeps earning its keep.

Principle 04

Decide the Stage 4 power and cooling envelope at Stage 2

The hardest constraint in scaling AI infrastructure is usually the facility, not the silicon. If you wait until Stage 3 to think about Stage 4 power and cooling, you have already missed the design window.

The platform

RM-4U8G is engineered for Stages 2 and 3.

A production single-node deployment (Stage 2) and a multi-node cluster (Stage 3) are the same chassis, the same cooling loop, the same power topology. You upgrade by replacing GPUs and motherboard - the platform stays.

That continuity is the buying-guide thesis in physical form. The Stage 2 starter - two RTX 6000 Ada GPUs in the same chassis - is the same Stage 3 maximum - eight H200 NVL GPUs. Same chassis, same cooling, no rebuild.

See the platform

Reference figures

What you would actually buy at each stage.

Reference figures · examples drawn from real customer deployments. Final sizing is specified per workload - these are not bundles LM TEK sells, they are sketch BOMs to anchor the buying conversation.

Stage	Configuration sketch	Indicative spend
Stage 01	2× RTX 6000 Ada, 1× CPU, 256 GB RAM, 2× 3.84 TB NVMe	~€80k–€120k
Stage 02	4× RTX A6000, 2× CPU, 512 GB RAM, 4× 3.84 TB NVMe	~€180k–€280k
Stage 03	8× H200 NVL, 2× CPU, 1 TB RAM, 8× 7.68 TB NVMe (per node × 2–4 nodes)	~€800k–€2M

Sizing your first deployment?

Tell us your stage, your workload, and the constraints in your facility. We will recommend a configuration that fits - and partners who can take it from spec to deployed system.

Deploying AI in your business

Inženirsko hlajenje, komponento po komponento.

Validirani kompleti za velika tower ohišja.

Specifične rešitve za edinstvene zahteve.

RM-4U8G

Start small.
Leave room to scale.

From on-paper to dedicated AI infrastructure.

Scoping the first workload

Starter node

Production single-node

Multi-node cluster

Dedicated AI infrastructure

Four principles that protect your early spend.

Buy chassis and cooling that can host the next GPU generation

Standardise on an inference stack portable across stages

Treat Stage 1 hardware as a lab asset after Stage 2 - not as throwaway

Decide the Stage 4 power and cooling envelope at Stage 2

RM-4U8G is engineered for Stages 2 and 3.

What you would actually buy at each stage.

Sizing your first deployment?

Start small. Leave room to scale.

From on-paper to dedicated AI infrastructure.

Scoping the first workload

Starter node

Production single-node

Multi-node cluster

Dedicated AI infrastructure

Four principles that protect your early spend.

Buy chassis and cooling that can host the next GPU generation

Standardise on an inference stack portable across stages

Treat Stage 1 hardware as a lab asset after Stage 2 - not as throwaway

Decide the Stage 4 power and cooling envelope at Stage 2

RM-4U8G is engineered for Stages 2 and 3.

What you would actually buy at each stage.

Sizing your first deployment?

Start small.
Leave room to scale.