Close-up of illuminated technology hardware panels with blue indicator lights
Infrastructure

Vessel GPU Racks: Power, Cooling, Security

By James Calder12 min read

Last quarter we did a site survey on a 62-meter motor yacht in Antibes. The owner had already purchased two NVIDIA H100 SXM cards and a DGX-class node. His integrator had proposed mounting it in a standard 42U server rack, bolted to the deck in a storage compartment behind the engine room bulkhead. No vibration isolation. No dedicated cooling loop. A single 30A circuit breaker feeding the whole stack.

The hardware was fine. The deployment plan would have destroyed it inside six months.

Picking the right GPU is half the problem. I covered that decision in the GPU selection guide. This post is about everything else: the power conditioning, thermal management, vibration isolation, and physical security that determine whether your on-vessel compute stack actually survives its first crossing. None of this is glamorous. All of it is load-bearing.

Power Budget: What GPU Racks Draw on a Vessel

Every vessel runs its own microgrid. Unlike a shore-side data center connected to utility-scale power, your generators are the entire supply. Understanding what your compute stack actually draws (and what your generators can actually spare) is the first constraint.

The numbers, for reference:

  • NVIDIA H100 SXM: 700W TDP per card
  • NVIDIA H100 PCIe: 350W TDP per card
  • NVIDIA L40S: 350W TDP per card (PCIe, passive cooled, dual-slot)
  • NVIDIA A100 SXM: 400W TDP per card
  • NVIDIA A100 PCIe: 250W TDP per card

A full DGX H100 system (eight H100 SXM cards plus CPUs, networking, storage, and fans) draws 10.2kW at maximum load. That is not the GPU power alone. That is the complete node.

For the single-vessel deployments we typically recommend (two L40S cards in a purpose-built inference server), the total system power is in the 3-4kW range including the host system, networking switch, and NAS storage. Add 20-30% overhead for cooling and UPS losses, and your total draw from the ship's bus is roughly 4-5kW.

A 45-meter superyacht typically runs two 125-150kW diesel generators. At anchor at night with guests aboard, generator load might sit at 60-80kW (HVAC, galley, lighting, water systems). You have 40-60kW of headroom. A 5kW AI compute stack is a rounding error on that budget.

Where it gets tight: if you are deploying a full DGX node (10.2kW) with dedicated cooling infrastructure (another 3-5kW for the CDU and pumps), you are looking at 13-15kW total. Still manageable on a 60-meter yacht, but now you need to coordinate with the electrical engineer during the design phase, not after the fact.

The rule we follow: your AI compute stack should never exceed 10% of the vessel's minimum generator capacity. On a yacht with 2x150kW generators (one running at a time during anchor watch), that ceiling is 15kW. Enough for a serious inference deployment. Not enough for a training cluster. Training stays shore-side. The vessel runs inference, and that is the correct division of labor for a sovereign knowledge ark.

Power Conditioning and UPS

Ship power is not clean power. Generator frequency varies during load transients. Voltage sags when the stabilizers kick in or the bow thruster fires. Switchover between generators (or between generator and shore power) introduces millisecond-scale interruptions that will hard-crash a GPU mid-inference.

Every vessel GPU deployment needs an online double-conversion UPS between the ship's bus and the compute rack. Double-conversion means the UPS continuously converts AC to DC to AC, completely isolating the output from input disturbances. The Eaton 9SX Marine series (1000VA to 3000VA, DNV-GL type approved per DNVGL-CP0395) is purpose-built for this. It ships with vibration dampers and is rated for the temperature and humidity ranges of a vessel engine room.

For a 4-5kW compute stack, you need two 3000VA/2700W UPS units in parallel (or a single larger marine-rated unit in the 5-6kVA class). Battery runtime does not need to be long. Five minutes is enough to execute a graceful shutdown if the generators fail entirely. The UPS exists to smooth transients and bridge switchovers, not to run the AI stack for hours.

Input power should be 230V single-phase where possible. Some larger deployments run 400V three-phase to the rack PDU. Either way, run a dedicated circuit from the main switchboard with its own breaker. Do not share circuits with variable-frequency drives (thrusters, stabilizers, HVAC compressors) because their switching transients propagate back through the bus.

Cooling at Sea: Why Air Alone Fails Above 20kW

In a shore-side data center, you have industrial HVAC, hot/cold aisle containment, and effectively unlimited airflow. On a vessel, you have a compartment that is fighting ambient temperatures of 35-45 degrees Celsius in the engine room vicinity, limited ventilation runs, and salt-laden air that will corrode unprotected electronics.

Air cooling works for total rack heat loads under 20kW. A properly configured pair of L40S cards in a standard rack with adequate inlet airflow (the L40S is passively cooled and relies entirely on the server's internal fans) will run fine in a climate-controlled tech space maintained at 20-25 degrees Celsius. You need roughly 500 CFM of filtered airflow through the rack, front to back, with the hot exhaust vented to an air handler or directly overboard via a ducted return.

Above 20kW, air cooling becomes thermodynamically impractical in a vessel compartment. A DGX H100 node alone demands 1,105 CFM of airflow at 80% fan PWM, operating within a 5-30 degree Celsius ambient envelope. In a confined vessel space, that volume of air exchange is difficult to achieve without industrial ducting that competes for the same limited space your cooling infrastructure occupies.

The solutions, in order of complexity:

Rear-door heat exchangers (RDHx) attach to the back of a standard 19-inch rack and use chilled water to remove heat from the exhaust air before it enters the room. Effective up to approximately 60kW per rack. The chilled water loop ties into the vessel's existing HVAC chiller plant or a dedicated coolant distribution unit (CDU). This is the sweet spot for most vessel deployments: it uses standard rack-mount equipment, does not require specialized server hardware, and can be serviced by marine HVAC technicians who already understand closed-loop chilled water systems.

Direct-to-chip liquid cooling places cold plates directly on GPUs and CPUs, with a manifold distributing coolant through the server chassis. This handles 80-120kW per rack and is mandatory for high-density configurations like a full DGX H100 cluster. The trade-off: your servers must be liquid-cooling-ready from the factory, the coolant distribution unit adds another 2-4kW of power draw, and you need a marine engineer comfortable with pressurized fluid loops in a compute space.

For the majority of yacht deployments (two to four L40S or H100 PCIe cards), a well-designed air-cooled installation in a dedicated, climate-controlled tech compartment is sufficient. The key requirements: positive-pressure air filtration to keep salt particles out, dedicated HVAC cooling with 125% capacity margin above the computed heat load, and sealed cable penetrations so engine room air does not migrate into the clean tech space.

Vibration, Shock, and Salt: The Physical Environment

A vessel is not a raised floor in Virginia. It moves. Continuously. The environment problems break down into three categories.

Vibration. Engine harmonics, propeller shaft rotation, wave-induced oscillation, and machinery resonance all transmit through the hull structure. IEC 60945 (the primary international standard for maritime navigation and communication equipment) specifies vibration endurance testing at frequencies up to 100 Hz. The dominant frequencies on most displacement yachts are in the 2-25 Hz range, which is exactly the resonant window for standard rack-mount server hardware.

The solution is shock-isolated racks. Companies like Martin Enclosures and Socitec build 19-inch rack frames suspended inside an outer shell on coil-spring and elastomeric isolators. Coil springs attenuate low-frequency, high-amplitude motion (swell, slamming). Elastomeric mounts handle higher-frequency engine vibration. A properly isolated rack achieves MIL-STD-810 compliance: 5 GRMS random vibration and 40G mechanical shock. Available in sizes from 2RU to 16RU for edge deployments, and in full 42U configurations for larger installations.

Do not skip the isolation. A standard H100 SXM card has a heat sink weighing over a kilogram cantilevered off the GPU package on thermal interface material. Six months of 10 Hz resonant vibration will fatigue the solder joints on BGA packages and crack thermal pads. The card will not fail dramatically. It will develop intermittent errors under load, and you will spend weeks chasing phantom inference failures before someone thinks to check the mechanical environment.

Shock. Heavy weather, docking impacts, and ground contact events produce transient shocks in the 15-40G range depending on hull construction and location within the vessel. GPUs mounted lower in the hull (closer to the waterline and keel) experience higher shock loads. The same spring-isolated rack that handles vibration also attenuates shock, but verify your isolator is rated for the peak G expected at the mounting location. A naval architect can model this from the vessel's structural drawings.

Salt and humidity. Marine atmosphere contains salt particles and operates at 70-95% relative humidity much of the year. Standard server components are rated for 20-80% RH non-condensing. IEC 60945 specifies salt mist endurance testing for exposed maritime equipment.

For enclosed tech spaces, the answer is positive-pressure filtered air. Keep the compute compartment pressurized slightly above ambient (5-10 Pa positive) with HEPA-filtered supply air. This prevents salt-laden air from infiltrating through cable penetrations, door seals, or ventilation gaps. Conformal coating on exposed PCBs is an additional defense layer for equipment that may be exposed during servicing.

Physical Security for the Vessel Compute Stack

A vessel GPU rack running local AI is not just expensive hardware. It is the knowledge ark. It holds the model weights, the vector databases, the guest data, the operational intelligence. If your zero-trust network architecture protects the data logically, physical security protects it materially.

The threat model for physical access to vessel compute is different from a data center. You do not have a staffed security desk and mantrap. You have rotating crew, visiting technicians, port authority inspections, and (on charter yachts) guests who wander. The compute compartment must be:

Locked and access-controlled. Biometric access (fingerprint scanners rated for marine environments) or smart card readers on a dedicated door. Physical keys are insufficient because they get copied, lost, or lent. Every access event should log to the vessel's security management system with timestamp, identity, and duration. A card reader with an offline buffer ensures logging continues even when the network is down.

Monitored. A camera covering the compartment entrance and the rack face, recording to local NVR storage (not dependent on the AI compute stack itself). Tamper sensors on the rack doors that trigger alerts to the bridge security console.

Environmentally separated. The compute compartment should be a distinct space with its own fire suppression (clean agent, not water), independent ventilation, and a bilge alarm. If this compartment floods, you want to know before the rack gets wet, not after.

Documented for crew. Every crew member with access must understand: do not power off the rack, do not unplug cables, do not prop the door open for ventilation. A laminated single-page SOP on the compartment door goes further than a 40-page IT manual nobody reads.

For yachts operating with sensitive principal data (HNW families, corporate executives, government officials), consider tamper-evident seals on drive bays and a hardware security module (HSM) for encryption key storage. If someone physically removes a drive, the data is worthless without the HSM. This is the physical layer of the sovereign AI promise: your data never leaves the hull, and even if someone takes the hardware, they cannot access what is on it.

Reference Architecture: A Practical Build

For a 50-65 meter motor yacht running a two-card L40S inference deployment with a guest concierge system and a 70B knowledge assistant, here is what a production installation looks like:

Rack: 12U shock-isolated cabinet, shock-rated to 40G, vibration-isolated to 5 GRMS, mounted on anti-vibration pads in a dedicated tech space aft of the crew mess, above the waterline.

Compute: 2U inference server with two NVIDIA L40S (350W each, 48GB GDDR6 per card, 96GB total VRAM). Total server power draw: approximately 1800W at peak inference.

Networking: 1U managed switch with 10GbE uplinks to the vessel's backbone and satellite connectivity stack.

Storage: 2U NAS with 4x 8TB NVMe (RAID-10) for model weights, vector databases, and local telemetry.

UPS: 2x Eaton 9SX Marine 3000VA (2700W) in parallel, providing N+1 redundancy and five minutes of battery runtime. Input: dedicated 230V 32A circuit from the main switchboard.

Cooling: Positive-pressure filtered air supply from a dedicated HVAC split unit sized for 6kW heat removal (150% of the rack's maximum thermal output). Hot exhaust ducted to the vessel's general exhaust plenum.

Physical security: Biometric door lock, tamper-sensing rack door, IP camera on local NVR, clean-agent fire suppression, bilge sensor.

Total power from ship's bus: 4.5-5.5kW including cooling. Approximately 3.5% of a single 150kW generator's capacity.

Total installed cost: $45,000-$65,000 for the infrastructure (rack, UPS, cooling, security, installation). The GPUs and server hardware are separate. The infrastructure cost is the same whether you put L40S or H100 PCIe cards inside.

The Boring Work That Keeps the Ark Running

None of this is the exciting part of on-vessel AI. Nobody writes press releases about vibration isolators or UPS switchover times. But every deployment we have seen fail in the first year failed because of infrastructure, not because of model performance. The GPU ran the model perfectly. The rack was not isolated. The power was not conditioned. The compartment was not sealed.

If you are building a sovereign knowledge ark that survives when the satellite link drops, the compute must be as resilient as the concept it serves. That means treating the physical infrastructure with the same rigor you bring to model selection and prompt engineering.

The models are the brain. The infrastructure is the body. Both have to work at 3am in a Force 7 crossing. If you want to get the physical layer right the first time, talk to us. We have done this enough times to know where the failures hide.