After a protracted wait, the NVIDIA DGX Spark was launched final 12 months, with the hope of fostering a brand new wave of AI engineers and researchers. At StorageReview, we’ve acquired a number of DGX Spark techniques from varied companions for analysis, and we’re creating some thrilling cluster tasks. Nonetheless, the thermal and energy traits throughout these totally different implementations proved too attention-grabbing to maintain below wraps whereas we end our full evaluations.

What makes this comparability notably compelling is the chance to watch how totally different OEM companions have interpreted NVIDIA’s reference design. Whereas the core electronics and motherboard stay fixed throughout all implementations, the thermal administration options, chassis designs, and airflow methods range significantly. These engineering choices manifest as measurable variations in working temperatures, and understanding these variations offers priceless perception for finish customers out there for some DGX Sparks.

On this evaluation, we current a side-by-side comparability of thermal and energy efficiency for 5 DGX Spark techniques: the NVIDIA Founders Version, Gigabyte, Dell, Acer, and ASUS.
NVIDIA DGX Spark Thermal Check Methodology
The info introduced right here was gathered whereas working vLLM on-line serving benchmarks utilizing OpenAI’s GPT-OSS-120B mannequin. The benchmark includes three distinct testing eventualities designed to train totally different elements of the inference pipeline.
The equal state of affairs presents a balanced workload with 256 enter and 256 output tokens, leading to brief, symmetric calls for on each the prefill and decode phases. The prefill-heavy state of affairs shifts computational depth towards enter processing, with 4096 enter tokens producing 512 output tokens; this configuration saturates tensor cores throughout the immediate prefill/encoding part whereas sustaining average decode calls for, as in code-completion eventualities.
Conversely, the decode-heavy state of affairs reverses this relationship, utilizing 512 enter tokens to generate 4096 output tokens, just like asking a mannequin to write down an software utilizing a single immediate, creating sustained reminiscence bandwidth strain throughout the autoregressive token technology part.
Every state of affairs was examined throughout batch sizes of 1, 2, 4, 8, 16, 32, 64, and 128, leading to 24 whole take a look at phases. A 30-second cooldown interval was additionally enforced between every subsequent stage. These cooldown intervals are seen within the charts as transient dips between take a look at phases, marked by vertical dashed traces that separate the shaded state of affairs areas.
Chart Alignment and Environmental Controls
The charts introduced use a stage-aligned time axis, that means the information from all techniques is synchronized by take a look at stage slightly than absolute wall-clock time. This system permits direct comparability of system conduct throughout equivalent workload phases, regardless that particular person techniques accomplished phases at barely totally different charges on account of their thermal throttling conduct and sustained efficiency traits.
The 5 techniques have been examined concurrently below equivalent environmental circumstances, positioned adjoining to 1 one other in a managed room with a constant ambient temperature all through the take a look at. This parallel testing strategy ensures that any variations noticed are immediately attributable to the techniques’ thermal designs slightly than environmental variables similar to ambient temperature drift, airflow patterns, or time-of-day variations.
Moreover, all fashions had the most recent NVIDIA Ubuntu picture put in.
Information Assortment
System metrics have been collected at 1-second intervals utilizing a customized monitoring script that reads immediately from Linux kernel interfaces and nvidia-smi. We didn’t take direct measurements with thermal probes or any exterior energy monitoring.
Word that NVMe drive temperatures can’t be immediately in contrast apples-to-apples, as not all techniques have equivalent storage drives. Notably, the Asus system was the one unit configured with a 1TB Phison drive, the Dell featured a 4TB Phison drive, and all different techniques we examined featured a 4TB Samsung drive. As we undergo every evaluation, we are going to discover NVMe drive thermals in additional element, as even components similar to thermal bridging between the drive and the case backside aren’t the identical throughout Spark fashions.
NVIDIA DGX Spark Thermal Evaluation
Earlier than diving into the person metrics, it’s essential to know that the DGX Spark makes use of an built-in cooling system through which parts share thermal pathways. As seen within the Founders Version design, GPU warmth immediately influences the temperatures of adjoining parts, together with the CPU, NVMe storage, and community interfaces. This thermal interdependence signifies that GPU-intensive workloads create cascading thermal results all through the system, explaining why we see correlated temperature patterns throughout totally different parts.

CPU Temperature
The CPU temperature information reveals essentially the most dramatic variations between implementations. The Acer system stands out instantly, peaking at simply 74.6°C throughout the demanding Prefill Heavy state of affairs whereas each different system climbed into the mid-to-upper 80s. That’s a 10-14°C benefit over the competitors, a considerable thermal headroom that implies Acer invested considerably in cooling capability.

The Founders Version, Dell, and Gigabyte cluster along with practically equivalent thermal profiles, all reaching 87-88°C throughout peak masses. This similarity signifies that almost all companions have primarily matched NVIDIA’s reference thermal design slightly than pushing past it. ASUS lands within the center, working a couple of levels cooler than this group however nonetheless properly above Acer’s efficiency.
The sawtooth sample all through Equal and Prefill Heavy displays the benchmark’s cyclical nature, with clear restoration throughout cooldown intervals. Throughout Decode Heavy, temperatures stabilize at a extra sustained degree; this additionally hints on the memory-intensive nature of token technology and that the reminiscence configuration is a transparent bottleneck on this workload.
GPU Temperature
GPU temperatures comply with an identical story, given the shared cooling infrastructure. Acer peaks at simply 68°C throughout Prefill Heavy, whereas the opposite 4 techniques all hit 80-82°C, a constant 12-14°C benefit that mirrors the CPU findings.

The tight clustering amongst Founders Version, Dell, Gigabyte, and ASUS suggests these implementations are approaching related thermal limits, possible the place the GPU’s firmware-controlled thermal administration begins to intervene.
NVMe Temperature
Storage temperatures reveal how properly every chassis isolates the drive bay from compute warmth. Acer once more leads at 51.8°C peak versus 58-63°C for the others.

The gradual temperature rise all through testing, impartial of workload spikes, signifies warmth soak from adjoining parts accumulating over time. For prolonged fine-tuning runs or heavy mannequin swapping, that is related for drive longevity and sustained write efficiency.
As famous earlier, drive configurations different throughout techniques, so these comparisons carry some caveats. However Acer’s constant benefit throughout each thermal metric factors to real cooling enhancements slightly than variations in drive efficiency.
NIC Temperature
Trying on the ConnectX-7 NIC temps, Acer peaks at 62°C whereas the Founders Version hits 75°C, a 13°C unfold according to the sample we’ve seen throughout different parts. Gigabyte really runs cooler right here than Dell and ASUS, regardless of working hotter on the CPU, suggesting that warmth distribution varies throughout the chassis designs.

GPU Energy Consumption
Lastly, we wished to test the GPU’s energy consumption. Unsurprisingly, all 5 techniques exhibited remarkably related peak energy consumption throughout Prefill Heavy, starting from 69.3W (Acer) to 76.0W (Gigabyte).

The Takeaway
The info tells a transparent story: the Acer unit we had ran 10-15°C cooler than opponents throughout each metric, suggesting a basically superior cooling answer. Founders Version, Dell, and Gigabyte cluster collectively thermally, primarily matching NVIDIA’s reference design. ASUS occupies the center floor. Energy consumption is uniform, confirming that the thermal variations are due solely to the cooling implementation. Gigabyte, nonetheless, did stand out from the pack. It provided probably the greatest cooling profiles with one of many highest GPU energy ranges, placing the most effective stability between cooling and energy.
General, the associate fashions carry out very equally, and customers can not go mistaken with any of them. However curious readers ought to keep tuned for a complete evaluation of every system, protecting workload efficiency and everybody’s favourite: teardowns.
