L40S vs T4

Compare NVIDIA L40S and NVIDIA T4 specs, performance, and cloud pricing

L40S

48GB

From $0.820/hr

T4

16GB

From $0.220/hr

Architecture

Ada Lovelace

vs Turing

FP16 Gap

2.8x

L40S leads

SpecificationL40ST4
VRAM48 GB16 GB
VRAM TypeGDDR6XGDDR6
FP16 TFLOPS366.5 TFLOPS130 TFLOPS
FP8 TFLOPS733 TFLOPSN/A
Memory Bandwidth864 GB/s320 GB/s
TDP350W70W
InterconnectPCIe Gen4PCIe Gen3
ArchitectureAda LovelaceTuring

Price Comparison

MetricL40ST4
Cheapest On-Demand$0.820/hr$0.220/hr
Cheapest Spot$0.440/hr$0.120/hr
Providers Available55

Verdict

Best for Training

NVIDIA L40S

366.5 TFLOPS FP16 with 48GB VRAM

Best Value

NVIDIA T4

591 TFLOPS per $/hr

Best for Inference

NVIDIA L40S

733 TFLOPS FP8/FP16

Use-Case Recommendations

Large-Scale Training

Training LLMs and large multi-modal models

Winner

L40S

366.5 TFLOPS FP16 with 48GB GDDR6X provides the best training throughput.

Inference at Scale

Deploying models in production for real-time inference

Winner

L40S

733 TFLOPS FP8/FP16 gives superior inference throughput.

Budget-Conscious Workloads

Getting the best performance per dollar

Winner

T4

Starting at $0.220/hr delivers the best TFLOPS per dollar.

Learn More