InferenceMAX: Open-Source Inference Benchmarking newsletter.semianalysis.com 4 points by pella 16 hours ago
pella 16 hours ago - https://github.com/InferenceMAX/InferenceMAX- "NVIDIA GB200 NVL72, AMD MI355X, Throughput Token per GPU, Latency Tok/s/user, Perf per Dollar, Cost per Million Tokes, Tokens per Provisioned Megawatt, DeepSeek R1 670B, GPTOSS 120B, Llama3 70B"
- https://github.com/InferenceMAX/InferenceMAX
- "NVIDIA GB200 NVL72, AMD MI355X, Throughput Token per GPU, Latency Tok/s/user, Perf per Dollar, Cost per Million Tokes, Tokens per Provisioned Megawatt, DeepSeek R1 670B, GPTOSS 120B, Llama3 70B"