5 Nvidia models evaluated
| Rank | Model | Accuracy | Correct | Total | Incorrect | Errors |
|---|---|---|---|---|---|---|
| 1 | Nvidia/Llama-3.3-Nemotron-Super-49b-V1.5 |
95.5 ± 3.4% | 56 | 58 | 2 | 0 |
| 2 | Nvidia/Nemotron-Nano-9b-V2 |
95.4 ± 3.5% | 55 | 57 | 2 | 0 |
| 3 | Nvidia/Nemotron-Nano-9b-V2:free |
92.5 ± 4.8% | 57 | 61 | 3 | 1 |
| 4 | Nvidia/Llama-3.1-Nemotron-Ultra-253b-V1 |
85.9 ± 7.5% | 46 | 53 | 6 | 1 |
| 5 | Nvidia/Llama-3.1-Nemotron-70b-Instruct |
65.5 ± 18.2% | 12 | 18 | 2 | 4 |