6 Nousresearch models evaluated
| Rank | Model | Accuracy | Correct | Total | Incorrect | Errors |
|---|---|---|---|---|---|---|
| 1 | Nousresearch/Hermes-3-Llama-3.1-405b |
70.9 ± 14.8% | 18 | 25 | 6 | 1 |
| 2 | Nousresearch/Hermes-4-405b |
69.7 ± 15.3% | 17 | 24 | 7 | 0 |
| 3 | Nousresearch/Hermes-3-Llama-3.1-405b:free |
65.5 ± 18.2% | 12 | 18 | 6 | 0 |
| 4 | Nousresearch/Hermes-3-Llama-3.1-70b |
63.6 ± 19.1% | 11 | 17 | 5 | 1 |
| 5 | Nousresearch/Hermes-4-70b |
32.1 ± 33.0% | 2 | 7 | 4 | 1 |
| 6 | Nousresearch/Hermes-2-Pro-Llama-3-8b |
12.9 ± 39.2% | 0 | 4 | 3 | 1 |