5 Baidu models evaluated
| Rank | Model | Accuracy | Correct | Total | Incorrect | Errors |
|---|---|---|---|---|---|---|
| 1 | Baidu/Ernie-4.5-300b-A47b |
92.1 ± 5.1% | 54 | 58 | 4 | 0 |
| 2 | Baidu/Ernie-4.5-21b-A3b |
84.8 ± 8.1% | 42 | 49 | 6 | 1 |
| 3 | Baidu/Ernie-4.5-21b-A3b-Thinking |
78.9 ± 11.0% | 28 | 35 | 7 | 0 |
| 4 | Baidu/Ernie-4.5-Vl-424b-A47b |
74.7 ± 13.0% | 22 | 29 | 5 | 2 |
| 5 | Baidu/Ernie-4.5-Vl-28b-A3b |
70.1 ± 16.0% | 15 | 21 | 4 | 2 |