← Back to all models

Qwen Model Performance

39 Qwen models evaluated

Model Performance

Rank Model Accuracy Correct Total Incorrect Errors
1 Qwen/Qwen3-Max 98.9 ± 1.1% 61 61 0 0
2 Qwen/Qwen3-Next-80b-A3b-Instruct 97.3 ± 2.3% 59 60 1 0
3 Qwen/Qwen-Plus 97.2 ± 2.4% 57 58 1 0
3 Qwen/Qwen3-Vl-30b-A3b-Instruct 97.2 ± 2.4% 57 58 1 0
4 Qwen/Qwen3-30b-A3b-Thinking-2507 97.1 ± 2.5% 56 57 1 0
4 Qwen/Qwen3-Next-80b-A3b-Thinking 97.1 ± 2.5% 56 57 1 0
4 Qwen/Qwen3-Vl-235b-A22b-Instruct 97.1 ± 2.5% 56 57 1 0
5 Qwen/Qwen3-235b-A22b-Thinking-2507 96.9 ± 2.6% 52 53 1 0
6 Qwen/Qwq-32b 96.9 ± 2.7% 51 52 1 0
7 Qwen/Qwen-Vl-Max 95.6 ± 3.4% 57 59 2 0
8 Qwen/Qwen-2.5-Coder-32b-Instruct 95.5 ± 3.4% 56 58 2 0
8 Qwen/Qwen-Max 95.5 ± 3.4% 56 58 1 1
8 Qwen/Qwen-Plus-2025-07-28:thinking 95.5 ± 3.4% 56 58 2 0
9 Qwen/Qwen-Plus-2025-07-28 95.4 ± 3.5% 55 57 2 0
9 Qwen/Qwen3-235b-A22b-2507 95.4 ± 3.5% 55 57 2 0
10 Qwen/Qwen3-235b-A22b:free 95.0 ± 3.8% 50 52 1 1
11 Qwen/Qwen3-Vl-30b-A3b-Thinking 94.1 ± 4.5% 42 44 2 0
12 Qwen/Qwen3-14b 93.8 ± 4.3% 55 58 3 0
12 Qwen/Qwen3-30b-A3b-Instruct-2507 93.8 ± 4.3% 55 58 3 0
13 Qwen/Qwen3-Coder-Plus 93.6 ± 4.5% 53 56 3 0
14 Qwen/Qwen3-8b 93.1 ± 4.8% 49 52 3 0
15 Qwen/Qwen3-Vl-8b-Instruct 92.0 ± 5.1% 53 57 4 0
16 Qwen/Qwen3-Vl-8b-Thinking 90.3 ± 6.2% 43 47 1 3
17 Qwen/Qwen3-4b:free 89.1 ± 10.5% 5 5 0 0
18 Qwen/Qwen3-Vl-235b-A22b-Thinking 86.3 ± 8.2% 35 40 2 3
19 Qwen/Qwen3-Coder 85.3 ± 7.8% 44 51 4 3
20 Qwen/Qwen2.5-Vl-32b-Instruct 81.1 ± 10.4% 28 34 4 2
21 Qwen/Qwen-Vl-Plus 78.9 ± 11.0% 28 35 6 1
21 Qwen/Qwen2.5-Vl-72b-Instruct 78.9 ± 11.0% 28 35 6 1
21 Qwen/Qwen3-30b-A3b 78.9 ± 11.0% 28 35 6 1
22 Qwen/Qwen3-235b-A22b 77.3 ± 12.4% 22 28 2 4
23 Qwen/Qwen-2.5-72b-Instruct 76.3 ± 12.3% 24 31 6 1
24 Qwen/Qwen3-32b 74.7 ± 13.8% 19 25 6 0
25 Qwen/Qwen3-Coder-Flash 71.9 ± 14.3% 19 26 6 1
26 Qwen/Qwen-2.5-7b-Instruct 69.7 ± 15.3% 17 24 5 2
26 Qwen/Qwen-Turbo 69.7 ± 15.3% 17 24 5 2
26 Qwen/Qwen3-Coder-30b-A3b-Instruct 69.7 ± 15.3% 17 24 6 1
27 Qwen/Qwen-2.5-Vl-7b-Instruct 32.1 ± 33.0% 2 7 3 2
28 Qwen/Qwen2.5-Coder-7b-Instruct 22.8 ± 35.0% 1 6 3 2