← Back to all models

Microsoft Model Performance

7 Microsoft models evaluated

Model Performance

Rank Model Accuracy Correct Total Incorrect Errors
1 Microsoft/Phi-4-Reasoning-Plus 90.4 ± 5.7% 53 58 4 1
2 Microsoft/Phi-4-Multimodal-Instruct 73.9 ± 13.4% 21 28 7 0
3 Microsoft/Phi-4 65.5 ± 18.2% 12 18 5 1
4 Microsoft/Wizardlm-2-8x22b 46.0 ± 26.4% 5 11 2 4
5 Microsoft/Phi-3-Medium-128k-Instruct 22.8 ± 35.0% 1 6 2 3
6 Microsoft/Phi-3-Mini-128k-Instruct 12.9 ± 39.2% 0 4 2 2
6 Microsoft/Phi-3.5-Mini-128k-Instruct 12.9 ± 39.2% 0 4 1 3