Meta-llama Model Performance

17 Meta-llama models evaluated

Model Performance

Rank	Model	Accuracy	Correct	Total	Incorrect	Errors
1	`Meta-llama/Llama-4-Maverick:free`	95.6 ± 3.3%	58	60	1	1
2	`Meta-llama/Llama-4-Maverick`	92.1 ± 5.1%	54	58	2	2
3	`Meta-llama/Llama-3.3-70b-Instruct`	84.1 ± 8.4%	40	47	5	2
4	`Meta-llama/Llama-3.3-70b-Instruct:free`	78.9 ± 11.0%	28	35	3	4
5	`Meta-llama/Llama-3.2-90b-Vision-Instruct`	74.7 ± 13.0%	22	29	4	3
6	`Meta-llama/Llama-4-Scout`	70.9 ± 14.8%	18	25	1	6
7	`Meta-llama/Llama-3-70b-Instruct`	63.6 ± 19.1%	11	17	5	1
8	`Meta-llama/Llama-4-Scout:free`	60.3 ± 19.4%	11	18	3	4
9	`Meta-llama/Llama-3.1-405b-Instruct`	56.5 ± 22.2%	8	14	3	3
9	`Meta-llama/Llama-3.1-70b-Instruct`	56.5 ± 22.2%	8	14	4	2
10	`Meta-llama/Llama-3.2-3b-Instruct`	41.2 ± 28.0%	4	10	5	1
11	`Meta-llama/Llama-3.3-8b-Instruct:free`	39.3 ± 30.8%	3	8	1	4
12	`Meta-llama/Llama-3-8b-Instruct`	32.1 ± 33.0%	2	7	4	1
13	`Meta-llama/Llama-3.1-405b`	29.3 ± 54.9%	0	1	1	0
14	`Meta-llama/Llama-3.1-8b-Instruct`	26.4 ± 37.7%	1	5	3	1
15	`Meta-llama/Llama-3.2-11b-Vision-Instruct`	12.9 ± 39.2%	0	4	1	3
15	`Meta-llama/Llama-3.2-3b-Instruct:free`	12.9 ± 39.2%	0	4	3	1