AI Tutoring Works: What Harvard's Randomized Controlled Trial Reveals About AI in Learning

A rigorous randomized controlled trial appearing in Nature's Scientific Reports (2025) demonstrated an AI tutor surpassing conventional classroom-based active learning. These findings merit attention: they are methodologically sound, counterintuitive, and qualified. Each dimension requires careful interpretation.

The Study

Harvard investigators (Kestin, Miller, McCarty, Kastner, Ruch, Stubbs) executed a traditional RCT contrasting AI-tutored learning against classroom-based active learning, recognized as a robust pedagogical method. Participants were Harvard undergraduates in introductory physics. Outcome measures included course assessment performance. Findings: AI-tutored students produced substantially superior results compared to active learning peers.

The Effect Size

Effect magnitudes spanning 0.73 to 1.3 SD represent substantial educational impact. Context: 0.73 SD qualifies as substantial; 1.3 SD is extraordinary. Concretely: if standard classroom active learning produced 65% content mastery, 0.73 SD gains would shift that to 78-80%. The 1.3 SD result would move it toward 85-90% mastery.

Effects of this magnitude would justify significant changes to instructional practice if they prove replicable across contexts.

The Efficiency Gain

Learning occurred in compressed timeframes. AI-tutored groups needed median 49 minutes; classroom active learning required 60 minutes. The system demonstrated superior learning alongside improved efficiency.

The Boundary Conditions

Every credible study identifies its limits. This one is no exception.

The population was Harvard undergraduates, among the most academically prepared students available for study. Whether similar effects would appear with typical K-12 students, students with learning differences, or students lacking foundational skills remains an open question.

The domain was introductory physics, a well-structured subject with clear right answers and logical progressions. AI tutoring may perform differently in subjects requiring interpretation, creativity, or values-based reasoning.

The study measured short-term performance on course assessments. It did not measure long-term retention, transfer to new problems, or the kinds of deeper learning that emerge over months and years of instruction.

What This Means for Christian Schools

The temptation will be to see this as evidence that AI can replace teachers. It cannot, and this study does not claim that it can. What it demonstrates is that AI tutoring can be an effective supplement to instruction, particularly for well-structured content domains where students benefit from immediate, personalized feedback.

For Christian schools, the implications are practical. AI tutoring could help students who are struggling with foundational content catch up more efficiently, freeing teacher time for the relational and formational work that AI cannot do. It could provide practice and reinforcement in subjects like math, science, and foreign language, where repetition and immediate feedback accelerate learning.

The key is deployment context. An AI tutor used to supplement a teacher who knows and cares about the student is redemptive. An AI tutor deployed as a replacement for human instruction is not, regardless of its effect size.

Citation: Kestin, G., Miller, K., McCarty, L.S., Kastner, K., Ruch, G., & Stubbs, C. (2025). AI Tutoring Outperforms In-Class Active Learning: An RCT. Scientific Reports (Nature).