Urban school districts are in crisis. Student and teacher absenteeism, special education referrals, mental health complications and violence within and outside schools are all on the rise as student enrollment and state funding are in free fall. Morale is low for teachers, principals and district leaders.
Compounding these challenges, federal pandemic relief education funding (known as ESSER) ends in September 2024. Recent in-depth case studies of Chicago and Baltimore City Public Schools and my own research, including candid conversations with current and former big-city superintendents, have convinced me of a stark reality: States and cities must either empower bold leaders to make dramatic changes or step in to make those changes themselves.
It was impossible not to be moved by the courage the school leaders I spoke with displayed. Yet it was also obvious that the powers these district leaders possess are narrower than the challenges they face — and that they will need support from governors, state school chiefs, mayors and other leaders.
One superintendent lamented the incessant political scrutiny and media criticism he’s encountered, noting, “You can’t make an error without it being spread all over social media.”
Meanwhile, principals are also under pressure; many are now serving not only as instructional leaders but also as food bank organizers and mental health crisis counselors. “This job is becoming unsustainable for people to be able to have a healthy life,” one superintendent said.
Another superintendent emphasized the challenge of finding math teachers proficient enough to teach their subject, a problem exacerbated by state hiring regulations and union rules that prevent the assessment of candidates’ knowledge. “Most teachers are not even two grade levels above students in their math content knowledge,” she said.
Related: Become a lifelong learner. Subscribe to our free weekly newsletter to receive our comprehensive reporting directly in your inbox.
The best big-city district leaders know that their jobs now include resetting how public education operates. “What’s happening in schools is not just incompatible with what we want kids to do but also with the outside workforce,” a former superintendent said. “Everything outside of schools is getting more modern, hybrid, etc. Yet schools are still the same.”
These district leaders believe that learning must now be a 12-month enterprise, especially for the kids who fell behind during the pandemic.
Several leaders pointed to data showing that advances in teaching strategies are starting to work and noted that innovations in generative AI and team-based staffing could make teachers’ jobs easier, and partnerships with community services could help students with mental health challenges.
But superintendents cannot make these changes alone: Their only route to survival is with support from their cities and states.
When the fiscal cliff collides with enrollment declines, many states may be forced to put urban districts into receivership. Here are five ways state and city leaders can help urban superintendents and students now:
1. Provide political protection and regulatory relief for bold leaders.
States should provide financial relief, political cover and regulatory flexibility for districts that demonstrate solid plans and strong leadership. Superintendents must not be hamstrung by local rules preventing them from, for example, screening new teachers for math knowledge or insisting that teachers use evidence-based instructional materials.
2. Update old policies to meet new challenges.
States can help by updating their assessment and accountability systems so they better measure and incentivize career-linked skills and credentials. As one leader said, “I do see a lot of potential” for more “paid apprenticeships, etc., but none of them fit in the state and federal accountability systems.”
3. Stay in the game.
State leaders cannot expect to intervene briefly and then return to serene detachment. Improving urban districts takes fortitude, vision and a willingness to persist through objections from entrenched interest groups. New York City and New Orleans demonstrated significant gains under state and city intervention, but status quo forces and flagging state support upended their progress.
4. Help districts forge new alliances to adopt new strategies.
States can facilitate partnerships with employers, social services and higher education institutions by providing tax incentives and grants. They can encourage new, more sustainable staffing models, such as working in teams, and the use of AI to ease teacher workloads. They can bring in nonprofit transformation experts.
5. Have a Plan B.
Not all urban school districts have bold leadership that can help them overcome the odds, even with strong state-level support. State leaders must be willing to make alternative provisions for students, such as authorizing the establishment of high-performing public charter schools, mandating tutoring and supporting community-led initiatives to address student needs.
Millions of young people are leaving high school without being ready for college. Generational poverty and its accompanying social ills are being hardwired into our cities. Inaction is not an option. State and city leaders must recognize that urban districts can and must be transformed — and it will not happen without their help.
Governors, mayors, state legislators and state school chiefs must back courageous urban district leadership. And they must prepare to intervene when urban district leaders cannot overcome the overwhelming odds stacked against them.
Robin J. Lake is director of the Center on Reinventing Public Education, a nonpartisan research and policy center at Arizona State University’s Mary Lou Fulton Teachers College.
This story about urban school districts was produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Hechinger’s weekly newsletter.
Related articles
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
As the use of artificial intelligence grows, teachers are trying to protect the integrity of their educational practices and systems. When we see what AI can do in the hands of our students, it’s hard to stay neutral about how and if to use it.
Of course, we worry about cheating; AI can be used to write essays and solve math problems.
But we also have deeper concerns regarding learning. When our students use AI, they may not be engaging as deeply with our assignments and coursework.
They have discovered ways AI can be used to create essay outlines and help with project organization and other such tasks that are key components of the learning process.
Some of this could be good. AI is a fabulous tool for getting started or unstuck. AI puts together old ideas in new ways and can do this at scale: It will make creativity easier for everyone.
But this very ease has teachers wondering how we can keep our students motivated to do the hard work when there are so many new shortcuts. Learning goals, curriculums, courses and the way we grade assignments will all need to be reevaluated.
Related: Interested in innovations in the field of higher education? Subscribe to our free biweekly Higher Education newsletter.
The new realities of work also must be considered. A shift in employers’ job postings rewards those with AI skills. Many companies report already adopting generative AI tools or anticipate incorporating them into their workflow in the near future.
A core tension has emerged: Many teachers want to keep AI out of our classrooms, but also know that future workplaces may demand AI literacy.
What we call cheating, business could see as efficiency and progress.
The complexities, opportunities and decisions that lie between banning AI and teaching AI are significant.
It is increasingly likely that using AI will emerge as an essential skill for students, regardless of their career ambitions, and that action is required of educational institutions as a result.
Integrating AI into the curriculum will require change. The best starting point is a better understanding of what AI literacy looks like in our current landscape.
In our new book, we make it clear that the specifics of AI literacy will vary somewhat from one subject to the next, but there are some AI capacities that everyone will now need.
Before even writing a prompt, the AI user should develop an understanding of the following:
the role of human / AI collaborations
how to navigate the ethical implications of using AI for a given purpose
which AI tool to use (when and why)
how to use their selected AI tool fully and successfully
the limitations of generative AI systems and how to work around them
prompt engineering and all of its nuances
This knowledge will help our students write successful prompts, but additional skills and AI literacy will be required once AI returns a response. These include the abilities to:
review and evaluate AI-produced content, including how to determine its accuracy and recognize bias
edit AI content for its intended audience and purpose
follow up with AI to refine the output
take responsibility for the quality of the final work
The development of AI literacy mirrors the development of other key skills, such as critical thinking. Teaching AI literacy begins by teaching the capacities above, as well as others specific to your own subject.
While the inclination may be to start teaching AI literacy by opening a browser, faculty should begin by providing an ethical and environmental context regarding the use of AI and the responsibilities each of us has when working with AI.
Amazon Web Services recently surveyed employers from all business sectors about what skills employees need to use AI well. In ranked order, their answers included the following:
critical thinking and problem solving
creative thinking and design competence
technical proficiency
ethics and risk management
communication
math
teamwork
management
writing
Higher education is quite adept at teaching such skills, and many of those noted are among the American Association of Colleges and Universities’ (AAC&U) list of “essential learning outcomes” for higher education.
Faculty will need to improve their own AI literacy and explore the most advanced generative AI tools (currently ChatGPT 4o, Gemini 1.5 and Claude 3.5). A good way to begin is to ask AI to perform assignments and projects that you typically ask your students to complete — and then try to improve the AI’s response.
Understanding what AI can and cannot do well within the context of your course will be key as you contemplate revising your assignments and teaching.
Faculty should also find out if their college has an advisory board comprised of past students and/or employers. Reach out to them for firsthand insight on how AI is shifting the landscape — and keep that conversation going over time. That information will be essential as you think about AI literacy within your subjects and courses.
These actions will ultimately position you to be able to navigate the complexities and decisions that lie between ban and teach.
C. Edward Watson is vice president for digital innovation with the American Association of Colleges and Universities (AAC&U). José Antonio Bowen is a former president of Goucher College and co-author with Watson of “Teaching with AI: A Practical Guide to a New Era of Human Learning.”
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Ever since the pandemic shut down schools in the spring of 2020, education researchers have pointed to tutoring as the most promising way to help kids catch up academically. Evidence from almost 100 studies was overwhelming for a particular kind of tutoring, called high-dosage tutoring, where students focus on either reading or math three to five times a week.
But until recently, there has been little good evidence for the effectiveness of online tutoring, where students and tutors interact via video, text chat and whiteboards. The virtual versionhas boomed since the federal government handed schools nearly $190 billion of pandemic recovery aid and specifically encouraged them to spend it on tutoring. Now, some new U.S. studies could offer useful guidance to educators.
Online attendance is a struggle
In the spring of 2023, almost 1,000 Northern California elementary school children in grades 1 to 4 were randomly assigned to receive online reading tutoring during the school day. Students were supposed to get 20 to 30 sessions each, but only one of five students received that much. Eighty percent didn’t and they didn’t do much better than the 800 students in the comparison group who didn’t get tutoring, according to a draft paper by researchers from Teachers College, Columbia University, which was posted to the Annenberg Institute website at Brown University in April 2024. (The Hechinger Report is an independent news organization based at Teachers College, Columbia University.)
Researchers have previously found that it is important to schedule in-person tutoring sessions during the school day, when attendance is mandatory. The lesson here with online tutoring is that attendance can be rocky even during the school day. Often, students end up with a low dose of tutoring instead of the high dose that schools have paid for.
However, online tutoring can be effective when students participate regularly. In this Northern California study, reading achievement increased substantially, in line with in-person tutoring, for the roughly 200 students who got at least 20 sessions across 10 weeks.
The students who logged in regularly might have been more motivated students in the first place, the researchers warned, indicating that it could be hard to reproduce such large academic benefits for all. During the periods when children were supposed to receive tutoring, researchers observed that some children – often ones who were slightly higher achieving – regularly logged on as scheduled while others didn’t. The difference in student behavior and what the students were doing instead wasn’t explained. Students also seemed to log in more frequently when certain staff members were overseeing the tutoring and less frequently with others.
Small group tutoring doesn’t work as well online
The large math and reading gains that researchers documented in small groups of students with in-person tutors aren’t always translating to the virtual world.
Another study of more than 2,000 elementary school children in Texas tested the difference between one-to-one and two-to-one online tutoring during the 2022-23 school year. These were young, low-income children, in kindergarten through 2nd grade, who were just learning to read. Children who were randomly assigned to get one-to-one tutoring four times a week posted small gains on one test, but not on another, compared to students in a comparison group who didn’t get tutoring. First graders assigned to one-to-one tutoring gained the equivalent of 30 additional days of school. By contrast, children who had been tutored in pairs were statistically no different in reading than the comparison group of untutored children. A draft paper about this study, led by researchers from Stanford University, was posted to the Annenberg website in May 2024.
Another small study in Grand Forks, North Dakota confirmed the downside of larger groups with online tutoring. Researchers from Brown University directly compared the math progress of middle school students when they received one-to-one tutoring versus small groups of three students. The study was too small, only 180 students, to get statistically strong results, but the half that were randomly assigned to receive individual tutoring appeared to gain eight extra percentile points, compared to the students who were assigned to small group tutoring. It was possible that students in the small groups learned a third as much math, the researchers estimated, but these students might have learned much less. A draft of this paper was posted to the Annenberg website in June 2024.
In surveys, tutors said it was hard to keep all three kids engaged online at once. Students were more frequently distracted and off-task, they said. Shy students were less likely to speak up and participate. With one student at a time, tutors said they could move at a faster pace and students “weren’t afraid to ask questions” or “afraid of being wrong.” (On the plus side, tutors said groups of three allowed them to organize group activities or encourage a student to help a peer.)
Behavior problems happen in person too. However, when I have observed in-person small group tutoring in schools, each student is often working independently with the tutor, almost like three simultaneous sessions of one-to-one help. In-person tutors can encourage a student to keep practicing through a silent glance, a smile or hand signal even as they are explaining something to another student. Online, each child’s work and mistakes are publicly exposed on the screen to the whole group. Private asides aren’t as easy; some platforms allow the tutor to text a child privately in a chat window, but that takes time. Tutors have told me that many teens don’t like seeing their face on screen, but turning the camera off makes it harder for them to sense if a student is following along or confused.
Matt Kraft, one of the Brown researchers on the Grand Forks study, suggests that bigger changes need to be made to online tutoring lessons in order to expand from one-to-one to small group tutoring, and he notes that school staff are needed in the classroom to keep students on-task.
School leaders have until March 2026 to spend the remainder of their $190 billion in pandemic recovery funds, but contracts with tutoring vendors must be signed by September 2024. Both options — in person and virtual — involve tradeoffs. New research evidence is showing that virtual tutoring can work well, especially when motivated students want the tutoring and log in regularly. But many of the students who are significantly behind grade level and in need of extra help may not be so motivated. Keeping the online tutoring small, ideally one-to-one, improves the chances that it will be effective. But that means serving many fewer students, leaving millions of children behind. It’s a tough choice.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
If we are to believe the current rapturous cheerleading around artificial intelligence, education is about to be transformed. Digital educators, alert and available at all times, will soon replace their human counterparts and feed students with concentrated personalized content.
It’s reminiscent of a troubling experiment from the 1960s, immortalized in one touching image: an infant monkey, clearly scared, clutching a crude cloth replica of the real mother it has been deprived of. Next to it is a roll of metal mesh with a feeding bottle attached. The metal mom supplies milk, while the cloth mom sits inert. And yet, in moments of stress, it is the latter the infant seeks succor from.
Notwithstanding its distressing provenance, this image has bearing on a topical question: What role should AI play in our children’s education? And in school counseling? Here’s one way to think about these questions.
With its detached efficiency, an AI system is like the metal mesh mother — capable of delivering information, but little else. Human educators — the teachers and the school counselors with whom students build emotional bonds and relationships of trust — are like the cloth mom.
It would be a folly to replace these educators with digital counterparts. We don’t need to look very far back to validate this claim. Just over a decade ago, we were gripped by the euphoria around MOOCs — educational videos accessible to all via the Internet.
“The end of classroom education!” “An inflection point!” screamed breathless headlines. The reality turned out to be a lot less impressive.
MOOCs wound up playing a helpful supporting role in education, but the stars of the show remained the human teachers; in-person learning environments turned out to be essential. The failures of remote learning during Covid support the same conclusion. A similar narrative likely will (and we argue, ought to) play out in the context of AI and school counseling.
Related: Become a lifelong learner. Subscribe to our free weekly newsletter to receive our comprehensive reporting directly in your inbox.
Guidance for our children must keep caring adults at its core. Counselors play an indispensable role in helping students find their paths through the school maze. Their effectiveness is driven by their expertise, empathy and ability to be confidants to students in moments of doubt and stress.
At least, that is how counseling is supposed to work. In reality, the counseling system is under severe stress.
The American School Counselor Association recommends a student-to-counselor ratio of 250-to-1, yet the actual average was 385-to-1 for the 2022–23 school year, the most recent year for which data is available. In many schools the ratio is far higher.
Even for the most dedicated counselor, such a ratio makes it impossible to spend much time getting to know any one student; the counselor has to focus on administrative work like schedule changes and urgent issues like mental health. This constraint on availability has cascading effects, limiting the counselor’s ability to personalize advice and recommendations.
Students sense that their counselors are rushed or occupied with other crises and feel hesitant to ask for more advice and support from these caring adults. Meanwhile, the counselors are assigned extraneous tasks like lunch duty and attendance support, further scattering their attention.
Against this dispiriting backdrop, it is tempting to turn to AI as a savior. Can’t generative AI systems be deployed as virtual counselors that students can interact with and get recommendations from? As often as they want? On any topic? Costing a fraction of the $60,000 annual salary of a typical human school counselor?
Given the fantastic recent leaps in the capabilities of AI systems, answers to all these questions appear to be a resounding yes: There is a compelling case to be made for having AI play a role in school counseling. But it is not one of replacement.
AI’s ability to process vast amounts of data and offer personalized recommendations makes it well-suited for enhancing the counseling experience. By analyzing data on a student’s personality and interests, AI can facilitate more meaningful interactions between the student and their counselor and lay the groundwork for effective goal setting.
AI also excels at breaking down complex tasks into manageable steps, turning goals into action plans. This work is often time-consuming for human counselors, but it’s easy for AI, making it an invaluable ally in counseling sessions.
By leveraging AI to augment traditional approaches, counselors can allocate more time to providing critical social and emotional support and fostering stronger mentorship relationships with students.
Incorporating AI into counseling services also brings long-term benefits: AI systems can track recommendations and student outcomes, and thus continuously improve system performance over time. Additionally, AI can stay abreast of emerging trends in the job market so that counselors can offer students cutting-edge guidance on future opportunities.
And AI add-ons are well-suited to provide context-specific suggestions and information — such as for courses and local internships — on an as-needed basis and to adapt to a student’s changing interests and goals over time.
As schools grapple with declining budgets and chronicabsenteeism, the integration of AI into counseling services offers a remarkable opportunity to optimize counseling sessions and establish support systems beyond traditional methods.
Still, it is an opportunity we must approach with caution. Human counselors serve an essential and irreplaceable role in helping students learn about themselves and explore college and career options. By harnessing the power of AI alongside human strengths, counseling services can evolve to meet the diverse needs of students in a highly personalized, engaging and goal-oriented manner.
Izzat Jarudi is co-founder and CEO of Edifii, a startup offering digital guidance assistance for high school students and counselors supported by the U.S. Department of Education’s SBIR program. Pawan Sinha is a professor of neuroscience and AI at MIT and Edifii’s co-founder and chief scientist. Carolyn Stone, past president of the American School Counselor Association,contributed to this piece.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
When ChatGPT was released to the public in November 2022, advocates and watchdogs warned about the potential for racial bias. The new large language model was created by harvesting 300 billion words from books, articles and online writing, which include racist falsehoods and reflect writers’ implicit biases. Biased training data is likely to generate biased advice, answers and essays. Garbage in, garbage out.
Researchers are starting to document how AI bias manifests in unexpected ways. Inside the research and development arm of the giant testing organization ETS, which administers the SAT, a pair of investigators pitted man against machine in evaluating more than 13,000 essays written by students in grades 8 to 12. They discovered that the AI model that powers ChatGPT penalized Asian American students more than other races and ethnicities in grading the essays. This was purely a research exercise and these essays and machine scores weren’t used in any of ETS’s assessments. But the organization shared its analysis with me to warn schools and teachers about the potential for racial bias when using ChatGPT or other AI apps in the classroom.
AI and humans scored essays differently by race and ethnicity
“Diff” is the difference between the average score given by humans and GPT-4o in this experiment. “Adj. Diff” adjusts this raw number for the randomness of human ratings. Source: Table from Matt Johnson & Mo Zhang “Using GPT-4o to Score Persuade 2.0 Independent Items” ETS (June 2024 draft)
“Take a little bit of caution and do some evaluation of the scores before presenting them to students,” said Mo Zhang, one of the ETS researchers who conducted the analysis. “There are methods for doing this and you don’t want to take people who specialize in educational measurement out of the equation.”
That might sound self-serving for an employee of a company that specializes in educational measurement. But Zhang’s advice is worth heeding in the excitement to try new AI technology. There are potential dangers as teachers save time by offloading grading work to a robot.
In ETS’s analysis, Zhang and her colleague Matt Johnson fed 13,121 essays into one of the latest versions of the AI model that powers ChatGPT, called GPT 4 Omni or simply GPT-4o. (This version was added to ChatGPT in May 2024, but when the researchers conducted this experiment they used the latest AI model through a different portal.)
A little background about this large bundle of essays: students across the nation had originally written these essays between 2015 and 2019 as part of state standardized exams or classroom assessments. Their assignment had been to write an argumentative essay, such as “Should students be allowed to use cell phones in school?” The essays were collected to help scientists develop and test automated writing evaluation.
Each of the essays had been graded by expert raters of writing on a 1-to-6 point scale with 6 being the highest score. ETS asked GPT-4o to score them on the same six-point scale using the same scoring guide that the humans used. Neither man nor machine was told the race or ethnicity of the student, but researchers could see students’ demographic information in the datasets that accompany these essays.
GPT-4o marked the essays almost a point lower than the humans did. The average score across the 13,121 essays was 2.8 for GPT-4o and 3.7 for the humans. But Asian Americans were docked by an additional quarter point. Human evaluators gave Asian Americans a 4.3, on average, while GPT-4o gave them only a 3.2 – roughly a 1.1 point deduction. By contrast, the score difference between humans and GPT-4o was only about 0.9 points for white, Black and Hispanic students. Imagine an ice cream truck that kept shaving off an extra quarter scoop only from the cones of Asian American kids.
“Clearly, this doesn’t seem fair,” wrote Johnson and Zhang in an unpublished report they shared with me. Though the extra penalty for Asian Americans wasn’t terribly large, they said, it’s substantial enough that it shouldn’t be ignored.
The researchers don’t know why GPT-4o issued lower grades than humans, and why it gave an extra penalty to Asian Americans. Zhang and Johnson described the AI system as a “huge black box” of algorithms that operate in ways “not fully understood by their own developers.” That inability to explain a student’s grade on a writing assignment makes the systems especially frustrating to use in schools.
This table compares GPT-4o scores with human scores on the same batch of 13,121 student essays, which were scored on a 1-to-6 scale. Numbers highlighted in green show exact score matches between GPT-4o and humans. Unhighlighted numbers show discrepancies. For example, there were 1,221 essays where humans awarded a 5 and GPT awarded 3. Data source: Matt Johnson & Mo Zhang “Using GPT-4o to Score Persuade 2.0 Independent Items” ETS (June 2024 draft)
This one study isn’t proof that AI is consistently underrating essays or biased against Asian Americans. Other versions of AI sometimes produce different results. A separate analysis of essay scoring by researchers from University of California, Irvine and Arizona State University found that AI essay grades were just as frequently too high as they were too low. That study, which used the 3.5 version of ChatGPT, did not scrutinize results by race and ethnicity.
I wondered if AI bias against Asian Americans was somehow connected to high achievement. Just as Asian Americans tend to score high on math and reading tests, Asian Americans, on average, were the strongest writers in this bundle of 13,000 essays. Even with the penalty, Asian Americans still had the highest essay scores, well above those of white, Black, Hispanic, Native American or multi-racial students.
In both the ETS and UC-ASU essay studies, AI awarded far fewer perfect scores than humans did. For example, in this ETS study, humans awarded 732 perfect 6s, while GPT-4o gave out a grand total of only three. GPT’s stinginess with perfect scores might have affected a lot of Asian Americans who had received 6s from human raters.
ETS’s researchers had asked GPT-4o to score the essays cold, without showing the chatbot any graded examples to calibrate its scores. It’s possible that a few sample essays or small tweaks to the grading instructions, or prompts, given to ChatGPT could reduce or eliminate the bias against Asian Americans. Perhaps the robot would be fairer to Asian Americans if it were explicitly prompted to “give out more perfect 6s.”
The ETS researchers told me this wasn’t the first time that they’ve noticed Asian students treated differently by a robo-grader. Older automated essay graders, which used different algorithms, have sometimes done the opposite, giving Asians higher marks than human raters did. For example, an ETS automated scoring system developed more than a decade ago, called e-rater, tended to inflate scores for students from Korea, China, Taiwan and Hong Kong on their essays for the Test of English as a Foreign Language (TOEFL), according to a study published in 2012. That may have been because some Asian students had memorized well-structured paragraphs, while humans easily noticed that the essays were off-topic. (The ETS website says it only relies on the e-rater score alone for practice tests, and uses it in conjunction with human scores for actual exams.)
It was also unclear why BERT sometimes treated Asian Americans differently. But it illustrates how important it is to test these systems before we unleash them in schools. Based on educator enthusiasm, however, I fear this train has already left the station. In recent webinars, I’ve seen many teachers post in the chat window that they’re already using ChatGPT, Claude and other AI-powered apps to grade writing. That might be a time saver for teachers, but it could also be harming students.
This story about AI bias was written by Jill Barshay and produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for Proof Points and other Hechinger newsletters.
Related articles
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
When I first started teaching middle school, I did everything my university prep program told me to do in what’s known as the “workshop model.”
I let kids choose their books. I determined their independent reading levels and organized my classroom library according to reading difficulty.
I then modeled various reading skills, like noticing the details of the imagery in a text, and asked my students to practice doing likewise during independent reading time.
It was an utter failure.
Kids slipped their phones between the pages of the books they selected. Reading scores stagnated. I’m pretty sure my students learned nothing that year.
Yet one aspect of this model functioned seamlessly: when I sat on a desk in front of the room and read out loud from a shared classroom novel.
Kids listened, discussions arose naturally and everything seemed to click.
Slowly, the reason for these episodic successes became clear to me: Shared experiences and teacher direction are necessary for high-quality instruction and a well-run classroom.
Related: Become a lifelong learner. Subscribe to our free weekly newsletter to receive our comprehensive reporting directly in your inbox.
Over time, I pieced together the idea that my students would benefit most from a teaching model that emphasized shared readings of challenging works of literature; memorization of poetry; explicit grammar instruction; contextual knowledge, including history; and teacher direction — not time practicing skills.
But even as I made changes and saw improvements, doubts nagged at me. By abandoning student choice, and asking kids to dust off Chaucer, would I snuff out their joy of reading? Is Shakespearean English simply too difficult for middle schoolers?
To set my doubts aside, I surveyed the relevant research and found that many of the assumptions upon which the workshop model was founded are simply false — starting with the assumption that reading comprehension depends on “reading comprehension skills.”
There is evidence that teaching such skills has some benefit, but what students really need in order to read with understanding is knowledge about history, geography, science, music, the arts and the world more broadly.
Perhaps the most famous piece of evidence for this knowledge-centered theory of reading comprehension is the “baseball study,” in which researchers gave children an excerpt about baseball and then tested their comprehension. At the outset of the study, researchers noted the children’s reading levels and baseball knowledge; they varied considerably.
Ultimately, the researchers found that it was each child’s prior baseball knowledge and not their predetermined reading ability that predicted their comprehension and recall of the passage.
That shouldn’t be surprising. Embedded within any newspaper article or novel is a vast amount of assumed knowledge that authors take for granted — from the fall of the Soviet Union to the importance of 1776.
Just about any student can decode the words “Berlin Wall,” but they need a knowledge of basic geography (where is Berlin?), history (why was the Berlin wall built?) and political philosophy (what qualities of the Communist regime caused people to flee from East to West?) to grasp the full meaning of an essay or story involving the Berlin Wall.
Of course, students aren’t born with this knowledge, which is why effective teachers build students’ capacity for reading comprehension by relentlessly exposing them to content-rich texts.
My research confirmed what I had concluded from my classroom experiences: The workshop model’s text-leveling and independent reading have a weak evidence base.
Rather than obsessing over the difficulty of texts, educators would better serve students by asking themselves other questions, such as: Does our curriculum expose children to topics they might not encounter outside of school? Does it offer opportunities to discuss related historical events? Does it include significant works of literature or nonfiction that are important for understanding modern society?
In my classroom, I began to choose many books simply because of their historical significance or instructional opportunities. Reading the memoirs of Frederick Douglass with my students allowed me to discuss supplementary nonfiction texts about chattel slavery, fugitive slave laws and the Emancipation Proclamation.
Reading “The Magician’s Nephew” by C. S. Lewis prompted teaching about allusions to the Christian creation story and the myth of Narcissus, knowledge they could use to analyze future stories and characters.
Proponents of the workshop model claim that letting students choose the books they read will make them more motivated readers, increase the amount of time they spend reading and improve their literacy. The claim is widely believed.
However, it’s unclear to me why choice would necessarily foster a love of reading. To me, it seems more likely that a shared reading of a classic work with an impassioned teacher, engaged classmates and a thoughtfully designed final project are more motivating than reading a self-selected book in a lonely corner. That was certainly my experience.
After my classes acted out “Romeo and Juliet,” with rulers trimmed and painted to resemble swords, and read “To Kill a Mockingbird” aloud, countless students (and their parents) told me it was the first time they’d ever enjoyed reading.
They said these classics were the first books that made them think — and the first ones that they’d ever connected with.
Students don’t need hours wasted on finding a text’s main idea or noticing details. They don’t need time cloistered off with another book about basketball.
They need to experience art, literature and history that might not immediately interest them but will expand their perspective and knowledge of the world.
They need a teacher to guide them through and inspire a love and interest in this content. The workshop model doesn’t offer students what they need, but teachers still can.
Daniel Buck is an editorial and policy associate at the Thomas B. Fordham Institute and the author of “What Is Wrong with Our Schools?”
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Higher education has finally come around to the idea that college should better help prepare students for careers.
It’s about time: Recognizing that students do not always understand the connection between their coursework and potential careers is a long-standing problem that must be addressed.
Over 20 years ago, I co-authored the best-selling “Quarterlife Crisis,” one of the first books to explore the transition from college to the workforce. We found, anecdotally, that recent college graduates felt inadequately prepared to choose a career or transition to life in the workforce. At that time, liberal arts institutions in particular did not view career preparation as part of their role.
While some progress has been made since then, institutions can still do a better job connecting their educational and economic mobility missions; recent research indicates that college graduates are having a hard time putting their degrees to work.
Importantly, improving career preparation can help not only with employment but also with student retention and completion.
Related: Interested in innovations in the field of higher education? Subscribe to our free biweekly Higher Education newsletter.
I believe that if students have a career plan in mind, and if they better understand how coursework will help them succeed in the workforce, they will be more likely to complete that coursework, persist, graduate and succeed in their job search.
First-generation students, in particular, whose parents often lack college experience, may not understand why they need to take a course such as calculus, which, on the surface, does not appear to help prepare them for most jobs in the workforce.
They will benefit deeply from a clearer understanding of how such required courses connect to their career choices and skills.
Acknowledging the need for higher education to better demonstrate course-to-career linkages — and its role in workforce preparation — is an important first step.
Taking action to improve these connections will better position students and institutions. Better preparing students for the workforce will increase their success rates and, in turn, will improve college rankings on student success measures.
This might require a cultural shift in some cases, but given the soaring cost of tuition, it is necessary for institutions to think about return on investment for students and their parents, not only in intellectual terms but also monetarily.
Such a shift could help facilitate much-needed social and economic mobility, particularly for students who borrow money to attend college.
Recent articles and research about low job placement rates for college graduates often posit that internships provide the needed connection between college and careers. Real-world experience is important, but there are other ways to make a college degree more career relevant.
1. Spell out the connections for students. The class syllabus is one opportunity to make this connection for students. Faculty can explain how different coursework topics and texts translate to career skills and provide real-life examples of those skills at work. In some cases, however, this might be a tough sell for faculty who have spent their careers in the academy and do not see career counseling as part of their job.
But providing this additional information for students does not need to be a big lift and can be done in partnership with campus staff, such as career services counselors. These connections can also be made in course catalogs, on department websites and through student seminars.
2. Raise awareness of realistic careers. Many students start college with the goal of entering a commonly known profession — doctor, lawyer or teacher, to name a few. However, there are hundreds of jobs, such as public policy research and advocacy, with which students may not be as familiar. Colleges should provide more detailed information on a wide range of careers that students may never have thought of — and how coursework can help them enter those fields. Experiential learning can provide good opportunities to sample careers that match students’ interests, to help further determine the right fit.
Increased awareness of job options can also serve as motivation for students as they formulate their goals and plans. Jobs can be described through the same information avenues as the career-coursework connections listed above, along with examples of how coursework is used in each job.
3. Make coursework-career connections a campuswide priority. College leaders must stress to faculty the importance of better preparing students for careers. Economic mobility is of increasing importance to institutions and the general public, and consumers now rely on information about employment outcomes when selecting colleges (e.g., see College Scorecard).
Faculty can be assured that adding career preparation to a college degree does not diminish its educational value — quite the contrary; critical thinking and analytical skills, for example, are of utmost importance to liberal arts programs and prospective employers. Simply demonstrating those links does not change coursework content or objectives.
4. Help students translate their coursework for the job market. Beyond understanding the coursework-to-career linkages, students must know how to articulate them. Job interviews are unnatural for anyone, especially for students new to the workforce — and even more so for those who are the first in their families to graduate from college.
Career centers often provide interview tips to students — again, if the students seek out that help — but special emphasis should be placed on helping students reflect on their coursework and translate the skills and knowledge they have gained for employers.
A portfolio can help them accomplish this, and it can be developed at regular intervals throughout a student’s time on campus, since reflecting on several years of coursework all at once can be challenging. A Senior Year Seminar can further promote workforce readiness and tie together the career skills gained throughout one’s time on campus.
By making these simple changes, institutions can take the lead in making students and the public more aware of the benefits of higher education.
Abby Miller, founding partner at ASA Research, has been researching higher education and workforce development for over 20 years.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Reports about schools squandering their $190 billion in federal pandemic recovery money have been troubling. Many districts spent that money on things that had nothing to do with academics, particularly building renovations. Less common, but more eye-popping were stories about new football fields, swimming pool passes, hotel rooms at Caesar’s Palace in Las Vegas and even the purchase of an ice cream truck.
So I was surprised that two independent academic analyses released in June 2024 found that some of the money actually trickled down to students and helped them catch up academically. Though the two studies used different methods, they arrived at strikingly similar numbers for the average growth in math and reading scores during the 2022-23 school year that could be attributed to each dollar of federal aid.
One of the research teams, which includes Harvard University economist Tom Kane and Stanford University sociologist Sean Reardon, likened the gains to six days of learning in math and three days of learning in reading for every $1,000 in federal pandemic aid per student. Though that gain might seem small, high-poverty districts received an average of $7,700 per student, and those extra “days” of learning for low-income students added up. Still, these neediest children were projected to be one third of a grade level behind low-income students in 2019, before the pandemic disrupted education.
“Federal funding helped and it helped kids most in need,” wrote Robin Lake, director of the Center on Reinventing Public Education, on X in response to the two studies. Lake was not involved in either report, but has been closely tracking pandemic recovery. “And the spending was worth the gains,” Lake added. “But it will not be enough to do all that is needed.”
The academic gains per aid dollar were close to what previous researchers had found for increases in school spending. In other words, federal pandemic aid for schools has been just as effective (or ineffective) as other infusions of money for schools. The Harvard-Stanford analysis calculated that the seemingly small academic gains per $1,000 could boost a student’s lifetime earnings by $1,238 – not a dramatic payoff, but not a public policy bust either. And that payoff doesn’t include other societal benefits from higher academic achievement, such as lower rates of arrests and teen motherhood.
The most interesting nuggets from the two reports, however, were how the academic gains varied wildly across the nation. That’s not only because some schools used the money more effectively than others but also because some schools got much more aid per student.
The poorest districts in the nation, where 80 percent or more of the students live in families whose income is low enough to qualify for the federally funded school lunch program, demonstrated meaningful recovery because they received the most aid. About 6 percent of the 26 million public schoolchildren that the researchers studied are educated in districts this poor. These children had recovered almost half of their pandemic learning losses by the spring of 2023. The very poorest districts, representing 1 percent of the children, were potentially on track for an almost complete recovery in 2024 because they tended to receive the most aid per student. However, these students were far below grade level before the pandemic, so their recovery brings them back to a very low rung.
Some high-poverty school districts received much more aid per student than others. At the top end of the range, students in Detroit received about $26,000 each – $1.3 billion spread among fewer than 49,000 students. One in 10 high-poverty districts received more than $10,700 for each student. An equal number of high-poverty districts received less than $3,700 per student. These surprising differences for places with similar poverty levels occurred because pandemic aid was allocated according to the same byzantine rules that govern federal Title I funding to low-income schools. Those formulas give large minimum grants to small states, and more money to states that spend more per student.
On the other end of the income spectrum are wealthier districts, where 30 percent or fewer students qualify for the lunch program, representing about a quarter of U.S. children. The Harvard-Stanford researchers expect these students to make an almost complete recovery. That’s not because of federal recovery funds; these districts received less than $1,000 per student, on average. Researchers explained that these students are on track to approach 2019 achievement levels because they didn’t suffer as much learning loss. Wealthier families also had the means to hire tutors or time to help their children at home.
Middle-income districts, where between 30 percent and 80 percent of students are eligible for the lunch program, were caught in between. Roughly seven out of 10 children in this study fall into this category. Their learning losses were sometimes large, but their pandemic aid wasn’t. They tended to receive between $1,000 and $5,000 per student. Many of these students are still struggling to catch up.
In the second study, researchers Dan Goldhaber of the American Institutes for Research and Grace Falken of the University of Washington estimated that schools around the country, on average, would need an additional $13,000 per student for full recovery in reading and math. That’s more than Congress appropriated.
There were signs that schools targeted interventions to their neediest students. In school districts that separately reported performance for low-income students, these students tended to post greater recovery per dollar of aid than wealthier students, the Goldhaber-Falken analysis shows.
Impact differed more by race, location and school spending. Districts with larger shares of white students tended to make greater achievement gains per dollar of federal aid than districts with larger shares of Black or Hispanic students. Small towns tended to produce more academic gains per dollar of aid than large cities. And school districts that spend less on education per pupil tended to see more academic gains per dollar of aid than high spenders. The latter makes sense: an extra dollar to a small budget makes a bigger difference than an extra dollar to a large budget.
The most frustrating part of both reports is that we have no idea what schools did to help students catch up. Researchers weren’t able to connect the academic gains to tutoring, summer school or any of the other interventions that schools have been trying. Schools still have until September to decide how to spend their remaining pandemic recovery funds, and, unfortunately, these analyses provide zero guidance.
And maybe some of the non-academic things that schools spent money on weren’t so frivolous after all. A draft paper circulated by the National Bureau of Economic Research in January 2024 calculated that school spending on basic infrastructure, such as air conditioning and heating systems, raised test scores. Spending on athletic facilities did not.
Meanwhile, the final score on pandemic recovery for students is still to come. I’ll be looking out for it.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
One brain study, published in May 2024, detected different electrical activity in the brain after students had read a passage on paper, compared with screens. Credit: Getty Images
Studies show that students of all ages, from elementary school to college, tend to absorb more when they’re reading on paper rather than screens. The advantage for paper is a small one, but it’s been replicated in dozens of laboratory experiments, particularly when students are reading about science or other nonfiction texts.
Experts debate why comprehension is worse on screens. Some think the glare and flicker of screens tax the brain more than ink on paper. Others conjecture that students have a tendency to skim online but read with more attention and effort on paper. Digital distraction is an obvious downside to screens. But internet browsing, texting or TikTok breaks aren’t allowed in the controlled conditions of these laboratory studies.
Neuroscientists around the world are trying to peer inside the brain to solve the mystery. Recent studies have begun to document salient differences in brain activity when reading on paper versus screens. None of the studies I discuss below is definitive or perfect, but together they raise interesting questions for future researchers to explore.
One Korean research team documented that young adults had lower concentrations of oxygenated hemoglobin in a section of the brain called the prefrontal cortex when reading on paper compared with screens. The prefrontal cortex is associated with working memory and that could mean the brain is more efficient in absorbing and memorizing new information on paper, according to a study published in January 2024 in the journal Brain Sciences. An experiment in Japan, published in 2020, also noticed less blood flow in the prefrontal cortex when readers were recalling words in a passage that they had read on paper, and more blood flow with screens.
But it’s not clear what that increased blood flow means. The brain needs to be activated in order to learn and one could also argue that the extra brain activation during screen reading could be good for learning.
Instead of looking at blood flow, a team of Israeli scientists analyzed electrical activity in the brains of 6- to 8-year-olds. When the children read on paper, there was more power in high-frequency brainwaves. When the children read from screens, there was more energy in low-frequency bands.
The Israeli scientists interpreted these frequency differences as a sign of better concentration and attention when reading on paper. In their 2023 paper, they noted that attention difficulties and mind wandering have been associated with lower frequency bands – exactly the bands that were elevated during screen reading. However, it was a tiny study of 15 children and the researchers could not confirm whether the children’s minds were actually wandering when they were reading on screens.
Another group of neuroscientists in New York City has also been looking at electrical activity in the brain. But instead of documenting what happens inside the brain while reading, they looked at what happens in the brain just after reading, when students are responding to questions about a text.
The study, published in the peer-reviewed journal PLOS ONE in May 2024, was conducted by neuroscientists at Teachers College, Columbia University, where The Hechinger Report is also based. My news organization is an independent unit of the college, but I am covering this study just like I cover other educational research.
In the study, 59 children, aged 10 to 12, read short passages, half on screens and half on paper. After reading the passage, the children were shown new words, one at a time, and asked whether they were related to the passage they had just read. The children wore stretchy hair nets embedded with electrodes. More than a hundred sensors measured electrical currents inside their brains a split second after each new word was revealed.
For most words, there was no difference in brain activity between screens and paper. There was more positive voltage when the word was obviously related to the text, such as the word “flow” after reading a passage about volcanoes. There was more negative voltage with an unrelated word like “bucket,” which the researchers said was an indication of surprise and additional brain processing. These brainwaves were similar regardless of whether the child had read the passage on paper or on screens.
However, there were stark differences between paper and screens when it came to ambiguous words, ones where you could make a creative argument that the word was tangentially related to the reading passage or just as easily explain why it was unrelated. Take for example, the word “roar” after reading about volcanoes. Children who had read the passage on paper showed more positive voltage, just as they had for clearly related words like “flow.” Yet, those who had read the passage on screens showed more negative activity, just as they had for unrelated words like “bucket.”
For the researchers, the brainwave difference for ambiguous words was a sign that students were engaging in “deeper” reading on paper. According to this theory, the more deeply information is processed, the more associations the brain makes. The electrical activity the neuroscientists detected reveals the traces of these associations and connections.
Despite this indication of deeper reading, the researchers didn’t detect any differences in basic comprehension. The children in this experiment did just as well on a simple comprehension test after reading a passage on paper as they did on screens. The neuroscientists told me that the comprehension test they administered was only to verify that the children had actually read the passage and wasn’t designed to detect deeper reading. I wish, however, the children had been asked to do something involving more analysis to buttress their argument that students had engaged in deeper reading on paper.
Virginia Clinton-Lisell, a reading researcher at the University of North Dakota who was not involved in this study, said she was “skeptical” of its conclusions, in part because the word-association exercise the neuroscientists created hasn’t been validated by outside researchers. Brain activation during a word association exercise may not be proof that we process language more thoroughly or deeply on paper.
One noteworthy result from this experiment is speed. Many reading experts have believed that comprehension is often worse on screens because students are skimming rather than reading. But in the controlled conditions of this laboratory experiment, there were no differences in reading speed: 57 seconds on the laptop compared to 58 seconds on paper – statistically equivalent in a small experiment like this. And so that raises more questions about why the brain is acting differently between the two media.
“I’m not sure why one would process some visual images more deeply than others if the subjects spent similar amounts of time looking at them,” said Timothy Shanahan, a reading research expert and a professor emeritus at the University of Illinois at Chicago.
None of this work settles the debate over reading on screens versus paper. All of them ignore the promise of interactive features, such as glossaries and games, which can swing the advantage to electronic texts. Early research can be messy, and that’s a normal part of the scientific process. But so far, the evidence seems to be corroborating conventional reading research that something different is going on when kids log in rather than turn a page.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Rates of chronic absenteeism are at record-high levels. More than 1 in 4 students missed 10 percent or more of the 2021-22 school year. That means millions of students missed out on regular instruction, not to mention the social and emotional benefits of interacting with peers and trusted adults.
Moreover, two-thirds of the nation’s students attended a school where chronic absence rates reached at least 20 percent. Such levels disrupt entire school communities, including the students who are regularly attending.
The scope and scale of this absenteeism crisis necessitate the implementation of the next generation of student support.
Fortunately, a recent study suggests a promising path for getting students back in school and back on track to graduation. A group of nearly 50 middle and high schools saw reductions in chronic absenteeism and course failure rates after one year of harnessing the twin powers of data and relationships.
From the 2021-22 to 2022-23 school years, the schools’ chronic absenteeism rates dropped by 5.4 percentage points, and the share of students failing one or more courses went from 25.5 percent to 20.5 percent. In the crucial ninth grade, course failure rates declined by 9.2 percentage points.
These encouraging results come from the first cohort of rural and urban schools and communities partnering with the GRAD Partnership, a collective of nine organizations, to grow the use of “student success systems” into a common practice.
Student success systems take an evidence-based approach to organizing school communities to better support the academic progress and well-being of all students.
They were developed with input from hundreds of educators and build on the successes of earlier student support efforts — like early warning systems and on-track initiatives — to meet students’ post-pandemic needs.
Related: Widen your perspective. Our free biweekly newsletter consults critical voices on innovation in education.
Importantly, student success systems offer schools a way to identify school, grade-level and classroom factors that impact attendance; they then deliver timely supports to meet individual students’ needs. They do this, in part, by explicitly valuing supportive relationships and responding to the insights that students and the adults who know them bring to the table.
Valuable relationships include not only those between students and teachers, and schools and families, but also those among peer groups and within the entire school community. Schools cannot address the attendance crisis without rebuilding and fostering these relationships.
When students feel a sense of connection to school they are more likely to show up.
For some students, this connection comes through extracurricular activities like athletics, robotics or band. For others, it may be a different connection to school.
Schools haven’t always focused on connections in a concrete way, partly because relationships can feel fuzzy and hard to track. We’re much better at tracking things like grades and attendance.
Still, schools in the GRAD Partnership cohort show that it can be done.
These schools established “student success teams” of teachers, counselors and others. The teams meet regularly to look at up-to-date student data and identify and address the root causes of absenteeism with insight and input from families and communities, as well as the students themselves.
The teams often use low-tech relationship-mapping tools to help identify students who are disconnected from activities or mentors. One school’s student success team used these tools to ensure that all students were connected to at least one activity — and even created new clubs for students with unique interests. Their method was one that any school could replicate —collaborating on a Google spreadsheet.
Another school identified students who would benefit from a new student mentoring program focused on building trusting relationships.
Some schools have used surveys of student well-being to gain insight on how students feel about school, themselves and life in general — and have then used the information to develop supports.
And in an example of building supportive community relationships, one of the GRAD Partnership schools worked with local community organizations to host a resource night event at which families were connected on the spot to local providers who could help them overcome obstacles to regular attendance — such as medical and food needs, transportation and housing issues and unemployment.
Turning the tide against our current absenteeism crisis does not have a one-and-done solution — it will involve ongoing collaborative efforts guided by data and grounded in relationships that take time to build.
Without these efforts, the consequences will be severe both for individual students and our country as a whole.
Robert Balfanz is a research professor at the Center for Social Organization of Schools at Johns Hopkins University School of Education, where he is the director of the Everyone Graduates Center.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Two new surveys, both released this month, show how high school and college-age students are embracing artificial intelligence. There are some inconsistencies and many unanswered questions, but what stands out is how much teens are turning to AI for information and to ask questions, not just to do their homework for them. And they’re using it for personal reasons as well as for school. Another big takeaway is that there are different patterns by race and ethnicity with Black, Hispanic and Asian American students often adopting AI faster than white students.
Emily Weinstein, executive director for the Center for Digital Thriving, a research center that investigates how youth are interacting with technology, said that more teens are “certainly” using AI now that these tools are embedded in more apps and websites, such as Google Search. Last October and November, when this survey was conducted, teens typically had to take the initiative to navigate to an AI site and create an account. An exception was Snapchat, a social media app that had already added an AI chatbot for its users.
More than half of the early adopters said they had used AI for getting information and for brainstorming, the first and second most popular uses. This survey didn’t ask teens if they were using AI for cheating, such as prompting ChatGPT to write their papers for them. However, among the half of respondents who were already using AI, fewer than half – 46 percent – said they were using it for help with school work. The fourth most common use was for generating pictures.
The survey also asked teens a couple of open-response questions. Some teens told researchers that they are asking AI private questions that they were too embarrassed to ask their parents or their friends. “Teens are telling us I have questions that are easier to ask robots than people,” said Weinstein.
Weinstein wants to know more about the quality and the accuracy of the answers that AI is giving teens, especially those with mental health struggles, and how privacy is being protected when students share personal information with chatbots.
The second report, released on June 11, was conducted by Impact Research and commissioned by the Walton Family Foundation. In May 2024, Impact Research surveyed 1,003 teachers, 1,001 students aged 12-18, 1,003 college students, and 1,000 parents about their use and views of AI.
This survey, which took place six months after the Hopelab-Common Sense survey, demonstrated how quickly usage is growing. It found that 49 percent of students, aged 12-18, said they used ChatGPT at least once a week for school, up 26 percentage points since 2023. Forty-nine percent of college undergraduates also said they were using ChatGPT every week for school but there was no comparison data from 2023.
Among 12- to 18-year-olds and college students who had used AI chatbots for school, 56 percent said they had used it for help in writing essays and other writing assignments. Undergraduate students were more than twice as likely as 12- to 18-year-olds to say using AI felt like cheating, 22 percent versus 8 percent. Earlier 2023 surveys of student cheating by scholars at Stanford University did not detect an increase in cheating with ChatGPT and other generative AI tools. But as students use AI more, students’ understanding of what constitutes cheating may also be evolving.
More than 60 percent of college students who used AI said they were using it to study for tests and quizzes. Half of the college students who used AI said they were using it to deepen their subject knowledge, perhaps, as if it were an online encyclopedia. There was no indication from this survey if students were checking the accuracy of the information.
Both surveys noticed differences by race and ethnicity. The first Hopelab-Common Sense survey found that 7 percent of Black students, aged 14-22, were using AI every day, compared with 5 percent of Hispanic students and 3 percent of white students. In the open-ended questions, one Black teen girl wrote that, with AI, “we can change who we are and become someone else that we want to become.”
The Walton Foundation survey found that Hispanic and Asian American students were sometimes more likely to use AI than white and Black students, especially for personal purposes.
These are all early snapshots that are likely to keep shifting. OpenAI is expected to become part of the Apple universe in the fall, including its iPhones, computers and iPads. “These numbers are going to go up and they’re going to go up really fast,” said Weinstein. “Imagine that we could go back 15 years in time when social media use was just starting with teens. This feels like an opportunity for adults to pay attention.”
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Researchers from the University of California, Irvine, and Arizona State University found that human feedback was generally a bit better than AI feedback, but AI was surprisingly good. Credit: Getty Images
This week I challenged my editor to face off against a machine. Barbara Kantrowitz gamely accepted, under one condition: “You have to file early.” Ever since ChatGPT arrived in 2022, many journalists have made a public stunt out of asking the new generation of artificial intelligence to write their stories. Those AI stories were often bland and sprinkled with errors. I wanted to understand how well ChatGPT handled a different aspect of writing: giving feedback.
My curiosity was piqued by a new study, published in the June 2024 issue of the peer-reviewed journal Learning and Instruction, that evaluated the quality of ChatGPT’s feedback on students’ writing. A team of researchers compared AI with human feedback on 200 history essays written by students in grades 6 through 12 and they determined that human feedback was generally a bit better. Humans had a particular advantage in advising students on something to work on that would be appropriate for where they are in their development as a writer.
But ChatGPT came close. On a five-point scale that the researchers used to rate feedback quality, with a 5 being the highest quality feedback, ChatGPT averaged a 3.6 compared with a 4.0 average from a team of 16 expert human evaluators. It was a tough challenge. Most of these humans had taught writing for more than 15 years or they had considerable experience in writing instruction. All received three hours of training for this exercise plus extra pay for providing the feedback.
ChatGPT even beat these experts in one aspect; it was slightly better at giving feedback on students’ reasoning, argumentation and use of evidence from source materials – the features that the researchers had wanted the writing evaluators to focus on.
“It was better than I thought it was going to be because I didn’t have a lot of hope that it was going to be that good,” said Steve Graham, a well-regarded expert on writing instruction at Arizona State University, and a member of the study’s research team. “It wasn’t always accurate. But sometimes it was right on the money. And I think we’ll learn how to make it better.”
Average ratings for the quality of ChatGPT and human feedback on 200 student essays
Researchers rated the quality of the feedback on a five-point scale across five different categories. Criteria-based refers to whether the feedback addressed the main goals of the writing assignment, in this case, to produce a well-reasoned argument about history using evidence from the reading source materials that the students were given. Clear directions mean whether the feedback included specific examples of something the student did well and clear directions for improvement. Accuracy means whether the feedback advice was correct without errors. Essential Features refer to whether the suggestion on what the student should work on next is appropriate for where the student is in his writing development and is an important element of this genre of writing. Supportive Tone refers to whether the feedback is delivered with language that is affirming, respectful and supportive, as opposed to condescending, impolite or authoritarian. (Source: Fig. 1 of Steiss et al, “Comparing the quality of human and ChatGPT feedback of students’ writing,” Learning and Instruction, June 2024.)
Exactly how ChatGPT is able to give good feedback is something of a black box even to the writing researchers who conducted this study. Artificial intelligence doesn’t comprehend things in the same way that humans do. But somehow, through the neural networks that ChatGPT’s programmers built, it is picking up on patterns from all the writing it has previously digested, and it is able to apply those patterns to a new text.
The surprising “relatively high quality” of ChatGPT’s feedback is important because it means that the new artificial intelligence of large language models, also known as generative AI, could potentially help students improve their writing. One of the biggest problems in writing instruction in U.S. schools is that teachers assign too little writing, Graham said, often because teachers feel that they don’t have the time to give personalized feedback to each student. That leaves students without sufficient practice to become good writers. In theory, teachers might be willing to assign more writing or insist on revisions for each paper if students (or teachers) could use ChatGPT to provide feedback between drafts.
Despite the potential, Graham isn’t an enthusiastic cheerleader for AI. “My biggest fear is that it becomes the writer,” he said. He worries that students will not limit their use of ChatGPT to helpful feedback, but ask it to do their thinking, analyzing and writing for them. That’s not good for learning. The research team also worries that writing instruction will suffer if teachers delegate too much feedback to ChatGPT. Seeing students’ incremental progress and common mistakes remain important for deciding what to teach next, the researchers said. For example, seeing loads of run-on sentences in your students’ papers might prompt a lesson on how to break them up. But if you don’t see them, you might not think to teach it. Another common concern among writing instructors is that AI feedback will steer everyone to write in the same homogenized way. A young writer’s unique voice could be flattened out before it even has the chance to develop.
There’s also the risk that students may not be interested in heeding AI feedback. Students often ignore the painstaking feedback that their teachers already give on their essays. Why should we think students will pay attention to feedback if they start getting more of it from a machine?
Still, Graham and his research colleagues at the University of California, Irvine, are continuing to study how AI could be used effectively and whether it ultimately improves students’ writing. “You can’t ignore it,” said Graham. “We either learn to live with it in useful ways, or we’re going to be very unhappy with it.”
Example of feedback from a human and ChatGPT on the same essay
Source: Steiss et al, “Comparing the quality of human and ChatGPT feedback of students’ writing,” Learning and Instruction, June 2024.
In the current study, the researchers didn’t track whether students understood or employed the feedback, but only sought to measure its quality. Judging the quality of feedback is a rather subjective exercise, just as feedback itself is a bundle of subjective judgment calls. Smart people can disagree on what good writing looks like and how to revise bad writing.
In this case, the research team came up with its own criteria for what constitutes good feedback on a history essay. They instructed the humans to focus on the student’s reasoning and argumentation, rather than, say, grammar and punctuation. They also told the human raters to adopt a “glow and grow strategy” for delivering the feedback by first finding something to praise, then identifying a particular area for improvement.
The human raters provided this kind of feedback on hundreds of history essays from 2021 to 2023, as part of an unrelated study of an initiative to boost writing at school. The researchers randomly grabbed 200 of these essays and fed the raw student writing – without the human feedback – to version 3.5 of ChatGPT and asked it to give feedback, too.
At first, the AI feedback was terrible, but as the researchers tinkered with the instructions, or the “prompt,” they typed into ChatGPT, the feedback improved. The researchers eventually settled upon this wording: “Pretend you are a secondary school teacher. Provide 2-3 pieces of specific, actionable feedback on each of the following essays…. Use a friendly and encouraging tone.” The researchers also fed the assignment that the students were given, for example, “Why did the Montgomery Bus Boycott succeed?” along with the reading source material that the students were provided. (More details about how the researchers prompted ChatGPT are explained in Appendix C of the study.)
The humans took about 20 to 25 minutes per essay. ChatGPT’s feedback came back instantly. The humans sometimes marked up sentences by, for example, showing a place where the student could have cited a source to buttress an argument. ChatGPT didn’t write any in-line comments and only wrote a note to the student.
Researchers then read through both sets of feedback – human and machine – for each essay, comparing and rating them. (It was supposed to be a blind comparison test and the feedback raters were not told who authored each one. However, the language and tone of ChatGPT were distinct giveaways, and the in-line comments were a tell of human feedback.)
Humans appeared to have a clear edge with the very strongest and the very weakest writers, the researchers found. They were better at pushing a strong writer a little bit further, for example, by suggesting that the student consider and address a counterargument. ChatGPT struggled to come up with ideas for a student who was already meeting the objectives of a well-argued essay with evidence from the reading source materials. ChatGPT also struggled with the weakest writers. The researchers had to drop two of the essays from the study because they were so short that ChatGPT didn’t have any feedback for the student. The human rater was able to parse out some meaning from a brief, incomplete sentence and offer a suggestion.
In one student essay about the Montgomery Bus Boycott, reprinted above, the human feedback seemed too generic to me: “Next time, I would love to see some evidence from the sources to help back up your claim.” ChatGPT, by contrast, specifically suggested that the student could have mentioned how much revenue the bus company lost during the boycott – an idea that was mentioned in the student’s essay. ChatGPT also suggested that the student could have mentioned specific actions that the NAACP and other organizations took. But the student had actually mentioned a few of these specific actions in his essay. That part of ChatGPT’s feedback was plainly inaccurate.
In another student writing example, also reprinted below, the human straightforwardly pointed out that the student had gotten an historical fact wrong. ChatGPT appeared to affirm that the student’s mistaken version of events was correct.
Another example of feedback from a human and ChatGPT on the same essay
Source: Steiss et al, “Comparing the quality of human and ChatGPT feedback of students’ writing,” Learning and Instruction, June 2024.
So how did ChatGPT’s review of my first draft stack up against my editor’s? One of the researchers on the study team suggested a prompt that I could paste into ChatGPT. After a few back and forth questions with the chatbot about my grade level and intended audience, it initially spit out some generic advice that had little connection to the ideas and words of my story. It seemed more interested in format and presentation, suggesting a summary at the top and subheads to organize the body. One suggestion would have made my piece too long-winded. Its advice to add examples of how AI feedback might be beneficial was something that I had already done. I then asked for specific things to change in my draft, and ChatGPT came back with some great subhead ideas. I plan to use them in my newsletter, which you can see if you sign up for it here. (And if you want to see my prompt and dialogue with ChatGPT, here is the link.)
My human editor, Barbara, was the clear winner in this round. She tightened up my writing, fixed style errors and helped me brainstorm this ending. Barbara’s job is safe – for now.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Schools spend billions of dollars a year on products and services, including everything from staplers and textbooks to teacher coaching and training. Does any of it help students learn more? Some educational materials end up mothballed in closets. Much software goes unused. Yet central-office bureaucrats frequently renew their contracts with outside vendors regardless of usage or efficacy.
One idea for smarter education spending is for schools to sign smarter contracts, where part of the payment is contingent upon whether students use the services and learn more. It’s called outcomes-based contracting and is a way of sharing risk between buyer (the school) and seller (the vendor). Outcomes-based contracting is most common in healthcare. For example, a health insurer might pay a pharmaceutical company more for a drug if it actually improves people’s health, and less if it doesn’t.
Although the idea is relatively new in education, many schools tried a different version of it – evaluating and paying teachers based on how much their students’ test scores improved – in the 2010s. Teachers didn’t like it, and enthusiasm for these teacher accountability schemes waned. Then, in 2020, Harvard University’s Center for Education Policy Research announced that it was going to test the feasibility of paying tutoring companies by how much students’ test scores improved.
The initiative was particularly timely in the wake of the pandemic. The federal government would eventually give schools almost $190 billion to reopen and to help students who fell behind when schools were closed. Tutoring became a leading solution for academic recovery and schools contracted with outside companies to provide tutors. Many educators worried that billions could be wasted on low-quality tutors who didn’t help anyone. Could schools insist that tutoring companies make part of their payment contingent upon whether student achievement increased?
The Harvard center recruited a handful of school districts who wanted to try an outcomes-based contract. The researchers and districts shared ideas on how to set performance targets. How much should they expect student achievement to grow from a few months of tutoring? How much of the contract should be guaranteed to the vendor for delivering tutors, and how much should be contingent on student performance?
The first hurdle was whether tutoring companies would be willing to offer services without knowing exactly how much they would be paid. School districts sent out requests for proposals from online tutoring companies. Tutoring companies bid and the terms varied. One online tutoring company agreed that 40 percent of a $1.2 million contract with the Duval County Public Schools in Jacksonville, Florida, would be contingent upon student performance. Another online tutoring company signed a contract with Ector County schools in the Odessa, Texas, region that specified that the company had to accept a penalty if kids’ scores declined.
In the middle of the pilot, the outcomes-based contracting initiative moved from the Harvard center to the Southern Education Foundation, another nonprofit, and I recently learned how the first group of contracts panned out from Jasmine Walker, a senior manager there. Walker had a first-hand view because until the fall of 2023, she was the director of mathematics in Florida’s Duval County schools, where she oversaw the outcomes-based contract on tutoring.
Here are some lessons she learned:
Planning is time-consuming
Drawing up an outcomes-based contract requires analyzing years of historical testing data, and documenting how much achievement has typically grown for the students who need tutoring. Then, educators have to decide – based on the research evidence for tutoring – how much they could reasonably hope student achievement to grow after 12 weeks or more.
Incomplete data was a common problem
The first school district in the pilot group launched its outcome-based contract in the fall of 2021. In the middle of the pilot, school leadership changed, layoffs hit, and the leaders of the tutoring initiative left the district. With no one in the district’s central office left to track it, there was no data on whether tutoring helped the 1,000 students who received it. Half the students attended 70 percent of the tutoring sessions. Half didn’t. Test scores for almost two-thirds of the tutored students increased between the start and the end of the tutoring program. But these students also had regular math classes each day and they likely would have posted some achievement gains anyway.
Delays in settling contracts led to fewer tutored students
Walker said two school districts weren’t able to start tutoring children until January 2023, instead of the fall of 2022 as originally planned, because it took so long to iron out contract details and obtain approvals inside the districts. Many schools didn’t want to wait and launched other interventions to help needy students sooner. Understandably, schools didn’t want to yank these students away from those other interventions midyear.
That delay had big consequences in Duval County. Only 451 students received tutoring instead of a projected 1,200. Fewer students forced Walker to recalculate Duval’s outcomes-based contract. Instead of a $1.2 million contract with $480,000 of it contingent on student outcomes, she downsized it to $464,533 with $162,363 contingent. The tutored students hit 53 percent of the district’s growth and proficiency goals, leading to a total payout of $393,220 to the tutoring company – far less than the company had originally anticipated. But the average per-student payout of $872 was in line with the original terms of between $600 and $1,000 per student.
The bottom line is still uncertain
What we don’t know from any of these case studies is whether similar students who didn’t receive tutoring also made similar growth and proficiency gains. Maybe it’s all the other things that teachers were doing that made the difference. In Duval County, for example, proficiency rates in math rose from 28 percent of students to 46 percent of students. Walker believes that outcomes-based contracting for tutoring was “one lever” of many.
It’s unclear if outcomes-based contracting is a way for schools to save money. This kind of intensive tutoring – three times a week or more during the school day – is new and the school districts didn’t have previous pre-pandemic tutoring contracts for comparison. But generally, if all the student goals are met, companies stand to earn more in an outcomes-based contract than they would have otherwise, Walker said.
“It’s not really about saving money,” said Walker. “What we want is for students to achieve. I don’t care if I spent the whole contract amount if the students actually met the outcomes, because in the past, let’s face it, I was still paying and they were not achieving outcomes.”
The biggest change with outcomes-based contracting, Walker said, was the partnership with the provider. One contractor monitored student attendance during tutoring sessions, called her when attendance slipped and asked her to investigate. Students were given rewards for attending their tutoring sessions and the tutoring company even chipped in to pay for them. “Kids love Takis,” said Walker.
Advice for schools
Walker has two pieces of advice for schools considering outcomes-based contracts. One, she says, is to make the contingency amount at least 40 percent of the contract. Smaller incentives may not motivate the vendor. For her second outcomes-based contract in Duval County, Walker boosted the contingency amount to half the contract. To earn it, the tutoring company needs the students it is tutoring to hit growth and proficiency goals. That tutoring took place during the current 2023-24 school year. Based on mid-year results, students exceeded expectations, but full-year results are not yet in.
More importantly, Walker says the biggest lesson she learned was to include teachers, parents and students earlier in the contract negotiation process. She says “buy in” from teachers is critical because classroom teachers are actually making sure the tutoring happens. Otherwise, an outcomes-based contract can feel like yet “another thing” that the central office is adding to a teacher’s workload.
Walker also said she wished she had spent more time educating parents and students on the importance of attending school and their tutoring sessions. ”It’s important that everyone understands the mission,” said Walker.
Innovation can be rocky, especially at the beginning. Now the Southern Education Foundation is working to expand its outcomes-based contracting initiative nationwide. A second group of four school districts launched outcomes-based contracts for tutoring this 2023-24 school year. Walker says that the rate cards and recordkeeping are improving from the first pilot round, which took place during the stress and chaos of the pandemic.
The foundation is also seeking to expand the use of outcomes-based contracts beyond tutoring to education technology and software. Nine districts are slated to launch outcomes-based contracts for ed tech this fall. Her next dream is to design outcomes-based contracts around curriculum and teacher training. I’ll be watching.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Grading papers is hard work. “I hate it,” a teacher friend confessed to me. And that’s a major reason why middle and high school teachers don’t assign more writing to their students. Even an efficient high school English teacher who can read and evaluate an essay in 20 minutes would spend 3,000 minutes, or 50 hours, grading if she’s teaching six classes of 25 students each. There aren’t enough hours in the day.
Could ChatGPT relieve teachers of some of the burden of grading papers? Early research is finding that the new artificial intelligence of large language models, also known as generative AI, is approaching the accuracy of a human in scoring essays and is likely to become even better soon. But we still don’t know whether offloading essay grading to ChatGPT will ultimately improve or harm student writing.
Tamara Tate, a researcher at University California, Irvine, and an associate director of her university’s Digital Learning Lab, is studying how teachers might use ChatGPT to improve writing instruction. Most recently, Tate and her seven-member research team, which includes writing expert Steve Graham at Arizona State University, compared how ChatGPT stacked up against humans in scoring 1,800 history and English essays written by middle and high school students.
Tate said ChatGPT was “roughly speaking, probably as good as an average busy teacher” and “certainly as good as an overburdened below-average teacher.” But, she said, ChatGPT isn’t yet accurate enough to be used on a high-stakes test or on an essay that would affect a final grade in a class.
Tate presented her study on ChatGPT essay scoring at the 2024 annual meeting of the American Educational Research Association in Philadelphia in April. (The paper is under peer review for publication and is still undergoing revision.)
Most remarkably, the researchers obtained these fairly decent essay scores from ChatGPT without training it first with sample essays. That means it is possible for any teacher to use it to grade any essay instantly with minimal expense and effort. “Teachers might have more bandwidth to assign more writing,” said Tate. “You have to be careful how you say that because you never want to take teachers out of the loop.”
Writing instruction could ultimately suffer, Tate warned, if teachers delegate too much grading to ChatGPT. Seeing students’ incremental progress and common mistakes remain important for deciding what to teach next, she said. For example, seeing loads of run-on sentences in your students’ papers might prompt a lesson on how to break them up. But if you don’t see them, you might not think to teach it.
In the study, Tate and her research team calculated that ChatGPT’s essay scores were in “fair” to “moderate” agreement with those of well-trained human evaluators. In one batch of 943 essays, ChatGPT was within a point of the human grader 89 percent of the time. On a six-point grading scale that researchers used in the study, ChatGPT often gave an essay a 2 when an expert human evaluator thought it was really a 1. But this level of agreement – within one point – dropped to 83 percent of the time in another batch of 344 English papers and slid even farther to 76 percent of the time in a third batch of 493 history essays. That means there were more instances where ChatGPT gave an essay a 4, for example, when a teacher marked it a 6. And that’s why Tate says these ChatGPT grades should only be used for low-stakes purposes in a classroom, such as a preliminary grade on a first draft.
ChatGPT scored an essay within one point of a human grader 89 percent of the time in one batch of essays
Corpus 3 refers to one batch of 943 essays, which represents more than half of the 1,800 essays that were scored in this study. Numbers highlighted in green show exact score matches between ChatGPT and a human. Yellow highlights scores in which ChatGPT was within one point of the human score. Source: Tamara Tate, University of California, Irvine (2024).
Still, this level of accuracy was impressive because even teachers disagree on how to score an essay and one-point discrepancies are common. Exact agreement, which only happens half the time between human raters, was worse for AI, which matched the human score exactly only about 40 percent of the time. Humans were far more likely to give a top grade of a 6 or a bottom grade of a 1. ChatGPT tended to cluster grades more in the middle, between 2 and 5.
Tate set up ChatGPT for a tough challenge, competing against teachers and experts with PhDs who had received three hours of training in how to properly evaluate essays. “Teachers generally receive very little training in secondary school writing and they’re not going to be this accurate,” said Tate. “This is a gold-standard human evaluator we have here.”
The raters had been paid to score these 1,800 essays as part of three earlier studies on student writing. Researchers fed these same student essays – ungraded – into ChatGPT and asked ChatGPT to score them cold. ChatGPT hadn’t been given any graded examples to calibrate its scores. All the researchers did was copy and paste an excerpt of the same scoring guidelines that the humans used, called a grading rubric, into ChatGPT and told it to “pretend” it was a teacher and score the essays on a scale of 1 to 6.
Older robo graders
Earlier versions of automated essay graders have had higher rates of accuracy. But they were expensive and time-consuming to create because scientists had to train the computer with hundreds of human-graded essays for each essay question. That’s economically feasible only in limited situations, such as for a standardized test, where thousands of students answer the same essay question.
Earlier robo graders could also be gamed, once a student understood the features that the computer system was grading for. In some cases, nonsense essays received high marks if fancy vocabulary words were sprinkled in them. ChatGPT isn’t grading for particular hallmarks, but is analyzing patterns in massive datasets of language. Tate says she hasn’t yet seen ChatGPT give a high score to a nonsense essay.
Tate expects ChatGPT’s grading accuracy to improve rapidly as new versions are released. Already, the research team has detected that the newer 4.0 version, which requires a paid subscription, is scoring more accurately than the free 3.5 version. Tate suspects that small tweaks to the grading instructions, or prompts, given to ChatGPT could improve existing versions. She is interested in testing whether ChatGPT’s scoring could become more reliable if a teacher trained it with just a few, perhaps five, sample essays that she has already graded. “Your average teacher might be willing to do that,” said Tate.
Many ed tech startups, and even well-known vendors of educational materials, are now marketing new AI essay robo graders to schools. Many of them are powered under the hood by ChatGPT or another large language model and I learned from this study that accuracy rates can be reported in ways that can make the new AI graders seem more accurate than they are. Tate’s team calculated that, on a population level, there was no difference between human and AI scores. ChatGPT can already reliably tell you the average essay score in a school or, say, in the state of California.
Questions for AI vendors
At this point, it is not as accurate in scoring an individual student. And a teacher wants to know exactly how each student is doing. Tate advises teachers and school leaders who are considering using an AI essay grader to ask specific questions about accuracy rates on the student level:What is the rate of exact agreement between the AI grader and a human rater on each essay? How often are they within one-point of each other?
The next step in Tate’s research is to study whether student writing improves after having an essay graded by ChatGPT. She’d like teachers to try using ChatGPT to score a first draft and then see if it encourages revisions, which are critical for improving writing. Tate thinks teachers could make it “almost like a game: how do I get my score up?”
Of course, it’s unclear if grades alone, without concrete feedback or suggestions for improvement, will motivate students to make revisions. Students may be discouraged by a low score from ChatGPT and give up. Many students might ignore a machine grade and only want to deal with a human they know. Still, Tate says some students are too scared to show their writing to a teacher until it’s in decent shape, and seeing their score improve on ChatGPT might be just the kind of positive feedback they need.
“We know that a lot of students aren’t doing any revision,” said Tate. “If we can get them to look at their paper again, that is already a win.”
That does give me hope, but I’m also worried that kids will just ask ChatGPT to write the whole essay for them in the first place.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Linda Brown was a third grader in Topeka, Kansas, when her father, Oliver Brown, tried to enroll her in the white public school four blocks from her home. Otherwise, she would have had to walk across railroad tracks to take a bus to attend the nearest all-Black one.
When she was denied admission, Oliver Brown sued.
The case, and four others from Delaware, the District of Columbia, South Carolina and Virginia were combined and made their way to the Supreme Court. All of them involved school children required to attend all-Black schools that were of lower quality than schools for white children.
While the Supreme Court found in 1954 in Oliver Brown’s favor, years would pass before desegregation of American schools began in earnest. And for many Black students now, 70 years since the nation’s highest court held unanimously that separate is inherently unequal, educational resources and access remain woefully uneven.
Here are some of the racial realities of American public education today:
25: That’s the percentage increase in Black-white school segregation between 1991 and 2019, according to an analysis of 533 districts by sociologists Sean Reardon at Stanford University and Ann Owens at the University of Southern California. While school segregation fell dramatically beginning in 1968 with a series of court orders, it began to tick up in the early 1990s because of the expiration of court orders mandating integration, school choice policies, and other factors. Still, schools remain significantly less segregated than they did before and immediately after the Brown decision.
10: That’s the proportion of Black students learning in a school where more than 90 percent of their classmates were also Black, according to 2022 Department of Education data. That figure is down from 23 percent in 2000. Even as Black-white school segregation has increased slightly since the early 1990s, the number of extremely segregated schools has shrunk, in part because of an increase in the Hispanic student population. Meanwhile, from 2000 to 2022, the percentage of white students attending a school that is 90 percent or more white fell from 44 percent to 14 percent.
6: This is the percentage of teachers in American public schools who are Black. By comparison, Black students make up about 15 percent of public school enrollment. One legacy of Brown v. Boardis the dearth of Black teachers: More than 38,000 Black educators lost their jobs after the decision came down, as white administrators of integrating schools refused to hire Black professionals for teaching roles or pushed them out. Yet research suggests that more Black teachers in the classroom can help boost Black student outcomes such as college enrollment.
Related: Become a lifelong learner. Subscribe to our free weekly newsletter to receive our comprehensive reporting directly in your inbox.
2014: That’s the year that Wilcox County High School, in rural Georgia, held its first school-sponsored, racially integrated prom. After desegregation, parents in the community, like many across the South, began organizing private, off-site proms to keep the events exclusively white. That practice persisted in Wilcox County until 2013, when high schoolers organized a prom for both white and Black students. The next year, the school made it official, finally holding an integrated event.
$14,385: This is the average amount spent per Black pupil in public school, compared with $14,263 per white student, according to a 2022 analysis of 2017-18 data by the Federal Reserve Bank of St. Louis. The researchers found that while school district spending was very similar for Black and white students, the sources of funding differed somewhat, with Black students receiving more federal funding and white students receiving more local funding. The amount of money spent on instruction per pupil, meanwhile, was slightly lower for Black students – $7,169 – than for white students ($7,329). The researchers attributed that to a number of small, predominantly white districts that spent far above average on their students.
7: That’s the share of incoming students at the University of Mississippi who were Black in 2022 — even though nearly half the state’s public high school graduates, 48 percent, were Black that year. That gap between Black students graduating from high school in Mississippi and those enrolling at the state flagship university has grown over the past decade, according to a Hechinger analysis. Similar trends are playing out elsewhere in the country: In 2022, 16 state flagship universities had a gap of 10 percentage points or more between Black high school graduates and incoming freshmen. And at two dozen flagships, the gap for Black students stayed the same or grew between 2019 and 2022. Yet public flagships were created to educate the residents of their states, and most make that explicit.
The Hechinger Report takes a look at the decision that was intended to end segregation in public schools in an exploration of what has, and hasn’t, changed since school segregation was declared illegal.
700:That’s roughly how many high schools are offering the College Board’s Advanced Placement African American Studies course this school year, more than 10 times as many that offered it a year earlier, when it debuted. The course was created in part in response to longstanding concerns that African American history has been downplayed or left out of K-12 curriculum. But the A.P. course, an elective, became ensnared in politics. The content has evolved after criticism that it introduced students to “divisive concepts,” among other reasons; it has been banned or restricted in some states. Nevertheless, about 13,000 students are enrolled in this second year of the pilot course, which took more than 10 years to develop. Forty-five percent of students taking the class had never previously taken another AP course, which can earn them college credit.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
It was one of the most significant days in the history of the U.S. Supreme Court. On May 17, 1954, the nine justices unanimously ruled in Brown v. Board of Education that schools segregated by race did not provide an equal education. Students could no longer be barred from a school because of the color of their skin. To commemorate the 70th anniversary of the Brown decision, I wanted to look at how far we’ve come in integrating our schools and how far we still have to go.
Two sociologists, Sean Reardon at Stanford University and Ann Owens at the University of Southern California, have teamed up to analyze both historical and recent trends. Reardon and Owens were slated to present their analysis at a Stanford University conference on May 6, and they shared their presentation with me in advance. They also expect to launch a new website to display segregation trends for individual school districts around the country.
Here are five takeaways from their work:
The long view shows progress but a worrying uptick, especially in big cities
Source: Owens and Reardon, “The state of segregation: 70 years after Brown,” 2024 presentation at Stanford University.
Not much changed for almost 15 years after the Brown decision. Although Black students had the right to attend another school, the onus was on their families to demand a seat and figure out how to get their child to the school. Many schools remained entirely Black or entirely white.
Desegregation began in earnest in 1968 with a series of court orders, beginning with Virginia’s New Kent County schools. That year, the Supreme Court required the county to abolish its separate Black and white schools and students were reassigned to different schools to integrate them.
This graph above, produced by Reardon and Owens, shows how segregation plummeted across the country between 1968 and 1973. The researchers focused on roughly 500 larger school districts where there were at least 2,500 Black students. That captures nearly two-thirds of all Black students in the nation and avoids clouding the analysis with thousands of small districts of mostly white residents.
Reardon’s and Owens’s measurement of segregation compares classmates of the average white student with the classmates of the average Black student. For example, in North Carolina’s Charlotte-Mecklenberg district, the average white student in 1968 attended a school where 90 percent of his peers were white and only 10 percent were Black. The average Black student attended a school where 76 percent of his peers were Black and 24 percent were white. Reardon and Owens then calculated the gap in exposure to each race. White students had 90 percent white classmates while Black students had 24 percent white classmates. The difference was 66 percentage points. On the flip side, Black students had 76 percent Black classmates while white students had 10 percent Black classmates. Again, the difference was 66 percentage points, which translates to 0.66 on the segregation index.
But in 1973, after court-ordered desegregation went into effect, the average white student attended a school that was 69 percent white and 31 percent Black. The average Black student attended a school that was 34 percent Black and 66 percent white. In five short years, the racial exposure gap fell from 66 percentage points to 3 percentage points. Schools reflected Charlotte-Mecklenberg’s demographics. In the graph above, Reardon and Owens averaged the segregation index figures for all 533 districts with substantial Black populations. That’s what each dot represents.
In the early 1990s, this measure of segregation began to creep up again, as depicted by the red tail in the graph above. Owens calls it a “slow and steady uptick” in contrast to the drastic decline in segregation after 1968. Segregation has not bounced back or returned to pre-Brown levels. “There’s a misconception that segregation is worse than ever,” Reardon said.
Although the red line from 1990 to the present looks nearly flat, when you zoom in on it, you can see that Black-white segregation grew by 25 percent between 1991 and 2019. During the pandemic, segregation declined slightly again.
Detailed view of the red line segment in the chart above, “Average White-Black Segregation, 1968-2022.” Source: Owens and Reardon, “The state of segregation: 70 years after Brown,” 2024 presentation at Stanford University.
It’s important to emphasize that these Black-white segregation levels are tiny compared with the degree of segregation in the late 1960s. A 25 percent increase can seem like a lot, but it’s less than 4 percentage points.
“It’s big enough that it makes me worried,” said Owens. “Now is the moment to keep an eye on this. If it continues in this direction, it would take a long time to get back up to Brown. But let’s not let it keep going up.”
Even more troubling is the fact that segregation increased substantially if you zero in on the nation’s biggest cities. White-Black segregation in the largest 100 school districts increased by 64 percent from 1988 to 2019, Owens and Reardon calculated.
Source: Owens and Reardon, “The state of segregation: 70 years after Brown,” 2024 presentation at Stanford University.
School choice plays a role in recent segregation
Why is segregation creeping back up again?
The expiration of court orders that mandated school integration and the expansion of school choice policies, including the rapid growth of charter schools, explains all of the increase in segregation from 2000 onward, said Reardon. Over 200 medium-sized and large districts were released from desegregation court orders from 1991 to 2009, and racial school segregation in these districts gradually increased in the years afterward.
School choice, however, appears to be the dominant force. More than half of the increase in segregation in the 2000s can be attributed to the rise of charter schools, whose numbers began to increase rapidly in the late 1990s. In many cases, either white or Black families flocked to different charter schools, leaving behind a less diverse student body in traditional public schools.
The reason for the rise in segregation in the 1990s before the number of charter schools soared is harder to understand. Owens speculates that other school choice policies, such as the option to attend any public school within a district or the creation of new magnet schools, may have played a role, but she doesn’t have the data to prove that. White gentrification of cities in the 1990s could also be a factor, she said, as the white newcomers favored a small set of schools or sent their children to private schools.
“We might just be catching a moment where there’s been an influx of one group before the other group leaves,” said Owens. “It’s hard to say how the numbers will look 10 years from now.”
It’s important to disentangle demographic shifts from segregation increases
There’s a popular narrative that segregation has increased because Black students are more likely to attend school with other students who are not white, especially Hispanic students. But Reardon and Owens say this analysis conflates demographic shifts in the U.S. population with segregation. The share of Hispanic students in U.S. schools now approaches30 percent and everyone is attending schools with more Hispanic classmates. White students, who used to represent 85 percent of the U.S. student population in 1970, now make up less than half.
Source: Owens and Reardon, “The state of segregation: 70 years after Brown,” 2024 presentation at Stanford University.
The blue line in the graph above shows how the classmates of the average Black, Hispanic or Native American student have increased from about 55 percent Black, Hispanic and Native American students in the early 1970s to nearly 80 percent Black, Hispanic and Native American students today. That means that the average student who is not white is attending a school that is overwhelmingly made up of students who are not white.
But look at how the red line, which depicts white students, is following the same path. The average white student is attending a school that moved from 35 percent students who are not white in the 1970s to nearly 70 percent students who are not white today. “It’s entirely driven by Hispanic students,” said Owens. “Even the ‘white’ schools in L.A. are 40 percent Hispanic.”
I dug into U.S. Department of Education data to show how extremely segregated schools have become less common. The percentage of Black students attending a school that is 90 percent or more Black fell from 23 percent in 2000 to 10 percent in 2022. Only 1 in 10 Black students attends an all-Black or a nearly all-Black school. Meanwhile, the percentage of white students attending a school that is 90 percent or more white fell from 44 percent to 14 percent during this same time period. That’s 1 in 7. Far fewer Black or white students are learning in schools that are almost entirely made up of students of their same race.
At the same time, the percentage of Black students attending a school where 90 percent of students are not white grew from 37 percent in 2000 to 40 percent in 2022. But notice the sharp growth of Hispanic students during this period. They went from 7.6 million (fewer than the number of Black students) to more than 13.9 million (almost double the number of Black students).
Most segregation falls across school district boundaries
Source: Owens and Reardon, “The state of segregation: 70 years after Brown,” 2024 presentation at Stanford University.
This bar chart shows how schools are segregated for two reasons. One is that people of different races live on opposite sides of school district lines. Detroit is an extreme example. The city schools are dominated by Black students. Meanwhile, the Detroit suburbs, which operate independent school systems, are dominated by white students. Almost all the segregation is because people of different races live in different districts. Meanwhile, in the Charlotte, North Carolina, metropolitan area, over half of the segregation reflects the uneven distribution of students within school districts.
Nationally, 60 percent of the segregation occurs because of the Detroit scenario: people live across administrative borders, Reardon and Owens calculated. Still, 40 percent of current segregation is within administrative borders that policymakers can control.
Residential segregation is decreasing
People often say there’s little that can be done about school segregation until we integrate neighborhoods. I was surprised to learn that residential segregation has been declining over the past 30 years, according to Reardon’s and Owens’s analysis of census tracts. More Black and white people live in proximity to each other. And yet, at the same time, school segregation is getting worse.
All this matters, Reardon said, because kids are learning at different rates in more segregated systems. “We know that more integrated schools provide more equal educational opportunities,” he said. “The things we’re doing with our school systems are making segregation worse.”
Reardon recommends more reforms to housing policy to integrate neighborhoods and more “guard rails” on school choice systems so that they cannot be allowed to produce highly segregated schools.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
Student-parents disproportionately give up before they reach the finish line. Fewer than 4 in 10 graduate with a degree within six years, compared with more than 6 in 10 other students.
Search to learn more about childcare availability at colleges and universities nationwide. Enter an institution name to see if child care is available and how many students are over the age of 24.
Related articles
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
For the first 57 minutes of the basketball game between two Bend, Oregon, high school rivals, Kyra Rice stood at the edges of the court taking yearbook photos. With just minutes before the end of the game, she was told she had to move.
Kyra pushed back: She had permission to stand near the court. The athletic director got involved, Kyra recalled. She let a swear word or two slip.
Kyra has anxiety as well as ADHD, which can make her impulsive. Following years of poor experiences at school, she sometimes became defensive when she felt overwhelmed, said her mom, Jules Rice.
But at the game, Kyra said she kept her cool overall. Both she and her mother were shocked to learn the next day that she’d been suspended from school.
“OK, maybe she said some bad words, but it’s not enough to suspend her,” Rice said.
The incident’s discipline record, provided by Rice, lists a series of categories to explain the suspension: insubordination, disobedience, disrespectful/minor disruption, inappropriate language, non-compliance.
Broad and subjective categories like these are cited hundreds of thousands of times a year to justify removing students from school, a Hechinger Report investigation found. The data show that students with disabilities, like Kyra, are more likely than their peers to be punished for such violations. In fact, they’re often more likely to be suspended for these reasons than for other infractions.
For example, between 2017-18 and 2021-22, Rhode Island students with disabilities were, on average, two and a half times more likely than their peers to be suspended for any reason, but nearly three times more likely to be suspended for insubordination and almost four times more likely to be suspended for disorderly conduct. Similar patterns played out in other states with available data including Massachusetts, Montana and Vermont.
Federal law should offer students protections from being suspended for behavior that results from their disability, even if they are being disruptive or insubordinate. But those protections have significant limitations. At the same time, these subjective categories are almost tailor-made to trap students with disabilities, who might have trouble expressing or regulating themselves appropriately.
Districts have wide discretion in setting their own rules and many students with disabilities quickly earn reputations at school as troublemakers. “Unfortunately, who gets caught up in a lot of the vagueness in the codes of conduct are students with disabilities,” said attorney Robert Tudisco, an expert with Understood.org, a nonprofit that provides resources and support to people with learning and attention disabilities.
Students on the autism spectrum often have a hard time communicating with words and might yell or become aggressive if something upsets them. A student with oppositional defiant disorder is likely to be openly insubordinate to authority, while one with dyslexia might act out when frustrated with schoolwork. Students with ADHD typically have a hard time controlling their impulses.
Kyra’s disability created challenges throughout her school career in the Bend-La Pine School District. “Nobody really understood her,” Rice said. “She’s a big personality and she’s very impulsive. And impulsivity is what gets kids in trouble and gets kids suspended.”
Suspended for…what?
Students miss hundreds of thousands of school days each year for subjective infractions like defiance and disorderly conduct, a Hechinger investigation revealed.
Kyra, now 17, said that too few teachers cared about her individualized education program, or IEP, a document that details the accommodations a student in special education is granted. She’d regularly butt heads with teachers or skip class altogether to avoid them. Her favorite teacher was her special ed teacher.
“She understood my ADHD and my other special needs,” Kyra said. “My other teachers didn’t.”
Scott Maben, district spokesperson, said in an email he could not comment on specific disciplinary matters because of privacy concerns, but that the district had a range of responses to deal with student misconduct and that administrators “carefully consider a response that is commensurate with the violation.”
In Oregon, “disruptive conduct” accounted for more than half of all suspensions from 2017-18 to 2021-22. The state department of education includes in that category insubordination and disorderly conduct, as well as harassment, obscene behavior, minor physical altercations, and “other” rule violations.
Disruptive behavior is the leading cause of suspensions because of its “inherently subjective nature,” the state department of education’s spokesperson, Marc Siegal, said in an email. He added that the department monitors discipline data for special education disparities and works with school districts on the issue.
The primary protections for students with disabilities come from the federal government, through the Individuals with Disabilities Education Act, or IDEA. But that law only requires districts to examine whether a student’s behavior stems from their disability after they have missed 10 total days of school through suspension.
At that point, districts are required to hold a manifestation hearing, in which officials must determine whether a student’s behavior was the result of their disability. “That’s where it gets very gray,” Tudisco said. “What happens in the determination of manifestation is very subjective.”
In his experience, he added, the behavior is almost always connected to a student’s disability, but school districts often don’t see it that way.
“Manifestation is not about giving Johnny or Susie a free pass because they have a disability,” Tudisco said. “It’s a process to understand why this behavior occurred so we can do something to prevent it tomorrow.”
The connections are often much clearer to parents.
A Rhode Island mother, Pearl, said her daughter was easily overwhelmed in her elementary school classroom in the Bristol Warren Regional School District. (Pearl is being referred to by her middle name because she is still a district parent and fears retaliation.)
Her child has autism and easily experiences a sensory overload. If the classroom was too loud or someone new walked in, she might start screaming and get out of her seat, Pearl said. Teachers struggled to calm her down, as other students were escorted out of the room.
Sometimes, Pearl was called to pick up her daughter early, in an unrecorded informal removal. A few times, though, she was suspended for disorderly conduct, Pearl recalled.
Between 2017-18 and 2020-21, students with disabilities in the Bristol Warren Regional School District made up about 13 percent of the student body, but accounted for 21 percent of suspensions for insubordination and 30 percent of all disorderly conduct suspensions.
The district did not respond to repeated requests for comment.
The Rhode Island Department of Education collects data on school discipline from districts, but special education and discipline reform advocates in the state say that the agency rarely acts on these numbers.
Department spokesperson Victor Morente said in an email that the agency monitors discipline data and is “very clear that suspension should be the last option considered.” He added that the department has published resources about alternatives to suspension and discipline specifically for students with disabilities.
A 2016 state law that limits the overall use of out-of-school suspensions also requires that districts examine their data for inequities. Districts that find such disparities are supposed to submit a report to the department of education, said Hannah Stern, a policy associate at the Rhode Island American Civil Liberties Union.
Her group submits public records requests for copies of their reports every year, but has never received one, she said, “even though almost every single school district exhibits disparities.”
Pearl said that her daughter needed one-on-one support in the classroom instead of punishment. “She’s autistic. She’s not going to learn her lesson by suspending her,” Pearl said. “She actually got more scared to go back. She actually felt very unwelcome and very sad.”
Students with autism often have a hard time connecting their actions to the punishment, said Joanne Quinn, executive director of The Autism Project, a Rhode Island-based group that offers support to family members of people with autism. With suspension, “there’s no learning going on and they’re going to do the same thing incorrectly.”
Quinn’s group provides training for schools throughout Rhode Island and beyond, aimed at helping teachers understand how the brain functions in people with autism and offering strategies on how to effectively respond to behavior challenges that could easily be labeled disobedient or disorderly.
Federal law provides a road map for schools to improve how they respond to misconduct related to a student’s disability. Schools should identify a student’s triggers and create a behavior intervention plan aimed at preventing problems before they start, it says.
But, doing these things well requires time, resources and training that can be in short supply, leaving teachers feeling alone, struggling to maintain order in their classrooms, said Christine Levy, a former special education teacher and administrator who works as an advocate for individual special education students in the Northeast, including Rhode Island.
Levy recently worked with a student with disabilities who was suspended after he tickled a peer at a locker on five straight days. But, she said, the situation should have never reached the point of suspension: Educators should have quickly identified what the boy was struggling with and set a plan in motion to help him, including modeling appropriate locker conduct.
Had this boy’s teachers done that, the suspension could have been avoided. “The repair of that is so much longer and so much harder to do versus, let’s catch it right away,” she said.
Cranston Public School officials would regularly call Michelle Gomes and tell her to come get her daughter for misbehaving in class, she said. Credit: Sarah Butrymowicz/The Hechinger Report
Many parents described similar situations, though, in which a child routinely got in trouble for repeated behavior. When Michelle Gomes’s daughter became upset in her kindergarten classroom, she’d often run out and refuse to come back in. Sometimes, she’d tear things off the walls.
“Whenever she gets like that, it’s hard to see,” Gomes said. “I hurt for her. It’s like she’s not in control.”
Gomes received regular calls from Cranston Public School officials to come pick her daughter up. A couple of times, the child was formally suspended, Gomes said. The school described her as a safety risk, Gomes recalled.
“She obviously doesn’t feel safe herself,” she said.
Cranston Public Schools did not respond to requests for comment.
Gomes’s daughter had a speech delay and anxiety and qualified for special education services. A private neurological evaluation concluded that she was compensating for that delay with her physical responses, Gomes said.
This can be a common cause of behavior challenges for students with disabilities, experts say.
“Behavior is communication,” said Julian Saavandra, an assistant principal and an expert at Understood.org. “The behavior is trying to tell us something. We as the IEP team, the school team, have to dig deeper.”
On her own, Gomes found strategies that helped. Gomes’ child struggled with transitions, so they’d go over her day in advance to prepare her for what to expect. A play therapist taught both her and her daughter breathing exercises.
Her daughter was switched to another district school where a social worker would sometimes walk the girl to class. When the child got worked up, she’d sometimes be allowed to sit with that social worker or in the nurse’s office to calm down. That helped, but sometimes, those staff members weren’t available.
In the end, Gomes moved her daughter to a school outside the district that was better equipped to help the girl deescalate. Her behavior problems lessened and she started enjoying going to school, Gomes said.
But Gomes still can’t understand why more teachers weren’t able to help her child regulate herself. “Do we need retraining or do we need new training?” she said. “Because this is mindblowing to me, not one of you can do that.”
Note: The Hechinger Report’s Fazil Khan had nearly completed the data analysis and reporting for this project when he died in a fire in his apartment building. USA TODAY Senior Data Editor Doug Caruso completed data visualizations for this project based on Khan’s work.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
A Rhode Island student smashed a ketchup packet with his fist, splattering an administrator. Another ripped up his school work. The district called it “destruction of school property.” A Washington student turned cartwheels while a PE teacher attempted to give instructions.
A pair of Colorado students slid down a dirt path despite a warning. An Ohio 12th grader refused to work while assigned to the in-school suspension room. Then there was the Maryland sixth grader who swore when his computer shut off and responded “my bad” when his teacher addressed his language.
Their transgressions all ended the same way: The students were suspended.
Discipline records state the justification for their removals: These students were disorderly. Insubordinate. Disruptive. Disobedient. Defiant. Disrespectful.
At most U.S. public schools, students can be suspended, even expelled, for these ambiguous and highly subjective reasons. This type of punishment is pervasive nationwide, leading to hundreds of thousands of missed days of school every year, and is often doled out for misbehavior that doesn’t seriously hurt anyone or threaten school safety, a Hechinger Report investigation found.
Districts cited one of these vague violations as a reason for suspending or expelling students more than 2.8 million times from 2017-18 to 2021-22 across the 20 states that collect this data. That amounted to nearly a third of all punishments recorded by those states. Black students and students with disabilities were more likely than their peers to be disciplined for these reasons.
Many discipline reform advocates say that suspensions should be reserved for only the most serious, dangerous behaviors. Those, the analysis found, were much less common. Violations of rules involving alcohol, tobacco or drugs were cited as reasons for ejecting students from classes about 759,000 times, and incidents involving a weapon were cited 131,000 times. Even infractions involving physical violence — such as fighting, assault and battery — were less common, with about 2.3 million instances. (Learn more about the data and how we did our analysis.)
Because categories like defiance and disorderly conduct are often defined broadly at the state level, teachers and administrators have wide latitude in interpreting them, according to interviews with dozens of researchers, educators, lawyers and discipline reform advocates. That opens the door to suspensions for low-level infractions.
“Those are citations you can drive a truck through,” said Jennifer Wood, executive director for the Rhode Island Center for Justice.
The Hechinger Report also obtained more than 7,000 discipline records from a dozen school districts across eight states through public records requests. They show a wide range of behavior that led to suspensions for things like disruptive conduct and insubordination. Much of the conduct posed little threat to safety. For instance, students were regularly suspended for being tardy, using a phone during class or swearing.
Decades of research have found that students who are suspended from school tend to perform worse academically and drop out at higher rates. Researchers have linked suspensions to lower college enrollment rates and increased involvement with the criminal justice system.
These findings have spurred some policymakers to try to curtail suspensions by limiting their use to severe misbehavior that could harm others. Last year, California banned all suspensions for willful defiance. Other places, including Philadelphia and New York City, have similarly eliminated suspensions for low-level misconduct.
Elsewhere, though, as student behavior has worsened following the pandemic, legislators are calling for stricter discipline policies, concerned for educators who struggle to maintain order and students whose lessons are disrupted. These legislative proposals come despite warnings from experts and even classroom teachers who say more suspensions — particularly for minor, subjective offenses — are not the answer.
Roberto J. Rodríguez, assistant U.S. education secretary, said he was concerned by The Hechinger Report’s findings. “We need more tools in the toolkit for our educators and for our principals to be able to respond to some of the social and emotional needs,” he said. “Suspension and expulsion shouldn’t be the only tool that we pull out when we see behavioral issues.”
Suspended for…what?
Students miss hundreds of thousands of school days each year for subjective infractions like defiance and disorderly conduct, a Hechinger investigation revealed.
In Rhode Island, insubordination was the most common reason for a student to be suspended in the years analyzed. Disorderly conduct was third.
In the Cranston Public Schools, these two categories accounted for half of the Rhode Island district’s suspensions in 2021-22. Disorderly conduct alone made up about 38 percent.
Behavior that led to a such a suspension there in recent years included:
Getting a haircut in the bathroom;
Putting a finger through the middle of another student’s hamburger at lunch;
Writing swear words in an email exchange with another student;
Throwing cut up pieces of paper in the air;
Stabbing a juice bottle with a pencil and getting juice all over a table and peers; and
Leapfrogging over a peer and “almost” knocking down others.
Cranston school officials did not respond to repeated requests for comment.
Rhode Island Department of Education spokesperson Victor Morente said in an email that the agency could not comment on specific causes for suspension, but that the department “continues to underscore that all options need to be exhausted before schools move to suspension.”
The department defines disorderly conduct as “Any act which substantially disrupts the orderly conduct of a school function, [or] behavior which substantially disrupts the orderly learning environment or poses a threat to the health, safety, and/or welfare of students, staff, or others.”
Many states use similarly unspecific language in their discipline codes, if they provide any guidance at all, a review of state policies found.
For education departments that do provide definitions to districts, subjectivity is frequently built in. In Louisiana’s state guidance, for instance, “treats authority with disrespect” includes “any act which demonstrates a disregard or interference with authority.”
Ted Beasley, spokesperson for the Louisiana Department of Education, said in an email that discipline codes are not defined in state statutes and that “school discipline is a local school system issue.”
Officials in several other states said the same.
The result, as demonstrated by a review of discipline records from eight states, is a broad interpretation of the categories: Students were suspended for shoving, yelling at peers, throwing objects, and violating dress codes. Some students were suspended for a single infraction; others broke several rules.
In fewer than 15 percent of cases, students got in trouble for using profanity, according to a Hechinger analysis of the records. The rate was similar for when they yelled at or talked back to administrators. In at least 20 percent of cases, students refused a direct order and in 6 percent, they were punished for misusing technology, including being on the cell phones during class or using school computers inappropriately.
“What is defiance to one is not defiance to all, and that becomes confusing, not just for the students, but also the adults,” said Harry Lawson, human and civil rights director for the National Education Association, the country’s largest teachers union. “Those terms that are littered throughout a lot of codes of conduct, depending on the relationship between people, can mean very different things.”
But giving teachers discretion in how to assign discipline isn’t necessarily a problem, said Adam Tyner, national research director at the Thomas B. Fordham Institute. “The whole point of trusting, in this case, teachers, or anyone, to do their job is to be able to let them have responsibility and make some judgment calls,” he said.
Tyner added that it’s important to think about all students when considering school discipline policies. “If a student is disrupting the class, it may not help them all that much to take them and put them in a different environment, but it sure might help the other students who are trying to learn,” he said.
Johanna Lacoe spent years trying to measure exactly that — the effect of discipline reforms on all students In Philadelphia, including those who hadn’t been previously suspended. The district banned out-of-school suspensions for many nonviolent offenses in 2012.
Critics of the policy shift warned that it would harm students who do behave in class; they’d learn less or even come to school less often. Lacoe’s research found that schools faithfully following the new rules saw no decrease in academic achievement or attendance for non-suspended students.
But, the policy wasn’t implemented consistently, the researchers found. The schools that complied already issued the fewest suspensions; it was easier for them to make the policy shift, Lacoe said. In schools that kept suspending students, despite the ban, test scores and student attendance fell slightly.
Overall, though, students who had been previously suspended showed improvements. Lacoe called eliminating out-of-school suspensions for minor infractions a “no brainer.”
“We know suspensions aren’t good for kids,” said Lacoe, the director of the California Policy Lab, a group that partners with government agencies to research the impact of policies. “Kicking kids out of school and providing them no services and no support and then returning them to the environment where nothing has changed is not a good solution.”
This fall, two high schoolers in Providence, Rhode Island, walked out of a classroom. They later learned they were being suspended for their action, because it was disrespectful to a teacher.
On her first day back after the suspension, one of the students, Sara, said she went to her teacher to talk through the incident. It was something she wished she’d had the chance to do without missing a couple days of school.
“Suspending someone, not talking to someone, that’s not helping,” said Sara, whose last name is being withheld to protect her privacy. “You’re not helping them to succeed. You’re making it worse.”
In 2021-22, disorderly conduct and insubordination made up a third of all Providence Public School suspensions.
District spokesperson Jay Wegimont said in an email that the district uses many alternatives to suspension and out-of-school suspensions are only given to respond to “persistent conduct which substantially impedes the ability of other students to learn.”
Some parents and students interviewed asked not to have their full names published, fearing retaliation from their school districts. But nearly all parents and students who have dealt with suspension for violations such as disrespect and disorderly conduct also said that the punishment often did nothing but leave the student frustrated with the school and damage the student’s relationships with teachers.
Following a suspension, Yousef Munir founded the Young Activists Coalition, which advocated for fair discipline and restorative practices at Cincinnati Public Schools. Credit: Albert Cesare/ Cincinnati Enquirer
At a Cincinnati high school in 2019, Yousuf Munir led a peaceful protest about the impact of climate change, with about 50 fellow students. Munir, then a junior, planned to leave school and join a larger protest at City Hall. The principal said Munir couldn’t go and threatened to assign detention.
Munir left anyway.
That detention morphed into suspension for disobeying the principal, said Munir, who remembers thinking: “The only thing you’re doing is literally keeping me out of class.”
The district told The Hechinger Report that Munir was suspended for leaving campus without written permission, a decision in line with the district’s code of conduct.
The whole incident left Munir feeling “so angry I didn’t know what to do with it.” They went on to start the Young Activists Coalition, which advocated for fair discipline and restorative practices at Cincinnati Public Schools.
Now in college, Munir is a mentor to high school kids. “I can’t imagine ever treating a kid that way,” they said.
In 2021-22, 38 percent of suspensions and expulsions in Maryland’s Dorchester County Public Schools were assigned for disrespect and disruption. Credit: Sarah Butrymowicz/The Hechinger Report
Parents and students around the country described underlying reasons for behavior problems that a suspension would do little to address: Struggles with anxiety. Frustration with not understanding classwork. Distraction by events in their personal lives.
Discipline records are also dotted with examples that indicate a deeper cause for the misbehavior.
In one case, a student in Rhode Island was suspended for talking back to her teachers; the discipline record notes that her mother had recently died and the student might need counseling. A student in Minnesota “lost his cool” after having “his buttons pushed by a couple peers.” He cursed and argued back. A Maryland student who went to the main office to report being harassed cursed at administrators when asked to formally document it.
To be sure, discipline records disclose only part of a school’s response, and many places may simultaneously be working to address root causes. Even as they retain — and exercise — the right to suspend, many districts across the country have adopted alternative strategies aimed at building relationships and repairing harm caused by misconduct.
“There needs to be some kind of consequence for acting out, but 9 out of 10 times, it doesn’t need to be suspension,” said Judy Brown, a social worker in Minneapolis Public Schools.
Some educators who have embraced alternatives say in the long run they’re more effective. Suspension temporarily removes kids; it rarely changes behavior when they return.
“It’s really about having the compassion and the time and patience to be able to have these conversations with students to see what the antecedent of the behavior is,” Brown said. “It’s often not personal; they’re overwhelmed.”
In some cases, students act out because they don’t want to be at school at all and know the quickest escape is misbehavior.
Records from Maryland’s Dorchester County Public Schools show that the main goal for some students who were suspended for defiance and disruption was getting sent home Credit: Sarah Butrymowicz/The Hechinger Report
On Valentine’s day 2022, a Maryland seventh grader showed up to school late. She then refused to go to class or leave the hallway and, according to her Dorchester County discipline record, was disrespectful towards an educator. “These are the behaviors [the student] typically displays when she does not want to go to class,” her record reads.
By 8:30 she was suspended and sent home for three days.
Dorchester County school officials declined to comment. In 2021-22, 38 percent of suspensions and expulsions in the district were assigned for disrespect and disruption.
Last year, administrators in Minnesota’s Monticello School District spent the summer overhauling their discipline procedures and consequences, out of concern that students of color were being disproportionately disciplined. They developed clearer definitions for violation categories and instituted non-exclusionary tools to deal with isolated minor misbehaviors.
Previously, the district suspended students for telling an “inappropriate joke” in class or cursing, records show. Those types of behavior will now be dealt with in schools, Superintendent Eric Olsen said, but repeated refusals and noncompliance could still lead to a suspension.
“Would I ever want to see a school where we can’t suspend? I would not,” he said. “Life is always about balance.”
Olsen wants his students — all students — to feel valued and be successful. But they’re not his only consideration. “You also have to think of your employees,” he said. “There’s also that fine line of making sure your staff feels safe.”
Monticello, like most school districts across the country, has seen an increase in student misconduct since schools reopened after pandemic closures. A 2023 survey found that more than 40 percent of educators felt less safe in their schools compared with 2019 and, in some instances, teachers have been injured in violent incidents, including shootings.
And even before 2020, educators nationwide were warning that they lacked the appropriate mental health and social service supports to adequately deal with behavior challenges. Some nonviolent problems, like refusal to put phones away or stay in one’s seat, can make it difficult for teachers to effectively do their jobs.
And the discipline records reviewed by The Hechinger Report do capture a sampling of more severe misbehavior. In some cases, students were labeled defiant or disorderly for fighting, throwing chairs or even hitting a teacher.
Shatara Clark taught for 10 years in Alabama before feeling too disrespected and overextended to keep going. She recalled regular disobedience from students.
“Sometimes I look back like, ‘How did I make it?’” Clark said. “My blood pressure got high and everything.”
She became so familiar with the protocol for discipline referrals that she can still remember every step two years after leaving the classroom. In her schools, students were suspended for major incidents like fighting or threatening a teacher but also for repeated nonviolent behavior like interrupting or speaking out in class.
Clark said discipline records often don’t show the full context. “Say for instance, a boy got suspended for talking out of turn. Well, you’re not going to know that he’s done that five times, and I’ve called his parents,” she said. “Then you see someone that’s been suspended for fighting, and it looks like the same punishment for a lesser thing.”
In many states, reform advocates and student activists pushing to ban harsh discipline policies have found a receptive audience in lawmakers. Many teachers are also sympathetic to their arguments; the National Education Association and American Federation of Teachers support discipline reform and alternatives to suspension.
In some instances, though, teachers have resisted efforts to curtail suspensions, saying they need to have the option to remove kids from school.
Many experts say the largest hurdle to getting teachers to embrace discipline reforms is that new policies are often rolled out without training or adequate staffing and support.
Without those things, “the policy change is somewhat of a paper tiger,” said Richard Welsh, an associate professor of education and public policy at Vanderbilt University. “If we don’t think about the accompanying support, it’s almost as if some of these are unfunded mandates.”
In Monticello, Olsen has focused on professional development for teachers to promote alternatives to suspension. The district has created space for students to talk about their actions and how they can rebuild relationships.
It’s still a work in progress. Teacher training, Olsen says, is key.
“You can’t just do a policy change and expect everyone to magically do it.”
Reporting contributed by Hadley Hitson of the Montgomery Advertiser and Madeline Mitchell of the Cincinnati Enquirer, members of the USA TODAY Network; and Amanda Chen, Tazbia Fatima, Sara Hutchinson, Tara García Mathewson, and Nirvi Shah, The Hechinger Report.
Editors’ note: The Hechinger Report’s Fazil Khan had nearly completed the data analysis and reporting for this project when he died in a fire in his apartment building. Read about the internship fund created to honor his legacy as a data reporter. USA TODAY Senior Data Editor Doug Caruso completed data visualizations for this project based on Khan’s work.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.
The federal government’s financial aid application, known as the FAFSA, has been plagued with problems since its new version launched December 30, three months late. This is a major problem for the more than 70 percent of undergraduates who rely on some type of financial aid to pay for their education, because they’ll have less time than ever to make a decision about one of the biggest expenses of their lives.
What can parents do? The best first step is one that’s often the hardest for parents: Start a conversation about what you can afford. Research has shown that middle-class families rarely discuss the trade-offs and uncertainties related to paying for college, even though an honest conversation may prevent future financial headaches and relational heartache. The biggest reason? Parents may not want to burden their children with financial worries.
As a researcher at uAspire, a nonprofit that tries to help students learn about and access financial aid, I find that concerning. But I know how hard these discussions can be.
My own family didn’t talk about how we’d pay for college more than 25 years ago. I remember when the promissory notes arrived at my house, on green postcards, written in a tiny font size. I didn’t ask a single person what they meant, and no one in my family explained them to me — I just signed and mailed them back. Loans appeared to offer a bridge from my high school reality to an independent, adult life far from home. What I didn’t realize is how many of my future choices would be limited for the next 21 years, until those loans were finally paid off. Making room in my postcollege budget for loan payments affected where I could afford to live, how many hours I had to work, how often I could eat out, whether I could afford to travel to a friend’s wedding and whether I could donate to charities, among other choices.
Of course, the amount of financial damage I could do to myself back then was more limited than it would be now. Tuition charges alone have more than tripled at my alma mater, Northwestern University, since I was a student, rising from less than $20,000 a year in 1998 to nearly $65,000 this past fall.
FAFSA Fiasco
This op-ed is part of a package of opinion pieces The Hechinger Report is running that focus on solutions to the new FAFSA’s troubled rollout.
To muster the bravery for a financial talk, it may help parents to know that this process is complicated for every family. The FAFSA — the first step in a lengthy process to unlock grants, loans, work-study and other forms of financial aid — has been imperfect since its inception in 1992. This new version promises to be simpler and award Pell Grants to over 600,000 more students from low-income families — major policy wins. Yet families largely have not found FAFSA to be simpler. It’s improving, but the growing pains are being felt by students and parents everywhere.
That’s why it is so imperative for families to talk now, while there is still time to listen, share and make a plan, before placing a deposit somewhere.
Once you do start talking, the conversation with your child should cover a few things: What can our family afford to pay up front to start college? What sources — savings, or a part-time job, for example — can your child rely on for day-to-day expenses during college? And what can they comfortably pay back later based on their expected employment earnings?
There are other things you can do, too. First, complete the FAFSA as soon as possible. Second, review the financial aid offers once they arrive — even though they will likely arrive later than usual this year — and make sure you understand the different types of aid being offered.
My organization offers a free tool — a college cost calculator — to compare notoriously confusing aid offers. Since fewer than half of the students who begin a bachelor’s degree will graduate within four years, choose an institution with the most sustainable financing plan, one you could manage for up to six years. Browse government websites like Federal Student Aid and the Consumer Financial Protection Bureau, or industry sites like NerdWallet, to learn about the pros and cons of different types of education loans before accepting any. The Institute of Student Loan Advisors can offer advice if you have questions about loan repayment, including forgiveness and consolidation. Appeal your aid offer if your financial situation has changed dramatically since what was captured by your 2022 tax return; resources on the SwiftStudent website can help you get started.
Of course, these are all individual actions to mitigate the effects of our broken system. Until there’s true change in how we pay for college, students and their families must be vigilant and proactive — starting now.
Jonathan Lewis is the senior director of research at uAspire, a nonprofit group that works to ensure students have the necessary financial information and resources to complete college.
The Hechinger Report provides in-depth, fact-based, unbiased reporting on education that is free to all readers. But that doesn’t mean it’s free to produce. Our work keeps educators and the public informed about pressing issues at schools and on campuses throughout the country. We tell the whole story, even when the details are inconvenient. Help us keep doing that.