Why Data Science Education Must Begin in High School

In today’s data-saturated world, the ability to make sense of information is no longer optional — it is essential and foundational. Our social, economic, and civic lives are shaped by the constant flow of data that surrounds us, from the metrics we see in the news to the algorithms underpinning the technologies we use daily. As such, data literacy and data science have emerged as essential competencies that all students should begin developing early in their education. Yet, even as data increasingly drives decisions across industries and governments, the United States lacks a universal definition of “data literacy,” and data science remains optional or inaccessible in much of the country’s K–12 system. Strengthening data science education in high school—and ideally earlier—is both an educational necessity and a national imperative.
At its core, data literacy refers to the ability to understand, evaluate, and communicate claims derived from data. The Organisation for Economic Co-operation and Development (OECD) defines it as the capacity to “analyze, explore, read, and argue with data,” underscoring that it is not merely technical skill but a cognitive and communicative one. Echoing this, the statistician Fred Mosteller described statistics as “the art and science of gathering, analyzing, and making inferences from data”—a discipline that requires reasoning, interpretation, and judgment. These intellectual habits do not emerge suddenly in adulthood. They develop gradually, through repeated exposure to meaningful problems, diverse data sources, and authentic decision-making.
Research supports this developmental perspective. Children learn to read well only when they encounter rich literacy environments early and often; inequities in these environments create lasting disparities in reading proficiency. The same is true of data literacy. It is not a skill that can be fully taught in a single high school statistics course. Instead, as the GAISE II report argues, students need ongoing, spiraled exposure beginning in elementary school and reinforced throughout middle and high school. Early experiences with variability, questioning, uncertainty, and evidence-based reasoning help students build the foundations for later mastery in both statistics and data science.
Rob Gould’s criteria for statistical literacy further illustrate why this long-term developmental approach matters. Students must learn to ask investigative questions; understand who collects data and why; create meaningful visualizations; distinguish between random and non-random samples; and appreciate the uncertainty inherent in all statistical processes. These are not isolated skills. They reflect a way of thinking—skeptical, curious, and analytical—that takes years to nurture and is increasingly vital for navigating both personal and professional life.
As data has become ubiquitous, it has also become economically consequential. According to Forbes, data literacy is now one of the most in-demand skills globally. The U.S. Bureau of Labor Statistics projects employment for data scientists to grow by 35% and for statisticians by 32% from 2022 to 2032, among the highest growth rates for any occupations. These roles extend far beyond the technology sector. A simple search for “data science jobs in Indiana,” for example, returns hundreds of listings from companies as varied as Eli Lilly, the State of Indiana, and Caterpillar. Many positions are remote, expanding opportunities for workers far from traditional tech hubs. This is not a niche workforce trend; it reflects a national shift in how industries operate—healthcare, agriculture, finance, manufacturing, national security, and more increasingly rely on employees who can analyze and interpret data.
The challenge is that our education system is not yet prepared to meet this demand. Teachers themselves need stronger preparation in statistics, computer science, and programming concepts. A 2022 study on AI teacher professional development found stark differences in teacher outcomes based on their prior foundational knowledge in mathematics and statistics. Without systemic support, training, and high-quality resources, teachers will struggle to give students the skills that modern society requires.
Other countries have already recognized the strategic importance of data literacy. Statistics Canada provides national resources to support data education for learners of all ages. International networks such as GIST are building global capacity in statistical training. Programs like Census at Schools and Experiments at Schools give students hands-on experiences with authentic data, and New Zealand’s recent curriculum reform emphasizes numeracy and real-world data use. While approaches vary, one theme is universal: early exposure and strong teacher professional development are key.
The consequences of inaction for the United States are substantial. Low data literacy hinders productivity, reduces a company’s ability to understand consumer behavior, inhibits innovation, and leads to poor decision-making. At the national level, insufficient data literacy undermines global competitiveness and weakens national security. Intelligence agencies rely on experts who can distinguish signal from noise, interpret complex datasets, and identify emerging threats. Cybersecurity professionals need to detect anomalies and patterns long before they escalate. A workforce lacking these skills leaves the country more vulnerable.
If the United States wants to maintain its global leadership in innovation, economic strength, and national security, data literacy cannot be an optional enrichment course. It must be a core component of K–12 education, beginning early and growing more sophisticated across the years. High school is a pivotal moment: students are ready for deeper analysis, modeling, and real-world problem-solving. By equipping them with data science skills now, we position the next generation not just to participate in the modern world—but to lead it.
About the Author
Bonnie Ghosh Dastidar is a senior statistician at RAND, specializing in randomized trials, large-scale surveys, and health policy. She is the 119th President of the American Statistical Association and an Elected Fellow of the ASA. She received her PhD from Penn State University in 1998. Her daughter, Maya Brazil, was inspired to co-write this blog post after taking a data science course at Carlmont High School in California. After high school, Maya intends to study criminal justice and forensic science.