2 Chapter 1.2: What is Data Science and Why It Matters
This chapter examines the fundamental nature of data science as an interdisciplinary field and explores its critical importance in contemporary decision-making. Key concepts include the four-component framework of data science, the progression from descriptive to prescriptive analytics, and the practical applications that demonstrate data science’s value across industries and societal challenges.
Data Science in Crisis Response: The COVID-19 Dashboard
Dr. Lauren Gardner at Johns Hopkins University created the COVID-19 Dashboard that became the world’s most trusted source for pandemic data (Johns Hopkins Engineering, 2022). Her team combined epidemiological expertise, statistical modeling, computer programming, and data visualization to transform scattered reports into actionable intelligence. Within months, their dashboard influenced policy decisions affecting billions of people worldwide, receiving over 1 billion hits per day and garnering 1.2 billion page views since 2020.
The dashboard exemplifies data science implementation through systematic integration of multiple data sources, cleaning of inconsistent reporting formats, construction of predictive models, creation of compelling visualizations, and communication of findings to diverse audiences from scientists to world leaders. This real-world application demonstrates the transformation from raw information to informed decisions that characterizes effective data science practice.
Defining Data Science
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data.
Data science serves as the bridge between raw information and informed decisions. The COVID-19 dashboard exemplifies this transformation, converting inconsistent case reports from thousands of health departments worldwide into a unified, real-time picture of global pandemic spread that guided trillion-dollar policy decisions.
The Data Science Framework
Data Science = Domain Expertise + Mathematical/Statistical Knowledge + Programming Skills + Communication Abilities
Each component is essential for successful data science implementation. Domain expertise helps practitioners understand what data actually means in context. Mathematical and statistical knowledge enables accurate modeling and pattern recognition. Programming skills automate complex data collection and processing workflows. Communication abilities make complex findings accessible to diverse audiences and decision-makers.
Figure 1.2.1: The four essential components of data science working in integration. Domain expertise, mathematical knowledge, programming skills, and communication abilities connect to form the comprehensive data science approach, with each component contributing unique capabilities for transforming raw data into actionable insights.
The COVID-19 example demonstrates this integration clearly. Domain expertise helped epidemiologists understand disease transmission patterns. Mathematical and statistical knowledge enabled accurate modeling of virus spread and prediction of future trends. Programming skills automated the complex data collection needed to handle thousands of sources. Communication abilities made findings accessible to both scientists and world leaders making critical decisions.
Contemporary Relevance of Data Science
The COVID-19 pandemic highlighted the inadequacy of traditional data analysis approaches for contemporary challenges. Three key factors have created urgent need for data science capabilities across all sectors of society.
Scale Challenge
Global data generation reaches 2.5 quintillion bytes daily, with 90% of the world’s data created in the last two years (Edgedelta, 2025). This explosion includes real-time health data from millions of patients, mobility patterns from billions of smartphones, social media sentiment from global populations, and economic indicators updating continuously. Traditional analysis methods, designed for smaller datasets and manual processing, cannot handle this volume, velocity, and variety of information.
Competitive and Social Advantage
Organizations successfully harnessing data science consistently outperform those relying on intuition or traditional methods. Data-driven companies are 23 times more likely to acquire customers, 6 times as likely to retain customers, and 19 times as likely to be profitable compared to their non-data-driven counterparts (McKinsey Global Institute, 2014). In critical areas like healthcare and public policy, data science applications often determine the difference between effective and ineffective interventions.
Complex, Interconnected Problems
Modern challenges exceed human cognitive capacity without systematic, computational approaches. Climate change involves atmospheric data, economic models, social behavior patterns, and political dynamics interacting simultaneously. Healthcare demands integration of genetic information, treatment outcomes, lifestyle factors, and population health trends. Urban planning requires coordination of transportation data, demographic shifts, environmental monitoring, and economic development patterns.
The Data Science Impact Framework
Data science creates value through four progressive levels of analysis, each building on the previous level to deliver increasingly sophisticated insights.
Descriptive Analytics
Descriptive analytics answers “What happened?” providing situational awareness through summary statistics, data visualization, and reporting. The COVID-19 dashboard showed current case counts, death rates, and geographic distributions, giving decision-makers clear pictures of pandemic status at any moment.
Diagnostic Analytics
Diagnostic analytics answers “Why did it happen?” moving beyond description to identify causes and relationships. Researchers analyzed why some regions had higher transmission rates than others, identifying critical factors like population density, mobility patterns, healthcare capacity, and policy interventions.
Predictive Analytics
Predictive analytics answers “What will happen?” using historical patterns to forecast future conditions, enabling proactive decision-making. Models forecast infection peaks weeks in advance, helping hospitals prepare capacity, governments time interventions, and supply chains anticipate demand.
Prescriptive Analytics
Prescriptive analytics answers “What should we do?” providing specific recommendations for optimal outcomes. Advanced models recommended optimal vaccination strategies, resource allocation policies, and intervention timing that would minimize both health impacts and economic disruption.
This progression from description to prescription represents data science’s unique value proposition—not just understanding what happened, but actively guiding what should happen next to achieve desired outcomes.
Figure 1.2.2: The four-level analytics progression demonstrating how data science creates value through descriptive, diagnostic, predictive, and prescriptive approaches. Each level builds upon the previous to deliver increasingly sophisticated insights that enable proactive decision-making.
Cross-Industry Applications
The principles demonstrated in pandemic response apply across virtually every industry and challenge. Netflix uses this same progression to recommend content, describing viewing patterns across millions of users, diagnosing preference drivers and viewing behaviors, predicting what individual users want to watch next, and prescribing optimal content mix for personalized interfaces.
Financial institutions follow similar steps for fraud detection, moving from transaction monitoring through risk pattern analysis to predictive threat assessment and automatic intervention decisions. Manufacturing companies apply data science to predict equipment failures before they occur, optimizing maintenance schedules and preventing costly downtime.
Retailers analyze purchasing patterns to optimize inventory, ensuring popular items stay in stock while minimizing waste. Educational institutions personalize learning experiences based on student performance data, adapting content delivery to individual learning styles and pace. Urban planners use traffic flow data, demographic trends, and environmental monitoring to design more livable cities.
In each case, data science transforms raw information into intelligent action, creating value that wouldn’t be possible through traditional analytical approaches alone.
Key Concepts Summary
Data science represents a systematic approach to extracting knowledge from data through the integration of domain expertise, mathematical rigor, programming capability, and communication effectiveness. Its four-level analytical framework—descriptive, diagnostic, predictive, and prescriptive—enables organizations to move beyond simply understanding what happened to actively guiding optimal future outcomes.
The contemporary relevance of data science stems from unprecedented data scale, competitive advantages for data-driven organizations, and the complexity of modern interconnected problems that exceed traditional analytical approaches. Real-world applications across industries demonstrate data science’s capacity to transform raw information into intelligent action, creating value impossible through conventional methods.
References
Adhikari, A., DeNero, J., & Wagner, D. (2022). Computational and inferential thinking: The foundations of data science (2nd ed.). University of California, Berkeley. https://inferentialthinking.com/chapters/intro.html
Edgedelta. (2025, March 24). Data creation in 2024: Daily breakdown. https://edgedelta.com/company/blog/how-much-data-is-created-per-day
Johns Hopkins Engineering. (2022, September 29). COVID-19 Dashboard Creator Lauren Gardner wins Lasker-Bloomberg Public Service Award. Johns Hopkins Whiting School of Engineering. https://engineering.jhu.edu/news/covid-19-dashbaord-creator-lauren-gardner-wins-lasker-bloomberg-public-service-award/
McKinsey Global Institute. (2014). Five facts: How customer analytics boosts corporate performance. McKinsey & Company. https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/five-facts-how-customer-analytics-boosts-corporate-performance
Timbers, T., Campbell, T., & Lee, M. (2024). Data science: A first introduction. University of British Columbia. https://datasciencebook.ca/intro.html