Course Catalog > Foundations of Data Science
Foundations of Data Science
Dr. Josh Vandenbrink brings his interactive, skills-based, fun approach to this Praxis Foundations of Data Science course. Using R Studio, students will discover several data visualization and graphing techniques, learn how to explore and analyze data using various methods, mine Twitter for sentiment analysis, and gain practice using interactive shiny apps. At the end of this course, each student will be much better prepared for a life awash in data.
Students will explore how R impacts data science and learn to create stunning visual representations of data. Using AI learning resources and labs, they we will learn to mine their own data from social media networks such as Twitter and Facebook, to perform sentiment analysis of public health topics. In addition, students will learn advanced techniques for reproducible research, which will allow them to create apps of the work they have conducted throughout the class. These apps will allow end-users to manipulate data and change how it is presented. Qualifies for the Foundations of Data Science digital badge.
The online program is available 24x7x365 via any web browser or mobile device and includes five (5) learning paths, 50+ video lectures, and over fifty (50) hours of learning material. Below is a list of the main topics:
- Data Visualization
- Exploratory Data Analysis
- Sentiment Analysis
- Final Project
This class endeavors to answer the question, “What good is science if you can’t convey results to the public?” During the pandemic, citizens were inundated with public health data related to the virus, including maps, public health statistics, articles, news reports, etc. The quality of the data presentation ranged from outstanding to down-right awful and confusing. This course starts with the creation of static graphs using the GGPlot and Plotly packages, which create fully customizable and professional looking charts. Students will also learn to “clean” their data through outlier detection and culling of missing data points.
In addition to being inundated with public health data during the pandemic, we were also inundated with people’s opinions and reactions to the crisis. People went to social media in droves (since many were in quarantine) to tout the vaccine, suggest alternative medicines, debate the efficacy of lockdown, among many other topics. Understanding the publics response to public health data is a valuable component of data presentation. Thus, students will leverage the ability of R to perform sentiment analysis of social media posts on various topics related to the Covid-19 pandemic. Lastly, students will integrate all of these techniques of data presentation to create interactive apps with RShiny. These apps allow the presentation of complex or large datasets and allow end-user interaction.
Skills and Resources
Earners of the Foundations of Data Science credential have successfully demonstrated experiential skills in the creation of polished data products through advanced data visualization techniques. The Foundations of Data Science badge requires 50+ hours of hands-on activities and labs across 10+ skills in data science. The Foundations of Data Science credential was built in collaboration with data science expert Dr. Josh Vandenbrink.
Following is summary of the earning criteria for the Foundations of Data Science digital credential:
- Complete 15+ hands-on data science labs using R Studio and live computing systems
- AND – Complete all required learning resources in the Foundations of Data Science online journey – 80+ lessons – including, videos, articles, activities, and discussion posts
- AND – Pass short assessments (80% or better) in all lessons
- AND – Participate in weekly virtual collaboration sessions with instructor(s), mentor(s), and peers
- AND – Conduct original Data Science research using the skills, resources, and tools within the Foundations of Data Science journey