Data Science Projects

Introduction

Welcome to my data science projects page! As part of my ongoing growth as a data scientist, an important component is practice. Specifically, my projects are aimed at asking intersting questions of real open datasets in order to derive, visualise and present potentially useful insights and possible solutions!

An important feature of this section of the site is that it enables me to explore a variety of interesting and useful data science techniques and apply them to realistic scenarios. Please checkout this overview post if you want a bit more background. Otherwise, please feel free to browse the index below that summarises my growing list of projects to get a feel for what I have been exploring to date.

Thanks for stopping by... I hope that you will find inspiration and enjoyment here :smile:!

Project Index

Title: Data Science Mini-project: Interactive App with Shiny

Description:

This mini-project was an assignment for the Developing Data Products MOOC on Coursera. The aim of this project was to create a complex and useful interactive Shiny app to illustrate the fundamental principles in the assessment of regression models using a simple dataset. The inspiration for this project was drawn from the concepts that I was studying in the Regression Models MOOC that I had taken not long before that.

Title: Data Science Mini-project: Machine Learning

Description:

This mini-project was an assignment for the Practical Machine Learning MOOC on Coursera. The focus of this assignment was on the application of machine learning processes and techniques to a reasonably challenging dataset. The challenge was to predict the type and quality of an excercise activity using sensor data.

Title: Healthy Air

Description:

This is a project that aims to explore respiratory health, focusing on asthma, in the context of potential factors that may explain how it changes over time. Specifically, I am interested in how the prevalence and severity of asthma are potentially affected by factors such as:

  • pollution sources and emmission types
  • demographic characteristics
  • meterological factors

This is an end to end project, whcih basically means that the scope of this project covers the entire data science process, which includes important activities like:

  • Data wrangling: Raw data acquisition, processing and management
  • Data exploration and feature selection
  • Data modelling and statistical inference
  • Predictive analytics via machine learning and/or data modelling to leverage significant patterns and trends identified within the datasets
  • Creation of Data Products: reproducible reports and data visualisations