The Value of Experience

Experience counts, especially in a fast changing world

Covid-19 Analysis with Python

I was able to generate a web site where, with one click, I could see the information that was pertinent to me using publicly available data along with Python, Plotly, Plotly Dash and the Heroku cloud/web service environment. The website can be accessed here at https://arkletonanalysis.herokuapp.com/. It may take up to twenty seconds to wake up the service so be patient.

I enjoy learning new things and analyzing trends to better understand what is behind the story. When I first retired, I decided to learn the Python programming language and learn a bit about Data Science and Machine Learning. I did get a handle and Python and got a better understanding of Data Science. However, after a few small projects, I felt that I needed a bigger project on which to test out my knowledge.

As the “shelter in place” , whether self imposed or mandated, protocol was put in place, I had a good deal of free time to pick up Python again to analyzed the available data for the Covid-19 in my county and state. My ultimate goal was to create a “system” where I could get the information that I wanted with just one click.

I started out in small steps and learning different tools and platforms along the way. The very first tool was Google’s colaboratory , which provided a hosted web environment for Jupyter Notebook. This is a great platform as you do not have to configure your own environment and you can access it from any device. I did most of my prototyping here and was able to access and run my notebooks with my iPad and a web browser.

I wanted to visualize the trend of the Covid-19 data for my county and state. Here is where Plotly and Plotly Dash came into the picture. In my first prototypes, I was using Matplotlib, however, I wanted a bit more interactivity. I found several videos explaining the Plotly ecosystem with some great examples. The videos were great learning tools as well and they helped me dig deeper and deeper into the Python/Plotly experience. Finally, I was able to build my simple app and host it for free on Heroku. Now that I had my app hosted, I could share my work with friends and neighbors, who were also interested in a simple way to access the local data.

Resources

The benefit of my approach is that I used mostly open source tools which included:

  • Anaconda Python the main Python interpreter
  • Juptyer and Jupyter Lab used for prototyping
  • Jupyter Lab – Plotly Chart Editor for tweaking the Plotly charts
  • Google’s Colaboratory also for initial prototyping
  • Plotly and Plotly Dash for data visualization
  • Heroku as the hosting environment
  • Pycharm as my primary IDE
  • Git and GitHub for version control

Data

I used the publicly available datasets from John Hopkins and the daily csv files from The Covid Tracking Project. The Python code goes out and reads the most current data every time it runs. The data is updated nightly. I verified the data against the official data published by our state, so I knew that the data was good and accurate.

YuoTube as a Learning Environment

I used YouTube extensively to learn all of the new tools and environments. Many of the video creators provided examples and allow you to view their code stored on GitHub. One of the best content creators is Adam Schroeder. Adam has a great style and explains everything that you need to use Plotly and Plotly Dash. His YouTube channel is called Charming Data.

Conclusion

I now feel that I have rudimentary skills to access and analyzed data using Python and generate some interesting insights through data visualization using Plotly and Plotly Dash. However, I am still a hack and my code is not eloquent by any means. One of my biggest hurdles was learning or re-learning some of the different platforms, be it the Pycharm IDE or GitHub and Google Colaboratory. This learning experience was a lot of fun and kept me out of my wife’s hair as we remained mostly in our house for the last several months. If interested, I could share my code on GitHub, once I clean it up a bit.

I have also done some additional analyses that lets me look at the Covid-19 data by any state or any country. With Adam’s help, I used Plotly Dash tables to select the data that I would subsequently analyze.

The next step may be to do some statistical analysis of the data to see if there is a correlation between certain values. I would love to have more detailed data from anonymized patient data to look at how the disease manifests itself.

A Plotly chart showing the data for Georgia.
Forsyth County Data

Leave a comment

Information

This entry was posted on July 22, 2020 by in Data Science, Python and tagged , , , .

Navigation