The Pandemic, Cardiology Training and Data Science
The Pandemic Endangers Training Goals For FITs
During the early months of the COVID-19 pandemic in the U.S., cardiology fellows in training (FITs) nationally faced competing pressures to continue their clinical training; "get their numbers" for COCATS certification; use their unique skills in procedural, diagnostic and critical care to help care for COVID-19 patients; and hone their skills as clinician investigators.
For many of us, it felt as though one or more of these goals could not be achieved in the short time allotted to us – balancing educational priorities, clinical service and a future career as an investigator.
However, for many FITs interested in a career in biomedical informatics, the pandemic in some ways created the perfect confluence of circumstances to learn data science – incredible examples of data used for good, free access to high quality data, and a perceived urgency to track and analyze data in real time.
Data Availability and Demand For Insights
March to June 2020 was a difficult time to be a doctor, but a fascinating time to be a training investigator. As uncertainty about the virus, its death rate, its mode of transmission, its transmission in different countries, and the efficacy of various treatments were being reported in the press, there was a pressing urgency to acquire the skills and data needed to help in any way possible. The COVID Tracking Project began releasing data on testing domestically, and the Johns Hopkins COVID-19 Dashboard provided data internationally – building transparency and knowledge for press and researchers alike.
Data Journalism and Data Visualization
One source of inspiration for developing data science skills was the incredible work of data journalists, who demonstrated the impressive power of data visualization for science communication. "Flatten the Curve" became an easy way for the public to understand incidence rate and its effect on resource utilization. The Financial Times Visual and Data Journalism Team (led by @jburnmurdoch) has built a suite of interactive charts that served as inspiration for developing new visualization skills.
3 Inspiring Examples of Data Journalism:
- Harry Stevens at the Washington Post explains "Flatten The Curve"
- Our World In Data's Coronavirus Data Explorer
- The COVID Tracking Project's Racial Data Tracker
5 Considerations and Resources to Begin:
After four to six years of continuous clinical training, transitioning to research can be challenging and figuring out where to begin is the first challenge.
1. What is your goal? It is important to set both short- and long-term goals. For me, short-term goals were to show descriptive statistics of how the early surge in COVID-19 cases in the U.S. paralleled the early surge in cases in Italy. My longer-term goals were to gain familiarity with statistical programming to perform genomic analyses for academic publication.
2. Which software is best for you? There is a plethora of options, but among the most commonly used are SAS, STATA and R. For me, the most helpful thing was to ask my lab members. If you have a community of resources and fellow learners, you'll go much further. Also, sharing code in the same language within the same lab is extremely helpful! Personally, I use R – this is also an advantage because it is free and easily accessible.
3. What courses or static resources are helpful? I will include a list of resources that I used below, but my favorite is the Harvard EdX Professional Certificate in R Data Science taught by Rafael Irizarry, PhD. He's a gifted teacher and communicator, and gets you coding quickly. Once you complete the second course on data visualization, you'll be up and running plotting charts with R's ggplot2 package!
4. Who will be your mentor or community? This is very important. Learning alone can be frustrating and difficult. Whether it's a message board in your online class, your lab mates, or students from a summer course, it is very important to find people to struggle with and pull you along. The great thing about fellowship is that likely more than one of your colleagues is in the same boat trying to learn to code. Hold each other accountable.
5. You never "know how to code." What I learn from my colleagues who are much more advanced is that there is never a state of "knowing how to code" because languages, packages and applications are updating constantly. There are basic structures, frameworks and syntaxes you learn early on, but then much of data science is having a desire to learn – even when things seem a little too complicated!
My Two Favorite Courses:
Other Interesting Courses:
- Harvard Catalyst Course in Applied Biostatistics
- Harvard EdX Course in Data Analysis for Genomics (I have not taken this)
- Harvard EdX Data Analysis for Genomics
Free Videos or References:
- Rafael Irizarry's Open Courseware for HarvardX Biomedical Data Sciences
- Broad Institute Primer on Medical and Population Genomics Video Series
Free Sources of High-Quality Data to Use For Your First Analysis:
- Covid Tracking Project
- Hopkins GitHub with Dashboard Data
- NY Times Covid Data Github
- FiveThirtyEight Data Repository
- Pew Research Center Datasets
Some Free Online Textbooks:
- Rafalab Textbook (From the Harvard EdX Course)
- R For Data Science
ACC Members, discuss
this on Member Hub.
This content was developed independently from the content developed for ACC.org. This content was not reviewed by the American College of Cardiology (ACC) for medical accuracy and the content is provided on an "as is" basis. Inclusion on ACC.org does not constitute a guarantee or endorsement by the ACC and ACC makes no warranty that the content is accurate, complete or error-free. The content is not a substitute for personalized medical advice and is not intended to be used as the sole basis for making individualized medical or health-related decisions. Statements or opinions expressed in this content reflect the views of the authors and do not reflect the official policy of ACC.