Virtual Applied Data Science Training Institute (VADSTI)

Data Science Approaches to Better Understand Clinical and Genomic Informatics



With the recent advancements in technology, and computational tools, healthcare services, and clinical and geonomic sciences can store large amounts of datasets. There is therefore increased demand for researchers to utilize data analytics capabilities to look at recent trends, predict outcomes and to make better clinical and health policy decision. Skill sets in data science are critical for advancing the science of minority health and health disparities. The Howard University Research Centers in Minority Institutions, RCMI, Program with funding from NIH created the VADSTI to meet the growing data science demand and their application to problems of minority health and health disparities.

The mission of VADSTI is to advance education and research by providing training in the foundations of programming and the critical data analytic skills for planning and conducting research that involves big data. Our aim is to attract and engage underrepresented students and researchers in data science application to biomedical, clinical and genomic research, with a focus on diseases common to minority populations. VADSTI draws faculty with complementary experts in the conduct and application of data science from across different institutions and in partnership with the NIH Office of Data Science Strategy to launch an 8-week comprehensive training in a virtual environment. VADSTI 2021 was an 8-Week training series that ran every other week.

Program Objectives & Competencies

The primary objective of the 2021 VADSTI program was to provide training in the foundations of data science and advance analytic skills and introduce tools for clinical and genomic research. Over the course of the 8-week training program participants were:

  • Introduced to the principles of data science.
  • Gained practical, hands-on experience with Python and related libraries for accessing data from multiple sources and use analytic methods for analyses.
  • Learned about the underlying concepts of probability and statistics.
  • Introduced to advanced statistical analytic techniques utilized in biomedical, clinical and genomic research.
  • Understand the concepts of data partitioning and and practice behind supervised and unsupervised learning.
  • Introduced to advanced algorithmic techniques including machine learning and deep learning.
  • Introduced to tools for applied data science using cloud-based platforms for clinical and genomic research.
  • Learned from experts on current research topics in biomedical, clinical, and genomic application.