VADSTI22

Virtual Applied Data Science Training Institute (VADSTI)

Data Science Approaches to Better Understand Clinical and Genomic Informatics
February 24 – April 22, 2022 | An 8-week data science training series in a virtual setting

About VADSTI

The advancements in technology, and computational tools has made it possible to generate and store large amounts of datasets in many disciplines including public health, clinical and biomedical, genomics and business and economics. There is therefore increased demand for utilization of data analytics capabilities to look at trends, predict outcomes and to make better clinical, health policy and economic decisions. Skill sets in data science are particularly critical for advancing the science of minority health and health disparities.

The Howard University Research Centers in Minority Institutions, RCMI, Program, with funding from NIH, is pleased to announce VADSTI 2.0, to the Howard University community. Our aim is to enhance data science capability and application by providing training in the foundations of programming and the critical data analytic skills for planning and conducting research that involves big data and pertinent to minority issues. This free training series will cover topics including, Foundations of Data Science, Introduction to Python, Statistical Concepts, Data Exploration and Visualization and Predictive Analytics. Participants will receive a verified digital Certificate of Completion from VADSTI.

To register click the following link – https://vadsti22.eventbrite.com

For questions, contact John Kwagyan, PhD at jkwagyan@howard.edu or Stacey Gerald, MSc. at
stacey.gerald@howard.edu

Program Objectives & Competencies

The primary objective of the 2022 VADSTI program is to provide training in the foundations of
data science and advance analytic skills and introduce tools for clinical and genomic research.
Over the course of the 8-week training program you will:

  • Be introduced to the principles of data science.
  • Be introduced to Python programming skills.
  • Gain practical, hands-on experience with Python and related libraries for accessing data from
    multiple sources.
  • Gain insights in use predictive analytic.
  • Learn about the underlying concepts of probability and statistics for data analytics.
  • Be introduced to advanced statistical analytic techniques utilized in biomedical, clinical, and
    genomic research.
  • Understand the concepts of data partitioning and practice behind supervised and unsupervised
    learning.
  • Be introduced to advanced algorithmic techniques including machine learning and deep
    learning..

Certificate of Completion: Participants who complete 6 of the 8 modules will receive a verified
Certificate of Completion from VADSTI.

Evaluation: At the end of each training module, you will be requested to complete electronic
feedback forms on the extent to which expectations and objectives were met.

Registration & Fees: No fees for participation, but registration is required to attend.

VADSTI Training Program Schedule

No prerequisite for research knowledge topics. Basic undergraduate knowledge of algebra and
probability recommended for content knowledge topics. The training series consists of the
following modules.

Module 1
Foundations of 
Data Science


Thursday, February 24, & Friday, February 25, 2022
11:00 AM – 2:00 PM EST

Module 2a
Introduction to Python 1


Thursday, March 3, & Friday, March 4, 2022
11:00 AM – 2:00 PM EST

Module 2b
Introduction to Python II


Thursday, March 17, & Friday, March 18, 2022
11:00 AM – 2:00 PM EST

Module 3a 
Probability, Random Variables and Statistical Inference

Thursday, March 24, & Friday, March 25, 2022
11:00 AM – 1:30 PM EST

 

Module 3b
Correlation and Regression Models

Thursday, March 31, & Friday, April 1, 2022
11:00 AM – 2:00 PM EST

 

Module 4
Data Exploration and Visualization

Thursday, April 7, & Friday, April 8, 2022
11:00 AM – 2:00 PM EST

 

Module 5a 
Predictive Analytics I


Thursday, April 14, & Friday, April 15, 2022
11:00 AM – 2:00 PM EST

Module 5b 
Predictive Analytics II 


Thursday, April 21, & Friday, April 22, 2022
11:00 AM – 2:00 PM EST

VADSTI Training Program Curriculum

 Here are details for each of the modules

Thursday, February 24, & Friday, February 25, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR – Prem Saggar, MS


This module will introduce you to the core principles of data science and python programming and associated libraries. You will be introduced to and learn how to use Jupyter notebooks. You will understand what data science and AI can currently do. An overview of the state-of-the-art methods will be introduced and real-life examples from clinical and healthcare data will be used for illustration.

 

Thursday, March 3, & Friday, March 4, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR – Moussa Doumbia, PhD


This introductory course will be your guide to learning how to set up the working environment and use the power of Python to analyze data, create beautiful visualizations, and use powerful machine learning algorithms. In this module, you will be  introduced to Python programming skills and the related libraries for accessing from multiple sources. You will learn how to create amazing data visualizations. Topical areas will include:

  • Setting the working environment
  • Programming with Python
  • NumPy with Python
  • Using pandas Data Frames to solve complex tasks.
  • Use pandas to handle Excel Files

 

Thursday, March 17, & Friday, March 18, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR – Moussa Doumbia, PhD


Introduction to Python II, builds upon Introduction to Python I. Topics we include:

  • Web scraping with python
  • Use matplotlib and seaborn for data visualizations
  • Use plotly for interactive visualizations

 

Thursday, March 24, & Friday, March 25, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR: Paul Kohn, PhD

This module will introduce you to basic probability and statistical concepts. You will learn about different descriptive and inferential statistical techniques that are utilized in data science. You will learn about commonly used statistical distributions functions including Binomial, Poisson, and Normal distributions. You will understand and learn about appropriate the way to formulate hypotheses statements and select appropriate statistical techniques for testing.

 

Thursday, March 31, & Friday, April 1, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR: Paul Kohn, PhD

This module will introduce you to statistical concepts. You will learn about different metrics for assessing correlation and regression models including linear regression and logistic regression. You will understand and learn about appropriate the way to formulate hypotheses statements and select appropriate statistical techniques for testing.

 

Thursday, April 7, & Friday, April 8, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR – John Kwagyan
This module provides recipes for exploratory data analysis and data visualization which are critical steps in any data science project. The goal of this module is to learn how to visualize and perform initial investigations of the data so as to discover patterns, spot anomalies, test hypothesis, and check assumptions with the help of summary statistics and graphical representations. We will be using python to explore, filter, and manipulate the UCI diabetes; identify data anomalies and missingness; learn how to impute missing data; identify highly correlated variables. Explore the Johns Hopkins University COVID-19 data repository and import the data and wrangle the data to look at the number of reported confirmed cases by country and regions; plot the number of reported confirmed cases and deaths by country. In addition, we will use the COVID-19 tracking project dataset to explore racial disparities inCOVID-19 mortality and infections in the US.
 

Thursday, April 14, & Friday, April 15, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR – Prem Saggar, MS

In this module, you will be introduced to predictive analytics using the latest technologies in Data Science and AI. Areas of discussion will include:

  • What are analytics?
  • How do we make predictions?
  • Examples of predictive analytics
  • Latest developments in Predictive Analytics

Thursday, April 21, & Friday, April 22, 2022
11:00 AM – 2:00 PM EST
INSTRUCTOR – Prem Saggar, MS

Predictive Analytics 2 builds upon Predictive Analytics 1 and has the participants build a predictive analytical model. Participants will learn best practices and gain hands on experience building predictive analytics. Topical areas will include:

  • Working on a dataset to develop Predictive Analytics
  • Using the Foundations of Data Science to ensure accurate results
  • Interpreting results
  • Ethical consideration
  • Reaching your audience