Data Science

Overview

Data Science is an interdisciplinary field that employs methods and theories drawn from statistics, computer science (computer programming, databases, machine learning) and mathematics (calculus, probability, linear algebra) to extract insights from data with special emphasis on big data. Machine learning is a branch of artificial intelligence that deals with the study and development of algorithms and statistical models that allow computers to automatically learn patterns from data without being explicitly programmed. The field of Data Science encompasses topics such as exploratory data analysis, statistical inference, regression analysis, machine learning, cluster analysis, data wrangling, data mining, and data visualization.

Big data can be found in almost every sector of society, from business and industry, healthcare, education, and government. This necessitates the need to train individuals who can work with and analyze massive amounts of data to help organizations make informed decisions. The Data Science program at Saint Joseph’s University is an interdisciplinary program that is jointly administered by the Departments of Mathematics, Computer Science and Decision & System Sciences. The program offers a Major and a Minor in Data Science, which include electives not just in Mathematics, Computer Science and Decision & System Sciences but also in other departments such as Economics and Biology.

Program Faculty

The required courses in the Data Science major and minor are taught by faculty from the Departments of Mathematics, Computer Science and Decision & System Sciences. Courses that count for the Data Science electives are taught not just by faculty in the three departments that administer the program but also by faculty in other departments such as Economics and Biology.

Director

Dr. Rommel Regis

Advisory Board

Dr. Babak Forouraghi

Dr. Wei Chang

Dr. Ginny Miori

Dr. Abolfazl Saghafi

Dr. Baha Taoufik

Undergraduate Major

Data Science

Undergraduate Minor

Data Science

DSC 223 Intro Math of Data Science (3 credits)

This course provides an introduction to basic mathematical topics needed to understand modern areas of applied and theoretical mathematics including the rapidly growing field of data science. It includes elementary set theory and counting techniques, discrete probability, descriptive statistics, simple linear regression, basic inferential statistics, and an introduction to linear algebra. This course will also cover some basic proof techniques in elementary set theory, combinatorics, discrete probability and linear algebra.

Prerequisites: MAT 155 and MAT 161

Attributes: Math Beauty, Undergraduate

DSC 325 Essentials of Data Science (3 credits)

This course covers the basic topics in data science. It includes descriptive and inferential statistics, introduction to simple and multiple regression, data visualization, and data cleaning or scrubbing. It also includes an introduction to machine learning topics such as decision trees, k-nearest neighbors, neural networks and clustering. The R software or the Python programming language will be used to visualize and analyze datasets.

Prerequisites: MAT 223 or DSC 223

Attributes: Math Beauty, Undergraduate

DSC 326 Advanced Data Science (3 credits)

This course covers some advanced topics in data science, including recent tools for performing predictive analytics, data visualization, data wrangling, statistical inference, deep machine learning, and software engineering. Various software packages, including TensorFlow, will be used to build predictive models. Whenever appropriate, the mathematical background of predictive models will be covered. Also, one of the main goals is to introduce students to the most important aspects of data science by reinforcing writing efficient code, testing, and debugging while working with large software systems. The course includes several programming projects in Python and/or R.

Prerequisites: DSC 325 or CSC 346

Attributes: Math Beauty, Undergraduate

DSC 425 Machine Learning/Data Science (3 credits)

This course provides an introduction to the fields of Machine Learning, Data Science and Predictive Analytics. It includes linear regression, logistic regression, nearest neighbor methods, decision trees, neural networks, clustering, principal components analysis, and resampling methods such as cross-validation and bootstrapping. If time permits, it will also include support vector machines, deep learning methods, and machine learning methods for numerical optimization such as genetic and evolutionary algorithms and swarm intelligence algorithms. The R software will be used to apply statistical and machine learning methods to real data sets. Whenever appropriate, the mathematical background of machine learning methods will be covered. Students will be required to work on a final data analysis project and present their findings in class. This course and MAT 424 (Regression and Time Series) together cover the topics in the SOA (Society of Actuaries) exam in SRM (Statistics for Risk Modeling) and provide an intro to the PA (Predictive Analytics) exam. Also, this course and MAT 424 cover several topics in the CAS (Casualty Actuarial Society) exams in MAS (Modern Actuarial Statistics) I and II.

Prerequisites: MAT 223 or DSC 223

Attributes: Math Beauty, Undergraduate