Data Science Short Courses
Short Courses are Data Science Initiative-sponsored events to enable people from across campus to learn about data analytic software.
11/09/18 An Introduction to Julia
This workshop aims to introduce both users of scripting languages and advanced programmers to the Julia ecosystem and explore details about the Julia v1.0 language which can help produce efficient and readable code.
The goal of the workshop is for students to understand where Julia can be applied and be well-equipped to start using Julia in their own research. Students will learn about the current state of Julia development (IDEs, documentation, where to get help), how to write efficient code by understanding some of Julia’s internals via small projects, solve problems using advanced Julia features (metaprogramming, multiple-dispatch, etc.), and learn workarounds to common issues newcomers face (scoping problems, type conversions, etc.).
11/02/18 Topics in R
Pre-requisites: 1) familiarity with basic statistical concepts, and 2) intermediate R programming knowledge. For the tutorial, bring a laptop with R downloaded and installed and WiFi.
10/26/18 Intro to Linux on the HPC
This course covers how to best exploit the bash shell for both interactive work and batch jobs, moving & simple manipulation of data, as well very short introductions to programming in bash, Perl, and R. This is not computer science; this is a driver’s license.
Intro to R and Data Visualization in R with ggplot
Intro to R:
In this session, students will be familiarized with R: data types, functions and basic data manipulation including some exploratory data analysis and how to perform statistical tests.
Data Visualization in R with ggplot:
In this part of the workshop, students will learn the basic commands to create statistical plots, understand the grammar of graphics behind ggplot, and master how to create more sophisticated data visualizations through hands-on exercises on real data sets.
Predictive Modeling with Python
Python is a popular language for scientific processing and machine learning. This course introduces general modeling concepts in addition to concrete examples based on the scikit-learn library. Example usage of scikit-learn illustrates how to fit and evaluate predictive models. Regression and classification settings will be considered. The course is taught mostly through the medium of iPython notebooks.
Introduction to Linux Short Course
This course is for researchers who have never used Linux and/or a compute cluster and introduces concepts and best practices for both. This course covers how to best exploit the bash shell for both interactive work and batch jobs, moving & simple manipulation of data, as well very short introductions to programming in bash, Perl, and R. This is not computer science; this is a driver’s license.
Analyzing Data/BigData on Linux
This covers using foreign data formats on Linux, stream processing, using efficient and appropriate file formats, considerations for simple parallel processing, introduction to different families of applications, dealing with Big Data sets.
Introduction to R
This course provides an introduction to the fundamentals of the R language.
In this course, students learn how to program in R and how to effectively use R to analyze data. The course covers introduction to data/object types in R, reading data into R, creating data graphics, accessing and installing R packages, writing R functions, fitting statistical models including regression models and performing statical tests as t-test, and ANOVA. Practical examples are provided during the course.
Software Carpentry Workshop
This hands-on workshop developed by the Software Carpentry Foundation covers basic concepts and tools, including program design, programming in Python, version control and task automation in the Unix shell. Software Carpentry’s mission is to help graduate students get more research done in less time and with less pain by teaching them basic lab skills for scientific computing.