Skip to main content

MSDS 436 Analytics Systems Engineering

Course Description #

This course introduces design principles and best practices for implementing large-scale systems for data ingestion, processing, storage, and analytics. Students learn about cloud-based computing, including infrastructure-, platform-, software-, and database-as-a-service systems for data science. They evaluate system performance and resource utilization in batch, interactive, and streaming environments. They create and run performance benchmarks comparing browser-based and desktop applications. They evaluate key-value stores, relational, document, graph, and graph-relational databases. Recommended prior course: MSDS 430-DL Python for Data Science or MSDS 431-DL Data Engineering with Go. Prerequisites: (1) MSDS 420-DL Database Systems or CIS 417 Database Systems Design and Implementation and (2) MSDS 422-DL Practical Machine Learning or CIS 435 Practical Data Science Using Machine Learning.

Students benefit by taking the Go Learning Studio and MSDS 431 Data Engineering with Go before taking this course.

What is required of students? Students participate in weekly discussion forums and programming assignments. They work extensively with the Data Science Computing Cluster, a group of Linux-based systems at Northwestern University.

Primary Textbooks #

  • Gerardi, Ricardo. 2021. Powerful Command-Line Applications in Go: Build Fast and Maintainable Tools. Raleigh, NC: The Pragmatic Bookshelf. [ISBN-13: 978-1-68050-696-9]

  • Huyen, Chip. 2023. Designing Machine Learning Systems: An Iterative Process for Production-Ready Applications. Sebastopol, CA: O’Reilly. [ISBN-13: 978-1098107963]

  • Shotts, William. 2019. The Linux Command Line: A Complete Introduction (second edition). San Francisco: No Starch Press. [ISBN-13: 978-1593279523]

Go to the home page Learning Go for Data Science.