Skip to main content

Database Systems (SQL)

MSDS 420-DL Database Systems.

This course introduces data management and data preparation with a focus on applications in large-scale analytics projects utilizing relational, document, graph, and graph-relational databases. Students learn about the relational model, the normalization process, and structured query language. They learn about data cleaning and integration, and database programming for extract, transform, and load operations. Students work with unstructured data, indexing and scoring documents for effective and relevant responses to user queries. They learn about graph data models and query processing. Students write programs for data preparation and extraction using various data sources and file formats. Recommended prior programming experience or 430-DL Python for Data Science. Prerequisites: None.

Database systems and query languages in this course:

  • Relational databases: PostgreSQL with Structure Query Language (SQL)
  • Document stores: Elasticsearch, indexing, and natural language queries
  • Vector database: Milvus for scalable similarity search
  • Graph-relational databases: EdgeDB with EdgeQL and GraphQL
Primary Textbook #
  • DeBarros, Anthony. 2022. Practical SQL: A Beginner’s Guide to Storytelling with Data (second edition). San Francisco: No Starch Press. [ISBN-13: 978-1-7185-0106-5]

Back to the Languages for Data Science page.