Introduction to SQL

A course covering SQL coding queries using Azure Data Studio. It includes reference to the IDI and useful coding, tying in well with the IDI intro course; SQL is also applicable in many other areas.

Dates

Monday 25 – Thursday 28 November (4 days ONLINE)

Fee structure

Full-time students
$800

Everyone else
$1,600

If you require an invoice for enrolment, or an internal journalling within the University of Auckland, please email us at nzssncourses@auckland.ac.nz. For other payment options, please click below as appropriate.

Instructor

Daniel Fryer
Daniel Fryer

Course outline

This course is a gentle, fast-paced introduction to SQL. Our objective is to build a strong foundation and intuition for SQL programming, with an emphasis on building and testing reliable SQL queries. The course is suitable for beginners.

We use Microsoft's flavour of SQL (called Transact-SQL), but all of the fundamentals, which are core SQL standards, are transferrable to all other flavours of SQL, such as MySQL, PostgreSQL, Oracle SQL, and so on. Having said that, we will learn a number of things that are unique to Transact-SQL.

The course is accompanied by an extensive set of notes and exercises. Day four is an optional extension for those wishing to become acquainted with the R programming language, its functionality for working with SQL, and some of the R libraries that mimic the SQL way of thinking about data.

Day one

Day one starts with a short introduction to database technologies, covering the basic structure of a database and the usefulness of the SQL relational database model in general. We learn about SQL tables, relationships between tables, and primary/foreign key pairs.

We then connect to a remote SQL database in the Azure Data Studio development environment, and dive into SQL coding, beginning with easy examples and the most useful data manipulation queries. After lunch, we learn more of the fundamentals: search conditions, subqueries, and joining tables. The day ends with many exercises for practice and revision.

Day two

Day two is a chance to apply, and build on, our day 1 knowledge. We start by learning how to read and interpret the Transact-SQL documentation. We then cover the syntax and intuition for filtering and aggregating on grouped data.

The queries we can build become increasingly complex, so it's time to learn some skills for iterative development: we will use the SQL data definition language to create our own test tables, allowing us to design our own databases, and to use them safely test complicated queries as we write them. The day ends with many more exercises, to apply what we have learned and deal with common challenges.

Day three

Day three gets our hands dirty with a large realistic dataset, split between multiple tables and schemas, with a data dictionary, and having a few of the complications that come with real-world datasets. We will use the New Zealand Integrated Data Infrastructure (IDI) in our examples, gaining some tips and familiarity for working with the IDI.

We'll get experience making sense of, and simplifying, large messy SQL queries written by other people. After updating these queries, we'll need to test them to make sure they return what we expect. We will employ our day two skills to build effective testing strategies for writing robust queries. Along the way, we’ll practice investigating new clauses and functions, found in the T-SQL documentation.

Day four

Day four suits those wanting to learn the basics of the R programming language, and its SQL ecosystem. We run through a brief introduction to programming in R, and then learn how to make SQL requests from R to a remote SQL server. These requests allow us to retrieve very large datasets one small piece at a time, and to process the data sequentially.

We will also run a brief introduction to the Tidyverse R metapackage, which includes a range of R functions that work very similarly to the SQL functions that we learnt about on days one and two.

We’ll walk away with the realisation that SQL has introduced us to a way of thinking about data that is used in many programming languages, not just SQL. Time permitting, we will then look at using R with a different flavour of SQL, called SQLite, which is the most widely deployed database engine in the world today!