Study Guide 2023+

sql

Warning: These notes are partial, ongoing, incomplete, and may contain typos/inaccuracies. (They are kept factually accurate, time permitting.)

They are being united from many disparate notes created in the past and the layout/organization will gradually improve with time!

Please view them on a computer as they are not optimized for mobile (although you can still view them on Mobile along with the Flashcards at your own risk)!

Topics and code examples are lazy-loaded and may require two-clicks from the TOC to correctly calculate the updated x,y coordinates (after rendering). Thanks!

SQL: General Concepts

NOTE: A variety of SQL dialects are used in the examples below (MYSQL, Postgres, MSSQL, and SQLite)

CRUD

  1. CREATE - a resource is persisted and saved.
  2. READ - a persisted resource is scanned, read, or returned in an unmodified way.
  3. UPDATE - a persisted resource is updated.
  4. DELETE - a persisted resource is permanently or softly removed.

Corresponding Basic Operations

CREATE

CREATE statements are typified by the INSERT (Row) or CREATE (Table) keywords.

INSERT INTO PaySchedule (date, name) VALUES ('2023-02-15', 'frank');
DROP TABLE IF EXISTS "Rules";

CREATE TABLE IF NOT EXISTS "Rules" (
    "id"    INTEGER NOT NULL UNIQUE,
    "description"    TEXT,
    "category"    TEXT,
    "rule"    TEXT,
    PRIMARY KEY("id" AUTOINCREMENT)
);

READ

READ statements are typified by the presence of the SELECT keyword.

SELECT * FROM Accounts WHERE active =  1 AND status = 'Active';

SELECT * FROM Hobbies;

SELECT x.newhome + y.newhome + z.newhome + w.newhome + q.newhome
FROM 
  (SELECT SUM(unitcost * quantity) as newhome FROM House WHERE quantity > 0 AND fha = 1) AS x,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepHobbies WHERE quantity > 0 AND fha = 1) AS y,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepClothing WHERE quantity > 0 AND fha = 1) AS z,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepElectronics WHERE quantity > 0 AND fha = 1) AS w,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepKitchen WHERE quantity > 0 AND fha = 1) AS q;

SELECT * FROM MonthlyCosts WHERE choicegroup LIKE "%ALL%";

SELECT COUNT(*) FROM (SELECT grouping FROM MonthlyCostsGroceries GROUP BY grouping);

SELECT y.total / (x.mtbf * 12) AS upkeep FROM 
  (SELECT AVG(mtbf) AS mtbf FROM Upkeep WHERE quantity > 0 AND choicegroup LIKE "%ALL%") AS x, 
  (SELECT SUM(unitcost * quantity) AS total FROM Upkeep WHERE choicegroup LIKE "%ALL%") AS y;

UPDATE

UPDATE statements are typified by the presence of the UPDATE (ROW) or ALTER (TABLE) keyword.

UPDATE Assets SET value = 400 WHERE id = 4 AND account = 8;

UPDATE Demographics SET personal = 750, success = 1, updated = '2020-01-01' WHERE id = 100;
ALTER TABLE example ADD more_text VARCHAR(45);

DELETE

DELETE statements are typified by the presence of the DELETE keyword.

DELETE FROM PaySchedule WHERE date < '2023-02-15' OR date = '2024-02-15';

Code samples:

  1. https://github.com/Thoughtscript/pg_sql_2025
  2. https://github.com/Thoughtscript/dotnet_2025
  3. https://leetcode.com/problem-list/database/
  4. https://www.hackerrank.com/domains/sql?badge_type=sql
  5. https://github.com/Thoughtscript/cockroachdb-kotlin-client
  6. https://github.com/Thoughtscript/java_stuff/tree/master/Spring-MVC-Security
  7. https://github.com/Thoughtscript/java_spring_boot_2024/blob/main/springboot/src/main/java/io/thoughtscript/bootexample/repositories/ExampleJpaRepository.java
  8. https://github.com/Thoughtscript/mysql_2025

SQL: ACID

Atomicity guarantees that operations are completely contained in a single unit (Transaction). It either succeeds completely or fails (such that all operations fail – it is not partially successful).

Consistency guarantees that each Transaction moves the database state from one valid, legal, state to another valid, legal, state.

Isolation guarantees that concurrent Transactions are disjoint and separate, and that they don’t overwrite each other.

Durability guarantees that persisted data remains persisted across time and system failure/disaster.

SQL: Joins

Retrieving data that's associated by field.

OUTER - matches and NULL values where absent depending on the specifics of the JOIN.

INNER - only what's matched in both Tables.

LEFT, RIGHT, INNER, FULL

  1. LEFT (OUTER) JOIN: Returns all records from the left Table, the matched records from the right Table, and NULL in any Row absent from the right Table .
  2. RIGHT (OUTER) JOIN: Returns all records from the right Table, the matched records from the left Table, and NULL in any Row absent from the left Table.
  3. (INNER) JOIN: Returns records that have matching values in both Tables. Equivalent to: SELECT * FROM table_one, table_two WHERE table_one.id = table_two.id
  4. FULL (OUTER) JOIN: Returns all records when there is a match in either left or right Table, NULL in any Row absent from one of the two Tables.
-- Multi JOIN
SELECT a.name as aname, b.description as name, a.description as adesc, s.name as description, b.value 
FROM Balances AS b 
LEFT JOIN Accounts AS a ON b.account = a.id 
JOIN SummaryTypes AS s ON b.purpose = s.id 
WHERE a.active = 1 AND a.status='Active' AND b.active = 1 
ORDER BY value DESC;

-- INNER JOIN
SELECT * 
FROM example AS e
INNER JOIN employee AS em ON e.id = em.id;

CROSS

  1. Produces the Cartesian Product of two Tables and their Rows.
  2. Every Row in Table A is combined with every Row in Table B.

Self Joins

A JOIN between a Table and itself.

  1. https://leetcode.com/submissions/detail/816099738/
  2. https://leetcode.com/submissions/detail/816856989/

Explicit vs Implicit

Consider the following:

--- Simple JOIN
SELECT * FROM A, B WHERE A.id = B.a_id;

--- Explicit JOIN
SELECT * FROM A JOIN B ON A.id = B.a_id;
  1. The two statements are very nearly functionally identical.
  2. Although there are some differences in terms of the leeway that the Execution Planner has when computing an optimal SQL Plan.

JOIN Order

The order of JOIN statements is typically relevant for any Explicitly Joined statement

  1. SQL optimizers will refine most simple (Implicit) JOINS and have the most leeway in computing an optimal SQL Plan.
  2. SQL optimizers will refine most INNER JOINS through a computed Execution Plan.
  3. However, OUTER JOINS will typically preserve the order specified by the statement.

Generally speaking, remember to keep the leftmost tables to the least number of expected rows.

For instance, if one is joining three Tables A, B, C such that:

SQL: Techniques

Some common SQL techniques.

DROP TABLE Check

To create a fresh, empty, new Table drop any existing such Table first OR to prevent accidently overriding an existing Table:

DROP TABLE IF EXISTS "Rules";

CREATE TABLE IF NOT EXISTS "Rules" (
    "id"    INTEGER NOT NULL UNIQUE,
    "description"    TEXT,
    "category"    TEXT,
    "rule"    TEXT,
    PRIMARY KEY("id" AUTOINCREMENT)
);

SELF JOIN

Occassionally, a TABLE will represent multiple entity-kinds (say a PLANT TABLE with ROWS representing both seedlings and their parent plants):

SELECT e1.name AS Plants
FROM Plants AS e1
JOIN Plants AS e2
ON e1.parent = e2.id
AND e1.sproutdate > e2.sproutdate;

Refer to: https://leetcode.com/problems/employees-earning-more-than-their-managers/description/

GROUP BY

GROUP BY is used to condense multiple ROWS with the same field into a single result in the RECORD SET. GROUP BY can be further refined by an accompanying HAVING clause (which acts like a WHERE clause for the specified GROUP). Importantly, in many dialects of SQL, GROUP BY can only be used with Aggregate Functions (ROUND, AVG, COUNT, SUM, MIN, MAX, etc.).

NOTE: HAVING conditions are deferred and apply to the results of a GROUP BY clause and are therefore often faster than their equivalent WHERE clause (which applies the filtering condition against the entire TABLE).

SELECT Name FROM STUDENTS GROUP BY Name, Marks, ID HAVING Marks > 75 ORDER BY Right(Name, 3) ASC, ID ASC;

SELECT e.id, ROUND(AVG(e.account), 2) AS averageAccountBalance
FROM Example AS e
JOIN Examples AS ee
ON e.id = ee.id
GROUP BY e.id, ee.name
HAVING e.account > 750
ORDER BY e.id ASC, averageAccountBalance DESC

NOTE: a double GROUP BY (e.g. - GROUP BY x.id, x.name) can be used to deduplicate associated multi-row RECORD SETS (in say a complex multi-JOIN).

Refer to: https://www.hackerrank.com/challenges/more-than-75-marks/problem

Chained SELECT Statements

SELECT DISTINCT CITY FROM STATION 
    WHERE (CITY LIKE 'A%'
    OR  CITY LIKE 'E%'
    OR  CITY LIKE 'I%'
    OR  CITY LIKE 'O%'
    OR  CITY LIKE 'U%')
    AND CITY IN (

SELECT DISTINCT CITY FROM STATION 
    WHERE CITY LIKE '%a'
    OR  CITY LIKE '%e'
    OR  CITY LIKE '%i'
    OR  CITY LIKE '%o'
    OR  CITY LIKE '%u')

Refer to: https://www.hackerrank.com/challenges/more-than-75-marks/problem

WITH

WITH x AS (
  SELECT months * salary AS earnings, id
  FROM employee
);

SELECT TOP 1 earnings, Count(id)
FROM x
GROUP BY earnings
ORDER BY earnings DESC;
  1. https://leetcode.com/problems/employees-earning-more-than-their-managers/description/
  2. https://www.hackerrank.com/challenges/more-than-75-marks/problem
  3. https://www.hackerrank.com/challenges/more-than-75-marks/problem
  4. https://www.geeksforgeeks.org/difference-between-where-and-group-by/
  5. https://stackoverflow.com/questions/49758446/where-vs-having-performance-with-group-by

Code samples:

  1. https://github.com/Thoughtscript/pg_sql_2025
  2. https://github.com/Thoughtscript/dotnet_2025
  3. https://leetcode.com/problem-list/database/
  4. https://www.hackerrank.com/domains/sql?badge_type=sql
  5. https://github.com/Thoughtscript/cockroachdb-kotlin-client
  6. https://github.com/Thoughtscript/java_stuff/tree/master/Spring-MVC-Security
  7. https://github.com/Thoughtscript/java_spring_boot_2024/blob/main/springboot/src/main/java/io/thoughtscript/bootexample/repositories/ExampleJpaRepository.java

SQL: NULL

COALESCE

  1. SUM returns NULL not 0 if no values exist that meet the query conditions.
  2. Use COALESCE(SUM(table.column),0)
  3. Also try a different JOIN if a zero summed value fails to appear.
  4. For example, a LEFT JOIN with COALESCE(SUM(table.column),0)

NULL

  1. Use IS NOT NULL vs <> NULL

SQL: Transact SQL

Transact SQL was originally introduced to extend core Structured Query Language (SQL) syntax with additional features to support Microsoft SQL Server (MSSQL). Like SQL itself, T-SQL has gone to be adopted by most SQL variants/dialects.

Key Features

  1. Variables
    • DECLARE, SET
    • @my_var_name
  2. Batch execution
    • GO
    • BEGIN, END
  3. Call and run a Stored Procedure or Function
    • EXECUTE

Microsoft SQL Server Entities

  1. Stored Procedures
  2. Functions
    • In-Built Functions include Aggregate Functions, basic Math operations, Regular Expression (RegEx), etc.
    • User-Defined Functions are specific operations to compute a returned value that are customized and written by the User.
  3. Triggers

Transact SQL also provides support for CREATE, ALTER, DELETE the above (they use syntax similar to Tables and Views).

It's generally good practice to carefully determine if a Stored Procedure or Trigger is really needed - we should be careful to observe proper Seperation of Concerns (most such operations should probably live in a different part of the stack, be handled programmatically, databases are the most expensive and resource consumptive, etc.).

  1. https://learn.microsoft.com/en-us/sql/t-sql/language-reference?view=sql-server-ver16
  2. https://learn.microsoft.com/en-us/sql/relational-databases/user-defined-functions/user-defined-functions?view=sql-server-ver16
  3. https://learn.microsoft.com/en-us/sql/relational-databases/triggers/ddl-triggers?view=sql-server-ver16
  4. https://learn.microsoft.com/en-us/sql/relational-databases/stored-procedures/stored-procedures-database-engine?view=sql-server-ver16