SQL: General Concepts

NOTE: A variety of SQL dialects are used in the examples below (MYSQL, Postgres, MSSQL, and SQLite)

CRUD

CREATE - a resource is persisted and saved.
READ - a persisted resource is scanned, read, or returned in an unmodified way.
UPDATE - a persisted resource is updated.
DELETE - a persisted resource is permanently or softly removed.

Corresponding Basic Operations

CREATE

CREATE statements are typified by the INSERT (Row) or CREATE (Table) keywords.

INSERT INTO PaySchedule (date, name) VALUES ('2023-02-15', 'frank');

DROP TABLE IF EXISTS "Rules";

CREATE TABLE IF NOT EXISTS "Rules" (
    "id"    INTEGER NOT NULL UNIQUE,
    "description"    TEXT,
    "category"    TEXT,
    "rule"    TEXT,
    PRIMARY KEY("id" AUTOINCREMENT)
);

READ

READ statements are typified by the presence of the SELECT keyword.

SELECT * FROM Accounts WHERE active =  1 AND status = 'Active';

SELECT * FROM Hobbies;

SELECT x.newhome + y.newhome + z.newhome + w.newhome + q.newhome
FROM 
  (SELECT SUM(unitcost * quantity) as newhome FROM House WHERE quantity > 0 AND fha = 1) AS x,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepHobbies WHERE quantity > 0 AND fha = 1) AS y,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepClothing WHERE quantity > 0 AND fha = 1) AS z,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepElectronics WHERE quantity > 0 AND fha = 1) AS w,
  (SELECT SUM(unitcost * quantity) as newhome FROM UpkeepKitchen WHERE quantity > 0 AND fha = 1) AS q;

SELECT * FROM MonthlyCosts WHERE choicegroup LIKE "%ALL%";

SELECT COUNT(*) FROM (SELECT grouping FROM MonthlyCostsGroceries GROUP BY grouping);

SELECT y.total / (x.mtbf * 12) AS upkeep FROM 
  (SELECT AVG(mtbf) AS mtbf FROM Upkeep WHERE quantity > 0 AND choicegroup LIKE "%ALL%") AS x, 
  (SELECT SUM(unitcost * quantity) AS total FROM Upkeep WHERE choicegroup LIKE "%ALL%") AS y;

UPDATE

UPDATE statements are typified by the presence of the UPDATE (ROW) or ALTER (TABLE) keyword.

UPDATE Assets SET value = 400 WHERE id = 4 AND account = 8;

UPDATE Demographics SET personal = 750, success = 1, updated = '2020-01-01' WHERE id = 100;

ALTER TABLE example ADD more_text VARCHAR(45);

DELETE

DELETE statements are typified by the presence of the DELETE keyword.

DELETE FROM PaySchedule WHERE date < '2023-02-15' OR date = '2024-02-15';

Resources and Links

Code samples:

SQL: ACID

Atomicity guarantees that operations are completely contained in a single unit (Transaction). It either succeeds completely or fails (such that all operations fail – it is not partially successful).

Rollback, database Commit

Consistency guarantees that each Transaction moves the database state from one valid, legal, state to another valid, legal, state.

FK constraints being enforced across two Transactions

Isolation guarantees that concurrent Transactions are disjoint and separate, and that they don’t overwrite each other.

Database READ/WRITE locks

Durability guarantees that persisted data remains persisted across time and system failure/disaster.

Caching systems often lack durability since they often exist only in memory.

SQL: Joins

Retrieving data that's associated by field.

OUTER - matches and NULL values where absent depending on the specifics of the JOIN.

INNER - only what's matched in both Tables.

LEFT, RIGHT, INNER, FULL

LEFT (OUTER) JOIN: Returns all records from the left Table, the matched records from the right Table, and NULL in any Row absent from the right Table .
RIGHT (OUTER) JOIN: Returns all records from the right Table, the matched records from the left Table, and NULL in any Row absent from the left Table.
(INNER) JOIN: Returns records that have matching values in both Tables. Equivalent to: SELECT * FROM table_one, table_two WHERE table_one.id = table_two.id
FULL (OUTER) JOIN: Returns all records when there is a match in either left or right Table, NULL in any Row absent from one of the two Tables.

-- Multi JOIN
SELECT a.name as aname, b.description as name, a.description as adesc, s.name as description, b.value 
FROM Balances AS b 
LEFT JOIN Accounts AS a ON b.account = a.id 
JOIN SummaryTypes AS s ON b.purpose = s.id 
WHERE a.active = 1 AND a.status='Active' AND b.active = 1 
ORDER BY value DESC;

-- INNER JOIN
SELECT * 
FROM example AS e
INNER JOIN employee AS em ON e.id = em.id;

CROSS

Produces the Cartesian Product of two Tables and their Rows.
Every Row in Table A is combined with every Row in Table B.

Self Joins

A JOIN between a Table and itself.

Explicit vs Implicit

Consider the following:

--- Simple JOIN
SELECT * FROM A, B WHERE A.id = B.a_id;

--- Explicit JOIN
SELECT * FROM A JOIN B ON A.id = B.a_id;

The two statements are very nearly functionally identical.
Although there are some differences in terms of the leeway that the Execution Planner has when computing an optimal SQL Plan.

JOIN Order

The order of JOIN statements is typically relevant for any Explicitly Joined statement

SQL optimizers will refine most simple (Implicit) JOINS and have the most leeway in computing an optimal SQL Plan.
SQL optimizers will refine most INNER JOINS through a computed Execution Plan.
However, OUTER JOINS will typically preserve the order specified by the statement.

Generally speaking, remember to keep the leftmost tables to the least number of expected rows.

For instance, if one is joining three Tables A, B, C such that:

The Rows in C are guaranteed to be unique.
The number of Rows likely to be returned by B meeting some condition is far less than the number that'd initially be scanned in A meeting the same.
Then, it's likely far better to join C, B, then A (in that order) to reduce the total number of Rows that need to be compared (in toto) and at each step in the overall JOIN statement.

SQL: Techniques

Some common SQL techniques.

DROP TABLE Check

To create a fresh, empty, new Table drop any existing such Table first OR to prevent accidently overriding an existing Table:

DROP TABLE IF EXISTS "Rules";

CREATE TABLE IF NOT EXISTS "Rules" (
    "id"    INTEGER NOT NULL UNIQUE,
    "description"    TEXT,
    "category"    TEXT,
    "rule"    TEXT,
    PRIMARY KEY("id" AUTOINCREMENT)
);

SELF JOIN

Occassionally, a TABLE will represent multiple entity-kinds (say a PLANT TABLE with ROWS representing both seedlings and their parent plants):

SELECT e1.name AS Plants
FROM Plants AS e1
JOIN Plants AS e2
ON e1.parent = e2.id
AND e1.sproutdate > e2.sproutdate;

Refer to: https://leetcode.com/problems/employees-earning-more-than-their-managers/description/

GROUP BY

GROUP BY is used to condense multiple ROWS with the same field into a single result in the RECORD SET. GROUP BY can be further refined by an accompanying HAVING clause (which acts like a WHERE clause for the specified GROUP). Importantly, in many dialects of SQL, GROUP BY can only be used with Aggregate Functions (ROUND, AVG, COUNT, SUM, MIN, MAX, etc.).

NOTE: HAVING conditions are deferred and apply to the results of a GROUP BY clause and are therefore often faster than their equivalent WHERE clause (which applies the filtering condition against the entire TABLE).

SELECT Name FROM STUDENTS GROUP BY Name, Marks, ID HAVING Marks > 75 ORDER BY Right(Name, 3) ASC, ID ASC;

SELECT e.id, ROUND(AVG(e.account), 2) AS averageAccountBalance
FROM Example AS e
JOIN Examples AS ee
ON e.id = ee.id
GROUP BY e.id, ee.name
HAVING e.account > 750
ORDER BY e.id ASC, averageAccountBalance DESC

NOTE: a double GROUP BY (e.g. - GROUP BY x.id, x.name) can be used to deduplicate associated multi-row RECORD SETS (in say a complex multi-JOIN).

Refer to: https://www.hackerrank.com/challenges/more-than-75-marks/problem

Chained SELECT Statements

SELECT DISTINCT CITY FROM STATION 
    WHERE (CITY LIKE 'A%'
    OR  CITY LIKE 'E%'
    OR  CITY LIKE 'I%'
    OR  CITY LIKE 'O%'
    OR  CITY LIKE 'U%')
    AND CITY IN (

SELECT DISTINCT CITY FROM STATION 
    WHERE CITY LIKE '%a'
    OR  CITY LIKE '%e'
    OR  CITY LIKE '%i'
    OR  CITY LIKE '%o'
    OR  CITY LIKE '%u')

Refer to: https://www.hackerrank.com/challenges/more-than-75-marks/problem

WITH

WITH x AS (
  SELECT months * salary AS earnings, id
  FROM employee
);

SELECT TOP 1 earnings, Count(id)
FROM x
GROUP BY earnings
ORDER BY earnings DESC;

Resources and Links

Code samples:

SQL: NULL

COALESCE

SUM returns NULL not 0 if no values exist that meet the query conditions.
Use COALESCE(SUM(table.column),0)
Also try a different JOIN if a zero summed value fails to appear.
For example, a LEFT JOIN with COALESCE(SUM(table.column),0)

NULL

Use IS NOT NULL vs <> NULL

SQL: Transact SQL

Transact SQL was originally introduced to extend core Structured Query Language (SQL) syntax with additional features to support Microsoft SQL Server (MSSQL). Like SQL itself, T-SQL has gone to be adopted by most SQL variants/dialects.

Key Features

Variables
- DECLARE, SET
- @my_var_name
Batch execution
- GO
- BEGIN, END
Call and run a Stored Procedure or Function
- EXECUTE

Microsoft SQL Server Entities

Stored Procedures
- One or more Queries, Functions, or operations that can be called and invoked.
- https://learn.microsoft.com/en-us/sql/relational-databases/stored-procedures/stored-procedures-database-engine?view=sql-server-ver16
Functions
- In-Built Functions include Aggregate Functions, basic Math operations, Regular Expression (RegEx), etc.
- User-Defined Functions are specific operations to compute a returned value that are customized and written by the User.
Triggers
- Associate operations with specified Database events, conditions.
- https://learn.microsoft.com/en-us/sql/relational-databases/triggers/ddl-triggers?view=sql-server-ver16

Transact SQL also provides support for CREATE, ALTER, DELETE the above (they use syntax similar to Tables and Views).

It's generally good practice to carefully determine if a Stored Procedure or Trigger is really needed - we should be careful to observe proper Seperation of Concerns (most such operations should probably live in a different part of the stack, be handled programmatically, databases are the most expensive and resource consumptive, etc.).