Study Guide 2023+

postgres

Warning: These notes are partial, ongoing, incomplete, and may contain typos/inaccuracies. (They are kept factually accurate, time permitting.)

They are being united from many disparate notes created in the past and the layout/organization will gradually improve with time!

Please view them on a computer as they are not optimized for mobile (although you can still view them on Mobile along with the Flashcards at your own risk)!

Topics and code examples are lazy-loaded and may require two-clicks from the TOC to correctly calculate the updated x,y coordinates (after rendering). Thanks!

Postgres: General Concepts

Basic Types

Most commonly encountered:

  1. Numeric

    • bigint and int are probably the two most commonly encountered.

    • I've foudn that monetary quantities are typically converted into int or bigint Cents in order to avoid decimal/precision rounding effects.

    • Often used in leiu of Booleans IMO.

      For example, instead of two Columns (Soft Delete or "enablement" column A and some Boolean representing some target State B) one can often use one Column with three values: 0,1,2 to compress logic and validation checking.

      If A then check state B (True or False) in two ops per Row.

      If 2 ignore, otherwise use 0 or 1 in one op per Row.

      Consider a table with 1 Billion Rows - two Columns is 2 Billion and the same data can be halved.

      Consider further that most backends will perform extensive validation on most fields anyway (in Java a Boolean type can be Nullable for instance and so must be checked and/or some default value supplied within the SQL) so checking for valid Integer types (Enums) is not necessarily a new cost programmatically.

  2. JSON

    • Postgres supports JSON serialization, querying, and interpolation (->).
    • JSONB
  3. Text (Character)

    • varchar
    • Many ISO compliant Date or Time formats are represented as text strings to avoid conversion issues.

A complete list: https://www.postgresql.org/docs/current/datatype.html

Advantages

  1. Plugin support.
  2. Full support for NoSQL-like JSON in queries, rows, and columns.
  3. Less vulnerable to data corruption (preference for data Consistency/integrity over performance).
  4. dblink now provides remote-database querying/connections.
  5. Non-blocking index creation (CONCURRENTLY).
  6. Postgres Multi-Version Concurrency Control (MVCC) reading never block writing and vice-versa. Also, see the article on ACID.

Disadvantages

  1. Deprecated: No cross-database querying (a decisive factor for many database systems at scale: MySQL was a top choice for that reason) prior to 8.2.
  2. Deprecated: Slightly slower than MySQL (using the older MyISAM engine - a decisive factor for many database systems at scale: MySQL was a top choice for that reason) for READ and transaction-heavy workloads.
  1. https://developer.okta.com/blog/2019/07/19/mysql-vs-postgres
  2. https://www.cybertec-postgresql.com/en/joining-data-from-multiple-postgres-databases/
  3. https://www.postgresql.org/docs/current/dblink.html
  4. https://www.postgresql.org/docs/7.1/mvcc.html
  5. https://www.postgresql.org/docs/12/sql-reindex.html
  6. https://www.postgresql.org/docs/current/datatype.html

Code samples:

  1. https://github.com/Thoughtscript/pg_sql_2025

Postgres: Indexes and Views

  1. Indexes - used to improve query performance within a Table - Indexes a Table by one or more Columns.
  2. Views - a logical representation of a Table or Tables.

Indexes

Postgres supports:

  1. Hash indexes and B-Tree indexes
  2. Partial Indexes support conditioned indexing: CREATE INDEX CONCURRENTLY my_index ON my_table (column1_name) WHERE amount > 0;
  3. Concurrent non-blocking indexing: CREATE INDEX CONCURRENTLY my_index ON my_table (column1_name, column2_name);

Implicit Indexes are automatically created for any Primary Key on a Table by default.

Materialized Views

Essentially a cached Table that stores the results of a query:

DROP MATERIALIZED VIEW my_view_name;

CREATE MATERIALIZED VIEW my_view_name
AS
    SELECT * FROM example;
    --- Assume column 'name'

That can be queried itself:

SELECT * FROM my_view_name;

And that can be refreshed:

REFRESH MATERIALIZED VIEW my_view_name;

Concurrent refresh:

CREATE UNIQUE INDEX my_index ON my_view_name (name);

REFRESH MATERIALIZED VIEW CONCURRENTLY my_view_name;
  1. https://www.postgresql.org/docs/current/indexes-partial.html
  2. https://www.postgresqltutorial.com/postgresql-views/postgresql-materialized-views/
  3. https://www.postgresql.org/docs/current/rules-materializedviews.html

Postgres: JSON

Postgres supports both JSON and JSONB data types:

  1. JSON is stored in a standard String format.
  2. JSONB is a more performant format that uses binary (hence, the "b") to improve indexing and querying at the expense of more complex Serialization.

Operators

Given a table:

id json_col json_array_col jsonb_col jsonb_array_col
1 "[1,2,3]" "[{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}]" [1,2,3]b [{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}]b
  1. -> - allows one to query into a JSON field shared by all rows in a column (say json_col). E.g. - jsonb_col -> 'name'
  2. ->> - extracts the JSON value at the specified index (numeric) or the value at the specified key. E.g. - json_col ->> 2
  3. ::int, ::json, etc. - since JSON fields lack a Postgres type, use :: to cast the value to a type.

Examples

To initialize an example Postgres table:

DROP TABLE IF EXISTS example;

CREATE TABLE example (
  id INT,
  json_col JSON,
  json_array_col JSON,
  jsonb_col JSONB,
  jsonb_array_col JSONB
);

-- Insert values into table.

INSERT INTO example VALUES (1,
  '[1,2,3]'::json,
  '[{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}]'::json,
  '[1,2,3]'::jsonb,
  '[{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}]'::jsonb
);

Use the following queries to retrieve the desired JSON data:

-- queries

SELECT * FROM example;

-- insert via json

INSERT INTO example VALUES (2,
  '[1,2,3]'::json,
  '[{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}]'::json,
  '[1,2,3]'::jsonb,
  '[{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}]'::jsonb
);

INSERT INTO example
SELECT id, json_col, json_array_col, jsonb_col, jsonb_array_col
FROM json_populate_record (NULL::example,
    '{
      "id": 3,
      "json_col": {"name": "bob", "age": 111},
      "json_array_col": [{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}],
      "jsonb_col": {"name": "sarah", "age": 2222},
      "jsonb_array_col": [{"id": 0, "name": "a"},{"id": 1, "name": "a"},{"id": 2, "name": "c"}]
    }'
);

-- query into json array

SELECT arr -> 'id' AS json_id, arr -> 'name' AS json_name
FROM example e, json_array_elements(e.json_array_col) arr
WHERE (arr ->> 'id')::int > -1;

-- query json column

SELECT json_col::json ->> 2 FROM example;

SELECT json_col -> 'age' FROM example;

SELECT json_col -> 'age' AS json_age FROM example WHERE (json_col ->> 'age')::int = 111;

-- query into jsonb array

SELECT arr -> 'id' AS json_id, arr -> 'name' AS json_name
FROM example e, jsonb_array_elements(e.jsonb_array_col) arr
WHERE (arr ->> 'id')::int > -1;

-- query jsonb column

SELECT jsonb_col::json ->> 2 FROM example;

SELECT jsonb_col -> 'age' FROM example;

SELECT jsonb_col -> 'name' AS jsonb_name, jsonb_col -> 'age' AS jsonb_age
FROM example WHERE (jsonb_col ->> 'name') = 'sarah';
  1. https://github.com/Thoughtscript/postgres_json_practice
  2. https://www.postgresql.org/docs/current/functions-json.html

Code samples:

  1. https://github.com/Thoughtscript/postgres_json_practice