Schema Design: Tables, Types, and Constraints
So far, we've been working with a schema that was handed to you. You ran the seed SQL in Article 1 and started querying. But understanding how to design a schema — choosing the right types, adding the right constraints, thinking through what the database should enforce — is one of the most important skills in backend development.
A good schema prevents bad data from entering the database in the first place. A poorly designed schema relies on application code to catch every mistake — and application code has bugs.
Let's build the ShopFlow schema from scratch, and understand every decision along the way.
Quick Reference
Create a table:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
price NUMERIC(10, 2) NOT NULL CHECK (price >= 0),
stock_quantity INTEGER NOT NULL DEFAULT 0,
category TEXT NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW()
);
Common types:
| Use case | Type |
|---|---|
| Auto-incrementing ID | SERIAL / BIGSERIAL |
| Whole numbers | INTEGER / BIGINT |
| Decimal money values | NUMERIC(10, 2) |
| Short text | TEXT or VARCHAR(n) |
| Long text | TEXT |
| True/false | BOOLEAN |
| Date and time | TIMESTAMP / TIMESTAMPTZ |
| Date only | DATE |
Gotchas:
- ⚠️ Use
NUMERIC(notFLOAT) for money — floating-point arithmetic introduces rounding errors - ⚠️ Use
TIMESTAMPTZ(with timezone) for production apps — bareTIMESTAMPstores no timezone info - ⚠️
VARCHAR(255)has no meaningful performance advantage overTEXTin PostgreSQL
Version Information
Tested with:
- PostgreSQL: 15.x, 16.x
- Node.js: v18.x, v20.x, v22.x
What You Need to Know First
Required reading:
- Relational Databases: How Structured Data Works — tables, columns, primary keys
- SQL Writes: INSERT, UPDATE, DELETE — constraints affect what you can insert and how errors surface
What We'll Cover in This Article
By the end of this guide, you'll understand:
- How to create tables with
CREATE TABLE - Which PostgreSQL data type to use for each kind of data
- How
NOT NULL,UNIQUE,DEFAULT,CHECK, andPRIMARY KEYconstraints work - Why
NUMERICbeatsFLOATfor money - Why
TIMESTAMPTZbeatsTIMESTAMPfor production - How to drop and modify tables safely
- How to build the full ShopFlow schema from scratch
What We'll Explain Along the Way
We'll introduce these concepts with full explanations:
- Why
VARCHAR(255)is a historical artifact in PostgreSQL - What
SERIALactually is under the hood (a sequence) - The difference between a table constraint and a column constraint
- What happens when a constraint is violated
Part 1: CREATE TABLE — The Basic Structure
A CREATE TABLE statement defines the table name, its columns, and the rules each column must follow.
CREATE TABLE table_name (
column_name data_type [constraints],
column_name data_type [constraints],
...
);
Let's start simple and build up. Here's the most minimal valid products table:
-- Minimal: no constraints, no defaults
CREATE TABLE products (
id INTEGER,
name TEXT,
price NUMERIC
);
This creates the table, but it's almost useless — nothing prevents inserting a product with no name, a negative price, or a duplicate ID. Constraints are what give the schema its teeth.
Dropping Tables
If you need to start over during development:
-- Drop a table (irreversible — all data is deleted)
DROP TABLE products;
-- Drop only if it exists (no error if it doesn't)
DROP TABLE IF EXISTS products;
-- Drop a table along with anything that references it
DROP TABLE products CASCADE;
In production, you never DROP TABLE — you use migrations to modify schema incrementally. We'll cover migrations in the Drizzle and Prisma articles.
Part 2: Choosing the Right Data Type
Every column has a data type that defines what values it can hold and how the database stores them. Choosing the right type matters for correctness, storage efficiency, and query performance.
Integer Types
| Type | Range | Storage | When to Use |
|---|---|---|---|
SMALLINT | -32,768 to 32,767 | 2 bytes | Rarely — only for very constrained values |
INTEGER | -2.1B to 2.1B | 4 bytes | Default for counts, quantities, foreign keys |
BIGINT | -9.2 quintillion to 9.2 quintillion | 8 bytes | IDs for very large tables, financial amounts in cents |
For ShopFlow, INTEGER is fine for all IDs and quantities. If you expected millions of users or billions of orders, BIGINT would be the safer choice.
Auto-Incrementing IDs: SERIAL
SERIAL is PostgreSQL shorthand for "create an integer column with an auto-incrementing sequence attached":
-- These two are equivalent
id SERIAL PRIMARY KEY
-- Under the hood, SERIAL does this:
id INTEGER NOT NULL DEFAULT nextval('products_id_seq') PRIMARY KEY
-- (PostgreSQL creates the sequence automatically)
BIGSERIAL is the same for BIGINT. For most tables, SERIAL is sufficient. If you're building a system that might eventually have billions of rows, use BIGSERIAL from the start — retrofitting it later requires a migration.
Text Types
PostgreSQL has two main text types: TEXT and VARCHAR(n).
name TEXT -- unlimited length
short_code VARCHAR(10) -- maximum 10 characters
description TEXT -- unlimited, fine for long content
The VARCHAR(255) myth: In MySQL, VARCHAR(255) has performance implications. In PostgreSQL, it doesn't — TEXT and VARCHAR are stored identically. The (n) in VARCHAR(n) adds a length check constraint and nothing else. Using TEXT for most columns is the PostgreSQL idiom. Reserve VARCHAR(n) only when you genuinely want the database to enforce a character limit (like a two-character country code).
Numeric Types for Decimals
This is where many developers make a critical mistake.
-- ❌ WRONG for money: float arithmetic is imprecise
price FLOAT -- or DOUBLE PRECISION, REAL
-- ✅ CORRECT for money: exact decimal arithmetic
price NUMERIC(10, 2) -- 10 total digits, 2 after decimal point
Why does FLOAT fail for money?
-- Float arithmetic surprises:
SELECT 0.1 + 0.2; -- Returns 0.30000000000000004 in many contexts
SELECT 10.00::FLOAT - 9.99::FLOAT; -- Might return 0.009999999999999... not 0.01
Floats use binary representation that can't precisely represent most decimal fractions. For prices, use NUMERIC(precision, scale):
NUMERIC(10, 2)— up to 10 total digits, with exactly 2 after the decimal pointNUMERIC(15, 4)— for exchange rates or other high-precision decimalsNUMERICwith no arguments — arbitrary precision (rarely needed)
Boolean
is_active BOOLEAN NOT NULL DEFAULT true
is_featured BOOLEAN NOT NULL DEFAULT false
PostgreSQL accepts TRUE/FALSE, 't'/'f', 1/0, 'yes'/'no' as boolean values. Using TRUE and FALSE in queries is the clearest.
Date and Time Types
This is another area where the wrong choice causes production headaches.
-- ❌ Stores date+time with NO timezone information
created_at TIMESTAMP
-- ✅ Stores date+time WITH timezone (UTC-normalized internally)
created_at TIMESTAMPTZ -- same as TIMESTAMP WITH TIME ZONE
With bare TIMESTAMP, the database stores whatever you give it and gives it back unchanged. If your server is in UTC and your app inserts 2024-03-15 14:00:00, that's what's stored. If you later deploy the app in a different timezone, the stored values become ambiguous.
TIMESTAMPTZ normalizes everything to UTC internally and converts on retrieval. Always use TIMESTAMPTZ for production. The only exception is when you genuinely want to record a "wall clock time" with no timezone (rare — think "alarm that fires at 09:00 regardless of timezone").
-- Other time types
appointment_date DATE -- date only, no time: 2024-03-15
duration INTERVAL -- a span of time: '2 hours', '30 minutes'
meeting_time TIME -- time only, no date: 14:30:00
Part 3: Constraints — Enforcing Business Rules
Constraints are rules the database enforces on every insert and update. They're the difference between a database that keeps itself consistent and one that relies on application code to catch every mistake.
NOT NULL — Required Fields
NOT NULL prevents a column from ever containing NULL:
name TEXT NOT NULL
Without NOT NULL, you can insert a product with no name — the database will happily accept NULL. With NOT NULL, the insert fails with a clear error.
Rule of thumb: every column should be NOT NULL unless you have a specific reason to allow null. Nullable columns make queries harder (you always have to account for IS NULL), and usually indicate a schema design that can be improved.
-- ❌ Too permissive: name and price can be NULL
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
price NUMERIC(10, 2)
);
-- ✅ Correct: required fields are NOT NULL
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
price NUMERIC(10, 2) NOT NULL
);
DEFAULT — Automatic Values
DEFAULT provides a fallback value when a column is omitted from an INSERT:
stock_quantity INTEGER NOT NULL DEFAULT 0
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
status TEXT NOT NULL DEFAULT 'pending'
With DEFAULT 0 on stock_quantity, you can insert a product without specifying stock:
-- stock_quantity will be 0 automatically
INSERT INTO products (name, price, category)
VALUES ('New Item', 29.99, 'Accessories');
NOW() is a PostgreSQL function that returns the current timestamp — perfect for created_at columns.
UNIQUE — No Duplicates
UNIQUE ensures no two rows have the same value in that column:
email TEXT NOT NULL UNIQUE
Trying to insert two users with the same email fails:
-- ❌ Error: duplicate key value violates unique constraint
INSERT INTO users (name, email) VALUES ('Another Aisha', 'aisha@example.com');
-- ERROR: duplicate key value violates unique constraint "users_email_key"
Note: NULL is special with UNIQUE — multiple rows can have NULL in a unique column (because NULL != NULL in SQL's logic). This is usually the desired behavior for optional unique fields.
Multi-column UNIQUE — require a unique combination of values:
-- An order can only have one row per product
-- (quantity handles multiple units, not multiple rows)
UNIQUE (order_id, product_id)
This is written as a table constraint (after the column definitions) rather than a column constraint:
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER NOT NULL,
UNIQUE (order_id, product_id) -- table-level constraint
);
CHECK — Custom Validation Rules
CHECK lets you write any boolean expression as a constraint:
-- Price must be non-negative
price NUMERIC(10, 2) NOT NULL CHECK (price >= 0)
-- Quantity must be at least 1
quantity INTEGER NOT NULL CHECK (quantity > 0)
-- Status must be one of the valid values
status TEXT NOT NULL CHECK (status IN ('pending', 'shipped', 'cancelled', 'refunded'))
CHECK violations produce clear errors:
-- ❌ Error: violates check constraint
INSERT INTO products (name, price, category, stock_quantity)
VALUES ('Bad Product', -10.00, 'Electronics', 5);
-- ERROR: new row for relation "products" violates check constraint "products_price_check"
Multi-column CHECK constraints go at the table level:
-- Discount price must be less than regular price
CREATE TABLE promotions (
id SERIAL PRIMARY KEY,
product_id INTEGER NOT NULL,
original_price NUMERIC(10, 2) NOT NULL,
sale_price NUMERIC(10, 2) NOT NULL,
CHECK (sale_price < original_price)
);
PRIMARY KEY — The Row's Identity
PRIMARY KEY combines NOT NULL and UNIQUE and designates the column as the table's unique row identifier. Every table should have one.
id SERIAL PRIMARY KEY
For most tables, a single auto-incrementing integer is the right primary key. Composite primary keys (combining two columns) are sometimes used for junction tables:
-- Composite primary key on the junction table
-- (instead of a surrogate id column)
CREATE TABLE order_items (
order_id INTEGER NOT NULL REFERENCES orders(id),
product_id INTEGER NOT NULL REFERENCES products(id),
quantity INTEGER NOT NULL CHECK (quantity > 0),
price_at_purchase NUMERIC(10, 2) NOT NULL,
PRIMARY KEY (order_id, product_id)
);
Part 4: Building the Complete ShopFlow Schema
Let's put everything together. Here's the full ShopFlow schema, built from scratch with proper types and constraints.
Drop any existing tables first if you ran the seed from Article 1:
-- Drop in reverse dependency order (children before parents)
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS products;
DROP TABLE IF EXISTS users;
Now build the schema:
-- ─────────────────────────────────────────────────
-- USERS
-- ─────────────────────────────────────────────────
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT NOT NULL UNIQUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Why these choices:
-- SERIAL: auto-incrementing id, no manual management needed
-- TEXT NOT NULL: name is required
-- TEXT NOT NULL UNIQUE: email is required and must be unique across all users
-- TIMESTAMPTZ: stores timezone-aware timestamp, defaults to current time
-- ─────────────────────────────────────────────────
-- PRODUCTS
-- ─────────────────────────────────────────────────
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
description TEXT, -- optional
price NUMERIC(10, 2) NOT NULL CHECK (price >= 0),
stock_quantity INTEGER NOT NULL DEFAULT 0 CHECK (stock_quantity >= 0),
category TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Why these choices:
-- description TEXT (no NOT NULL): products can have no description
-- NUMERIC(10,2): exact decimal for money — no floating-point errors
-- CHECK (price >= 0): database enforces non-negative prices
-- DEFAULT 0: new products start with no stock
-- ─────────────────────────────────────────────────
-- ORDERS
-- ─────────────────────────────────────────────────
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(id),
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending', 'shipped', 'cancelled', 'refunded')),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- Why these choices:
-- REFERENCES users(id): foreign key — orders must belong to a real user
-- CHECK on status: only valid status values are allowed
-- DEFAULT 'pending': new orders start in pending state
-- ─────────────────────────────────────────────────
-- ORDER ITEMS
-- ─────────────────────────────────────────────────
CREATE TABLE order_items (
id SERIAL PRIMARY KEY,
order_id INTEGER NOT NULL REFERENCES orders(id),
product_id INTEGER NOT NULL REFERENCES products(id),
quantity INTEGER NOT NULL CHECK (quantity > 0),
price_at_purchase NUMERIC(10, 2) NOT NULL CHECK (price_at_purchase >= 0),
UNIQUE (order_id, product_id)
);
-- Why these choices:
-- REFERENCES orders(id): must belong to a real order
-- REFERENCES products(id): must reference a real product
-- CHECK (quantity > 0): can't order zero or negative units
-- price_at_purchase: snapshot of price at time of order (products.price can change)
-- UNIQUE (order_id, product_id): each product appears at most once per order
What the Constraints Enforce
Let's trace what the database now automatically rejects — without any application code:
| Attempted action | Constraint that blocks it |
|---|---|
| Insert product with no name | NOT NULL on products.name |
| Insert product with negative price | CHECK (price >= 0) |
| Insert two users with same email | UNIQUE on users.email |
| Insert order for non-existent user | REFERENCES users(id) |
Set order status to 'delivered' | CHECK (status IN (...)) |
| Insert order item with quantity 0 | CHECK (quantity > 0) |
| Add same product twice to an order | UNIQUE (order_id, product_id) |
Every one of these would be a data integrity bug if it reached the database unchecked. The schema catches all of them.
Part 5: Modifying Tables with ALTER TABLE
After a table is created, you can modify it without dropping and recreating it:
-- Add a new column
ALTER TABLE products
ADD COLUMN sku TEXT UNIQUE;
-- Make an existing column NOT NULL
-- (requires no existing NULLs in that column, or a DEFAULT)
ALTER TABLE products
ALTER COLUMN description SET NOT NULL;
-- Add a default to an existing column
ALTER TABLE orders
ALTER COLUMN status SET DEFAULT 'pending';
-- Add a constraint to an existing table
ALTER TABLE products
ADD CONSTRAINT check_positive_price CHECK (price >= 0);
-- Remove a constraint
ALTER TABLE products
DROP CONSTRAINT check_positive_price;
-- Rename a column
ALTER TABLE users
RENAME COLUMN name TO full_name;
-- Change a column's type (risky — existing data must be compatible)
ALTER TABLE products
ALTER COLUMN stock_quantity TYPE BIGINT;
In production applications, schema changes are managed through migrations — versioned SQL files that record every change in order. This ensures all environments (development, staging, production) have identical schemas. We'll cover migration workflows in a future Drizzle ORM article.
Common Misconceptions
❌ Misconception: VARCHAR(255) is better than TEXT for performance
Reality: In PostgreSQL, TEXT and VARCHAR use identical internal storage. The only difference is that VARCHAR(n) adds a length-check constraint. There's no performance advantage to VARCHAR(255) — it's a habit carried over from MySQL where the distinction matters.
Fix: Use TEXT by default. Use VARCHAR(n) only when you genuinely want to enforce a maximum length.
❌ Misconception: FLOAT is fine for money because the rounding errors are tiny
Reality: Financial calculations accumulate rounding errors. Summing thousands of transactions with float arithmetic can produce results that are off by cents, which is unacceptable for accounting.
-- Illustration of float imprecision
SELECT 0.1::FLOAT + 0.2::FLOAT;
-- Result: 0.30000000000000004 (not 0.3)
-- NUMERIC is exact
SELECT 0.1::NUMERIC + 0.2::NUMERIC;
-- Result: 0.3
Fix: Always use NUMERIC(precision, scale) for money. Many financial systems also store amounts as integer cents (INTEGER) to avoid decimal math entirely.
❌ Misconception: More constraints mean slower writes
Reality: The performance cost of constraint checks on individual row inserts is negligible. Constraints save far more in debugging time and data cleanup than they cost in write performance. The one constraint type with a meaningful performance cost is foreign keys (they require an index lookup on the referenced table) — and even that cost is usually worth the integrity guarantee.
Troubleshooting Common Issues
Problem: ERROR: column cannot be cast to type integer
Symptoms: ALTER TABLE ... ALTER COLUMN TYPE fails with a cast error.
Cause: Existing data in the column can't be automatically converted to the new type. For example, changing TEXT to INTEGER when some rows contain non-numeric text.
Solution:
-- Check what values exist before changing type
SELECT DISTINCT column_name FROM table_name;
-- If needed, provide an explicit USING clause for the conversion
ALTER TABLE products
ALTER COLUMN stock_quantity TYPE BIGINT
USING stock_quantity::BIGINT;
Problem: ERROR: null value in column violates not-null constraint
Symptoms: An INSERT fails because a required column was omitted.
Cause: A NOT NULL column has no DEFAULT and no value was provided in the INSERT.
Solution:
-- Either provide a value in every INSERT
INSERT INTO products (name, price, category, stock_quantity)
VALUES ('Item', 9.99, 'Accessories', 0); -- stock_quantity explicitly included
-- Or add a DEFAULT to the column
ALTER TABLE products ALTER COLUMN stock_quantity SET DEFAULT 0;
Problem: CHECK constraint name is unclear in error messages
Symptoms: ERROR: violates check constraint "products_price_check" — you're not sure which constraint this refers to.
Solution: Name your constraints explicitly for clearer error messages:
-- ❌ Auto-generated name: "products_price_check"
price NUMERIC(10, 2) CHECK (price >= 0)
-- ✅ Named constraint: "price_must_be_non_negative"
price NUMERIC(10, 2) CONSTRAINT price_must_be_non_negative CHECK (price >= 0)
Check Your Understanding
Quick Quiz
-
Why should you use
NUMERIC(10, 2)instead ofFLOATfor price columns?Show Answer
FLOATuses binary floating-point representation, which cannot precisely represent most decimal fractions. Financial calculations accumulate these rounding errors.NUMERICuses exact decimal arithmetic —0.1 + 0.2is exactly0.3, not0.30000000000000004. -
What's the difference between a column constraint and a table constraint?
Show Answer
A column constraint is written inline with the column definition and applies to that column only (
price NUMERIC NOT NULL). A table constraint is written after all column definitions and can reference multiple columns (UNIQUE (order_id, product_id),CHECK (sale_price < original_price)). Some constraints (likePRIMARY KEYandUNIQUE) can be written either way — use the table form when multiple columns are involved. -
Why does the
order_itemstable storeprice_at_purchaseinstead of just referencingproducts.price?Show Answer
Product prices change over time. If
order_itemsreferencedproducts.pricedirectly, old orders would show the current price instead of the price the customer actually paid. By storingprice_at_purchaseas a snapshot at the time of the order, historical order totals remain correct even when prices change later.
Hands-On Challenge
Task: Design a reviews table for ShopFlow. A review belongs to a product and a user, has a rating (1–5), optional text content, and a timestamp. Add appropriate constraints: a user can only review each product once; rating must be between 1 and 5; the review must belong to a real user and real product.
Show Solution
CREATE TABLE reviews (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(id),
product_id INTEGER NOT NULL REFERENCES products(id),
rating SMALLINT NOT NULL CHECK (rating BETWEEN 1 AND 5),
content TEXT, -- optional review text
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (user_id, product_id) -- one review per user per product
);
Design decisions explained:
SMALLINTfor rating — values 1–5 fit in 2 bytes;CHECK (rating BETWEEN 1 AND 5)enforces the valid rangecontent TEXT(noNOT NULL) — review text is optional; a user might just want to leave a star ratingUNIQUE (user_id, product_id)— table-level constraint preventing duplicate reviewsREFERENCES users(id)andREFERENCES products(id)— orphaned reviews (pointing to deleted users or products) are prevented
Summary: Key Takeaways
CREATE TABLEdefines columns with data types and constraints — the schema is the database's rulebook- Use
SERIALfor auto-incrementing IDs;BIGSERIALfor very large tables - Use
NUMERIC(10, 2)for money — neverFLOATorDOUBLE PRECISION - Use
TIMESTAMPTZfor timestamps in production — it preserves timezone information TEXTandVARCHAR(n)perform identically in PostgreSQL — useTEXTby default- Constraints enforce business rules at the database level:
NOT NULL,UNIQUE,DEFAULT,CHECK,PRIMARY KEY - Column constraints apply to one column; table constraints can span multiple columns
ALTER TABLEmodifies existing tables without dropping them — in production, this is done through migrations- A well-constrained schema rejects bad data automatically, before it ever reaches application code
What's Next?
You've designed the ShopFlow schema with proper types and constraints. But the schema still refers to REFERENCES users(id) without explaining how foreign keys work in detail — what happens on delete, how cascades work, and how to design the right relationship for each scenario.
The next step is Foreign Keys and Relationships: Linking Tables — where you'll learn the full mechanics of foreign key constraints, the difference between ON DELETE CASCADE, RESTRICT, and SET NULL, and how to design one-to-many and many-to-many relationships correctly.