Skip to main content

SQL User-Defined Functions (UDFs)

SQL comes with a rich library of built-in functions like UPPER(), ROUND(), NOW(), and COALESCE(). But eventually you will encounter calculations or transformations that no built-in function covers. Maybe you need to classify customers into loyalty tiers based on a custom formula, compute a tax rate that varies by region, or format phone numbers in a specific way across dozens of queries. This is where SQL user-defined functions come in.

A user-defined function (UDF) lets you encapsulate reusable logic inside the database, then call it in your SELECT, WHERE, JOIN, and virtually anywhere you would use a built-in function. Unlike stored procedures, functions return values and can be embedded directly inside SQL statements, making them feel like natural extensions of the SQL language.

This guide covers the two main types of UDFs (scalar and table-valued), the CREATE FUNCTION syntax across PostgreSQL, MySQL, and SQL Server, practical examples of using functions in queries, and a clear comparison of functions vs procedures so you know when to use each.

What Is a User-Defined Function?

A user-defined function is a named block of code stored in the database that:

  1. Accepts zero or more input parameters
  2. Performs a computation or data transformation
  3. Returns a value (a single scalar value or an entire table of rows)
  4. Can be used inside SQL statements just like a built-in function

The key distinction from stored procedures is that functions are designed to return data and be used within queries, while procedures are designed to perform actions and are called as standalone commands.

-- Built-in function
SELECT UPPER('hello'); -- Returns: HELLO

-- User-defined function (same usage pattern)
SELECT calculate_tax(100.00, 'CA'); -- Returns: 7.25

Why Create Functions?

BenefitExplanation
Eliminate code duplicationDefine complex logic once, use it in any query
Improve readabilityget_loyalty_tier(customer_id) is clearer than a 15-line CASE expression
Ensure consistencyEvery query uses the same calculation, no copy-paste divergence
ComposabilityFunctions can be used in SELECT, WHERE, JOIN, ORDER BY, GROUP BY
EncapsulationHide complex formulas behind a simple, named interface

The Sample Data

We will use these tables throughout the guide:

employees table:

idnamedepartmentsalaryhire_date
1AliceEngineering950002019-03-15
2BobEngineering880002021-06-01
3CharlieSales720002018-11-20
4DianaSales680002022-01-10
5EveMarketing740002020-08-05
6FrankMarketing700002023-02-14
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
department VARCHAR(50) NOT NULL,
salary DECIMAL(10,2) NOT NULL,
hire_date DATE NOT NULL
);
INSERT INTO employees (id, name, department, salary, hire_date) VALUES
(1, 'Alice', 'Engineering', 95000, '2019-03-15'),
(2, 'Bob', 'Engineering', 88000, '2021-06-01'),
(3, 'Charlie', 'Sales', 72000, '2018-11-20'),
(4, 'Diana', 'Sales', 68000, '2022-01-10'),
(5, 'Eve', 'Marketing', 74000, '2020-08-05'),
(6, 'Frank', 'Marketing', 70000, '2023-02-14');

customers table:

idnamecountrysignup_datelifetime_spent
101Acme CorpUS2022-01-154250.00
102Globex IncCA2022-06-202180.00
103InitechUK2023-03-101520.00
104UmbrellaUS2024-01-05320.00
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
country VARCHAR(50) NOT NULL,
signup_date DATE NOT NULL,
lifetime_spent DECIMAL(12,2) NOT NULL
);
INSERT INTO customers (id, name, country, signup_date, lifetime_spent) VALUES
(101, 'Acme Corp', 'US', '2022-01-15', 4250.00),
(102, 'Globex Inc', 'CA', '2022-06-20', 2180.00),
(103, 'Initech', 'UK', '2023-03-10', 1520.00),
(104, 'Umbrella', 'US', '2024-01-05', 320.00);

orders table:

idcustomer_idorder_dateamountcountry
11012024-01-05250.00US
21022024-01-12430.00CA
31012024-02-08180.00US
41032024-02-15520.00UK
51022024-03-01310.00CA
61012024-03-18640.00US
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
amount DECIMAL(12,2) NOT NULL,
country VARCHAR(50) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
);
INSERT INTO orders (id, customer_id, order_date, amount, country) VALUES
(1, 101, '2024-01-05', 250.00, 'US'),
(2, 102, '2024-01-12', 430.00, 'CA'),
(3, 101, '2024-02-08', 180.00, 'US'),
(4, 103, '2024-02-15', 520.00, 'UK'),
(5, 102, '2024-03-01', 310.00, 'CA'),
(6, 101, '2024-03-18', 640.00, 'US');

Scalar Functions vs Table-Valued Functions

User-defined functions come in two primary flavors, and understanding the difference is essential for using them correctly.

Scalar Functions

A scalar function returns a single value (one number, one string, one date, etc.). You use it anywhere a single value expression is valid: in SELECT columns, WHERE conditions, ORDER BY clauses, and more.

-- Scalar function returns one value per call
SELECT calculate_tax(250.00, 'US') AS tax;
-- Returns: 20.00

Table-Valued Functions

A table-valued function returns an entire result set (multiple rows and columns). You use it in the FROM clause, just like a table or a view.

-- Table-valued function returns a set of rows
SELECT * FROM get_recent_orders(101, 30);
-- Returns multiple rows:
-- | order_id | order_date | amount |
-- |----------|------------|--------|
-- | 6 | 2024-03-18 | 640.00 |
-- | 3 | 2024-02-08 | 180.00 |

Quick Comparison

AspectScalar FunctionTable-Valued Function
ReturnsSingle valueSet of rows and columns
Used inSELECT, WHERE, ORDER BY, expressionsFROM clause (like a table)
AnalogyBuilt-in UPPER(), ROUND()Built-in generate_series(), view
SupportAll databasesPostgreSQL, SQL Server; limited in MySQL

CREATE FUNCTION: Scalar Functions

PostgreSQL

CREATE OR REPLACE FUNCTION function_name (
parameter1 datatype,
parameter2 datatype
)
RETURNS return_datatype
LANGUAGE plpgsql -- or LANGUAGE sql for simple functions
AS $$
BEGIN
-- logic here
RETURN value;
END;
$$;

PostgreSQL offers two language options for functions:

LanguageBest ForFeatures
LANGUAGE sqlSimple queries, single expressionsNo variables, no control flow; fastest
LANGUAGE plpgsqlComplex logic with variables, IF/ELSE, loopsFull procedural language

MySQL

DELIMITER //

CREATE FUNCTION function_name (
parameter1 datatype,
parameter2 datatype
)
RETURNS return_datatype
DETERMINISTIC -- or NOT DETERMINISTIC
BEGIN
-- logic here
RETURN value;
END //

DELIMITER ;
MySQL DETERMINISTIC Declaration

MySQL requires you to declare whether a function is DETERMINISTIC (same inputs always produce the same output) or NOT DETERMINISTIC (output can vary, e.g., uses NOW() or RAND()). This affects query optimization and replication.

-- DETERMINISTIC: calculate_tax(100, 'US') always returns the same result
-- NOT DETERMINISTIC: get_current_user_count() returns different values over time

If binary logging is enabled and you do not specify determinism, MySQL may reject the function creation.

SQL Server

CREATE FUNCTION function_name (
@parameter1 datatype,
@parameter2 datatype
)
RETURNS return_datatype
AS
BEGIN
DECLARE @result return_datatype;
-- logic here
RETURN @result;
END;

Your First Scalar Function

Let's create a function that calculates years of service for an employee.

PostgreSQL

CREATE OR REPLACE FUNCTION years_of_service (p_hire_date DATE)
RETURNS INT
LANGUAGE sql
AS $$
SELECT EXTRACT(YEAR FROM AGE(CURRENT_DATE, p_hire_date))::INT;
$$;

MySQL

DELIMITER //

CREATE FUNCTION years_of_service (p_hire_date DATE)
RETURNS INT
DETERMINISTIC
BEGIN
RETURN TIMESTAMPDIFF(YEAR, p_hire_date, CURDATE());
END //

DELIMITER ;

SQL Server

CREATE FUNCTION years_of_service (@p_hire_date DATE)
RETURNS INT
AS
BEGIN
RETURN DATEDIFF(YEAR, @p_hire_date, GETDATE());
END;

Using the Function

-- PostgreSQL / MySQL
SELECT
name,
hire_date,
years_of_service(hire_date) AS tenure
FROM employees;

-- SQL Server (schema-qualified call required)
SELECT
name,
hire_date,
dbo.years_of_service(hire_date) AS tenure
FROM employees;

Output:

namehire_datetenure
Alice2019-03-156
Bob2021-06-014
Charlie2018-11-207
Diana2022-01-104
Eve2020-08-055
Frank2023-02-143

The function works exactly like a built-in function. It receives a value per row and returns a computed result.

SQL Server Requires Schema Prefix

In SQL Server, user-defined functions must be called with their schema prefix (typically dbo.). Calling without it produces a "not a recognized built-in function" error:

-- WRONG (SQL Server)
SELECT years_of_service(hire_date) FROM employees;
-- ERROR: 'years_of_service' is not a recognized built-in function name

-- CORRECT (SQL Server)
SELECT dbo.years_of_service(hire_date) FROM employees;

PostgreSQL and MySQL do not require a schema prefix for functions in the default schema.

Practical Scalar Function Examples

Example 1: Tax Calculator

A function that returns the tax amount based on country:

PostgreSQL

CREATE OR REPLACE FUNCTION calculate_tax (
p_amount DECIMAL(10,2),
p_country VARCHAR(2)
)
RETURNS DECIMAL(10,2)
LANGUAGE plpgsql
AS $$
DECLARE
v_rate DECIMAL(5,4);
BEGIN
v_rate := CASE p_country
WHEN 'US' THEN 0.08
WHEN 'CA' THEN 0.13
WHEN 'UK' THEN 0.20
WHEN 'DE' THEN 0.19
ELSE 0.10
END;

RETURN ROUND(p_amount * v_rate, 2);
END;
$$;

MySQL

DELIMITER //

CREATE FUNCTION calculate_tax (
p_amount DECIMAL(10,2),
p_country VARCHAR(2)
)
RETURNS DECIMAL(10,2)
DETERMINISTIC
BEGIN
DECLARE v_rate DECIMAL(5,4);

SET v_rate = CASE p_country
WHEN 'US' THEN 0.08
WHEN 'CA' THEN 0.13
WHEN 'UK' THEN 0.20
WHEN 'DE' THEN 0.19
ELSE 0.10
END;

RETURN ROUND(p_amount * v_rate, 2);
END //

DELIMITER ;

Using in a Query

SELECT
id,
amount,
country,
calculate_tax(amount, country) AS tax,
amount + calculate_tax(amount, country) AS total_with_tax
FROM orders;

Output:

idamountcountrytaxtotal_with_tax
1250.00US20.00270.00
2430.00CA55.90485.90
3180.00US14.40194.40
4520.00UK104.00624.00
5310.00CA40.30350.30
6640.00US51.20691.20

The function is called once per row, just like ROUND() or UPPER().

Example 2: Customer Loyalty Tier

A function that classifies customers into tiers based on their lifetime spending:

PostgreSQL

CREATE OR REPLACE FUNCTION get_loyalty_tier (p_lifetime_spent DECIMAL(10,2))
RETURNS VARCHAR(20)
LANGUAGE sql
AS $$
SELECT CASE
WHEN p_lifetime_spent >= 5000 THEN 'Platinum'
WHEN p_lifetime_spent >= 2000 THEN 'Gold'
WHEN p_lifetime_spent >= 500 THEN 'Silver'
ELSE 'Bronze'
END;
$$;

Using in a Query

SELECT
name,
lifetime_spent,
get_loyalty_tier(lifetime_spent) AS tier
FROM customers
ORDER BY lifetime_spent DESC;

Output:

namelifetime_spenttier
Acme Corp4250.00Gold
Globex Inc2180.00Gold
Initech1520.00Silver
Umbrella320.00Bronze

Example 3: Salary Band Classifier

CREATE OR REPLACE FUNCTION salary_band (p_salary DECIMAL(10,2))
RETURNS VARCHAR(30)
LANGUAGE sql
AS $$
SELECT CASE
WHEN p_salary >= 90000 THEN 'Senior Band'
WHEN p_salary >= 75000 THEN 'Mid Band'
WHEN p_salary >= 60000 THEN 'Junior Band'
ELSE 'Entry Band'
END;
$$;

Using in WHERE and GROUP BY

Functions can appear anywhere an expression is valid:

-- In WHERE: filter by computed tier
SELECT name, salary
FROM employees
WHERE salary_band(salary) = 'Senior Band';

Output:

namesalary
Alice95000
-- In GROUP BY: aggregate by computed band
SELECT
salary_band(salary) AS band,
COUNT(*) AS employee_count,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY salary_band(salary)
ORDER BY avg_salary DESC;

Output:

bandemployee_countavg_salary
Senior Band195000.00
Mid Band188000.00
Junior Band471000.00
-- In ORDER BY
SELECT name, salary, salary_band(salary) AS band
FROM employees
ORDER BY salary_band(salary), salary DESC;

Example 4: Formatting Function

A function that formats a number as a currency string:

PostgreSQL

CREATE OR REPLACE FUNCTION format_currency (
p_amount DECIMAL(10,2),
p_currency VARCHAR(3) DEFAULT 'USD'
)
RETURNS VARCHAR(20)
LANGUAGE sql
AS $$
SELECT CASE p_currency
WHEN 'USD' THEN '$' || TO_CHAR(p_amount, 'FM999,999,990.00')
WHEN 'EUR' THEN '€' || TO_CHAR(p_amount, 'FM999,999,990.00')
WHEN 'GBP' THEN '£' || TO_CHAR(p_amount, 'FM999,999,990.00')
ELSE TO_CHAR(p_amount, 'FM999,999,990.00') || ' ' || p_currency
END;
$$;
SELECT
name,
format_currency(salary) AS annual_salary,
format_currency(salary / 12.0) AS monthly_salary
FROM employees;

Output:

nameannual_salarymonthly_salary
Alice$95,000.00$7,916.67
Bob$88,000.00$7,333.33
Charlie$72,000.00$6,000.00
Diana$68,000.00$5,666.67
Eve$74,000.00$6,166.67
Frank$70,000.00$5,833.33

Notice the default parameter (p_currency VARCHAR(3) DEFAULT 'USD'). If you omit the second argument, it defaults to USD.

Table-Valued Functions

Table-valued functions return an entire result set rather than a single value. They act like parameterized views: you call them in the FROM clause and they return rows and columns.

PostgreSQL: RETURNS TABLE

CREATE OR REPLACE FUNCTION get_department_roster (p_department VARCHAR(50))
RETURNS TABLE (
employee_id INT,
employee_name VARCHAR(100),
salary DECIMAL(10,2),
tenure INT
)
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY
SELECT
e.id,
e.name,
e.salary,
EXTRACT(YEAR FROM AGE(CURRENT_DATE, e.hire_date))::INT
FROM employees e
WHERE e.department = p_department
ORDER BY e.salary DESC;
END;
$$;

Calling:

SELECT * FROM get_department_roster('Engineering');

Output:

employee_idemployee_namesalarytenure
1Alice95000.006
2Bob88000.004
-- Use it like a table: join, filter, aggregate
SELECT
dr.employee_name,
dr.salary,
dr.tenure
FROM get_department_roster('Sales') dr
WHERE dr.tenure >= 2;

Output:

employee_namesalarytenure
Charlie72000.005
Diana68000.002

PostgreSQL: RETURNS SETOF

An alternative syntax that returns rows of an existing table type:

CREATE OR REPLACE FUNCTION get_high_earners (p_min_salary DECIMAL(10,2))
RETURNS SETOF employees
LANGUAGE sql
AS $$
SELECT * FROM employees
WHERE salary >= p_min_salary
ORDER BY salary DESC;
$$;
SELECT name, salary FROM get_high_earners(75000);

Output:

namesalary
Alice95000.00
Bob88000.00

SQL Server: Inline Table-Valued Function

SQL Server has two types of table-valued functions. Inline functions contain a single RETURN SELECT statement and are highly optimized:

CREATE FUNCTION get_department_roster (@p_department VARCHAR(50))
RETURNS TABLE
AS
RETURN (
SELECT
id AS employee_id,
name AS employee_name,
salary,
DATEDIFF(YEAR, hire_date, GETDATE()) AS tenure
FROM employees
WHERE department = @p_department
);
SELECT * FROM dbo.get_department_roster('Engineering');

SQL Server: Multi-Statement Table-Valued Function

For more complex logic requiring variables and control flow:

CREATE FUNCTION get_customer_order_summary (@p_customer_id INT)
RETURNS @result TABLE (
order_count INT,
total_spent DECIMAL(10,2),
avg_order DECIMAL(10,2),
last_order_date DATE
)
AS
BEGIN
INSERT INTO @result
SELECT
COUNT(*),
SUM(amount),
AVG(amount),
MAX(order_date)
FROM orders
WHERE customer_id = @p_customer_id;

RETURN;
END;
SELECT * FROM dbo.get_customer_order_summary(101);

Output:

order_counttotal_spentavg_orderlast_order_date
31070.00356.672024-03-18
MySQL and Table-Valued Functions

MySQL does not support table-valued functions. To achieve similar functionality in MySQL, you have three alternatives:

  1. Views: For static query encapsulation without parameters
  2. Stored procedures with result sets: Procedures can return result sets, but they cannot be used in FROM clauses
  3. Temporary tables inside procedures: Create a temp table, populate it, then query it
-- MySQL workaround: stored procedure that returns a result set
DELIMITER //
CREATE PROCEDURE get_department_roster (IN p_department VARCHAR(50))
BEGIN
SELECT id, name, salary,
TIMESTAMPDIFF(YEAR, hire_date, CURDATE()) AS tenure
FROM employees
WHERE department = p_department
ORDER BY salary DESC;
END //
DELIMITER ;

-- Call it (returns a result set, but CANNOT be used in FROM)
CALL get_department_roster('Engineering');

Using Functions in Complex Queries

The real power of functions emerges when you embed them in complex queries alongside joins, window functions, and aggregations.

Functions in JOIN Conditions

-- Join customers with their loyalty tier
SELECT
c.name,
c.lifetime_spent,
get_loyalty_tier(c.lifetime_spent) AS tier,
COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name, c.lifetime_spent
ORDER BY c.lifetime_spent DESC;

Output:

namelifetime_spenttierorder_count
Acme Corp4250.00Gold3
Globex Inc2180.00Gold2
Initech1520.00Silver1
Umbrella320.00Bronze0

Functions with Window Functions

SELECT
name,
department,
salary,
salary_band(salary) AS band,
RANK() OVER (
PARTITION BY salary_band(salary)
ORDER BY salary DESC
) AS rank_within_band
FROM employees;

Output:

namedepartmentsalarybandrank_within_band
AliceEngineering95000Senior Band1
BobEngineering88000Mid Band1
EveMarketing74000Junior Band1
CharlieSales72000Junior Band2
FrankMarketing70000Junior Band3
DianaSales68000Junior Band4

Table-Valued Functions with JOINs (PostgreSQL / SQL Server)

-- PostgreSQL: LATERAL join with a table-valued function
SELECT
d.dept_name,
r.employee_name,
r.salary,
r.tenure
FROM (VALUES ('Engineering'), ('Sales'), ('Marketing')) AS d(dept_name)
CROSS JOIN LATERAL get_department_roster(d.dept_name) r;
-- SQL Server: CROSS APPLY with a table-valued function
SELECT
d.dept_name,
r.employee_name,
r.salary,
r.tenure
FROM (VALUES ('Engineering'), ('Sales'), ('Marketing')) AS d(dept_name)
CROSS APPLY dbo.get_department_roster(d.dept_name) r;

Functions in CHECK Constraints

You can use immutable functions in CHECK constraints to enforce complex validation rules:

-- PostgreSQL: Function-based constraint
CREATE OR REPLACE FUNCTION is_valid_email (p_email VARCHAR)
RETURNS BOOLEAN
LANGUAGE sql
IMMUTABLE
AS $$
SELECT p_email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$';
$$;

ALTER TABLE customers ADD CONSTRAINT chk_valid_email
CHECK (is_valid_email(email));

The IMMUTABLE keyword tells PostgreSQL that the function always returns the same output for the same input, which is required for use in constraints and indexes.

Functions vs Procedures

This is one of the most common points of confusion. Both encapsulate reusable logic, but they serve fundamentally different purposes.

Key Differences

AspectFunctionProcedure
Returns a value?Yes (required)No (uses OUT params or result sets)
Used inside SQL?Yes (SELECT, WHERE, JOIN, etc.)No (called with CALL/EXEC only)
Can modify data?Depends on database (see below)Yes
Transaction control?No (COMMIT/ROLLBACK not allowed)Yes
Called withAs part of an expressionCALL (MySQL/PG) or EXEC (SQL Server)
Multiple result sets?No (one return value or one table)Yes (some databases)
Primary purposeCompute and return a valuePerform an action (insert, update, delete)

Data Modification Rules

DatabaseCan Functions Modify Data?
PostgreSQLYes (functions can contain INSERT, UPDATE, DELETE)
MySQLYes, but with restrictions (no dynamic SQL, limited in certain contexts)
SQL ServerNo (scalar and inline TVFs cannot modify data; multi-statement TVFs can only modify @table variables)
SQL Server Restriction

SQL Server is the strictest: functions cannot contain INSERT, UPDATE, or DELETE statements on permanent tables. They are purely computational. If you need to modify data, use a stored procedure.

-- SQL Server: This will FAIL
CREATE FUNCTION bad_function (@id INT)
RETURNS INT
AS
BEGIN
UPDATE accounts SET balance = 0 WHERE id = @id; -- NOT ALLOWED
RETURN 0;
END;
-- ERROR: Invalid use of a side-effecting operator 'UPDATE' within a function

When to Use a Function

-- GOOD use of a function: computation used in a SELECT
SELECT
name,
salary,
calculate_tax(salary, 'US') AS annual_tax,
salary_band(salary) AS band
FROM employees;

-- GOOD use of a function: reusable logic in WHERE
SELECT * FROM customers
WHERE get_loyalty_tier(lifetime_spent) = 'Platinum';

-- GOOD use of a function: table-valued function as a data source
SELECT * FROM get_department_roster('Sales');

When to Use a Procedure

-- GOOD use of a procedure: performs multiple data modifications
CALL transfer_funds(1, 2, 500.00);

-- GOOD use of a procedure: complex workflow with transaction control
CALL process_order(101, 501, 3);

-- GOOD use of a procedure: administrative operations
CALL archive_old_records('2023-01-01');

Decision Flowchart: Function vs Procedure Decision Guide

Do you need to USE the result inside a query?

  • YES: Use a FUNCTION

    • Returns a single value? → Scalar Function
    • Returns rows? → Table-Valued Function
  • NO: Does it modify data or manage transactions?

    • YES → Use a PROCEDURE
    • NO → Either works; function is simpler for computation

Function Volatility (PostgreSQL)

PostgreSQL classifies functions by their volatility, which tells the optimizer how the function behaves. Declaring the correct volatility is crucial for performance.

VolatilityMeaningExampleCan Be Used In
IMMUTABLESame inputs always produce same output. Never changes.calculate_tax(100, 'US')Indexes, constraints, any context
STABLESame inputs produce same output within a single query. May vary between queries.Function that reads a config tableMost contexts except indexes
VOLATILEOutput can change at any time, even within the same query.NOW(), RANDOM(), functions that modify dataAll contexts, but optimizer cannot cache results
-- IMMUTABLE: Pure computation, no external dependencies
CREATE OR REPLACE FUNCTION celsius_to_fahrenheit (p_celsius DECIMAL)
RETURNS DECIMAL
LANGUAGE sql
IMMUTABLE
AS $$
SELECT ROUND(p_celsius * 9.0 / 5.0 + 32, 2);
$$;

-- STABLE: Reads database but result is consistent within one query
CREATE OR REPLACE FUNCTION get_setting (p_key VARCHAR)
RETURNS VARCHAR
LANGUAGE sql
STABLE
AS $$
SELECT value FROM app_settings WHERE key = p_key;
$$;

-- VOLATILE: May return different results each time (default)
CREATE OR REPLACE FUNCTION generate_code ()
RETURNS VARCHAR
LANGUAGE sql
VOLATILE
AS $$
SELECT 'ORD-' || LPAD(FLOOR(RANDOM() * 1000000)::TEXT, 6, '0');
$$;
tip

Declaring a function as IMMUTABLE when it truly is allows PostgreSQL to:

  • Cache results for repeated calls with the same arguments
  • Use the function in index expressions (CREATE INDEX ... ON table (my_function(column)))
  • Use the function in materialized view definitions and constraint checks
  • Optimize queries by evaluating the function once instead of per-row

Declaring IMMUTABLE on a function that is not truly immutable (e.g., it reads from a table) will produce incorrect cached results. Always be honest about volatility.

Performance Considerations

Scalar Function Performance Trap (SQL Server)

In SQL Server, scalar UDFs have a well-known performance problem: they are executed row by row, preventing the optimizer from using parallelism and often causing dramatic slowdowns on large tables.

-- SQL Server: This can be extremely slow on large tables
SELECT
name,
salary,
dbo.salary_band(salary) AS band -- Called once per row, no parallelism
FROM employees; -- 1 million rows = 1 million function calls

Alternatives in SQL Server:

  1. Inline Table-Valued Functions (no performance penalty):
-- SQL Server: Inline TVF is "unfolded" into the query plan
CREATE FUNCTION get_salary_bands ()
RETURNS TABLE
AS
RETURN (
SELECT
name,
salary,
CASE
WHEN salary >= 90000 THEN 'Senior Band'
WHEN salary >= 75000 THEN 'Mid Band'
WHEN salary >= 60000 THEN 'Junior Band'
ELSE 'Entry Band'
END AS band
FROM employees
);

SELECT * FROM dbo.get_salary_bands();
  1. Scalar UDF Inlining (SQL Server 2019+): The optimizer can automatically inline simple scalar functions. Ensure your function qualifies by keeping it simple and marking it deterministic.

PostgreSQL and MySQL: Functions Are Generally Efficient

PostgreSQL and MySQL handle scalar functions more efficiently than SQL Server in most cases. However, complex functions with multiple queries inside them can still cause performance issues when called per-row on large tables.

-- POTENTIALLY SLOW: Function executes a query per row
CREATE OR REPLACE FUNCTION get_order_count (p_customer_id INT)
RETURNS INT
LANGUAGE sql
STABLE
AS $$
SELECT COUNT(*) FROM orders WHERE customer_id = p_customer_id;
$$;

-- This runs the COUNT query once per customer row
SELECT name, get_order_count(id) FROM customers;
-- 100,000 customers = 100,000 separate COUNT queries!

-- BETTER: Use a JOIN instead
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;
caution

Avoid putting SELECT queries inside scalar functions that will be called per-row on large tables. The function call hides the fact that you are executing N separate queries. A JOIN or subquery is almost always faster.

Rule of thumb: If a function contains a SELECT from another table, reconsider whether a JOIN would be more efficient.

Inlining Simple Functions (PostgreSQL)

PostgreSQL can inline simple SQL-language functions directly into the calling query, eliminating the function call overhead entirely:

-- This simple SQL function will be inlined
CREATE OR REPLACE FUNCTION double_value (p_val INT)
RETURNS INT
LANGUAGE sql
IMMUTABLE
AS $$
SELECT p_val * 2;
$$;

-- PostgreSQL replaces this:
SELECT double_value(salary) FROM employees;
-- With this (internally):
SELECT salary * 2 FROM employees;

Inlining happens automatically for LANGUAGE sql functions that meet certain criteria (no side effects, simple body). LANGUAGE plpgsql functions are never inlined.

Modifying and Dropping Functions

Modifying

-- PostgreSQL: CREATE OR REPLACE
CREATE OR REPLACE FUNCTION calculate_tax (...)
RETURNS DECIMAL(10,2)
LANGUAGE plpgsql
AS $$ ... $$;

-- SQL Server: ALTER FUNCTION
ALTER FUNCTION dbo.calculate_tax (...)
RETURNS DECIMAL(10,2)
AS
BEGIN ... END;

-- MySQL: Must DROP and recreate
DROP FUNCTION IF EXISTS calculate_tax;
DELIMITER //
CREATE FUNCTION calculate_tax (...) ...
END //
DELIMITER ;

Dropping

-- All databases
DROP FUNCTION function_name;

-- Safe version
DROP FUNCTION IF EXISTS function_name;

-- PostgreSQL: May need to specify parameter types if overloaded
DROP FUNCTION IF EXISTS calculate_tax(DECIMAL, VARCHAR);

Listing Functions

-- PostgreSQL
SELECT routine_name, data_type AS return_type
FROM information_schema.routines
WHERE routine_schema = 'public' AND routine_type = 'FUNCTION';

-- MySQL
SHOW FUNCTION STATUS WHERE Db = 'your_database';

-- SQL Server
SELECT name, type_desc FROM sys.objects
WHERE type IN ('FN', 'IF', 'TF');
-- FN = Scalar, IF = Inline TVF, TF = Multi-statement TVF

Common Mistakes to Avoid

Mistake 1: Using Functions for Data Modification (SQL Server)

-- SQL Server: This will NOT compile
CREATE FUNCTION update_balance (@id INT, @amount DECIMAL)
RETURNS DECIMAL
AS
BEGIN
UPDATE accounts SET balance = balance + @amount WHERE id = @id;
RETURN @amount;
END;
-- ERROR: Invalid use of a side-effecting operator

Fix: Use a stored procedure for data modifications in SQL Server.

Mistake 2: Hidden N+1 Queries Inside Functions

-- BAD: Scalar function that queries another table
CREATE FUNCTION get_customer_name (p_id INT)
RETURNS VARCHAR
LANGUAGE sql STABLE
AS $$
SELECT name FROM customers WHERE id = p_id;
$$;

-- Calling it per-row on a large orders table:
SELECT id, amount, get_customer_name(customer_id) AS customer
FROM orders;
-- If orders has 500,000 rows, this runs 500,000 separate lookups!

Fix: Use a JOIN:

SELECT o.id, o.amount, c.name AS customer
FROM orders o
JOIN customers c ON o.customer_id = c.id;

Mistake 3: Wrong Volatility Declaration (PostgreSQL)

-- BAD: Function reads a table but is declared IMMUTABLE
CREATE OR REPLACE FUNCTION get_tax_rate (p_country VARCHAR)
RETURNS DECIMAL
LANGUAGE sql
IMMUTABLE -- WRONG! This reads from a table
AS $$
SELECT rate FROM tax_rates WHERE country = p_country;
$$;
-- PostgreSQL may cache the result and return stale data
-- when the tax_rates table is updated!

Fix: Use STABLE for functions that read from tables:

CREATE OR REPLACE FUNCTION get_tax_rate (p_country VARCHAR)
RETURNS DECIMAL
LANGUAGE sql
STABLE -- Correct: consistent within a query, may change between queries
AS $$
SELECT rate FROM tax_rates WHERE country = p_country;
$$;

Mistake 4: Forgetting the Schema Prefix in SQL Server

-- SQL Server: This fails
SELECT calculate_tax(100.00, 'US');
-- ERROR: 'calculate_tax' is not a recognized built-in function name

-- Fix: Include the schema
SELECT dbo.calculate_tax(100.00, 'US');

Mistake 5: Overly Complex Functions

-- BAD: 200-line function with multiple queries, loops, and conditionals
-- This should probably be a stored procedure or application code
CREATE FUNCTION process_entire_order (...)
RETURNS ...

Fix: Keep functions focused and simple. If a function grows beyond a single, clear responsibility, split it into multiple smaller functions or move the logic to a stored procedure.

Summary

SQL user-defined functions let you extend the database's built-in capabilities with your own reusable, encapsulated logic:

  • Scalar functions return a single value and can be used anywhere an expression is valid: SELECT, WHERE, JOIN, ORDER BY, GROUP BY, and more.
  • Table-valued functions return a result set and are used in the FROM clause like a parameterized view. Supported in PostgreSQL and SQL Server; not available in MySQL.
  • CREATE FUNCTION syntax varies across databases. PostgreSQL uses LANGUAGE plpgsql/sql with $$ ... $$ blocks. MySQL requires DELIMITER and a DETERMINISTIC declaration. SQL Server uses @ parameter prefixes and requires the dbo. schema prefix when calling.
  • Functions vs procedures: Functions return values and are used inside SQL statements. Procedures perform actions and are called as standalone commands. Use functions for computation; use procedures for data modification and transaction management.
  • Volatility (PostgreSQL) tells the optimizer how to cache function results. IMMUTABLE for pure computations, STABLE for functions that read data, VOLATILE for everything else.
  • Avoid putting SELECT queries inside scalar functions that will be called per-row on large tables. A JOIN is almost always faster.
  • In SQL Server, scalar UDFs have significant performance overhead. Prefer inline table-valued functions or leverage scalar UDF inlining in SQL Server 2019+.
When to Create a Function

Create a function when you find yourself:

  • Repeating the same CASE expression or calculation in multiple queries
  • Wishing you could filter or sort by a computed value without repeating the logic
  • Needing a parameterized view (table-valued function)
  • Wanting to enforce consistent business logic across all queries

If the logic modifies data or manages transactions, use a stored procedure instead.

Table of Contents