SQL User-Defined Functions (UDFs)
SQL comes with a rich library of built-in functions like UPPER(), ROUND(), NOW(), and COALESCE(). But eventually you will encounter calculations or transformations that no built-in function covers. Maybe you need to classify customers into loyalty tiers based on a custom formula, compute a tax rate that varies by region, or format phone numbers in a specific way across dozens of queries. This is where SQL user-defined functions come in.
A user-defined function (UDF) lets you encapsulate reusable logic inside the database, then call it in your SELECT, WHERE, JOIN, and virtually anywhere you would use a built-in function. Unlike stored procedures, functions return values and can be embedded directly inside SQL statements, making them feel like natural extensions of the SQL language.
This guide covers the two main types of UDFs (scalar and table-valued), the CREATE FUNCTION syntax across PostgreSQL, MySQL, and SQL Server, practical examples of using functions in queries, and a clear comparison of functions vs procedures so you know when to use each.
What Is a User-Defined Function?
A user-defined function is a named block of code stored in the database that:
- Accepts zero or more input parameters
- Performs a computation or data transformation
- Returns a value (a single scalar value or an entire table of rows)
- Can be used inside SQL statements just like a built-in function
The key distinction from stored procedures is that functions are designed to return data and be used within queries, while procedures are designed to perform actions and are called as standalone commands.
-- Built-in function
SELECT UPPER('hello'); -- Returns: HELLO
-- User-defined function (same usage pattern)
SELECT calculate_tax(100.00, 'CA'); -- Returns: 7.25
Why Create Functions?
| Benefit | Explanation |
|---|---|
| Eliminate code duplication | Define complex logic once, use it in any query |
| Improve readability | get_loyalty_tier(customer_id) is clearer than a 15-line CASE expression |
| Ensure consistency | Every query uses the same calculation, no copy-paste divergence |
| Composability | Functions can be used in SELECT, WHERE, JOIN, ORDER BY, GROUP BY |
| Encapsulation | Hide complex formulas behind a simple, named interface |
The Sample Data
We will use these tables throughout the guide:
employees table:
| id | name | department | salary | hire_date |
|---|---|---|---|---|
| 1 | Alice | Engineering | 95000 | 2019-03-15 |
| 2 | Bob | Engineering | 88000 | 2021-06-01 |
| 3 | Charlie | Sales | 72000 | 2018-11-20 |
| 4 | Diana | Sales | 68000 | 2022-01-10 |
| 5 | Eve | Marketing | 74000 | 2020-08-05 |
| 6 | Frank | Marketing | 70000 | 2023-02-14 |
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
department VARCHAR(50) NOT NULL,
salary DECIMAL(10,2) NOT NULL,
hire_date DATE NOT NULL
);
INSERT INTO employees (id, name, department, salary, hire_date) VALUES
(1, 'Alice', 'Engineering', 95000, '2019-03-15'),
(2, 'Bob', 'Engineering', 88000, '2021-06-01'),
(3, 'Charlie', 'Sales', 72000, '2018-11-20'),
(4, 'Diana', 'Sales', 68000, '2022-01-10'),
(5, 'Eve', 'Marketing', 74000, '2020-08-05'),
(6, 'Frank', 'Marketing', 70000, '2023-02-14');
customers table:
| id | name | country | signup_date | lifetime_spent |
|---|---|---|---|---|
| 101 | Acme Corp | US | 2022-01-15 | 4250.00 |
| 102 | Globex Inc | CA | 2022-06-20 | 2180.00 |
| 103 | Initech | UK | 2023-03-10 | 1520.00 |
| 104 | Umbrella | US | 2024-01-05 | 320.00 |
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
country VARCHAR(50) NOT NULL,
signup_date DATE NOT NULL,
lifetime_spent DECIMAL(12,2) NOT NULL
);
INSERT INTO customers (id, name, country, signup_date, lifetime_spent) VALUES
(101, 'Acme Corp', 'US', '2022-01-15', 4250.00),
(102, 'Globex Inc', 'CA', '2022-06-20', 2180.00),
(103, 'Initech', 'UK', '2023-03-10', 1520.00),
(104, 'Umbrella', 'US', '2024-01-05', 320.00);
orders table:
| id | customer_id | order_date | amount | country |
|---|---|---|---|---|
| 1 | 101 | 2024-01-05 | 250.00 | US |
| 2 | 102 | 2024-01-12 | 430.00 | CA |
| 3 | 101 | 2024-02-08 | 180.00 | US |
| 4 | 103 | 2024-02-15 | 520.00 | UK |
| 5 | 102 | 2024-03-01 | 310.00 | CA |
| 6 | 101 | 2024-03-18 | 640.00 | US |
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL,
amount DECIMAL(12,2) NOT NULL,
country VARCHAR(50) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
);
INSERT INTO orders (id, customer_id, order_date, amount, country) VALUES
(1, 101, '2024-01-05', 250.00, 'US'),
(2, 102, '2024-01-12', 430.00, 'CA'),
(3, 101, '2024-02-08', 180.00, 'US'),
(4, 103, '2024-02-15', 520.00, 'UK'),
(5, 102, '2024-03-01', 310.00, 'CA'),
(6, 101, '2024-03-18', 640.00, 'US');
Scalar Functions vs Table-Valued Functions
User-defined functions come in two primary flavors, and understanding the difference is essential for using them correctly.
Scalar Functions
A scalar function returns a single value (one number, one string, one date, etc.). You use it anywhere a single value expression is valid: in SELECT columns, WHERE conditions, ORDER BY clauses, and more.
-- Scalar function returns one value per call
SELECT calculate_tax(250.00, 'US') AS tax;
-- Returns: 20.00
Table-Valued Functions
A table-valued function returns an entire result set (multiple rows and columns). You use it in the FROM clause, just like a table or a view.
-- Table-valued function returns a set of rows
SELECT * FROM get_recent_orders(101, 30);
-- Returns multiple rows:
-- | order_id | order_date | amount |
-- |----------|------------|--------|
-- | 6 | 2024-03-18 | 640.00 |
-- | 3 | 2024-02-08 | 180.00 |
Quick Comparison
| Aspect | Scalar Function | Table-Valued Function |
|---|---|---|
| Returns | Single value | Set of rows and columns |
| Used in | SELECT, WHERE, ORDER BY, expressions | FROM clause (like a table) |
| Analogy | Built-in UPPER(), ROUND() | Built-in generate_series(), view |
| Support | All databases | PostgreSQL, SQL Server; limited in MySQL |
CREATE FUNCTION: Scalar Functions
PostgreSQL
CREATE OR REPLACE FUNCTION function_name (
parameter1 datatype,
parameter2 datatype
)
RETURNS return_datatype
LANGUAGE plpgsql -- or LANGUAGE sql for simple functions
AS $$
BEGIN
-- logic here
RETURN value;
END;
$$;
PostgreSQL offers two language options for functions:
| Language | Best For | Features |
|---|---|---|
LANGUAGE sql | Simple queries, single expressions | No variables, no control flow; fastest |
LANGUAGE plpgsql | Complex logic with variables, IF/ELSE, loops | Full procedural language |
MySQL
DELIMITER //
CREATE FUNCTION function_name (
parameter1 datatype,
parameter2 datatype
)
RETURNS return_datatype
DETERMINISTIC -- or NOT DETERMINISTIC
BEGIN
-- logic here
RETURN value;
END //
DELIMITER ;
MySQL requires you to declare whether a function is DETERMINISTIC (same inputs always produce the same output) or NOT DETERMINISTIC (output can vary, e.g., uses NOW() or RAND()). This affects query optimization and replication.
-- DETERMINISTIC: calculate_tax(100, 'US') always returns the same result
-- NOT DETERMINISTIC: get_current_user_count() returns different values over time
If binary logging is enabled and you do not specify determinism, MySQL may reject the function creation.
SQL Server
CREATE FUNCTION function_name (
@parameter1 datatype,
@parameter2 datatype
)
RETURNS return_datatype
AS
BEGIN
DECLARE @result return_datatype;
-- logic here
RETURN @result;
END;
Your First Scalar Function
Let's create a function that calculates years of service for an employee.
PostgreSQL
CREATE OR REPLACE FUNCTION years_of_service (p_hire_date DATE)
RETURNS INT
LANGUAGE sql
AS $$
SELECT EXTRACT(YEAR FROM AGE(CURRENT_DATE, p_hire_date))::INT;
$$;
MySQL
DELIMITER //
CREATE FUNCTION years_of_service (p_hire_date DATE)
RETURNS INT
DETERMINISTIC
BEGIN
RETURN TIMESTAMPDIFF(YEAR, p_hire_date, CURDATE());
END //
DELIMITER ;
SQL Server
CREATE FUNCTION years_of_service (@p_hire_date DATE)
RETURNS INT
AS
BEGIN
RETURN DATEDIFF(YEAR, @p_hire_date, GETDATE());
END;
Using the Function
-- PostgreSQL / MySQL
SELECT
name,
hire_date,
years_of_service(hire_date) AS tenure
FROM employees;
-- SQL Server (schema-qualified call required)
SELECT
name,
hire_date,
dbo.years_of_service(hire_date) AS tenure
FROM employees;
Output:
| name | hire_date | tenure |
|---|---|---|
| Alice | 2019-03-15 | 6 |
| Bob | 2021-06-01 | 4 |
| Charlie | 2018-11-20 | 7 |
| Diana | 2022-01-10 | 4 |
| Eve | 2020-08-05 | 5 |
| Frank | 2023-02-14 | 3 |
The function works exactly like a built-in function. It receives a value per row and returns a computed result.
In SQL Server, user-defined functions must be called with their schema prefix (typically dbo.). Calling without it produces a "not a recognized built-in function" error:
-- WRONG (SQL Server)
SELECT years_of_service(hire_date) FROM employees;
-- ERROR: 'years_of_service' is not a recognized built-in function name
-- CORRECT (SQL Server)
SELECT dbo.years_of_service(hire_date) FROM employees;
PostgreSQL and MySQL do not require a schema prefix for functions in the default schema.
Practical Scalar Function Examples
Example 1: Tax Calculator
A function that returns the tax amount based on country:
PostgreSQL
CREATE OR REPLACE FUNCTION calculate_tax (
p_amount DECIMAL(10,2),
p_country VARCHAR(2)
)
RETURNS DECIMAL(10,2)
LANGUAGE plpgsql
AS $$
DECLARE
v_rate DECIMAL(5,4);
BEGIN
v_rate := CASE p_country
WHEN 'US' THEN 0.08
WHEN 'CA' THEN 0.13
WHEN 'UK' THEN 0.20
WHEN 'DE' THEN 0.19
ELSE 0.10
END;
RETURN ROUND(p_amount * v_rate, 2);
END;
$$;
MySQL
DELIMITER //
CREATE FUNCTION calculate_tax (
p_amount DECIMAL(10,2),
p_country VARCHAR(2)
)
RETURNS DECIMAL(10,2)
DETERMINISTIC
BEGIN
DECLARE v_rate DECIMAL(5,4);
SET v_rate = CASE p_country
WHEN 'US' THEN 0.08
WHEN 'CA' THEN 0.13
WHEN 'UK' THEN 0.20
WHEN 'DE' THEN 0.19
ELSE 0.10
END;
RETURN ROUND(p_amount * v_rate, 2);
END //
DELIMITER ;
Using in a Query
SELECT
id,
amount,
country,
calculate_tax(amount, country) AS tax,
amount + calculate_tax(amount, country) AS total_with_tax
FROM orders;
Output:
| id | amount | country | tax | total_with_tax |
|---|---|---|---|---|
| 1 | 250.00 | US | 20.00 | 270.00 |
| 2 | 430.00 | CA | 55.90 | 485.90 |
| 3 | 180.00 | US | 14.40 | 194.40 |
| 4 | 520.00 | UK | 104.00 | 624.00 |
| 5 | 310.00 | CA | 40.30 | 350.30 |
| 6 | 640.00 | US | 51.20 | 691.20 |
The function is called once per row, just like ROUND() or UPPER().
Example 2: Customer Loyalty Tier
A function that classifies customers into tiers based on their lifetime spending:
PostgreSQL
CREATE OR REPLACE FUNCTION get_loyalty_tier (p_lifetime_spent DECIMAL(10,2))
RETURNS VARCHAR(20)
LANGUAGE sql
AS $$
SELECT CASE
WHEN p_lifetime_spent >= 5000 THEN 'Platinum'
WHEN p_lifetime_spent >= 2000 THEN 'Gold'
WHEN p_lifetime_spent >= 500 THEN 'Silver'
ELSE 'Bronze'
END;
$$;
Using in a Query
SELECT
name,
lifetime_spent,
get_loyalty_tier(lifetime_spent) AS tier
FROM customers
ORDER BY lifetime_spent DESC;
Output:
| name | lifetime_spent | tier |
|---|---|---|
| Acme Corp | 4250.00 | Gold |
| Globex Inc | 2180.00 | Gold |
| Initech | 1520.00 | Silver |
| Umbrella | 320.00 | Bronze |
Example 3: Salary Band Classifier
CREATE OR REPLACE FUNCTION salary_band (p_salary DECIMAL(10,2))
RETURNS VARCHAR(30)
LANGUAGE sql
AS $$
SELECT CASE
WHEN p_salary >= 90000 THEN 'Senior Band'
WHEN p_salary >= 75000 THEN 'Mid Band'
WHEN p_salary >= 60000 THEN 'Junior Band'
ELSE 'Entry Band'
END;
$$;
Using in WHERE and GROUP BY
Functions can appear anywhere an expression is valid:
-- In WHERE: filter by computed tier
SELECT name, salary
FROM employees
WHERE salary_band(salary) = 'Senior Band';
Output:
| name | salary |
|---|---|
| Alice | 95000 |
-- In GROUP BY: aggregate by computed band
SELECT
salary_band(salary) AS band,
COUNT(*) AS employee_count,
ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY salary_band(salary)
ORDER BY avg_salary DESC;
Output:
| band | employee_count | avg_salary |
|---|---|---|
| Senior Band | 1 | 95000.00 |
| Mid Band | 1 | 88000.00 |
| Junior Band | 4 | 71000.00 |
-- In ORDER BY
SELECT name, salary, salary_band(salary) AS band
FROM employees
ORDER BY salary_band(salary), salary DESC;
Example 4: Formatting Function
A function that formats a number as a currency string:
PostgreSQL
CREATE OR REPLACE FUNCTION format_currency (
p_amount DECIMAL(10,2),
p_currency VARCHAR(3) DEFAULT 'USD'
)
RETURNS VARCHAR(20)
LANGUAGE sql
AS $$
SELECT CASE p_currency
WHEN 'USD' THEN '$' || TO_CHAR(p_amount, 'FM999,999,990.00')
WHEN 'EUR' THEN '€' || TO_CHAR(p_amount, 'FM999,999,990.00')
WHEN 'GBP' THEN '£' || TO_CHAR(p_amount, 'FM999,999,990.00')
ELSE TO_CHAR(p_amount, 'FM999,999,990.00') || ' ' || p_currency
END;
$$;
SELECT
name,
format_currency(salary) AS annual_salary,
format_currency(salary / 12.0) AS monthly_salary
FROM employees;
Output:
| name | annual_salary | monthly_salary |
|---|---|---|
| Alice | $95,000.00 | $7,916.67 |
| Bob | $88,000.00 | $7,333.33 |
| Charlie | $72,000.00 | $6,000.00 |
| Diana | $68,000.00 | $5,666.67 |
| Eve | $74,000.00 | $6,166.67 |
| Frank | $70,000.00 | $5,833.33 |
Notice the default parameter (p_currency VARCHAR(3) DEFAULT 'USD'). If you omit the second argument, it defaults to USD.
Table-Valued Functions
Table-valued functions return an entire result set rather than a single value. They act like parameterized views: you call them in the FROM clause and they return rows and columns.
PostgreSQL: RETURNS TABLE
CREATE OR REPLACE FUNCTION get_department_roster (p_department VARCHAR(50))
RETURNS TABLE (
employee_id INT,
employee_name VARCHAR(100),
salary DECIMAL(10,2),
tenure INT
)
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY
SELECT
e.id,
e.name,
e.salary,
EXTRACT(YEAR FROM AGE(CURRENT_DATE, e.hire_date))::INT
FROM employees e
WHERE e.department = p_department
ORDER BY e.salary DESC;
END;
$$;
Calling:
SELECT * FROM get_department_roster('Engineering');
Output:
| employee_id | employee_name | salary | tenure |
|---|---|---|---|
| 1 | Alice | 95000.00 | 6 |
| 2 | Bob | 88000.00 | 4 |
-- Use it like a table: join, filter, aggregate
SELECT
dr.employee_name,
dr.salary,
dr.tenure
FROM get_department_roster('Sales') dr
WHERE dr.tenure >= 2;
Output:
| employee_name | salary | tenure |
|---|---|---|
| Charlie | 72000.00 | 5 |
| Diana | 68000.00 | 2 |
PostgreSQL: RETURNS SETOF
An alternative syntax that returns rows of an existing table type:
CREATE OR REPLACE FUNCTION get_high_earners (p_min_salary DECIMAL(10,2))
RETURNS SETOF employees
LANGUAGE sql
AS $$
SELECT * FROM employees
WHERE salary >= p_min_salary
ORDER BY salary DESC;
$$;
SELECT name, salary FROM get_high_earners(75000);
Output:
| name | salary |
|---|---|
| Alice | 95000.00 |
| Bob | 88000.00 |
SQL Server: Inline Table-Valued Function
SQL Server has two types of table-valued functions. Inline functions contain a single RETURN SELECT statement and are highly optimized:
CREATE FUNCTION get_department_roster (@p_department VARCHAR(50))
RETURNS TABLE
AS
RETURN (
SELECT
id AS employee_id,
name AS employee_name,
salary,
DATEDIFF(YEAR, hire_date, GETDATE()) AS tenure
FROM employees
WHERE department = @p_department
);
SELECT * FROM dbo.get_department_roster('Engineering');
SQL Server: Multi-Statement Table-Valued Function
For more complex logic requiring variables and control flow:
CREATE FUNCTION get_customer_order_summary (@p_customer_id INT)
RETURNS @result TABLE (
order_count INT,
total_spent DECIMAL(10,2),
avg_order DECIMAL(10,2),
last_order_date DATE
)
AS
BEGIN
INSERT INTO @result
SELECT
COUNT(*),
SUM(amount),
AVG(amount),
MAX(order_date)
FROM orders
WHERE customer_id = @p_customer_id;
RETURN;
END;
SELECT * FROM dbo.get_customer_order_summary(101);
Output:
| order_count | total_spent | avg_order | last_order_date |
|---|---|---|---|
| 3 | 1070.00 | 356.67 | 2024-03-18 |
MySQL does not support table-valued functions. To achieve similar functionality in MySQL, you have three alternatives:
- Views: For static query encapsulation without parameters
- Stored procedures with result sets: Procedures can return result sets, but they cannot be used in
FROMclauses - Temporary tables inside procedures: Create a temp table, populate it, then query it
-- MySQL workaround: stored procedure that returns a result set
DELIMITER //
CREATE PROCEDURE get_department_roster (IN p_department VARCHAR(50))
BEGIN
SELECT id, name, salary,
TIMESTAMPDIFF(YEAR, hire_date, CURDATE()) AS tenure
FROM employees
WHERE department = p_department
ORDER BY salary DESC;
END //
DELIMITER ;
-- Call it (returns a result set, but CANNOT be used in FROM)
CALL get_department_roster('Engineering');
Using Functions in Complex Queries
The real power of functions emerges when you embed them in complex queries alongside joins, window functions, and aggregations.
Functions in JOIN Conditions
-- Join customers with their loyalty tier
SELECT
c.name,
c.lifetime_spent,
get_loyalty_tier(c.lifetime_spent) AS tier,
COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name, c.lifetime_spent
ORDER BY c.lifetime_spent DESC;
Output:
| name | lifetime_spent | tier | order_count |
|---|---|---|---|
| Acme Corp | 4250.00 | Gold | 3 |
| Globex Inc | 2180.00 | Gold | 2 |
| Initech | 1520.00 | Silver | 1 |
| Umbrella | 320.00 | Bronze | 0 |
Functions with Window Functions
SELECT
name,
department,
salary,
salary_band(salary) AS band,
RANK() OVER (
PARTITION BY salary_band(salary)
ORDER BY salary DESC
) AS rank_within_band
FROM employees;
Output:
| name | department | salary | band | rank_within_band |
|---|---|---|---|---|
| Alice | Engineering | 95000 | Senior Band | 1 |
| Bob | Engineering | 88000 | Mid Band | 1 |
| Eve | Marketing | 74000 | Junior Band | 1 |
| Charlie | Sales | 72000 | Junior Band | 2 |
| Frank | Marketing | 70000 | Junior Band | 3 |
| Diana | Sales | 68000 | Junior Band | 4 |
Table-Valued Functions with JOINs (PostgreSQL / SQL Server)
-- PostgreSQL: LATERAL join with a table-valued function
SELECT
d.dept_name,
r.employee_name,
r.salary,
r.tenure
FROM (VALUES ('Engineering'), ('Sales'), ('Marketing')) AS d(dept_name)
CROSS JOIN LATERAL get_department_roster(d.dept_name) r;
-- SQL Server: CROSS APPLY with a table-valued function
SELECT
d.dept_name,
r.employee_name,
r.salary,
r.tenure
FROM (VALUES ('Engineering'), ('Sales'), ('Marketing')) AS d(dept_name)
CROSS APPLY dbo.get_department_roster(d.dept_name) r;
Functions in CHECK Constraints
You can use immutable functions in CHECK constraints to enforce complex validation rules:
-- PostgreSQL: Function-based constraint
CREATE OR REPLACE FUNCTION is_valid_email (p_email VARCHAR)
RETURNS BOOLEAN
LANGUAGE sql
IMMUTABLE
AS $$
SELECT p_email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$';
$$;
ALTER TABLE customers ADD CONSTRAINT chk_valid_email
CHECK (is_valid_email(email));
The IMMUTABLE keyword tells PostgreSQL that the function always returns the same output for the same input, which is required for use in constraints and indexes.
Functions vs Procedures
This is one of the most common points of confusion. Both encapsulate reusable logic, but they serve fundamentally different purposes.
Key Differences
| Aspect | Function | Procedure |
|---|---|---|
| Returns a value? | Yes (required) | No (uses OUT params or result sets) |
| Used inside SQL? | Yes (SELECT, WHERE, JOIN, etc.) | No (called with CALL/EXEC only) |
| Can modify data? | Depends on database (see below) | Yes |
| Transaction control? | No (COMMIT/ROLLBACK not allowed) | Yes |
| Called with | As part of an expression | CALL (MySQL/PG) or EXEC (SQL Server) |
| Multiple result sets? | No (one return value or one table) | Yes (some databases) |
| Primary purpose | Compute and return a value | Perform an action (insert, update, delete) |
Data Modification Rules
| Database | Can Functions Modify Data? |
|---|---|
| PostgreSQL | Yes (functions can contain INSERT, UPDATE, DELETE) |
| MySQL | Yes, but with restrictions (no dynamic SQL, limited in certain contexts) |
| SQL Server | No (scalar and inline TVFs cannot modify data; multi-statement TVFs can only modify @table variables) |
SQL Server is the strictest: functions cannot contain INSERT, UPDATE, or DELETE statements on permanent tables. They are purely computational. If you need to modify data, use a stored procedure.
-- SQL Server: This will FAIL
CREATE FUNCTION bad_function (@id INT)
RETURNS INT
AS
BEGIN
UPDATE accounts SET balance = 0 WHERE id = @id; -- NOT ALLOWED
RETURN 0;
END;
-- ERROR: Invalid use of a side-effecting operator 'UPDATE' within a function
When to Use a Function
-- GOOD use of a function: computation used in a SELECT
SELECT
name,
salary,
calculate_tax(salary, 'US') AS annual_tax,
salary_band(salary) AS band
FROM employees;
-- GOOD use of a function: reusable logic in WHERE
SELECT * FROM customers
WHERE get_loyalty_tier(lifetime_spent) = 'Platinum';
-- GOOD use of a function: table-valued function as a data source
SELECT * FROM get_department_roster('Sales');
When to Use a Procedure
-- GOOD use of a procedure: performs multiple data modifications
CALL transfer_funds(1, 2, 500.00);
-- GOOD use of a procedure: complex workflow with transaction control
CALL process_order(101, 501, 3);
-- GOOD use of a procedure: administrative operations
CALL archive_old_records('2023-01-01');
Decision Flowchart: Function vs Procedure Decision Guide
Do you need to USE the result inside a query?
-
YES: Use a FUNCTION
- Returns a single value? → Scalar Function
- Returns rows? → Table-Valued Function
-
NO: Does it modify data or manage transactions?
- YES → Use a PROCEDURE
- NO → Either works; function is simpler for computation
Function Volatility (PostgreSQL)
PostgreSQL classifies functions by their volatility, which tells the optimizer how the function behaves. Declaring the correct volatility is crucial for performance.
| Volatility | Meaning | Example | Can Be Used In |
|---|---|---|---|
| IMMUTABLE | Same inputs always produce same output. Never changes. | calculate_tax(100, 'US') | Indexes, constraints, any context |
| STABLE | Same inputs produce same output within a single query. May vary between queries. | Function that reads a config table | Most contexts except indexes |
| VOLATILE | Output can change at any time, even within the same query. | NOW(), RANDOM(), functions that modify data | All contexts, but optimizer cannot cache results |
-- IMMUTABLE: Pure computation, no external dependencies
CREATE OR REPLACE FUNCTION celsius_to_fahrenheit (p_celsius DECIMAL)
RETURNS DECIMAL
LANGUAGE sql
IMMUTABLE
AS $$
SELECT ROUND(p_celsius * 9.0 / 5.0 + 32, 2);
$$;
-- STABLE: Reads database but result is consistent within one query
CREATE OR REPLACE FUNCTION get_setting (p_key VARCHAR)
RETURNS VARCHAR
LANGUAGE sql
STABLE
AS $$
SELECT value FROM app_settings WHERE key = p_key;
$$;
-- VOLATILE: May return different results each time (default)
CREATE OR REPLACE FUNCTION generate_code ()
RETURNS VARCHAR
LANGUAGE sql
VOLATILE
AS $$
SELECT 'ORD-' || LPAD(FLOOR(RANDOM() * 1000000)::TEXT, 6, '0');
$$;
Declaring a function as IMMUTABLE when it truly is allows PostgreSQL to:
- Cache results for repeated calls with the same arguments
- Use the function in index expressions (
CREATE INDEX ... ON table (my_function(column))) - Use the function in materialized view definitions and constraint checks
- Optimize queries by evaluating the function once instead of per-row
Declaring IMMUTABLE on a function that is not truly immutable (e.g., it reads from a table) will produce incorrect cached results. Always be honest about volatility.
Performance Considerations
Scalar Function Performance Trap (SQL Server)
In SQL Server, scalar UDFs have a well-known performance problem: they are executed row by row, preventing the optimizer from using parallelism and often causing dramatic slowdowns on large tables.
-- SQL Server: This can be extremely slow on large tables
SELECT
name,
salary,
dbo.salary_band(salary) AS band -- Called once per row, no parallelism
FROM employees; -- 1 million rows = 1 million function calls
Alternatives in SQL Server:
- Inline Table-Valued Functions (no performance penalty):
-- SQL Server: Inline TVF is "unfolded" into the query plan
CREATE FUNCTION get_salary_bands ()
RETURNS TABLE
AS
RETURN (
SELECT
name,
salary,
CASE
WHEN salary >= 90000 THEN 'Senior Band'
WHEN salary >= 75000 THEN 'Mid Band'
WHEN salary >= 60000 THEN 'Junior Band'
ELSE 'Entry Band'
END AS band
FROM employees
);
SELECT * FROM dbo.get_salary_bands();
- Scalar UDF Inlining (SQL Server 2019+): The optimizer can automatically inline simple scalar functions. Ensure your function qualifies by keeping it simple and marking it deterministic.
PostgreSQL and MySQL: Functions Are Generally Efficient
PostgreSQL and MySQL handle scalar functions more efficiently than SQL Server in most cases. However, complex functions with multiple queries inside them can still cause performance issues when called per-row on large tables.
-- POTENTIALLY SLOW: Function executes a query per row
CREATE OR REPLACE FUNCTION get_order_count (p_customer_id INT)
RETURNS INT
LANGUAGE sql
STABLE
AS $$
SELECT COUNT(*) FROM orders WHERE customer_id = p_customer_id;
$$;
-- This runs the COUNT query once per customer row
SELECT name, get_order_count(id) FROM customers;
-- 100,000 customers = 100,000 separate COUNT queries!
-- BETTER: Use a JOIN instead
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;
Avoid putting SELECT queries inside scalar functions that will be called per-row on large tables. The function call hides the fact that you are executing N separate queries. A JOIN or subquery is almost always faster.
Rule of thumb: If a function contains a SELECT from another table, reconsider whether a JOIN would be more efficient.
Inlining Simple Functions (PostgreSQL)
PostgreSQL can inline simple SQL-language functions directly into the calling query, eliminating the function call overhead entirely:
-- This simple SQL function will be inlined
CREATE OR REPLACE FUNCTION double_value (p_val INT)
RETURNS INT
LANGUAGE sql
IMMUTABLE
AS $$
SELECT p_val * 2;
$$;
-- PostgreSQL replaces this:
SELECT double_value(salary) FROM employees;
-- With this (internally):
SELECT salary * 2 FROM employees;
Inlining happens automatically for LANGUAGE sql functions that meet certain criteria (no side effects, simple body). LANGUAGE plpgsql functions are never inlined.
Modifying and Dropping Functions
Modifying
-- PostgreSQL: CREATE OR REPLACE
CREATE OR REPLACE FUNCTION calculate_tax (...)
RETURNS DECIMAL(10,2)
LANGUAGE plpgsql
AS $$ ... $$;
-- SQL Server: ALTER FUNCTION
ALTER FUNCTION dbo.calculate_tax (...)
RETURNS DECIMAL(10,2)
AS
BEGIN ... END;
-- MySQL: Must DROP and recreate
DROP FUNCTION IF EXISTS calculate_tax;
DELIMITER //
CREATE FUNCTION calculate_tax (...) ...
END //
DELIMITER ;
Dropping
-- All databases
DROP FUNCTION function_name;
-- Safe version
DROP FUNCTION IF EXISTS function_name;
-- PostgreSQL: May need to specify parameter types if overloaded
DROP FUNCTION IF EXISTS calculate_tax(DECIMAL, VARCHAR);
Listing Functions
-- PostgreSQL
SELECT routine_name, data_type AS return_type
FROM information_schema.routines
WHERE routine_schema = 'public' AND routine_type = 'FUNCTION';
-- MySQL
SHOW FUNCTION STATUS WHERE Db = 'your_database';
-- SQL Server
SELECT name, type_desc FROM sys.objects
WHERE type IN ('FN', 'IF', 'TF');
-- FN = Scalar, IF = Inline TVF, TF = Multi-statement TVF
Common Mistakes to Avoid
Mistake 1: Using Functions for Data Modification (SQL Server)
-- SQL Server: This will NOT compile
CREATE FUNCTION update_balance (@id INT, @amount DECIMAL)
RETURNS DECIMAL
AS
BEGIN
UPDATE accounts SET balance = balance + @amount WHERE id = @id;
RETURN @amount;
END;
-- ERROR: Invalid use of a side-effecting operator
Fix: Use a stored procedure for data modifications in SQL Server.
Mistake 2: Hidden N+1 Queries Inside Functions
-- BAD: Scalar function that queries another table
CREATE FUNCTION get_customer_name (p_id INT)
RETURNS VARCHAR
LANGUAGE sql STABLE
AS $$
SELECT name FROM customers WHERE id = p_id;
$$;
-- Calling it per-row on a large orders table:
SELECT id, amount, get_customer_name(customer_id) AS customer
FROM orders;
-- If orders has 500,000 rows, this runs 500,000 separate lookups!
Fix: Use a JOIN:
SELECT o.id, o.amount, c.name AS customer
FROM orders o
JOIN customers c ON o.customer_id = c.id;
Mistake 3: Wrong Volatility Declaration (PostgreSQL)
-- BAD: Function reads a table but is declared IMMUTABLE
CREATE OR REPLACE FUNCTION get_tax_rate (p_country VARCHAR)
RETURNS DECIMAL
LANGUAGE sql
IMMUTABLE -- WRONG! This reads from a table
AS $$
SELECT rate FROM tax_rates WHERE country = p_country;
$$;
-- PostgreSQL may cache the result and return stale data
-- when the tax_rates table is updated!
Fix: Use STABLE for functions that read from tables:
CREATE OR REPLACE FUNCTION get_tax_rate (p_country VARCHAR)
RETURNS DECIMAL
LANGUAGE sql
STABLE -- Correct: consistent within a query, may change between queries
AS $$
SELECT rate FROM tax_rates WHERE country = p_country;
$$;
Mistake 4: Forgetting the Schema Prefix in SQL Server
-- SQL Server: This fails
SELECT calculate_tax(100.00, 'US');
-- ERROR: 'calculate_tax' is not a recognized built-in function name
-- Fix: Include the schema
SELECT dbo.calculate_tax(100.00, 'US');
Mistake 5: Overly Complex Functions
-- BAD: 200-line function with multiple queries, loops, and conditionals
-- This should probably be a stored procedure or application code
CREATE FUNCTION process_entire_order (...)
RETURNS ...
Fix: Keep functions focused and simple. If a function grows beyond a single, clear responsibility, split it into multiple smaller functions or move the logic to a stored procedure.
Summary
SQL user-defined functions let you extend the database's built-in capabilities with your own reusable, encapsulated logic:
- Scalar functions return a single value and can be used anywhere an expression is valid:
SELECT,WHERE,JOIN,ORDER BY,GROUP BY, and more. - Table-valued functions return a result set and are used in the
FROMclause like a parameterized view. Supported in PostgreSQL and SQL Server; not available in MySQL. CREATE FUNCTIONsyntax varies across databases. PostgreSQL usesLANGUAGE plpgsql/sqlwith$$ ... $$blocks. MySQL requiresDELIMITERand aDETERMINISTICdeclaration. SQL Server uses@parameter prefixes and requires thedbo.schema prefix when calling.- Functions vs procedures: Functions return values and are used inside SQL statements. Procedures perform actions and are called as standalone commands. Use functions for computation; use procedures for data modification and transaction management.
- Volatility (PostgreSQL) tells the optimizer how to cache function results.
IMMUTABLEfor pure computations,STABLEfor functions that read data,VOLATILEfor everything else. - Avoid putting
SELECTqueries inside scalar functions that will be called per-row on large tables. AJOINis almost always faster. - In SQL Server, scalar UDFs have significant performance overhead. Prefer inline table-valued functions or leverage scalar UDF inlining in SQL Server 2019+.
Create a function when you find yourself:
- Repeating the same
CASEexpression or calculation in multiple queries - Wishing you could filter or sort by a computed value without repeating the logic
- Needing a parameterized view (table-valued function)
- Wanting to enforce consistent business logic across all queries
If the logic modifies data or manages transactions, use a stored procedure instead.