How to Represent Missing Values with NaN and None

Python has two distinct ways to represent missing or absent values: NaN (Not a Number) for numerical contexts and None for general-purpose absence. Understanding their differences is essential for data handling and avoiding subtle bugs.

Core Differences

Feature	NaN	None
Type	`float`	`NoneType`
Purpose	Missing numeric data	Absence of any value
Equality	`nan != nan` (always False)	`None == None` (True)
Math operations	Propagates (`nan + 1 = nan`)	Raises `TypeError`
Check method	`math.isnan(x)`	`x is None`
Boolean value	`True` (truthy)	`False` (falsy)

Working with None

None is Python's built-in singleton representing the absence of a value. It's used for optional parameters, uninitialized variables, and functions without explicit returns.

# None is falsy and has its own type
x = None

print(type(x))       # <class 'NoneType'>
print(bool(x))       # False
print(x is None)     # True
print(x == None)     # True (but 'is' preferred)

Checking for None

Always use the identity operator is rather than equality ==:

value = None

# Correct approach
if value is None:
    print("No value provided")

# Also correct for checking existence
if value is not None:
    print(f"Value exists: {value}")

tip

Use is None instead of == None because is checks identity (faster and more explicit), while == can be overridden by custom classes to return unexpected results.

None in Functions

def find_user(user_id):
    users = {1: "Alice", 2: "Bob"}
    return users.get(user_id)  # Returns None if not found

result = find_user(999)

if result is None:
    print("User not found")

Working with NaN

NaN represents undefined or unrepresentable numerical results. It's part of the IEEE 754 floating-point standard and behaves uniquely in comparisons.

import math

# Creating NaN
nan1 = float('nan')
nan2 = math.nan

print(type(nan1))  # <class 'float'>
print(type(nan2))  # <class 'float'>

The Unusual Equality Behavior

NaN is the only value in Python that is not equal to itself:

import math

val = math.nan

# NaN is never equal to anything, including itself
print(val == val)          # False
print(val == float('nan')) # False
print(val != val)          # True

Output:

False
False
True

Checking for NaN

Because nan != nan, you must use dedicated functions:

import math

val = float('nan')

# Wrong approach: always False
if val == float('nan'):
    print("This never executes")

# Correct approach
if math.isnan(val):
    print("Value is NaN")

Output:

Value is NaN

warning

Never use == float('nan') or == math.nan to check for NaN. This comparison always returns False, even when comparing NaN to itself.

NaN Propagation in Math

NaN propagates through calculations, contaminating results:

import math

nan = math.nan

print(nan + 100)    # nan
print(nan * 0)      # nan
print(nan / nan)    # nan
print(max(1, nan))  # 1

None vs NaN in Practice

Type Behavior

import math

# None breaks numeric operations
try:
    result = None + 1
except TypeError as e:
    print(f"None error: {e}")  # unsupported operand type(s)

# NaN propagates silently
result = math.nan + 1
print(f"NaN result: {result}")  # nan

Boolean Context

import math

# None is falsy
if not None:
    print("None is falsy")  # Prints

# NaN is truthy!
if math.nan:
    print("NaN is truthy")  # Prints

Output:

None is falsy
NaN is truthy

note

This truthy behavior of NaN can cause bugs. A condition like if value: will pass for NaN but fail for None, even though both represent missing data.

Handling Both in Pandas

Pandas treats None and NaN similarly in DataFrames, converting None to NaN in numeric columns:

import pandas as pd
import numpy as np

# Mixed missing values
df = pd.DataFrame({
    'numbers': [1.0, None, float('nan'), 4.0],
    'strings': ['a', None, 'c', None]
})

print(df)

# isna() detects both None and NaN
print(df.isna())

Output:

   numbers strings
    1.0       a
    NaN    None
    NaN       c
    4.0    None
   numbers  strings
  False    False
   True     True
   True    False
  False     True

Checking and Filling Missing Values

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'value': [1, None, np.nan, 4],
    'category': ['A', None, 'B', 'C']
})

# Count missing values
print(df.isna().sum())
# value       2
# category    1

# Fill missing values
df_filled = df.fillna({'value': 0, 'category': 'Unknown'})
print(df_filled)

Output:

value       2
category    1
dtype: int64
   value category
0    1.0        A
1    0.0  Unknown
2    0.0        B
3    4.0        C

Distinguishing None from NaN in Pandas

When you need to differentiate between them:

import pandas as pd
import numpy as np

# Use object dtype to preserve None
data = pd.Series([1, None, np.nan], dtype=object)

def check_missing(val):
    if val is None:
        return 'None'
    try:
        if np.isnan(val):
            return 'NaN'
    except (TypeError, ValueError):
        pass
    return 'Value'

print(data.apply(check_missing))

Output:

  Value
   None
    NaN
dtype: object

Comprehensive Missing Value Check

When you need to handle both types robustly:

import math

def is_missing(value):
    """Check if value is None or NaN."""
    if value is None:
        return True
    try:
        return math.isnan(value)
    except (TypeError, ValueError):
        return False

# Test cases
test_values = [None, float('nan'), 0, '', [], 42, 'hello']

for val in test_values:
    print(f"{str(val):10} -> missing: {is_missing(val)}")

Output:

None       -> missing: True
nan        -> missing: True
0          -> missing: False
           -> missing: False
[]         -> missing: False
42         -> missing: False
hello      -> missing: False

When to Use Each

Scenario	Recommended Type	Reason
Function returns no result	`None`	Pythonic convention
Optional parameter default	`None`	Standard practice
Missing sensor reading	`NaN`	Preserves float dtype
Database NULL in numeric column	`NaN`	Allows vectorized operations
Database NULL in object column	`None`	Natural representation
Uninitialized variable	`None`	Explicit absence
Mathematical undefined result	`NaN`	IEEE 754 standard

Summary

Use None for general absence of values: function returns, optional parameters, and non-numeric missing data.
Use NaN for missing numerical data where you need to maintain float dtype and perform vectorized operations.
In Pandas, both are treated as missing values and detected by isna(), but understanding their distinct behaviors prevents subtle bugs in your data processing pipelines.

Core Differences​

Working with None​

Checking for None​

None in Functions​

Working with NaN​

The Unusual Equality Behavior​

Checking for NaN​

NaN Propagation in Math​

None vs NaN in Practice​

Type Behavior​

Boolean Context​

Handling Both in Pandas​

Checking and Filling Missing Values​

Distinguishing None from NaN in Pandas​

Comprehensive Missing Value Check​

When to Use Each​

Summary​

Table of Contents

Core Differences

Working with None

Checking for None

None in Functions

Working with NaN

The Unusual Equality Behavior

Checking for NaN

NaN Propagation in Math

None vs NaN in Practice

Type Behavior

Boolean Context

Handling Both in Pandas

Checking and Filling Missing Values

Distinguishing None from NaN in Pandas

Comprehensive Missing Value Check

When to Use Each

Summary