How to Resolve Python "ValueError: Sample larger than population or is negative"
The ValueError: Sample larger than population or is negative is a common error encountered when using the random.sample() function in Python. It arises when you attempt to select more unique random elements from a sequence than are actually available.
This guide explains the cause of this error and provides effective solutions, including using random.choices() and error handling.
Understanding the Error: random.sample() and Uniqueness
The random.sample(population, k) function is designed to return a list of k unique elements chosen randomly from the population sequence (like a list or tuple). Because it guarantees uniqueness (no element is chosen more than once), the requested sample size (k) can not be larger than the number of elements available in the population.
import random
a_list = ['tutorial', 'reference', 'com'] # Population size is 3
# This causes the error because we ask for 4 *unique* elements from a list of 3
# random_elements = random.sample(a_list, 4)
# ⛔️ ValueError: Sample larger than population or is negative
Solution 1: Use random.choices() for Sampling with Replacement (Recommended)
If you need to select k random elements and don't require them to be unique (meaning elements can be picked multiple times), use random.choices():
import random
a_list = ['tutorial', 'reference', 'com']
# Select 4 elements, allowing duplicates (sampling with replacement)
random_elements = random.choices(a_list, k=4)
print(random_elements) # Output (Example): ['tutorial', 'com', 'tutorial', 'reference']
random.choices(population, k=N)returns a list ofkelements chosen from thepopulationwith replacement. This means the same element can appear multiple times in the result, andkcan be larger than the population size.
Solution 2: Use min() to Limit Sample Size
If you must use random.sample() (e.g., you strictly need unique elements) but want to avoid the error if k might exceed the population size, you can limit k using the min() function:
import random
a_list = ['tutorial', 'reference', 'com']
requested_sample_size = 4
# Ensure the sample size doesn't exceed the list length
actual_sample_size = min(requested_sample_size, len(a_list))
random_elements = random.sample(a_list, actual_sample_size)
print(random_elements) # Output (Example, unique): ['com', 'tutorial', 'reference']
min(requested_sample_size, len(a_list))ensures that the second argument passed torandom.sample()is never larger than the number of elements ina_list.
Getting a Single Random Element with random.choice()
If you only need one random element from a sequence, use the simpler random.choice():
import random
a_list = ['tutorial', 'reference', 'com']
random_element = random.choice(a_list)
print(random_element) # Output (Example): reference
Handling Empty Sequences with random.choice()
random.choice() raises an IndexError if the sequence is empty. Use a try...except block to handle this:
import random
a_list = [] # Empty list
try:
random_element = random.choice(a_list)
print(random_element)
except IndexError:
print('The sequence is empty') # Output: The sequence is empty
Conclusion
The ValueError: Sample larger than population or is negative specifically occurs with random.sample() when requesting more unique items than available.
- The best solution is often to use
random.choices()if you allow duplicate selections (sampling with replacement). - If unique items are required, ensure your requested sample size (
k) doesn't exceed the population size, potentially by usingmin(k, len(population)). - For selecting a single random item,
random.choice()is the appropriate function.