Python NumPy: How to Find the K Smallest Values in a NumPy Array in Python
When analyzing numerical data, a common task is finding the k smallest values in an array - whether it's identifying the lowest temperatures, cheapest products, or smallest distances. While you could sort the entire array and slice it, NumPy and Python offer more efficient approaches that avoid the overhead of a full sort.
This guide covers four methods for finding the k smallest elements, ranging from the most performant to the most straightforward.
Using np.partition() (Recommended)
np.partition() is the fastest method for finding k smallest values because it performs a partial sort. It rearranges the array so that the element at the k-th position is where it would be in a fully sorted array, all smaller elements are moved before it, and all larger elements are moved after it - without sorting either group internally.
import numpy as np
arr = np.array([23, 12, 1, 3, 4, 5, 6])
k = 4
result = np.partition(arr, k)[:k]
print("Array:", arr)
print(f"{k} smallest elements:", result)
Output:
Array: [23 12 1 3 4 5 6]
4 smallest elements: [1 3 4 5]
np.partition() guarantees that the first k elements are the k smallest, but their internal order is not guaranteed. If you need them sorted, add an extra sort on the result:
result = np.sort(np.partition(arr, k)[:k])
print("Sorted k smallest:", result)
This is still faster than sorting the entire array since you're only sorting k elements.
Why np.partition() Is Faster
| Operation | Time Complexity |
|---|---|
np.sort() (full sort) | O(n log n) |
np.partition() (partial sort) | O(n) |
For large arrays where you only need a small number of elements, the difference can be significant.
Using np.argpartition() for Indices
np.argpartition() works like np.partition(), but instead of returning the values, it returns the indices of the k smallest elements. This is useful when you need to know where the smallest values are located in the original array.
import numpy as np
arr = np.array([23, 12, 1, 3, 4, 5, 6])
k = 4
indices = np.argpartition(arr, k)[:k]
values = arr[indices]
print("Array:", arr)
print("Indices of k smallest:", indices)
print(f"{k} smallest values:", values)
Output:
Array: [23 12 1 3 4 5 6]
Indices of k smallest: [2 3 4 5]
4 smallest values: [1 3 4 5]
Getting Sorted Results with Indices
If you need both the indices and values in sorted order:
import numpy as np
arr = np.array([23, 12, 1, 3, 4, 5, 6])
k = 4
indices = np.argpartition(arr, k)[:k]
# Sort the k indices by their corresponding values
sorted_order = indices[np.argsort(arr[indices])]
print("Sorted indices:", sorted_order)
print("Sorted values:", arr[sorted_order])
Output:
Sorted indices: [2 3 4 5]
Sorted values: [1 3 4 5]
Using np.sort() (Simple but Slower)
The most straightforward approach is to fully sort the array and then slice the first k elements. While less efficient than np.partition(), it's easy to read and guarantees the result is in sorted order.
import numpy as np
arr = np.array([23, 12, 1, 3, 4, 5, 6])
k = 4
result = np.sort(arr)[:k]
print("Array:", arr)
print(f"{k} smallest elements (sorted):", result)
Output:
Array: [23 12 1 3 4 5 6]
4 smallest elements (sorted): [1 3 4 5]
np.sort() for large arrays when k is smallSorting the entire array takes O(n log n) time, even if you only need a handful of values. For an array with millions of elements where k is small (e.g., k=10), np.partition() will be dramatically faster:
import numpy as np
# Large array with 10 million elements
large_arr = np.random.randint(0, 1_000_000, size=10_000_000)
k = 5 # Number of smallest elements we want
# Fast approach: O(n) time using np.partition
# np.partition rearranges the array so that the first k elements are the smallest (not fully sorted)
result_fast = np.partition(large_arr, k)[:k]
# Slow approach: O(n log n) time using np.sort
# Sorts all 10 million elements, then takes the first k
result_slow = np.sort(large_arr)[:k]
print(result_fast)
print(result_slow)
Use np.sort() only when the array is small or when you need the entire sorted array anyway.
Using heapq.nsmallest() (Python Standard Library)
Python's built-in heapq module provides nsmallest(), which uses a heap data structure to efficiently find the k smallest elements. This works on any iterable and doesn't require NumPy.
import numpy as np
import heapq
arr = np.array([23, 12, 1, 3, 4, 5, 6])
k = 4
result = heapq.nsmallest(k, arr.tolist())
print("Array:", arr)
print(f"{k} smallest elements:", result)
Output:
Array: [23 12 1 3 4 5 6]
4 smallest elements: [1, 3, 4, 5]
heapq.nsmallest() returns a sorted Python list, not a NumPy array. It requires converting the NumPy array to a list first (.tolist()), which adds overhead. For NumPy workflows, prefer np.partition() or np.argpartition().
Practical Example: Finding K Smallest in a 2D Array
In real-world scenarios, you might need the k smallest values from a multi-dimensional array. Here's how to handle that:
import numpy as np
# 2D array of sensor readings
data = np.array([
[45, 12, 67],
[3, 89, 23],
[56, 7, 41]
])
k = 4
# Flatten, find k smallest, then sort
flat = data.ravel()
result = np.sort(np.partition(flat, k)[:k])
print("2D Array:")
print(data)
print(f"\n{k} smallest values across entire array:", result)
Output:
2D Array:
[[45 12 67]
[ 3 89 23]
[56 7 41]]
4 smallest values across entire array: [ 3 7 12 23]
Comparison of Approaches
| Method | Time Complexity | Returns Sorted? | Returns Indices? | Best For |
|---|---|---|---|---|
np.partition() | O(n) | ❌ | ❌ | Large arrays (recommended) |
np.argpartition() | O(n) | ❌ | ✅ | When you need positions |
np.sort() | O(n log n) | ✅ | ❌ | Small arrays or full sort needed |
heapq.nsmallest() | O(n log k) | ✅ | ❌ | Non-NumPy workflows |
Finding the K Largest Values
All these methods work for finding the k largest values too - simply adjust the approach:
import numpy as np
arr = np.array([23, 12, 1, 3, 4, 5, 6])
k = 3
# K largest using np.partition()
result = np.partition(arr, -k)[-k:]
print(f"{k} largest elements:", np.sort(result))
Output:
3 largest elements: [ 6 12 23]
Using -k as the partition index places the k largest elements at the end of the partially sorted array.
Conclusion
For finding the k smallest values in a NumPy array, np.partition() is the best choice for most scenarios - it runs in linear time and scales efficiently to large datasets.
- Use
np.argpartition()when you also need the indices of those elements. - Fall back to
np.sort()for simplicity on small arrays. - Use
heapq.nsmallest()when working outside the NumPy ecosystem.
Remember that np.partition() doesn't guarantee sorted order within the k elements, so add a secondary sort on the result if ordering matters.