How to Extract Integers from User Input using Regex in Python
Processing user input often involves cleaning unstructured text to extract specific data types, such as integers. While you could iterate through characters manually, Python's re (Regular Expression) module provides a much more efficient and readable solution.
This guide explains how to use regular expressions to find, extract, and format integers from any given string.
Understanding the Regex Pattern
To extract numbers from mixed text (e.g., "Item 12 costs $50"), we need a pattern that identifies digits.
\d: Matches any decimal digit (0-9).+: Quantifier meaning "one or more" of the preceding element.
Therefore, the pattern r"\d+" matches sequences of one or more digits.
Using \d alone would match 1 and 2 separately in the number 12. Using \d+ ensures 12 is treated as a single integer.
Step 1: Setting Up the Script
Regular expressions are not loaded by default. You must import the standard re module.
Create a file named find_int.py and add the import statement.
import re
# No external installation required; 're' is part of the standard library.
Step 2: Implementing Extraction Logic
We use re.findall() to locate all non-overlapping matches of the pattern in the string. It returns a list of strings.
import re
text = "Order ID: 5592, Qty: 3"
# ⛔️ Naive Approach: Iterating and checking isdigit()
# This is verbose and hard to handle multi-digit numbers (like 5592) correctly
digits = []
temp = ""
for char in text:
if char.isdigit():
temp += char
else:
if temp: digits.append(temp)
temp = ""
# ✅ Correct: Using Regex
# Returns a list: ['5592', '3']
integers = re.findall(r"\d+", text)
Step 3: Handling Input and Output
The goal is to return a space-separated string of numbers found in the input.
def extract_integers(user_input):
# Find all sequences of digits
integers_list = re.findall(r"\d+", user_input)
# Join list elements into a single string separated by spaces
return " ".join(integers_list)
Complete Code Solution
Here is the complete find_int.py script. It accepts input from the command line (standard input) and prints the extracted numbers.
import re
def extract_integers(user_input):
"""
Finds all integers in a string and returns them as a space-separated string.
"""
# Pattern explanation:
# \d -> Matches digits 0-9
# + -> Matches 1 or more repetitions
integers = re.findall(r"\d+", user_input)
# Join the list ['1', '2'] into "1 2"
return " ".join(integers)
if __name__ == "__main__":
try:
# Get input from user (waits for typing in console)
user_text = input()
# Process and print
result = extract_integers(user_text)
print(result)
except EOFError:
pass # Handle cases where input might be empty/closed abruptly
Testing the Script
Test Case 1
Input: a1b2c3d4
$ python find_int.py
a1b2c3d4
Output:
1 2 3 4
Test Case 2
Input: 12 3ad5
$ python find_int.py
12 3ad5
Output:
12 3 5
re.findall returns strings (['12', '3', '5']). If you need to perform mathematical operations on these results later, remember to convert them using int() or float().
Conclusion
Regular expressions are the most robust tool for text parsing in Python.
- Import
reto access regex tools. - Use
r"\d+"to match whole integers (sequences of digits). - Use
re.findall()to extract all occurrences into a list. - Use
join()to format the output list back into a string.