Effortlessly check if string contains substring in Python
How to Check if a Python String Contains a Substring
by Martin Breuss
If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.
In this tutorial, we will focus on the most Pythonic way to tackle this task, using the membership operator in
. Additionally, we’ll learn how to identify the right string methods for related, but different, use cases. Finally, we’ll also learn how to find substrings in pandas columns, which is helpful if you need to search through data from a CSV file.
How to Confirm That a Python String Contains Another String
If you need to check whether a string contains a substring, use Python’s membership operator in
. In Python, this is the recommended way to confirm the existence of a substring in a string. The in
membership operator gives you a quick and readable way to check whether a substring is present in a string.
For example, if we have a string raw_file_content
and we want to check if it contains the substring “secret”, we can do the following:
Output:
As you can see, the in
operator returns True
if Python found the substring “secret” in raw_file_content
. We can use this intuitive syntax in conditional statements to make decisions in our code.
If we want to check whether the substring is not in the string, we can use the not in
operator:
Output:
Generalize Your Check by Removing Case Sensitivity
By default, the in
operator is case sensitive. This means that if you’re searching for a substring in a string and the case doesn’t match exactly, the in
operator will return False
. If you want to make your check case insensitive, you can convert both the string and the substring to lowercase before performing the check:
Output:
By converting both the string and the substring to lowercase using the lower()
method, we ensure that the check is case insensitive. This is useful when you want to ignore the case of the strings you’re comparing.
Learn More About the Substring
If you want to know not only if a string contains a substring but also its position within the string, you can use the find()
method. The find()
method returns the index of the first occurrence of the substring in the string, or -1 if the substring is not found. Here’s an example:
Output:
In this example, the find()
method returns 10 because “fox” is found at index 10 of the string sentence
. If the substring is not found, the method returns -1.
Find a Substring With Conditions Using Regex
If you need to find a substring that matches a specific pattern or condition, regular expressions (regex) can be a powerful tool. The re
module in Python provides functions for working with regular expressions. Here’s an example that demonstrates how to find all occurrences of a substring that starts with a capital letter and ends with a period:
Output:
In this example, the findall()
function from the re
module is used to find all non-overlapping occurrences of the pattern that matches a substring starting with a capital letter and ending with a period. The r"\b[A-Z][^.]*\."
pattern is a regular expression pattern that matches a word boundary (\b
), followed by an uppercase letter ([A-Z]
), any number of non-period characters ([^.]*
), and a period (\.
).
Find a Substring in a pandas DataFrame Column
If you’re working with tabular data and want to find substrings in one or more columns of a pandas DataFrame, you can use the str.contains()
method. This method returns a Boolean Series indicating whether each element of the column contains the substring. Here’s an example:
Output:
In this example, the str.contains()
method is used to create a Boolean Series indicating whether each element of the “Name” column contains the substring “Jo”. The resulting Boolean Series is then used to filter the DataFrame using boolean indexing.
Key Takeaways
Checking if a Python string contains a substring can be done using the membership operator in
. By default, the in
operator is case sensitive, but you can make your check case insensitive by converting both the string and the substring to lowercase. If you need more information about the substring, such as its position within the string, you can use the find()
method. If you need to find substrings that match specific patterns or conditions, regular expressions (regex) can be a powerful tool. And if you’re working with tabular data in a pandas DataFrame, you can use the str.contains()
method to find substrings in specific columns.
Now that you have a good understanding of how to check if a Python string contains a substring, you can use this knowledge to perform various string operations and make your code more powerful and flexible.