How to Check if a Python String Contains a Substring? Demystified: The Power of String Contains in Python3
If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.
In this tutorial, you’ll focus on the most Pythonic way to tackle this task, using the membership operator in
. Additionally, you’ll learn how to identify the right string methods for related, but different, use cases. Finally, you’ll also learn how to find substrings in pandas columns. This is helpful if you need to search through data from a CSV file. You could use the approach that you’ll learn in the next section, but if you’re working with tabular data, it’s best to load the data into a pandas DataFrame and search for substrings in pandas.
How to Confirm That a Python String Contains Another String
If you need to check whether a string contains a substring, use Python’s membership operator in
. In Python, this is the recommended way to confirm the existence of a substring in a string:
The in
membership operator gives you a quick and readable way to check whether a substring is present in a string. You may notice that the line of code almost reads like English.
Note: If you want to check whether the substring is not in the string, then you can use not in
:
Because the substring “secret” is present in raw_file_content
, the not in
operator returns False
.
When you use in
, the expression returns a Boolean value:
True
if Python found the substringFalse
if Python didn’t find the substring
You can use this intuitive syntax in conditional statements to make decisions in your code:
In this code snippet, you use the membership operator to check whether “secret” is a substring of raw_file_content
. If it is, then you’ll print a message to the terminal.
Generalize Your Check by Removing Case Sensitivity
By default, Python’s in
operator performs a case-sensitive search. This means that it will treat uppercase and lowercase letters as different characters. If you want to perform a case-insensitive search, you can convert both the string and the substring to lowercase using the lower()
string method:
Here, raw_file_content.lower()
converts the entire string to lowercase. As a result, the membership operator in
will now perform a case-insensitive search for the substring “secret”.
Learn More About the Substring
Now that you know how to confirm that a Python string contains another string, it’s helpful to know more about the substring itself. In Python, a substring is a contiguous sequence of characters within a string. You can extract substrings from strings using slicing.
For example, let’s say you have the following string:
You can extract a substring that consists of the first five characters using slicing:
In this case, substring
will contain “Hello”. Slicing in Python follows a [start:stop:step
] syntax, where:
start
is the index where the slice starts (inclusive)stop
is the index where the slice stops (exclusive)step
is the number of characters to skip between each slice
You can omit the start
and step
values to use the default values of 0 and 1, respectively:
You can also omit the stop
value to include all characters until the end of the string:
The length of a string can be obtained using the len()
function:
Find a Substring With Conditions Using Regex
In some cases, you may need to find a substring that matches specific conditions. This is where regular expressions, or regex, come in handy. Python provides the re
module for working with regular expressions.
For example, let’s say you want to find all occurrences of the word “Python” in a longer string. You can use the re.search()
function to do this:
Here, the regex pattern \bPython\b
specifies that you want to find the word “Python” as a whole word. The \b
characters represent word boundaries. The re.search()
function returns a match object if the pattern is found, or None
otherwise.
You can also use the re.findall()
function to find all occurrences of a pattern in a string:
This returns a list of all matches found in the string.
Find a Substring in a pandas DataFrame Column
If you’re working with tabular data in Python, particularly with the pandas library, you may want to find substrings in a specific column of a DataFrame. This is helpful if you need to search through data from a CSV file.
To find a substring in a pandas DataFrame column, you can use the str.contains()
method. This method allows you to search for a substring in each value of a specified column.
For example, let’s say you have the following DataFrame:
To check if a substring is present in the “occupation” column, you can use the following code:
This will return a Series of Boolean values, where True
indicates that the substring was found in the corresponding value of the column, and False
indicates that it wasn’t.
You can use this Series to filter the DataFrame and only keep the rows where the substring was found:
In this example, filtered_df
will only contain rows where the “occupation” column contains the substring “dev”.
Key Takeaways
In this tutorial, you learned how to check if a Python string contains a substring using the membership operator in
. You also learned how to generalize your check by removing case sensitivity. Additionally, you learned more about substrings, slicing, regular expressions, and finding substrings in pandas DataFrame columns.
By mastering these techniques, you’ll be able to identify and work with substrings in a wide range of use cases in Python.