Effortlessly Explore String Containment in Python
How to Check if a Python String Contains a Substring
by Martin Breuss
If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.
In this tutorial, you’ll focus on the most Pythonic way to tackle this task, using the membership operator in
. Additionally, you’ll learn how to identify the right string methods for related, but different, use cases. Finally, you’ll also learn how to find substrings in pandas columns. This is helpful if you need to search through data from a CSV file. You could use the approach that you’ll learn in the next section, but if you’re working with tabular data, it’s best to load the data into a pandas DataFrame and search for substrings in pandas.
How to Confirm That a Python String Contains Another String
If you need to check whether a string contains a substring, use Python’s membership operator in
. In Python, this is the recommended way to confirm the existence of a substring in a string:
The in
membership operator gives you a quick and readable way to check whether a substring is present in a string.
Note: If you want to check whether the substring is not in the string, then you can use not in
:
When you use in
, the expression returns a Boolean value: True if Python found the substring, and False if Python didn’t find the substring.
You can use this intuitive syntax in conditional statements to make decisions in your code:
In this code snippet, you use the membership operator to check whether “secret” is a substring of raw_file_content
. If it is, then you’ll print the message “Found!“.
Generalize Your Check by Removing Case Sensitivity
By default, the membership operator in
performs a case-sensitive search. This means that it will only match substrings that have the same casing as the string you’re searching in.
However, sometimes you may want to perform a case-insensitive search, especially if you don’t want the search to be sensitive to the casing of the individual characters. In this case, you can convert both strings to the same lowercase or uppercase before performing the membership check.
In the code snippet above, the first membership check returns False
because “SECRET” is not present in raw_file_content
. However, after converting both strings to lowercase using the .lower()
method, the second membership check returns True
. This is because the membership operator is now performing a case-insensitive search.
Learn More About the Substring
In addition to checking if a Python string contains a substring, you may also want to find its index or count how many times it appears. Python provides several string methods that can help you achieve this.
Finding the Index of a Substring
If you want to find the starting index of a substring in a string, you can use the .index()
method:
In the code snippet above, .index()
returns the index at which the substring “jumps” starts in the sentence. In this case, the value of index
will be 20
.
Counting the Occurrences of a Substring
To count how many times a substring appears in a string, Python provides the .count()
method:
In the code snippet above, .count()
returns the number of times the substring “the” appears in the sentence. Note that the search is case-insensitive because both the sentence and substring have been converted to lowercase using the .lower()
method. In this case, the value of count
will be 2
.
By leveraging these additional string methods, you can gain more control and flexibility in working with substrings in Python.
Find a Substring With Conditions Using Regex
Another powerful tool for finding substrings in Python is regular expressions, or regex. Regex allows you to define complex search patterns using special characters and syntax. It provides a more advanced and flexible way to find substrings based on various conditions.
To utilize regex for substring matching, you can use the re
module in Python. Here’s an example of how to use regex to search for substrings that start with a specific prefix:
In the code snippet above, re.findall()
returns a list of all substrings in string
that start with the letter “s”. In this case, the value of matches
will be ['sun', 'shining', 'singing']
.
Regex provides a powerful and flexible way to search for substrings based on various conditions, such as prefixes, suffixes, patterns, or even complex rules. It’s a valuable tool to have in your Python toolkit when you need to perform more advanced substring matching.
Find a Substring in a pandas DataFrame Column
If you need to search for substrings in a pandas DataFrame column, the process is slightly different. First, you’ll need to load your data into a DataFrame using pandas. Once you have your DataFrame, you can use the .str.contains()
method to check if each element in the column contains a specific substring.
Here’s an example using a pandas DataFrame with a column named “text”:
In the code snippet above, df["text"].str.contains()
returns a boolean Series that indicates whether each element in the “text” column contains the substring “example”. The case=False
argument ensures that the search is case-insensitive. In this case, the value of matches
will be [False, False, False, True]
, indicating that the last row contains the substring “example”.
By using the .str.contains()
method in pandas, you can easily search for substrings within specific columns of your DataFrame.
Key Takeaways
- Python’s membership operator
in
is the recommended way to check whether a string contains a substring. - You can remove case sensitivity by converting both strings to lowercase or uppercase before performing the membership check.
- Python provides additional string methods, such as
.index()
and.count()
, to find the index or count the occurrences of a substring. - Regular expressions (regex) provide a more advanced and flexible way to find substrings based on various conditions.
- In pandas, you can use the
.str.contains()
method to search for substrings within specific columns of a DataFrame.
With these techniques, you can confidently check whether a Python string contains a substring and perform various operations based on the result.