Skip to content

Effortlessly Check if Python String Contains Substring

[

How to Check if a Python String Contains a Substring

If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.

In this tutorial, you’ll focus on the most Pythonic way to tackle this task, using the membership operator in. Additionally, you’ll learn how to identify the right string methods for related, but different, use cases. Finally, you’ll also learn how to find substrings in pandas columns. This is helpful if you need to search through data from a CSV file. You could use the approach that you’ll learn in the next section, but if you’re working with tabular data, it’s best to load the data into a pandas DataFrame and search for substrings in pandas.

How to Confirm That a Python String Contains Another String

If you need to check whether a string contains a substring, use Python’s membership operator in. In Python, this is the recommended way to confirm the existence of a substring in a string:

raw_file_content = """
Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one.
"""
"secret" in raw_file_content

The in membership operator gives you a quick and readable way to check whether a substring is present in a string. You may notice that the line of code almost reads like English.

If you want to check whether the substring is not in the string, then you can use not in:

"secret" not in raw_file_content

Because the substring “secret” is present in raw_file_content, the not in operator returns False.

When you use in, the expression returns a Boolean value:

  • True if Python found the substring
  • False if Python didn’t find the substring

You can use this intuitive syntax in conditional statements to make decisions in your code:

if "secret" in raw_file_content:
print("Found!")

In this code snippet, you use the membership operator to check whether “secret” is a substring of raw_file_content. If it is, then you’ll print a message to the terminal.

Generalize Your Check by Removing Case Sensitivity

If you want to check whether a string contains a substring regardless of the case, you can convert both strings to lowercase or uppercase using the lower() or upper() method. Here’s an example:

raw_file_content = """
Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one.
"""
"secret" in raw_file_content.lower()

By converting raw_file_content to lowercase, you make the search case-insensitive. In the above code, the in operator checks whether the lowercase version of “secret” is present in the lowercase version of raw_file_content.

Learn More About the Substring

If you want to find the position of a substring within a string, you can use the find() method. This method returns the index at which the substring is found, or -1 if it’s not present. Here’s an example:

s = "Hello, World!"
substring = "o"
s.find(substring)

The find() method returns 4 because the first occurrence of the substring “o” in the string s is at index 4.

If you want to count the number of occurrences of a substring in a string, you can use the count() method. This method returns the number of times the substring appears in the string. Here’s an example:

s = "Hello, World!"
substring = "o"
s.count(substring)

The count() method returns 2 because the substring “o” appears twice in the string s.

Find a Substring With Conditions Using Regex

If you need to find a substring based on specific conditions, such as starting or ending with certain characters or containing a certain pattern, you can use regular expressions (regex). Python’s re module provides functions for working with regular expressions.

Here’s an example that uses regex to find all numbers within a string:

import re
s = "There are 47 apples and 3 oranges in the basket."
numbers = re.findall(r'\d+', s)

The re.findall() function searches for all occurrences of one or more digits (\d+) in the string s and returns a list of all matching substrings. In this case, the numbers list will contain ["47", "3"].

Regular expressions provide a powerful way to search for and manipulate substrings based on patterns. However, they can be complex and might require a deeper understanding of regex syntax.

Find a Substring in a pandas DataFrame Column

If you’re working with tabular data in pandas and need to search for substrings in a specific column, you can use the str.contains() method. Here’s an example:

import pandas as pd
data = {
"Name": ["John", "Alice", "Bob", "Charlie"],
"City": ["New York", "Los Angeles", "San Francisco", "Chicago"]
}
df = pd.DataFrame(data)
substring = "an"
df[df["City"].str.contains(substring)]

The str.contains() method allows you to check whether a substring is present in each element of a pandas Series or DataFrame column. In the above example, the code searches for the substring “an” in the “City” column of the DataFrame and returns a new DataFrame that only includes rows where the substring is found.

This approach is especially useful when working with large datasets and allows you to filter rows based on substring criteria.

Key Takeaways

  • You can use the membership operator in to check if a Python string contains a substring.
  • By removing case sensitivity using the lower() or upper() method, you can make the search case-insensitive.
  • The find() method can be used to find the index at which a substring is found within a string.
  • The count() method returns the number of occurrences of a substring in a string.
  • Regular expressions provide a powerful way to search for substrings based on specific patterns using the re module.
  • In pandas, the str.contains() method allows you to find substrings in specific columns of a DataFrame.

By understanding these techniques, you’ll be able to efficiently check for and find substrings in Python, enhancing your ability to work with text data effectively.