Skip to content

How to Check if a Python String Contains a Substring? Demystified: The Power of String Contains in Python3

CodeMDD.io

If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.

In this tutorial, you’ll focus on the most Pythonic way to tackle this task, using the membership operator in. Additionally, you’ll learn how to identify the right string methods for related, but different, use cases. Finally, you’ll also learn how to find substrings in pandas columns. This is helpful if you need to search through data from a CSV file. You could use the approach that you’ll learn in the next section, but if you’re working with tabular data, it’s best to load the data into a pandas DataFrame and search for substrings in pandas.

How to Confirm That a Python String Contains Another String

If you need to check whether a string contains a substring, use Python’s membership operator in. In Python, this is the recommended way to confirm the existence of a substring in a string:

raw_file_content = """Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one."""
"secret" in raw_file_content

The in membership operator gives you a quick and readable way to check whether a substring is present in a string. You may notice that the line of code almost reads like English.

Note: If you want to check whether the substring is not in the string, then you can use not in:

"secret" not in raw_file_content

Because the substring “secret” is present in raw_file_content, the not in operator returns False.

When you use in, the expression returns a Boolean value:

  • True if Python found the substring
  • False if Python didn’t find the substring

You can use this intuitive syntax in conditional statements to make decisions in your code:

if "secret" in raw_file_content:
print("Found!")

In this code snippet, you use the membership operator to check whether “secret” is a substring of raw_file_content. If it is, then you’ll print a message to the terminal.

Generalize Your Check by Removing Case Sensitivity

By default, Python’s in operator performs a case-sensitive search. This means that it will treat uppercase and lowercase letters as different characters. If you want to perform a case-insensitive search, you can convert both the string and the substring to lowercase using the lower() string method:

raw_file_content = """Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one."""
"secret" in raw_file_content.lower()

Here, raw_file_content.lower() converts the entire string to lowercase. As a result, the membership operator in will now perform a case-insensitive search for the substring “secret”.

Learn More About the Substring

Now that you know how to confirm that a Python string contains another string, it’s helpful to know more about the substring itself. In Python, a substring is a contiguous sequence of characters within a string. You can extract substrings from strings using slicing.

For example, let’s say you have the following string:

my_string = "Hello, world!"

You can extract a substring that consists of the first five characters using slicing:

substring = my_string[0:5]

In this case, substring will contain “Hello”. Slicing in Python follows a [start:stop:step ] syntax, where:

  • start is the index where the slice starts (inclusive)
  • stop is the index where the slice stops (exclusive)
  • step is the number of characters to skip between each slice

You can omit the start and step values to use the default values of 0 and 1, respectively:

substring = my_string[:5] # Equivalent to my_string[0:5]

You can also omit the stop value to include all characters until the end of the string:

substring = my_string[7:] # Equivalent to my_string[7:len(my_string)]

The length of a string can be obtained using the len() function:

string_length = len(my_string)

Find a Substring With Conditions Using Regex

In some cases, you may need to find a substring that matches specific conditions. This is where regular expressions, or regex, come in handy. Python provides the re module for working with regular expressions.

For example, let’s say you want to find all occurrences of the word “Python” in a longer string. You can use the re.search() function to do this:

import re
paragraph = "I love programming in Python. Python is great!"
regex_pattern = r"\bPython\b"
match = re.search(regex_pattern, paragraph)

Here, the regex pattern \bPython\b specifies that you want to find the word “Python” as a whole word. The \b characters represent word boundaries. The re.search() function returns a match object if the pattern is found, or None otherwise.

You can also use the re.findall() function to find all occurrences of a pattern in a string:

matches = re.findall(regex_pattern, paragraph)

This returns a list of all matches found in the string.

Find a Substring in a pandas DataFrame Column

If you’re working with tabular data in Python, particularly with the pandas library, you may want to find substrings in a specific column of a DataFrame. This is helpful if you need to search through data from a CSV file.

To find a substring in a pandas DataFrame column, you can use the str.contains() method. This method allows you to search for a substring in each value of a specified column.

For example, let’s say you have the following DataFrame:

import pandas as pd
data = {"name": ["John", "Jane", "Alice", "Bob"],
"occupation": ["developer", "manager", "teacher", "lawyer"]}
df = pd.DataFrame(data)

To check if a substring is present in the “occupation” column, you can use the following code:

substring = "dev"
df["occupation"].str.contains(substring)

This will return a Series of Boolean values, where True indicates that the substring was found in the corresponding value of the column, and False indicates that it wasn’t.

You can use this Series to filter the DataFrame and only keep the rows where the substring was found:

filtered_df = df[df["occupation"].str.contains(substring)]

In this example, filtered_df will only contain rows where the “occupation” column contains the substring “dev”.

Key Takeaways

In this tutorial, you learned how to check if a Python string contains a substring using the membership operator in. You also learned how to generalize your check by removing case sensitivity. Additionally, you learned more about substrings, slicing, regular expressions, and finding substrings in pandas DataFrame columns.

By mastering these techniques, you’ll be able to identify and work with substrings in a wide range of use cases in Python.

CodeMDD.io