Skip to content

Beginner's Guide to Using Python 3 Effortlessly

CodeMDD.io

How to Check if a Python String Contains a Substring

If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.

In this tutorial, we will focus on the most Pythonic way to tackle this task, using the membership operator in. Additionally, we’ll learn how to identify the right string methods for related, but different, use cases. Finally, we’ll also learn how to find substrings in pandas columns, which is helpful if you need to search through data from a CSV file.

How to Confirm That a Python String Contains Another String

If you need to check whether a string contains a substring, use Python’s membership operator in. In Python, this is the recommended way to confirm the existence of a substring in a string. The in membership operator gives you a quick and readable way to check whether a substring is present in a string. For example:

raw_file_content = """Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one."""
"secret" in raw_file_content

The expression "secret" in raw_file_content returns True because the substring “secret” is present in raw_file_content.

If you want to check whether the substring is not in the string, then you can use not in:

"secret" not in raw_file_content

In this case, the expression returns False because the substring “secret” is present in raw_file_content.

When you use in, the expression returns a Boolean value: True if Python found the substring, and False if Python didn’t find the substring. You can use this intuitive syntax in conditional statements to make decisions in your code:

if "secret" in raw_file_content:
print("Found!")

In this code snippet, the membership operator is used to check whether “secret” is a substring of raw_file_content. If it is, then the message “Found!” will be printed to the terminal.

Generalize Your Check by Removing Case Sensitivity

If you want to check whether a string contains a substring, but you don’t want to consider the case of the letters, you can convert both strings to lowercase using the lower() method:

file_content = "This is a Sample PYTHON string"
search_term = "pythON"
search_term.lower() in file_content.lower()

The expression search_term.lower() in file_content.lower() returns True because both the search term “pythON” and the file content “This is a Sample PYTHON string” are converted to lowercase before performing the check. This approach allows you to make your check case-insensitive, which can be useful in scenarios where the user input or the content you’re searching through may have varying cases.

Learn More About the Substring

If you need to know more about the substring, such as its index or occurrence count, you can use string methods like find() and count(). Here’s an example:

file_content = "This is a sample string"
search_term = "is"
first_index = file_content.find(search_term)
occurrence_count = file_content.count(search_term)
print(f"The first occurrence of '{search_term}' starts at index: {first_index}")
print(f"The term '{search_term}' occurred {occurrence_count} times in the string")

This code snippet finds the first occurrence of the search term “is” in the file content, and also counts the total number of occurrences of that term. The output will be:

The first occurrence of 'is' starts at index: 2
The term 'is' occurred 2 times in the string

Find a Substring With Conditions Using Regex

If you need to find a substring that matches certain conditions or patterns, you can use regular expressions. The re module in Python provides powerful tools for pattern matching. Here’s an example:

import re
file_content = "This is a sample string"
search_pattern = r"\b\w{3}\b" # Word with length 3
matches = re.findall(search_pattern, file_content)
print(f"The matching substrings found: {matches}")

In this code snippet, we’re searching for all substrings in the file content that are exactly three characters long using the regular expression pattern \b\w{3}\b. The output will be:

The matching substrings found: ['This', 'sample']

Find a Substring in a pandas DataFrame Column

If you’re working with data stored in a pandas DataFrame, you can use the str.contains() method to search for substrings in a specific column. Here’s an example:

import pandas as pd
data = {'Name': ['John Doe', 'Jane Smith', 'Alice Johnson'],
'Age': [25, 30, 27]}
df = pd.DataFrame(data)
search_term = 'ohn'
subset = df[df['Name'].str.contains(search_term)]
print(subset)

In this code snippet, we’re searching for all rows in the DataFrame where the ‘Name’ column contains the substring ‘ohn’. The output will be a subset of the original DataFrame that matches the search condition.

Key Takeaways

Checking whether a string contains a substring is a common task in programming. In Python, the membership operator in provides a quick and readable way to check for the existence of a substring. You can also generalize your check by removing case sensitivity using the lower() method. If you need more information about the substring, you can use string methods like find() and count(). Additionally, regular expressions and pandas DataFrame methods provide powerful tools for finding substrings that match specific conditions or patterns.

Remember to download the sample code provided in the tutorial to practice these concepts and deepen your understanding of how to check for substrings in Python.