Skip to content

Effortlessly Mastering Python with py contains

[

How to Check if a Python String Contains a Substring

If you’re new to programming or come from a programming language other than Python, you may be looking for the best way to check whether a string contains another string in Python. Identifying such substrings comes in handy when you’re working with text content from a file or after you’ve received user input. You may want to perform different actions in your program depending on whether a substring is present or not.

In this tutorial, you’ll focus on the most Pythonic way to tackle this task, using the membership operator in. Additionally, you’ll learn how to identify the right string methods for related, but different, use cases. Finally, you’ll also learn how to find substrings in pandas columns. This is helpful if you need to search through data from a CSV file. You could use the approach that you’ll learn in the next section, but if you’re working with tabular data, it’s best to load the data into a pandas DataFrame and search for substrings in pandas.

How to Confirm That a Python String Contains Another String

If you need to check whether a string contains a substring, use Python’s membership operator in. In Python, this is the recommended way to confirm the existence of a substring in a string:

raw_file_content = """Hi there and welcome.
This is a special hidden file with a SECRET secret.
I don't want to tell you The Secret,
but I do want to secretly tell you that I have one."""
"secret" in raw_file_content

The in membership operator gives you a quick and readable way to check whether a substring is present in a string. You may notice that the line of code almost reads like English.

Note: If you want to check whether the substring is not in the string, then you can use not in:

"secret" not in raw_file_content

When you use in, the expression returns a Boolean value:

  • True if Python found the substring
  • False if Python didn’t find the substring

You can use this intuitive syntax in conditional statements to make decisions in your code:

if "secret" in raw_file_content:
print("Found!")

In this code snippet, you use the membership operator to check whether "secret" is a substring of raw_file_content. If it is, then you’ll print a message to the terminal.

Generalize Your Check by Removing Case Sensitivity

If you want to check whether a string contains a substring while ignoring case sensitivity, you can convert both the string and the substring to lowercase or uppercase before performing the check. Here’s an example:

text = "Python is amazing"
substring = "python"
if substring.lower() in text.lower():
print("Found (case insensitive)!")

By converting both the text and substring to lowercase using the lower() method, you can perform a case-insensitive check for the existence of the substring. If the condition is met, the message “Found (case insensitive)!” will be printed to the terminal.

Learn More About the Substring

If you want to extract the actual occurrences of the substring within a string, you can use the find() or findall() method from the re (regular expression) module. The find() method will return the starting position of the first occurrence of the substring within the string, while the findall() method will return a list of all occurrences.

Here’s an example:

import re
text = "Python is a powerful and flexible programming language. Python is used for web development, data analysis, machine learning, and much more."
substring = "Python"
# Using find() method
position = text.find(substring)
print(position) # Output: 0
# Using findall() method
occurrences = re.findall(substring, text)
print(occurrences) # Output: ['Python', 'Python']

In this example, the find() method returns the starting position of the first occurrence of the substring “Python” within the text string, which is 0. The findall() method returns a list of all occurrences of the substring within the string.

Find a Substring in a pandas DataFrame Column

If you’re working with tabular data and need to find substrings in a specific column of a pandas DataFrame, you can use the str.contains() method. This method returns a Boolean Series that indicates whether a substring is present in each element of the specified column.

Here’s an example:

import pandas as pd
data = {
'Name': ['John', 'Jane', 'Michael', 'Jessica'],
'Age': [26, 30, 35, 28],
'Hobby': ['Playing guitar', 'Reading books', 'Watching movies', 'Painting']
}
df = pd.DataFrame(data)
# Find rows where the 'Hobby' column contains the substring 'guitar'
filtered_df = df[df['Hobby'].str.contains('guitar')]
print(filtered_df)

In this example, the str.contains() method is used to check whether the substring “guitar” is present in each element of the ‘Hobby’ column. The resulting DataFrame filtered_df only contains the rows where the substring is found in the column.

Key Takeaways

  • To check whether a string contains a substring, use the in membership operator. It returns True if the substring is present and False otherwise.
  • If you want to perform a case-insensitive check, convert both the string and the substring to lowercase or uppercase before using in.
  • To extract the actual occurrences of a substring within a string, use the find() or findall() methods from the re module.
  • When working with pandas DataFrames, use the str.contains() method to find substrings in specific columns.

Now that you know the different ways to check if a Python string contains a substring, you can confidently handle these tasks in your Python programs. Remember to choose the approach that best fits your specific use case and enjoy exploring the world of string manipulation in Python!

Remember, always practice and experiment with the code to solidify your understanding. Happy coding!