Skip to content

Easily Retrieve a List of Files in Directory with Python

[

How to Get a List of All Files in a Directory With Python

When working with files in Python, it is often necessary to get a list of all the files and folders in a directory. This is a common requirement for many file-related operations. In this tutorial, you will learn different methods to accomplish this task using the pathlib module.

Getting a List of All Files and Folders in a Directory in Python

To start, you need to create a Path object using the pathlib module. The type of object returned depends on the operating system you are using. For Windows, you will get a WindowsPath object, while for Linux and macOS, you will get a PosixPath object.

import pathlib
# Windows Path
desktop = pathlib.Path("C:/Users/RealPython/Desktop")
print(desktop) # WindowsPath("C:/Users/RealPython/Desktop")
# Linux and macOS Path
desktop = pathlib.Path("/home/RealPython/Desktop")
print(desktop) # PosixPath('/home/RealPython/Desktop')

With the Path object, you can access various methods and properties to manipulate and retrieve information about the directory.

Recursively Listing With .rglob()

The .rglob() method allows you to recursively list all the files and folders in a directory. It searches for items in the directory and its subdirectories.

import pathlib
path = pathlib.Path("/path/to/directory")
for item in path.rglob("*"):
print(item)

This will print the path of all the files and directories in /path/to/directory and its subdirectories.

Using a Python Glob Pattern for Conditional Listing

You can use glob patterns to filter the items you want to list. The * character represents any number of characters, and the ? character represents a single character.

Conditional Listing Using .glob()

The .glob() method allows you to list items that match a specific glob pattern.

import pathlib
path = pathlib.Path("/path/to/directory")
for item in path.glob("*.txt"):
print(item)

This will print the path of all the .txt files in /path/to/directory.

Conditional Listing Using .rglob()

The .rglob() method also supports glob patterns for recursive listing.

import pathlib
path = pathlib.Path("/path/to/directory")
for item in path.rglob("*.txt"):
print(item)

This will print the path of all the .txt files in /path/to/directory and its subdirectories.

Advanced Matching With the Glob Methods

The glob methods in pathlib support even more advanced matching using braces {} and ranges [].

import pathlib
path = pathlib.Path("/path/to/directory")
for item in path.glob("[abc]*.{txt,py}"):
print(item)

This will print the path of all the files that start with either a, b, or c, and have a .txt or .py extension.

Opting Out of Listing Junk Directories

Sometimes, you may want to skip listing certain directories that are not relevant to your operation. You can achieve this by filtering the directories using the .rglob() method.

Using .rglob() to Filter Whole Directories

import pathlib
path = pathlib.Path("/path/to/directory")
for item in path.rglob("*"):
if item.is_dir() and "junk" in item.name:
continue
print(item)

This code skips listing any directory whose name contains “junk”.

Creating a Recursive .iterdir() Function

You can also create a recursive function using the .iterdir() method to filter and list directories.

import pathlib
def list_directory(path):
for item in path.iterdir():
if item.is_dir() and "junk" in item.name:
continue
print(item)
if item.is_dir():
list_directory(item) # Recursively list subdirectories
path = pathlib.Path("/path/to/directory")
list_directory(path)

This function recursively lists all items in the directory and its subdirectories, skipping any directory containing the word “junk” in its name.

Conclusion

In this tutorial, you have learned different methods to get a list of all the files and folders in a directory using the pathlib module in Python. You have also explored advanced matching techniques using glob patterns and filtering out irrelevant directories. These techniques will help you efficiently handle file-related operations in your Python projects.