Effortlessly Understand pprint: Python's Pretty Printer

[

Prettify Your Data Structures With Pretty Print in Python

Dealing with data is essential for any Pythonista, but sometimes that data is just not very pretty. Computers don’t care about formatting, but without good formatting, humans may find something hard to read. The output isn’t pretty when you use print() on large dictionaries or long lists—it’s efficient, but not pretty.

The pprint module in Python is a utility module that you can use to print data structures in a readable, pretty way. It’s a part of the standard library that’s especially useful for debugging code dealing with API requests, large JSON files, and data in general.

By the end of this tutorial, you’ll:

Understand why the pprint module is necessary
Learn how to use pprint(), PrettyPrinter, and their parameters
Be able to create your own instance of PrettyPrinter
Save formatted string output instead of printing it
Print and recognize recursive data structures

Along the way, you’ll also see an HTTP request to a public API and JSON parsing in action.

Free Bonus: Click here Click here to get a Python Cheat Sheet and learn the basics of Python 3, like working with data types, dictionaries, lists, and Python functions.

Understanding the Need for Python’s Pretty Print

The Python pprint module is helpful in many situations. It comes in handy when making API requests, dealing with JSON files, or handling complicated and nested data. You’ll probably find that using the normal print() function isn’t adequate to efficiently explore your data and debug your application. When you use print() with dictionaries and lists, the output doesn’t contain any newlines.

Before you start exploring pprint, you’ll first use urllib to make a request to get some data. You’ll make a request to {JSON} Placeholder for some mock user information. The first thing to do is to make the HTTP GET request and put the response into a dictionary:

from urllib import request
response = request.urlopen("https://jsonplaceholder.typicode.com/users")
json_response = response.read()
import json
users = json.loads(json_response)

Here, you make a basic GET request and then parse the response into a dictionary with json.loads(). With the dictionary now in a variable, a common next step is to print the contents with print():

print(users)

However, if you run this code, you’ll find that the output is not very readable.

Working with pprint

To make your output more readable, you can use the pprint module. To use it, you need to import the module:

import pprint

Now, you can use the pprint() function to print your data structure. Replace the previous print() statement with the following code:

pprint.pprint(users)

When you run this code, you’ll notice that the output is now nicely formatted and easy to read.

Exploring Optional Parameters of pprint()

The pprint() function has several optional parameters that allow you to customize the output even further. Let’s explore some of the most commonly used parameters:

Summarizing Your Data: depth

By default, pprint() will display the entire data structure. However, if you want to limit the depth of the output, you can use the depth parameter. This parameter specifies the maximum depth to which the data structure is printed.

To demonstrate this, let’s limit the depth of the output to 1:

pprint.pprint(users, depth=1)

When you run this code, you’ll see that only the top-level elements of the data structure are displayed.

Giving Your Data Space: indent

The indent parameter specifies the number of spaces used for indentation in the output. By default, it is set to 1. You can increase or decrease this value to achieve the desired formatting.

Let’s increase the indentation to 4 spaces:

pprint.pprint(users, indent=4)

When you run this code, you’ll see that the output is now indented with 4 spaces, making it easier to distinguish nested elements.

Limiting Your Line Lengths: width

The width parameter specifies the maximum number of characters per line in the output. By default, it is set to 80 characters. If a line exceeds this limit, it will be wrapped to the next line.

To demonstrate this, let’s set the width to 50 characters:

pprint.pprint(users, width=50)

When you run this code, you’ll see that the output is now wrapped at 50 characters per line.

Squeezing Your Long Sequences: compact

The compact parameter controls whether long sequences are represented in a compact form or not. By default, it is set to False, which means that each item in the sequence will be placed on a new line. If you set it to True, the sequence will be displayed on a single line.

To demonstrate this, let’s set the compact parameter to True:

pprint.pprint(users, compact=True)

When you run this code, you’ll see that long sequences are now displayed on a single line, making the output more compact.

Directing Your Output: stream

By default, pprint() sends the output to sys.stdout. However, you can specify a different file-like object as the stream parameter to redirect the output.

To demonstrate this, let’s create a file called output.txt and redirect the output to it:

with open("output.txt", "w") as f:
    pprint.pprint(users, stream=f)

When you run this code, you’ll see that the output is saved to the output.txt file instead of being displayed in the console.

Preventing Dictionary Sorting: sort_dicts

By default, pprint() sorts dictionaries alphabetically by their keys. However, if you want to preserve the original order, you can set the sort_dicts parameter to False.

To demonstrate this, let’s create a dictionary with unordered keys and use pprint() to print it:

data = {"d": 4, "a": 1, "c": 3, "b": 2}
pprint.pprint(data)

When you run this code, you’ll see that the dictionary is displayed in alphabetical order. Now, let’s disable dictionary sorting and see the difference:

pprint.pprint(data, sort_dicts=False)

When you run this code, you’ll see that the dictionary is now displayed in the order in which it was defined.

Prettifying Your Numbers: underscore_numbers

The underscore_numbers parameter controls whether numbers should be formatted with underscores or not. By default, it is set to False. If you set it to True, numbers will be displayed with underscores to improve readability.

To demonstrate this, let’s create a dictionary with long numbers and use pprint() to print it:

data = {"number1": 1000000, "number2": 1000000000}
pprint.pprint(data)

When you run this code, you’ll see that the numbers are displayed without underscores. Now, let’s enable the underscore formatting and see the difference:

pprint.pprint(data, underscore_numbers=True)

When you run this code, you’ll see that the numbers are now displayed with underscores, making it easier to read them.

Creating a Custom PrettyPrinter Object

In addition to using the pprint() function, you can also create your own instance of the PrettyPrinter class. This allows you to customize the behavior of the pretty printer even further.

To demonstrate this, let’s create a custom pretty printer that indents with 2 spaces and limits the line width to 60 characters:

custom_pprinter = pprint.PrettyPrinter(indent=2, width=60)
custom_pprinter.pprint(users)

When you run this code, you’ll see that the output is indented with 2 spaces and wrapped at 60 characters per line according to the custom settings.

Getting a Pretty String With pformat()

If you don’t want to print the output directly, but instead store it in a variable, you can use the pformat() function. This function returns a string representation of the data structure in a pretty format.

To demonstrate this, let’s store the pretty formatted output in a variable:

pretty_string = pprint.pformat(users)
print(pretty_string)

When you run this code, you’ll see that the output is stored in the pretty_string variable and then printed.

Handling Recursive Data Structures

Recursive data structures are data structures that contain references to themselves. These structures can cause infinite recursion and lead to a RecursionError when trying to print them with pprint().

To avoid this, you can create your own instances of the PrettyPrinter class and set the depth parameter to limit the recursion depth. This will prevent infinite recursion and allow you to print recursive data structures without error.

To demonstrate this, let’s create a recursive data structure:

data = {"name": "John", "children": []}
data["children"].append(data)
pprint.pprint(data)  # Raises RecursionError

When you run this code, you’ll encounter a RecursionError because the pprint() function doesn’t handle recursive data structures by default.

Now, let’s create a custom pretty printer and set the depth parameter to 1 to limit the recursion depth:

custom_pprinter = pprint.PrettyPrinter(depth=1)
custom_pprinter.pprint(data)  # No RecursionError

When you run this code, you’ll see that the output is displayed without any errors, and the recursive reference is displayed as <Recursion on dictionary with id(X)>.

Conclusion

The pprint module in Python is a powerful tool for printing data structures in a readable and pretty format. By using the pprint() function or creating your own instance of the PrettyPrinter class, you can customize the output to meet your needs. Whether you’re debugging code, exploring API responses, or working with large JSON files, pprint can help you make sense of your data.

Now that you’ve learned how to use pprint, go ahead and apply it to your own projects. Experiment with the optional parameters and create custom pretty printers to format your data exactly the way you want. Happy coding!