Python Pretty Printer: Effortlessly Formatting Your Code for Clarity

[

Prettify Your Data Structures With Pretty Print in Python

By Ian Currie

Dealing with data is essential for any Pythonista, but sometimes that data is just not very pretty. Computers don’t care about formatting, but without good formatting, humans may find something hard to read. The output isn’t pretty when you use print() on large dictionaries or long lists—it’s efficient, but not pretty.

The pprint module in Python is a utility module that you can use to print data structures in a readable, pretty way. It’s a part of the standard library that’s especially useful for debugging code dealing with API requests, large JSON files, and data in general.

Understanding the Need for Python’s Pretty Print

The Python pprint module is helpful in many situations. It comes in handy when making API requests, dealing with JSON files, or handling complicated and nested data. You’ll probably find that using the normal print() function isn’t adequate to efficiently explore your data and debug your application. When you use print() with dictionaries and lists, the output doesn’t contain any newlines.

Before you start exploring pprint, you’ll first use urllib to make a request to get some data. You’ll make a request to JSON Placeholder for some mock user information. The first thing to do is to make the HTTP GET request and put the response into a dictionary:

from urllib import request
response = request.urlopen("https://jsonplaceholder.typicode.com/users")
json_response = response.read()
import json
users = json.loads(json_response)

Here, you make a basic GET request and then parse the response into a dictionary with json.loads(). With the dictionary now in a variable, a common next step is to print the contents with print():

print(users)

However, when you use print() on large data structures like this, the output might not be very readable. That’s where the pprint module comes in.

Working With pprint

To start using the pprint module, you’ll first need to import it:

import pprint

Now, instead of using print(), you can use the pprint.pprint() function to print your data structures in a pretty way. Let’s see an example with the users dictionary:

pprint.pprint(users)

When you run this code, you’ll see that the output is much more readable with appropriate indentation and line breaks. The pprint module automatically formats the data structure to improve its readability.

Exploring Optional Parameters of pprint()

The pprint module provides several optional parameters that you can use to customize the output. These parameters allow you to control the depth, indentation, line lengths, and more. Let’s explore these optional parameters one by one:

Summarizing Your Data: depth

The depth parameter allows you to specify how deep the pprint module should go when formatting nested data structures. By default, the depth is set to None, which means that the entire data structure will be explored. However, you can set a specific value to limit the depth. Let’s see an example:

pprint.pprint(users, depth=2)

In this example, the pprint module will only go two levels deep when formatting the users dictionary. The output will not show any further nested data beyond the second level.

Giving Your Data Space: indent

The indent parameter allows you to specify the number of spaces used for indentation in the output. By default, the indent is set to 1. However, you can increase or decrease the number of spaces as needed. Let’s see an example with an indent of 4:

pprint.pprint(users, indent=4)

In this example, the output will be indented by four spaces instead of the default one space. This can help improve the readability of the formatted data structure when dealing with deeply nested data.

Limiting Your Line Lengths: width

The width parameter allows you to specify the maximum number of characters per line in the output. By default, the width is set to 80. However, you can increase or decrease the number as needed. Let’s see an example with a width of 40:

pprint.pprint(users, width=40)

In this example, the output will be limited to a maximum of 40 characters per line. This can be useful when you have limited space or when you want to keep the output within a certain width for readability.

Squeezing Your Long Sequences: compact

The compact parameter allows you to specify whether the pprint module should attempt to shorten long sequences when formatting the output. By default, compact is set to False, which means that long sequences will be split into multiple lines for better readability. However, if you set compact to True, the module will attempt to squeeze long sequences into a single line if possible. Let’s see an example:

pprint.pprint(users, compact=True)

In this example, the output will try to squeeze long sequences into a single line. This can be helpful when you have long lists or tuples and want to see them in a more condensed form.

Directing Your Output: stream

The stream parameter allows you to specify the file-like object where the output should be written. By default, stream is set to None, which means that the output is written to the standard output (usually the console). However, you can direct the output to a specific file or a custom file-like object. Let’s see an example with a custom file:

with open("output.txt", "w") as f:
    pprint.pprint(users, stream=f)

In this example, the output will be written to a file named “output.txt” in write mode. You can replace “output.txt” with the path to your desired file.

Preventing Dictionary Sorting: sort_dicts

The sort_dicts parameter allows you to control whether the pprint module should sort dictionaries alphabetically when formatting the output. By default, sort_dicts is set to True, which means that dictionaries will be sorted. However, if you set sort_dicts to False, the module will preserve the original order of the dictionary. Let’s see an example:

pprint.pprint(users, sort_dicts=False)

In this example, the output will maintain the original order of the dictionary instead of sorting it alphabetically.

Prettifying Your Numbers: underscore_numbers

The underscore_numbers parameter allows you to specify whether the pprint module should add underscores between digits in large numbers when formatting the output. By default, underscore_numbers is set to False, which means that numbers will not be prettified. However, if you set underscore_numbers to True, the module will add underscores between digits in large numbers for better readability. Let’s see an example:

pprint.pprint(users, underscore_numbers=True)

In this example, large numbers in the output will be formatted with underscores between digits. This can help improve the readability of long numbers.

Creating a Custom PrettyPrinter Object

In addition to using the pprint.pprint() function, you can also create your own instance of the PrettyPrinter class from the pprint module. This allows you to have more control over the formatting options and reuse the same settings multiple times. Here’s how you can create a custom PrettyPrinter object:

custom_pprint = pprint.PrettyPrinter(\
    indent=4, width=40, compact=True,\
    stream=open("output.txt", "w"))
custom_pprint.pprint(users)

In this example, a custom PrettyPrinter object is created with an indent of 4, a width of 40, a compact format, and the output directed to a file named “output.txt”. You can modify the settings and the destination file as needed.

Getting a Pretty String With pformat()

Sometimes, instead of printing the formatted data structure, you may want to store it as a string for further processing or saving. The pprint module provides the pformat() function for this purpose. Here’s an example:

pretty_string = pprint.pformat(users)
print(pretty_string)

In this example, the pformat() function is used to convert the users dictionary into a formatted string. The string is then printed to the console. You can use the pretty_string variable just like any other string in your code.

Handling Recursive Data Structures

Recursive data structures are data structures that contain references to themselves. When using the pprint module, these recursive structures can cause an infinite loop and lead to a RecursionError. To handle recursive data structures, you can use the pprint module’s safe parameter. By default, the safe parameter is set to True, which means that the module will replace recursive references with a string. Let’s see an example:

data = [1, [2, [3, [4, data]]]]
pprint.pprint(data)

In this example, the data list contains a recursive reference to itself. When you try to print this list without specifying the safe parameter, you’ll encounter a RecursionError. However, by default, the pprint module replaces recursive references with the string '[...]'. This allows you to visualize the data structure without causing an infinite loop.

Conclusion

The pprint module in Python is a powerful tool for pretty-printing data structures. It allows you to format your output in a readable and aesthetically pleasing way. By exploring the optional parameters, you can customize the formatting according to your needs. Whether you’re working with API requests, JSON files, or any other data, pprint can make your debugging and data exploration process much easier.

Remember to import the pprint module at the beginning of your code and use the pprint.pprint() function to print your data structures. Experiment with the optional parameters and create a custom PrettyPrinter object for more advanced formatting. And don’t forget that you can also use pprint.pformat() to get a pretty string representation of your data structures.

Now you have the tools to prettify your data structures in Python and make them more readable for humans. Happy coding!