Skip to content

Effortlessly Sort Python Dicts: A Step-by-Step Tutorial

CodeMDD.io

Sorting a Python Dictionary: Values, Keys, and More

by Ian Currie

You’ve got a dictionary, but you’d like to sort the key-value pairs. Perhaps you’ve tried passing a dictionary to the sorted() function but haven’t gotten the results you expected. In this tutorial, you’ll go over everything you need to know if you want to sort dictionaries in Python.

In this tutorial, you’ll:

  • Review how to use the sorted() function
  • Learn how to get dictionary views to iterate over
  • Understand how dictionaries are cast to lists during sorting
  • Learn how to specify a sort key to sort a dictionary by value, key, or nested attribute
  • Review dictionary comprehensions and the dict() constructor to rebuild your dictionaries
  • Consider alternative data structures for your key-value data

Along the way, you’ll also use the timeit module to time your code and get tangible results for comparing the different methods of sorting key-value data. You’ll also consider whether a sorted dictionary is really your best option, as it’s not a particularly common pattern.

To get the most out of this tutorial, you should know about dictionaries, lists, tuples, and functions. With that knowledge, you’ll be able to sort dictionaries by the end of this tutorial. Some exposure to lambda functions, higher-order functions, will also come in handy but isn’t a requirement.

Rediscovering Dictionary Order in Python

Before Python 3.6, dictionaries were inherently unordered. A Python dictionary is an implementation of the hash table, which is traditionally an unordered data structure.

As a side effect of the compact dictionary implementation in Python 3.6, dictionaries started to conserve insertion order. From 3.7, that insertion order has been guaranteed.

If you wanted to keep an ordered dictionary as a data structure before 3.7, you had to use the collections.OrderedDict class. However, since 3.7, normal dictionaries now maintain insertion order under the hood.

Understanding What Sorting A Dictionary Really Means

When you say you want to sort a dictionary, what does that actually mean?

In Python, dictionaries are inherently unordered because they are implemented using hash tables. Hash tables provide fast lookups by key, but they don’t maintain any particular order. Therefore, sorting a dictionary, in a strict sense, doesn’t make much sense.

However, what you can do with Python dictionaries is:

  • Retrieve the keys, values, or key-value pairs in a sorted order
  • Sort the dictionaries based on their keys or values
  • Sort the dictionaries based on a defined sort key

Sorting Dictionaries in Python

Using the sorted() Function

The simplest way to sort a dictionary is to use the sorted() function. By default, sorted() will sort the dictionary based on its keys in ascending order and return a list of the key-value pairs as tuples.

Here’s an example:

>>> d = {'apple': 20, 'banana': 5, 'cherry': 10}
>>> sorted(d.items())
[('apple', 20), ('banana', 5), ('cherry', 10)]

Getting Keys, Values, or Both From a Dictionary

Sometimes, you don’t need the key-value pairs in sorted order. You might only be interested in the keys or the values.

To retrieve the keys in sorted order, you can use the sorted() function with the keys() method of the dictionary:

>>> d = {'apple': 20, 'banana': 5, 'cherry': 10}
>>> sorted(d.keys())
['apple', 'banana', 'cherry']

To retrieve the values in sorted order, you can use the sorted() function with the values() method of the dictionary:

>>> d = {'apple': 20, 'banana': 5, 'cherry': 10}
>>> sorted(d.values())
[5, 10, 20]

If you want both the keys and values in sorted order, you can directly sort the items() method of the dictionary:

>>> d = {'apple': 20, 'banana': 5, 'cherry': 10}
>>> sorted(d.items())
[('apple', 20), ('banana', 5), ('cherry', 10)]

Understanding How Python Sorts Tuples

By default, Python sorts tuples using the lexicographic order, which means it compares the elements of the tuples from left to right. If the leftmost elements are equal, Python compares the next elements, and so on.

For example, if you sort the list of tuples [('apple', 20), ('banana', 5), ('cherry', 10)], it will compare the tuples based on the first elements ('apple', 'banana', 'cherry') and sort them in ascending order. If the first elements are equal, it will compare the next elements (20, 5, 10) and sort them accordingly.

>>> data = [('apple', 20), ('banana', 5), ('cherry', 10)]
>>> sorted(data)
[('apple', 20), ('banana', 5), ('cherry', 10)]

Using the key Parameter and Lambda Functions

Python provides the key parameter in the sorted() function to specify a custom sort key. The key parameter takes a function that will be used to extract a comparison key from each element before sorting.

For example, to sort the dictionary by its values rather than by its keys, you can pass a lambda function as the key parameter:

>>> d = {'apple': 20, 'banana': 5, 'cherry': 10}
>>> sorted(d.items(), key=lambda x: x[1])
[('banana', 5), ('cherry', 10), ('apple', 20)]

In this example, the lambda function lambda x: x[1] is used to extract the second elements (values) of each tuple, and the list is sorted based on these values.

You can also use the key parameter to sort dictionaries based on nested attributes. For example, if you have a list of dictionaries representing people, and you want to sort them based on their ages, you can provide a lambda function that extracts the age attribute:

>>> people = [{'name': 'John', 'age': 28}, {'name': 'Mary', 'age': 25}, {'name': 'Jane', 'age': 30}]
>>> sorted(people, key=lambda x: x['age'])
[{'name': 'Mary', 'age': 25}, {'name': 'John', 'age': 28}, {'name': 'Jane', 'age': 30}]

Selecting a Nested Value With a Sort Key

In some cases, you might want to sort a dictionary based on a value that is nested within each key-value pair. To do this, you can combine the key parameter with multiple levels of indexing or attribute access.

For example, if you have a dictionary where the values are dictionaries themselves, and you want to sort the outer dictionary based on an attribute of the inner dictionaries, you can create a lambda function that uses multiple indexing or attribute access operations:

>>> d = {'apple': {'price': 2.0}, 'banana': {'price': 1.0}, 'cherry': {'price': 3.0}}
>>> sorted(d.items(), key=lambda x: x[1]['price'])
[('banana', {'price': 1.0}), ('apple', {'price': 2.0}), ('cherry', {'price': 3.0})]

In this example, the lambda function lambda x: x[1]['price'] is used to extract the 'price' attribute from the inner dictionary for sorting.

Converting Back to a Dictionary

After sorting a dictionary, you may want to convert the sorted list of key-value pairs back into a dictionary. To do this, you can pass the sorted list to the dict() constructor:

>>> d = {'apple': 20, 'banana': 5, 'cherry': 10}
>>> sorted_items = sorted(d.items())
>>> sorted_dict = dict(sorted_items)
>>> sorted_dict
{'apple': 20, 'banana': 5, 'cherry': 10}

This will create a new dictionary that maintains the sorted order based on the key-value pairs.

Considering Strategic and Performance Issues

Before deciding to sort a dictionary, it’s important to consider strategic and performance issues. Sorting a dictionary has some trade-offs that may or may not be worth it depending on your specific use case.

Using Special Getter Functions to Increase Performance and Readability

If you find yourself frequently accessing the same value in a dictionary, you can use a special getter function to increase performance and readability.

The operator.itemgetter() function provides a convenient way to create a getter function for a specific key. This can be useful when sorting dictionaries based on certain values.

For example, to sort a dictionary of people by their ages, you can use the operator.itemgetter() function to create a getter function for the 'age' key:

>>> from operator import itemgetter
>>> people = [{'name': 'John', 'age': 28}, {'name': 'Mary', 'age': 25}, {'name': 'Jane', 'age': 30}]
>>> sorted(people, key=itemgetter('age'))
[{'name': 'Mary', 'age': 25}, {'name': 'John', 'age': 28}, {'name': 'Jane', 'age': 30}]

Using itemgetter() can be more efficient and readable than using a lambda function, especially for multiple levels of indexing or attribute access.

Measuring Performance When Using itemgetter()

If performance is a concern when sorting dictionaries, you can use the timeit module to measure the execution time of your code.

Here’s an example of how to use the timeit module to compare the performance of sorting a dictionary using a lambda function versus using operator.itemgetter():

>>> import timeit
>>> from operator import itemgetter
>>> d = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
>>> lambda_time = timeit.timeit(lambda: sorted(d.items(), key=lambda x: x[1]), number=100000)
>>> itemgetter_time = timeit.timeit(lambda: sorted(d.items(), key=itemgetter(1)), number=100000)
>>> print(f"Lambda function time: {lambda_time}")
Lambda function time: 0.318201399999799
>>> print(f"Itemgetter time: {itemgetter_time}")
Itemgetter time: 0.190251800000174

In this example, the timeit module is used to measure the execution time of sorting the dictionary 100,000 times using both a lambda function and operator.itemgetter(). The results show that using itemgetter() is faster in this case.

Judging Whether You Want to Use a Sorted Dictionary

Keep in mind that sorting a dictionary is not a common pattern. The main reason you might want to sort a dictionary is for human-readable output or to maintain a specific order of key-value pairs.

If you’re frequently accessing the values in a dictionary by key, it’s generally more efficient to use a regular dictionary and access the values directly.

However, if you’re frequently iterating over the key-value pairs in a dictionary and order is important, using collections.OrderedDict or sorting the dictionary as needed can be valid approaches.

Comparing the Performance of Different Data Structures

If you’re dealing with large amounts of data and performance is crucial, it’s worth comparing the performance of different data structures.

In terms of lookup time, dictionaries have an average time complexity of O(1), which makes them very efficient. However, dictionaries are not ordered, so if you need to maintain order, you might consider using a different data structure like collections.OrderedDict or a list of tuples.

Here’s an example that compares the performance of looking up values in a dictionary versus an OrderedDict and a list of tuples:

>>> import timeit
>>> from collections import OrderedDict
>>> d = {i: i for i in range(10000)}
>>> od = OrderedDict(d.items())
>>> lt = list(d.items())
>>> dict_time = timeit.timeit(lambda: d[5000], number=100000)
>>> od_time = timeit.timeit(lambda: od[5000], number=100000)
>>> list_time = timeit.timeit(lambda: lt[5000][1], number=100000)
>>> print(f"Dictionary time: {dict_time}")
Dictionary time: 0.0005143999994746604
>>> print(f"OrderedDict time: {od_time}")
OrderedDict time: 0.003043999999225646
>>> print(f"List of tuples time: {list_time}")
List of tuples time: 0.0029188999998269396

In this example, the timeit module is used to measure the lookup time for a specific key in a dictionary, an OrderedDict, and a list of tuples, all containing 10,000 key-value pairs. The results show that dictionary lookups are the fastest, followed by list lookups, and finally OrderedDict lookups.

Comparing the Performance of Sorting

In addition to lookup time, it’s also important to consider the performance of sorting different data structures.

Here’s an example that compares the performance of sorting a dictionary, an OrderedDict, and a list of tuples:

>>> import timeit
>>> from collections import OrderedDict
>>> d = {i: i for i in range(10000)}
>>> od = OrderedDict(d.items())
>>> lt = list(d.items())
>>> dict_sort_time = timeit.timeit(lambda: sorted(d.items(), key=lambda x: x[1]), number=100)
>>> od_sort_time = timeit.timeit(lambda: sorted(od.items(), key=lambda x: x[1]), number=100)
>>> list_sort_time = timeit.timeit(lambda: sorted(lt, key=lambda x: x[1]), number=100)
>>> print(f"Dictionary sort time: {dict_sort_time}")
Dictionary sort time: 0.11550150000023188
>>> print(f"OrderedDict sort time: {od_sort_time}")
OrderedDict sort time: 0.5365233999997942
>>> print(f"List sort time: {list_sort_time}")
List sort time: 0.5305816999994944

In this example, the timeit module is used to measure the sort time for a dictionary, an OrderedDict, and a list of tuples, all containing 10,000 key-value pairs. The results show that sorting lists and OrderedDicts is slower than sorting dictionaries.

Comparing the Performance of Lookups

Finally, it’s important to compare the performance of different data structures when it comes to lookups. While dictionaries are generally efficient for lookup operations, it’s worth measuring their performance to ensure that they meet your requirements.

Here’s an example that compares the lookup time for a specific key in a dictionary and an OrderedDict:

>>> import timeit
>>> from collections import OrderedDict
>>> d = {i: i for i in range(10000)}
>>> od = OrderedDict(d.items())
>>> dict_lookup_time = timeit.timeit(lambda: d[5000], number=1000)
>>> od_lookup_time = timeit.timeit(lambda: od[5000], number=1000)
>>> print(f"Dictionary lookup time: {dict_lookup_time}")
Dictionary lookup time: 3.9000000000395346e-06
>>> print(f"OrderedDict lookup time: {od_lookup_time}")
OrderedDict lookup time: 2.7000000000406884e-06

In this example, the timeit module is used to measure the lookup time for a specific key in a dictionary and an OrderedDict, both containing 10,000 key-value pairs. The results show that the lookup time for both data structures is very fast and comparable.

Conclusion

In this tutorial, you learned how to sort dictionaries in Python using various methods. You saw how to use the sorted() function to sort dictionaries based on their keys, values, or both. You also learned how to specify a sort key using the key parameter and lambda functions. Additionally, you explored how to select a nested value with a sort key and how to convert the sorted list of key-value pairs back into a dictionary.

You also considered strategic and performance issues when sorting dictionaries. You saw how to use special getter functions like operator.itemgetter() to increase performance and readability. You learned how to measure the performance of different sorting methods and lookups using the timeit module. Finally, you determined whether sorting a dictionary is a suitable choice for your specific use case.

Remember that sorting a dictionary is not a common pattern, but it can be valuable in certain scenarios for maintaining order or generating human-readable output.