Staying pythonic - letting python do the heavy lifting

2020-11-16

This was inspired by a number of talks by Raymond Hettinger

Pythonic: Coding beautifully in harmony with the language to get the maximum benefits from python - Raymond Hettinger

Oftentimes you need to write code in several different languages. Even though Python is often taught as an introductory language to beginners the syntax and in-built functions stretch far beyond what many realize. This becomes even more apparent when switching between traditional strongly-typed languages and Python. It can be hard to remember the most pythonic way of doing something. This is a non-exhaustive list of common non-pythonic code I find myself writing as a result of not always remembering the pythonic way.

1. Looping with (and without) range

In Python the for-loop works a bit different than in most other languages. In Python you don't need to define an index variable to loop over a range. Instead, you loop over an iterator. In most other languages you would do something like:

for(int i = 0; i < 100; i++) { // Do something }

In Python you define an iterator to iterate over:

for x in range(100): print(x)

As of Python 3 range replaces xrange as the range iterator. The benefit of using an iterator is that you don't need to generate all posible values before you actually need them.
When the interpreter sees range(10000) it doesn't start computing all the values from 0 to 10000. Instead it just saves it as the iterator range(10000). Later when you call .next() it generates the next value in the sequence.
This means that the iterators range(1) and range(1,2,100000) take up exactly the same amount of memory.

The range function is incredibly useful, but that doesn't mean it's always the right choice. A big difference between Python and a lot of other languages is that looping over collections come for free. You should avoid accessing objects in collections by index and instead use the power of the for-loop.

# Don't do this: prefixes = ['nano', 'piko', 'kilo'] for i in range(len(prefixes)): print(prefixes[i]) # Do this: for prefix in prefixes: print(prefix)

This might seem like a no-brainer to many people but it's really easy to forget in more complicated scenarios. Every time you see something beeing accessed by index, think about if it could be refactored to improve readability.

2. Built-in iterators

The power of the iterator is used thoroughly by the built-in functions and methods of the standard library. Many old functions have a newer iterator-based version and it's almost always a good idea to use the iterator. This becomes apparent when using the reversed() function.

reversed

The reversed() function accepts an object and spits it out its reverse. This might not seem at all related to iterators but they play an important role in making the reversed() function performant. The naïve ways of writing reversed might be to loop over the list backwards, saving each element along the way. This, however, turns out to be incredibly inneficient. All elements must copied and saved to memory, resulting in twice the memory usage! Iterators allows us to create an object which will provide us with each element on-demand, in reversed order. Without any memory overhead.

# This will only create a single iterator which # uses a fraction of the memory a traditional 'reverse()' function might use for x in reversed(range(10000)): print(x)

enumerate

Another great built-in iterator is the enumerate function. In certain scenarios you need to access the index of an object. Should you sacrifice readabilty and replace for item in items with for i in range(len(items))? This is the problem enumerate sets out to solve. It will return a tupel of both the current element and the current index.

prefixes = ['nano', 'piko', 'kilo'] for i, prefix in enumerate(prefix): print(i, '->', prefix)

zip

Previously itertools.izip the zip function in Python 3 returns an iterator that combines multiple iterators into one. Every programming course will teach you how to loop over two lists at once: Find the smalles of the two lists and loop over each index, using the index to access the elements of each array. Something like:

prefixes = ['nano', 'piko', 'kilo'] power_of_10 = [-9, -12, 3] for i in range(min(len(prefix), len(power_of_10))): print(`${prefixes[i]: 10^${power_of_10[i]}}`)

With zip, this can be written as:

prefixes = ['nano', 'piko', 'kilo'] power_of_10 = [-9, -12, 3] for prefix, power in zip(prefixes, power_of_10): print(`${prefix: 10^${power}}`)

iter

## 4. Sorting keys

When sorting in most other programming languages you define how you want your list to be sorted by specifying a comparison function.
In python such a function might look like this:

prefixes = ['nano', 'piko', 'kilo'] def cmp_length(a, b): if len(a) > len(b): return 1 if len(a) < len(b): return -1 return 0 print(sorted(prefixes, cmp=cmp_length))

A more compact way of writing this using more python-specific features would be:

prefixes = ['nano', 'piko', 'kilo'] print(sorted(prefixes, key=len))

Or a more complicated example

prefixes = ['nano', 'piko', 'kilo'] print(sorted(prefixes, key=substr))

5. For else

A quite intresting feature in python is the else clause of loops. This acts as the else clause of the loops internal if statement. Every for/while loop has an internal if statement which determens if the loop should continue. If this expression is False the else clause will run. On the other hand, if the loop is terminated with a break the else clause will be skipped.

The else condition can be used to check if a loop exhausted all items and didn't finish early.

import math def is_prime(num): for factor in range(2, round(math.sqrt(num))): if num % factor == 0: print(f'{num} is divisible by {factor}') break else: # Exhausted factors -> prime number print(f'{num} is prime')

## 6. dict setdefault

The collections module implements a bunch of really useful datatypes that provide extended capabilities compared to the built-in types. One of the more useful datatypes is defaultdict.

A dictionary holds

## 7. Keyword arguments

Function arguments in Python have three

## 8. Named tupels

## 9. Unpacking Sequences

Sequence unpacking is incredibly useful when working with sequences. You can access elements in a sequence by unpacking it into variables. When manipulatin continuously chagning varibales you can save yourself from writing a lot of temporary variables by using sequence unpacking. x = 0 y = 1 for _ in range(10): next = x + y y = x x = next print(x) x = 0 y = 1 for _ in range(10): x, y = y, x+y print(x)

## 10. With as

## 12. Design patterns - letting python do the work for you

Following small practical advice can be helpful in the short run but one of the most important things to keep in mind is to don't forget the bigger picture. Make a habit of always looking at the patterns you are creating in your code. Even if you might not follow a specific design pattern it's usually a good idea to be aware of how you structure your code.

Implement dunder methods

'Dunder methods', or double underscore methods, are special methods used by a lot of underlying functions in python to give your code special 'magic' powers. Using them will make your code more usable and the users of your code (including yourself) will find it easier, and more obvious, to figure out what it does.

The __repr__ method is a great example of a dunder method thats easy to implement but can give your code a huge boost in usablility.
When you print an instance variable you will get the default string representation of the object. This includes the name of the class and the instances memory adress. Creating your own __repr__ gives you the possibility to provide a more appropriate representation of your type.

# Without __repr__ class Color: def __init__(self, color_hex, color_name): self.color_hex = color_hex self.color_name = color_name green = Color('#7fff00', 'Chartreuse') print(green) >>> <__main__.Color object at 0x109877580> # With __repr__ class Color: def __init__(self, color_hex, color_name): self.color_hex = color_hex self.color_name = color_name def __repr__(self): return f'{self.__class__.__name__}({self.color_name}: {self.color_hex})' green = Color('#7fff00', 'Chartreuse') print(green) >>> Color(Chartreuse: #7fff00)

Sequence

If you every have a class or datatype that have some concept of a size and is accessable via an index, chances are it could conform to sequence which comes with a ton of benefits.
Simply adding the two dunder methods __len__ and __get_item__ will tell python all it needs to know to convert it to an iterable. As previously mentioned iterables are extremely useful and allows you use them in a veriety of situations.
This is expecially useful when wrapping an API or other external code that might not conform to the iterable protocol.

import legacy_package.get_user_by_index import legacy_package.get_nbr_of_users class Users: def __len__(self): return get_nbr_of_users() def __get_item__(self): return get_nbr_of_users()

Create exceptions

In some languages, like C, a common way to indicate error state is by using a sentinel value such as -1. Python makes it easy to create custom exceptions which can be more descriptive than generic errors.

exception ValueRangeException(): pass

@property

Getters and setters in are rare in python. A better way of wirting