EngineeringFantasy

Python 3 May Become Relevant Now

Monday, 03 August 2015

Table of Contents

Python 3 has been mocked for a pretty long time for not having any big-ticket features. For the first four minor releases for Python 3 (3.1 to 3.4), this did not change. There are two features that are worth noting in Python 3.5. One being the type checker/hints.[1] The other being the async/await key words coming to Python 3.5.[2] Question is, how relevant?

Type Hints

This PEP aims to provide a standard syntax for type annotations, opening up Python code to easier static analysis and refactoring, potential runtime type checking, and performance optimizations utilizing type information.

PEP 0484

Python now has a type checker. This does not mean that python is going to magically become a statically typed language. In other words, the type checker is merely a tool for a better development experience. For example, if your IDE/Editor knows what type you're working with, it can provide better code completion. Some IDEs actually provide rudimentary type checking.[3] However, this will standardize the way in which types are declared and that means this will raise the bar for all the python tools available at our disposal.

Think of this as the python equivalent to flow, which is a static type checker for javascript. The type checking will happen through stub files, a lot like DefinitelyTyped's Typescript stubs.[4]

Other than better tooling, this also means that a type induced runtime errors can be thwarted before they happen, which is something that makes developing large code bases with Python a lot easier. You will hear no end of war stories about how a function that sometimes returned None deep in the source brought about pyapocalypse. Having type problems deep within your source code, is by no means uncommon.

A friend of mine had to deal with this kind of problem once. The problem was with a function (buried deep in the source code as one of many decorators) that usually returned a list but under a certain circumstances, it returned None. This wasn't noted before because most of the time, the result would be checked in a simple if statement.

In other words bool(None) and bool([]) return the same thing, which is False. The problem was happening because another branch still thought that the returned value was a list and so tried to append to it. As you might expect, the traceback wasn't pretty.

I'm not saying that the person who originally wrote the code is a bad programmer. I'll leave that up to you. What I am saying is that python allows you to make such silly mistakes. This is especially problematic when you're working with really large code-bases.

However, while this is all good in terms of development tools, this does not mean a large boost in performance. PyPy's FAQ talks about this issue and addresses some of the key concerns. [5] In a nutshell, the performance will not increase simply because python's types are objects and do not correspond directly to how types are represented in binary. For example, python's int type isn't necessarily a 64-bit number, but has the ability to grow to whatever size you need it to grow.

... annotations are at the wrong level (e.g. a PEP 484 “int” corresponds to Python 3’s int type, which does not necessarily fits inside one machine word; even worse, an “int” annotation allows arbitrary int subclasses).

PyPy FAQ (Would type annotations help PyPy’s performance?)

Async/Await

This proposal makes coroutines a native Python language feature, and clearly separates them from generators. This removes generator/coroutine ambiguity, and makes it possible to reliably define coroutines without reliance on a specific library. This also enables linters and IDEs to improve static code analysis and refactoring.

PEP 0492

async and await are very good ideas. In fact, these two keywords have been an integral part of C# for quite some time now and make programing with tasks much easier. These two key words makes python feel a lot more abstract, whereas turning generators into coroutines (even through the use of a decorator) and then using them like generators feels rather clumsy. In simple terms, async creates a coroutine for you. For example, the following is a coroutine:

import asyncio
import datetime

@asyncio.coroutine
def display_date(loop):
    end_time = loop.time() + 5.0
    while True:
        print(datetime.datetime.now())
        if (loop.time() + 1.0) >= end_time:
            break
        yield from asyncio.sleep(1)

loop = asyncio.get_event_loop()
# Blocking call which returns when the display_date() coroutine is done
loop.run_until_complete(display_date(loop))
loop.close()

The above will turn into the following under PEP-492:

import asyncio
import datetime

async def display_date(loop):
    end_time = loop.time() + 5.0
    while True:
        print(datetime.datetime.now())
        if (loop.time() + 1.0) >= end_time:
            break
        await asyncio.sleep(1)

loop = asyncio.get_event_loop()
# Blocking call which returns when the display_date() coroutine is done
loop.run_until_complete(display_date(loop))
loop.close()

However, what is being introduced is syntactic sugar meaning that this does not change python on a fundamental level. For example, introducing these native constructs will not magically make python be able to compete with node (V8).

Summary

The new features being added are subtle but will go a long way into making Python 3 the abstract language that it needs to be as well as provide the tooling required to make it a potential choice for huge code-bases. However, this does not mean that Python has become a better choice over other languages such as golang for high performance applications. So, although these new features will definitely persuade people to move from Python 2 to Python 3 (at least for new projects), it does not necessarily mean that more and more people will leave Python seeking abstract yet high performance alternatives.


[1]PEP-0484. This type checker is based on the mypy type checker created by one of the authors of PEP-0484, Jukka Lehtosalo.
[2]PEP-0492. This is a pretty radical change since it means Python will get new key words.
[3]PyCharm provides type checking through its skeletons. These are a lot like the stubs that PEP-0484 is proposing.
[4]DefinitelyTyped is a collection for stub files for a language called TypeScript which is basically a typed superset of JavaScript. These stub files help IDE and editors in providing better code completion.
[5]PyPy's FAQ talks about this issue, and its lack of effect on performance under "Would type annotations help PyPy’s performance?". Look under the "PEP-484 Type Hints" sub section.