What I don’t like about Python

At work I’ve had the pleasure of programming a preprocessor for our code base. It just takes an XML specification of structures and generates Lua accessors, serializers, etc. I did it in Python, mostly because our developers already have to have Python installed to use the test suite. This has given me a chance to get more familiar with Python as a programming language for doing real things (previously I knew enough to make minor modifications to code; basic block syntax, variable model).

First, I love the “usual” syntax. That is, as long as you’re doing fairly straightforward things, the syntax is beautiful, readable, and clean. I don’t mind the common complaint that variable declaration is implicit with assignment (I also don’t mind having a short declarator, either; as long as it catches name typos, I’m happy). I was happy to learn that Python had real closures via nested function definitions. Previously I believed that it did not support closures, which is why I did not port Glop to Python.

I think Python excels where other languages are mediocre is in communicating to the compiler and the user simultaneously, clearly, what is going on. The language layout essentially enforces that with statement-only assignment, indentation interpretation, and an overall choice of function names over syntactic constructs; it’s difficult to write code that looks like it’s doing something different than it is.

However, that strength of Python is also a weakness. After enough abstraction and indirection, you want to be communicating different things to the computer and to the user. If you communicated to the user what you communicate to the computer, you will just confuse the heck out of the user. At this level, I want a language which allows me to communicate precisely to the compiler, while communicating a good way to think about the code to the user. I would never be able to write a Language::AttributeGrammar for Python, simply because the level of abstraction in that module is too high. It takes a lot of my brainpower to think about its control flow path (driven by thunk evaluation), and if I were forced to communicate that path to the user, my module would be useless.

I think one of the biggest reasons why it is not possible to express code at this level is the lack of a multi-statement lambda. It seems like whether lambda is single-statement or multi-statement is irrelevant, since you can do the equivalent of a multi-statement lambda by using a nested function and naming the codeblock. However, it scatters around the control flow when it should be linear (or when it should look like it’s linear when it’s actually not). It also forces the user to think about functions and the concept of passing functions (which many programmers are not comfortable with), when it could just look like another block construct.

For example, I wrote a function which atomizes output; i.e. do a whole bunch of output operations in sequence, and if an exception is thrown at any time during the process, then nothing is output. This was important because I was using exceptions to communicate when an accessor couldn’t be generated, and I didn’t want my program to write the accessor declaration and opening brace if I couldn’t put something inside.

My use case code could look like this (using Ruby-style parenless block passing):

atomicio(handle) lambda handle:
    handle.write("foo")
    if blah:
        raise Exception, "baz"
    handle.write("bar")

The idea is that if blah ended up being true, then neither "foo" nor "bar" would be written. This abstraction has been important for the clarity of my script. Instead, it looks like this:

def writeFooBar(handle):
    handle.write("foo")
    if blah:
        raise Exception, "baz"
    handle.write("bar")
atomicio(handle, writeFooBar)

Okay, that's not that bad, but your eyes jump around a bit more as you're reading this. First you skip past the def because it's a def, then you read the atomicio line and notice that it's passing the writeFooBar function, so you scan for the function and find it above you. It doesn't read like a book, it reads like an academic paper :-).

It gets worse when you want composability, though. Say I wanted to do atomic I/O on two handles at once:

atomicio(fh1) lambda fh1:
    atomicio(fh2) lambda fh2:
        ...

As opposed to:

def writeFooBar(fh1):
    def writeFooBarHelper(fh2):
        ...
    atomicio(fh2, writeFooBarHelper)
atomicio(fh1, writeFooBar)

Now do you see the difference? The structure of the first example is pretty obvious, but the second example is significantly more awkward. writeFooBar logically takes two arguments, but that clarity has been lost by the explicit currying. Additionally, the order ends up appearing backwards in the source file (fh2 before fh1).

Of course, I don't think this is something that should be fixed in Python. If Python decided to abandon its very explicit vision of being clear about what you're telling the compiler, it would fail at being a more expressive language. The languages that really excel at this kind of thing are Perl, because you can play syntactic tricks on the Programmer's eyes, and Haskell, because the entire language is designed around functional abstraction (a Monad, for example, is just an extension of the idea I presented here). Python would have to change too much to compete with them, and it would almost certainly lose every value it has that puts it in front of other languages.

About these ads

15 thoughts on “What I don’t like about Python

  1. I tend to agree. I think Guido has said that multi-statement anonymous functions is syntactically impossible but I more or less buy your syntax for it. Encapsulating the block-statements within a new indentation level would be parseable to me. I do think this proposed syntax would run into problems with multiple anonymous functions being passed though.

  2. Luke, it’s probably not ideal but as far as your example is concerned you should be able to use withW for that purpose in Python 2.5 e.g.

    with atomicio as handle:
    handle.write("foo")
    if blah:
    raise Exception, "baz"
    handle.write("bar")

    And multiple handle does, of course, work pretty much the same way

    with atomicio as fh1:
    with atomicio as fh2:
    #code

    Not as flexible as true lambdas or blocks, or Lisp macros, but functional enough for your precise example, I think.

  3. Even in python2.4, you can use decrator
    @atomicio
    def writeFooBar(fh1):
    @atomicio
    def writeFooBarHelper(fh2):

    writeFooBarHelper(fh2)
    writeFooBar(fh1)

  4. I agree with previous commentors that “with” seems like the right tool for this situation. I’ll try and help out by getting formatting to work.

    with atomicio(handle) as handle:
    ___handle.write(“foo”)
    ___if blah:
    ______raise Exception, “baz”
    ___handle.write(“bar”)

  5. You can go one better with context managers – you can use contextlib.nested to clump together a series of contextmanagers – it’s quite nice

    note, in python 2.5 use “from __future__ import with_statement”

  6. >I was happy to learn that Python had real closures via nested function definitions. Previously I believed that it did not support closures, which is why I did not port Glop to Python.

    Not exactly. I’m not an expert on the terminology so I’m not even going to try, but there are a couple of things to worry about when using closures in Python. 1) you can’t assign to free variables. 2) free variables must be in certain positions. Not sure how to word this last one, but the variables can’t have been defined localy and close over them. I believe you can only correctly close over arguments to a function. (I’m a little sketchy on some of this so don’t take my word for it.)

    Some examples:

    (1) >>> funcs = [(lambda: i**2) for i in range(10)]
    (2) >>> [f() for f in funcs]
    ____(2) [81, 81, 81, 81, 81, 81, 81, 81, 81, 81]
    (3) >>> def bank_account(amount):
    ____… def deposit(n):
    ____… amount += n
    ____… return deposit
    ____…
    (4) >>> my_account = bank_account(5)
    (5) >>> my_account(15)
    —————————————————————————
    UnboundLocalError Traceback (most recent call last)
    /home/iclark/ in ()
    /home/iclark/ in deposit(n)
    UnboundLocalError: local variable ‘amount’ referenced before assignment

  7. My main disappointment with Python is that it is a dynamic language where types are objects and runtime instantiable but I’ve yet to see any guide to how this feature can be really utilized. The object system’s internal behavior is relatively poorly explained; e.g. if you look hard enough you’ll find that the rules for attribute lookup depend on data descriptors vs. non-data descriptors but not, say, why the rules depend on this or how this dependency can or should be used. Letting your code construct classes for you is pretty powerful.

    The lack of good guides is, I suppose, something of an opportunity as well.

  8. I agree that you should look into decorators. The pattern in the second code snippet is very similar to what a decorator does:

    def F():
    ___pass
    F = G(F)

    is equivalent to

    @G
    def F():
    ___pass

    The function G returns a function F’ that takes the same parameters as F and does whatever it likes before and after calling F – if it even calls F at all. So that could include exception handling.

    def G(func):
    ___def funcwrapper(h):
    ______…
    ______func(h)
    ______…
    ___return funcwrapper

  9. Ian: here is how closures work when you upgrade to Python 3 (presuming that’s feasible for you yet)

    def_bank_account(amount):
    ____def deposit(n):
    ________nonlocal amount
    ________amount_+=_n
    ____def balance():
    ________nonlocal amount
    ________return_amount
    ____return_deposit, balance

    deposit, balance = bank_account(5)
    deposit(15)
    print(balance()) # => 20

  10. this blog comment should support indent :(

    anyway, i think this all seems to be a little bit complicated… is it really the case that lambda simplifies stuff?

  11. I started coding in python for my personal website…But soon I had to stop the work and started rethinking my language choice…Just for one simple reason.

    It’s nothing but Python’s Indentation Enforcement. I couldn’t stop wondering, such a modern language is enforcing indentation. I can foresee the danger if my codebase is python, I can imagine the bugs going to injected. I can foresee how much it is going to waste the hours of programmers just this reason..

    Let me give you some scenarios where I faced.

    a = ['cat', 'window', 'defenestrate']
    for x in a:
    print x, len(x)
    a.remove(a)
    if len(a) < 1:
    a.append("cat")
    a.append("two")
    a.append("three")
    a.append("four")
    a.append("five")
    a.append("six")

    The above code is pretty simple. However assume while editing a programmer mistakenly deletes a character(tab) as below

    a = ['cat', 'window', 'defenestrate']
    for x in a:
    print x, len(x)
    a.remove(a)
    if len(a) < 1:
    a.append("one")
    a.append("two")
    a.append("three")
    a.append("four")
    a.append("five")
    a.append("six")

    In all modern languages this is not at all a problem. They are going to produce same output. In this case python is neither going to complain not going to produce expected output. In fact the output may be very scary because the last statement of if block got shifted to the for loop.

    The above like scenarios leading to the fact that "There is no automated tool to indent python code"

    That means the programmer has to spend his time in keeping on indenting. That will be an overhead when I am working in a big and scrollable source code. I guess you understand what I mean, in fact you might have experienced the same.

    For ex I have a C code

    for(i=0; i<10;i++)
    {
    if(a[i]500)
    {
    //Some biggggggggggggggggggggggggggggg logic of multiple lines
    //Some biggggggggggggggggggggggggggggg logic of multiple lines
    //Some biggggggggggggggggggggggggggggg logic of multiple lines
    //Some biggggggggggggggggggggggggggggg logic of multiple lines
    //Some biggggggggggggggggggggggggggggg logic of multiple lines
    //Some biggggggggggggggggggggggggggggg logic of multiple lines
    }
    }

    Later my boss is asking to relax/remove the if(a[i]>500) condition and make it default….

    If I am using C I can simply comment the if(a[i]>500) line…whereas python will surely complain. Because the bigLogic is double indented or to the worse it will append the whole logic to the previous if(a[i]<500) which is a HORROR.

    So I have to shift the whole 3 page source code one tab towards left. Using editors like vi such a task is cumbersome. I mentioned earlier python indentation cannot be automated So I have to indent those 3 page code manually. In that lot of wrapped code, nested code may be there, that's going to be a tedious. So just removing an if block is thus tedious. However in C I have editors to do that, or even without indentation the language will produce correct output.

    So my primary worry is, this era is of computers, we are making computers to do/ease our job, but sadly python is pushing to a situation where computers ask us to do the indentation which is doable by themselves though.

  12. @Nehemiah, I’m sorry, your arguments make no sense. I think the most efficient way to quantify how many times this complaint has been made and discussed is to use the Ackermann function. So I’ll hold my tongue, and just say the only useful thing I wouid have said anyhow.

    The “cumbersome” operation you describe of changing the indentation of a whole bunch of lines is something I do many times per day, whether or not I am working in an indentation-sensitive syntax (I keep my code properly indented in all languages). In vim, go to the top line of the block, type capital V (visual line mode), go to the bottom line of the block, type <. If you do this between the deletions of the if statement and the closing brace, then this adds only two keystrokes to the whole operation.

    Why so much suffering over such small problems?

  13. Pal, I do get your VIM hack..However if you imagine the time needed to do it, ‘dd’ for deleting the if line, press V, scroll down manually, there is no block matching like ‘%’ placed on ‘{‘, After reaching down press ‘<'. The cost is not on the keystrokes but the time to scroll, effort of eyes. Also I need to double check using my eyes whether I have done it right which counts one more scrolling, If I don't check I may inject bug…So python puts me more time in editing rather programming, if the more I want to be swift the more I will inject bugs(I am talking about large codebase) !! Yeap python indentation cannot be automated, because It is not a block structured language.

    In other language editors like NetBeans, Eclipse etc…just one keystroke Alt+Shift+F will do the job…IMO indentation is computer's job not ours

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s