Categories
default-parameters language-design least-astonishment python

“Least Astonishment” and the Mutable Default Argument

3158

Anyone tinkering with Python long enough has been bitten (or torn to pieces) by the following issue:

def foo(a=[]):
    a.append(5)
    return a

Python novices would expect this function called with no parameter to always return a list with only one element: [5]. The result is instead very different, and very astonishing (for a novice):

>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]
>>> foo()
[5, 5, 5, 5]
>>> foo()

A manager of mine once had his first encounter with this feature, and called it “a dramatic design flaw” of the language. I replied that the behavior had an underlying explanation, and it is indeed very puzzling and unexpected if you don’t understand the internals. However, I was not able to answer (to myself) the following question: what is the reason for binding the default argument at function definition, and not at function execution? I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs?)

Edit:

Baczek made an interesting example. Together with most of your comments and Utaal’s in particular, I elaborated further:

>>> def a():
...     print("a executed")
...     return []
... 
>>>            
>>> def b(x=a()):
...     x.append(5)
...     print(x)
... 
a executed
>>> b()
[5]
>>> b()
[5, 5]

To me, it seems that the design decision was relative to where to put the scope of parameters: inside the function, or “together” with it?

Doing the binding inside the function would mean that x is effectively bound to the specified default when the function is called, not defined, something that would present a deep flaw: the def line would be “hybrid” in the sense that part of the binding (of the function object) would happen at definition, and part (assignment of default parameters) at function invocation time.

The actual behavior is more consistent: everything of that line gets evaluated when that line is executed, meaning at function definition.

21

  • 69

    Complementary question – Good uses for mutable default arguments

    Feb 6, 2012 at 20:54


  • 7

    I have not doubt mutable arguments violate least astonishment principle for an average person, and I have seen beginners stepping there, then heroically replacing mailing lists with mailing tuples. Nevertheless mutable arguments are still in line with Python Zen (Pep 20) and falls into “obvious for Dutch” (understood/exploited by hard core python programmers) clause. The recommended workaround with doc string is the best, yet resistance to doc strings and any (written) docs is not so uncommon nowadays. Personally, I would prefer a decorator (say @fixed_defaults).

    – Serge

    Apr 6, 2017 at 16:04

  • 5

    My argument when I come across this is: “Why do you need to create a function that returns a mutable that could optionally be a mutable you would pass to the function? Either it alters a mutable or creates a new one. Why do you need to do both with one function? And why should the interpreter be rewritten to allow you to do that without adding three lines to your code?” Because we are talking about rewriting the way the interpreter handles function definitions and evocations here. That’s a lot to do for a barely necessary use case.

    Jun 1, 2017 at 21:22


  • 26

    “Python novices would expect this function to always return a list with only one element: [5].” I’m a Python novice, and I wouldn’t expect this, because obviously foo([1]) will return [1, 5], not [5]. What you meant to say is that a novice would expect the function called with no parameter will always return [5].

    Jul 6, 2017 at 16:08


  • 6

    This question asks “Why did this [the wrong way] get implemented so?” It doesn’t ask “What’s the right way?”, which is covered by [Why does using arg=None fix Python’s mutable default argument issue?]*(stackoverflow.com/questions/10676729/…). New users are almost always less interested in the former and much more in the latter, so that’s sometimes a very useful link/dupe to cite.

    – smci

    Apr 21, 2019 at 8:48


1831

Actually, this is not a design flaw, and it is not because of internals or performance. It comes simply from the fact that functions in Python are first-class objects, and not only a piece of code.

As soon as you think of it this way, then it completely makes sense: a function is an object being evaluated on its definition; default parameters are kind of “member data” and therefore their state may change from one call to the other – exactly as in any other object.

In any case, the effbot (Fredrik Lundh) has a very nice explanation of the reasons for this behavior in Default Parameter Values in Python.
I found it very clear, and I really suggest reading it for a better knowledge of how function objects work.

19

  • 91

    To anyone reading the above answer, I strongly recommend you take the time to read through the linked Effbot article. As well as all the other useful info, the part on how this language feature can be used for result caching/memoisation is very handy to know!

    Oct 14, 2011 at 0:05

  • 135

    Even if it’s a first-class object, one might still envision a design where the code for each default value is stored along with the object and re-evaluated each time the function is called. I’m not saying that would be better, just that functions being first-class objects does not fully preclude it.

    – gerrit

    Jan 11, 2013 at 10:55

  • 474

    Sorry, but anything considered “The biggest WTF in Python” is most definitely a design flaw. This is a source of bugs for everyone at some point, because no one expects that behavior at first – which means it should not have been designed that way to begin with. I don’t care what hoops they had to jump through, they should have designed Python so that default arguments are non-static.

    Jun 7, 2013 at 21:28


  • 283

    Whether or not it’s a design flaw, your answer seems to imply that this behaviour is somehow necessary, natural and obvious given that functions are first-class objects, and that simply isn’t the case. Python has closures. If you replace the default argument with an assignment on the first line of the function, it evaluates the expression each call (potentially using names declared in an enclosing scope). There is no reason at all that it wouldn’t be possible or reasonable to have default arguments evaluated each time the function is called in exactly the same way.

    Jan 8, 2014 at 22:16


  • 52

    The design doesn’t directly follow from functions are objects. In your paradigm, the proposal would be to implement functions’ default values as properties rather than attributes.

    – bukzor

    May 3, 2014 at 20:46

317

Suppose you have the following code

fruits = ("apples", "bananas", "loganberries")

def eat(food=fruits):
    ...

When I see the declaration of eat, the least astonishing thing is to think that if the first parameter is not given, that it will be equal to the tuple ("apples", "bananas", "loganberries")

However, suppose later on in the code, I do something like

def some_random_function():
    global fruits
    fruits = ("blueberries", "mangos")

then if default parameters were bound at function execution rather than function declaration, I would be astonished (in a very bad way) to discover that fruits had been changed. This would be more astonishing IMO than discovering that your foo function above was mutating the list.

The real problem lies with mutable variables, and all languages have this problem to some extent. Here’s a question: suppose in Java I have the following code:

StringBuffer s = new StringBuffer("Hello World!");
Map<StringBuffer,Integer> counts = new HashMap<StringBuffer,Integer>();
counts.put(s, 5);
s.append("!!!!");
System.out.println( counts.get(s) );  // does this work?

Now, does my map use the value of the StringBuffer key when it was placed into the map, or does it store the key by reference? Either way, someone is astonished; either the person who tried to get the object out of the Map using a value identical to the one they put it in with, or the person who can’t seem to retrieve their object even though the key they’re using is literally the same object that was used to put it into the map (this is actually why Python doesn’t allow its mutable built-in data types to be used as dictionary keys).

Your example is a good one of a case where Python newcomers will be surprised and bitten. But I’d argue that if we “fixed” this, then that would only create a different situation where they’d be bitten instead, and that one would be even less intuitive. Moreover, this is always the case when dealing with mutable variables; you always run into cases where someone could intuitively expect one or the opposite behavior depending on what code they’re writing.

I personally like Python’s current approach: default function arguments are evaluated when the function is defined and that object is always the default. I suppose they could special-case using an empty list, but that kind of special casing would cause even more astonishment, not to mention be backwards incompatible.

16

  • 49

    I think it’s a matter of debate. You are acting on a global variable. Any evaluation performed anywhere in your code involving your global variable will now (correctly) refer to (“blueberries”, “mangos”). the default parameter could just be like any other case.

    Jul 15, 2009 at 18:16

  • 68

    Actually, I don’t think I agree with your first example. I’m not sure I like the idea of modifying an initializer like that in the first place, but if I did, I’d expect it to behave exactly as you describe — changing the default value to ("blueberries", "mangos").

    – Ben Blank

    Jul 15, 2009 at 18:26

  • 14

    The default parameter is like any other case. What is unexpected is that the parameter is a global variable, and not a local one. Which in turn is because the code is executed at function definition, not call. Once you get that, and that the same goes for classes, it’s perfectly clear.

    Jul 15, 2009 at 18:59

  • 25

    I find the example misleading rather than brilliant. If some_random_function() appends to fruits instead of assigning to it, the behaviour of eat() will change. So much for the current wonderful design. If you use a default argument that’s referenced elsewhere and then modify the reference from outside the function, you are asking for trouble. The real WTF is when people define a fresh default argument (a list literal or a call to a constructor), and still get bit.

    – alexis

    Oct 9, 2014 at 15:37


  • 25

    You just explicitly declared global and reassigned the tuple – there is absolutely nothing surprising if eat works differently after that.

    Jan 26, 2015 at 16:07

290

The relevant part of the documentation:

Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that the same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function, e.g.:

def whats_on_the_telly(penguin=None):
    if penguin is None:
        penguin = []
    penguin.append("property of the zoo")
    return penguin

9

  • 256

    The phrases “this is not generally what was intended” and “a way around this is” smell like they’re documenting a design flaw.

    – bukzor

    May 3, 2014 at 20:53

  • 14

    @bukzor: Pitfalls need to be noted and documented, which is why this question is good and has received so many upvotes. At the same time, pitfalls don’t necessarily need to be removed. How many Python beginners have passed a list to a function that modified it, and were shocked to see the changes show up in the original variable? Yet mutable object types are wonderful, when you understand how to use them. I guess it just boils down to opinion on this particular pitfall.

    – Matthew

    Jun 19, 2014 at 17:54

  • 41

    The phrase “this is not generally what was intended” means “not what the programmer actually wanted to happen,” not “not what Python is supposed to do.”

    – holdenweb

    Dec 19, 2014 at 11:48

  • 17

    @holdenweb Wow, I’m mega-late to the party. Given the context, bukzor is completely right: they’re documenting behavior/consequence that was not “intended” when they they decided the language should exec the function’s definition. Since it’s an unintended consequence of their design choice, it’s a design flaw. If it were not a design flaw, there’d be no need to even offer “a way around this”.

    Oct 3, 2017 at 7:35

  • 8

    We could take it to chat and discuss how else it could be, but the semantics have been thoroughly debated and nobody could come up with a sensible mechanism for create-default-value-on-call. One serious issue is that the scope on call is often entirely different from that on definition, making name resolution uncertain if defaults were evaluated at call time. A “way around” means “you can achieve your desired end in the following way,” not “this is a mistake in Python’s design.”

    – holdenweb

    Oct 3, 2017 at 16:03