2

Assume that we have to get some value and change it from function.

Way-1

def change_b(obj):
    obj['b'] = 4


result = {'a': 1, 'b': 2}
change_b(obj=result)
print(result)

As you know that function change_b() change result['b']'s value directly in function.

Way-2

from copy import deepcopy


def change_b(obj):
    temp = deepcopy(obj)
    temp['b'] = 4
    return temp


result = {'a': 1, 'b': 2}
result = change_b(obj=result)
print(result)

But Way-2 copying object to new object and replace value from new object.

So, original object doesn't affect anything. (Also, no side-effect)

Maybe Way-2 is more safe, because it doesn't change original object.

I wonder that which one is more general and pythonic way?

Thanks.

4
  • 3
    Why do you do a deep copy instead of a shallow copy? Commented Oct 24, 2019 at 3:54
  • @wjandrea It just example. doesn't important either deepcopy or shallow copy. Commented Oct 24, 2019 at 3:59
  • some functions in pandas works in both ways. They create new data or they can update existing data if you add inplace=True . And this is nice way. Commented Oct 24, 2019 at 4:28
  • Both ways are OK. The worst is when function doesn't get arguments but uses global to change data. Commented Oct 24, 2019 at 4:29

3 Answers 3

4

Summary

If the API is explicit that it is updating its input, Way-1 is fine and desirable: add_route(route_map, new_route).

If the API is primarily about doing something else, then Way-2 avoids unintended side-effects.

Examples within Python

Way-1: The dict.update() and list.sort() do in-place updates because that is their primary job.

Way-2: The builtin sorted() function produces a new sorted list from its inputs which it takes care not to alter. Roughly, it does this:

def sorted(iterable, *, key=None, reverse=False):
    result = list(iterable)                # copy the data
    result.sort(key=key, reverse=reverse)  # in-place sort
    return result

Hope that clarifies when to copy and when to mutate in-place :-)

Sign up to request clarification or add additional context in comments.

2 Comments

In other words, there is no general way because it depends on the situation?
@Hide: Yes, that's true. However, I think of it as a single rule, "APIs should focus on their primary task and not have incidental side-effects".
4

"Explicit is better than implicit"
...
"In the face of ambiguity, refuse the temptation to guess."

- PEP 20


Modifying a parameter within a function is not necessarily a bad thing. What is bad is to do so with no good reason for doing so. If you're clear with your function name and documentation that the parameter will be modified within the function, then that's fine. If the function modifies the parameter with no indication it's trying to do so, that's less fine.

In this case, your Way-1 is simpler and more explicit. It's obvious that the variable is going to be changed, and the way in which it's going to be changed can be determined easily by looking at the code.

Way-2 is worse, because the name change_b would imply that the parameter is going to be modified, and it's not. Returning a modified version of a parameter without modifying the original is a standard design pattern in python, but it's best to be explicit about it.

For example, python's built-in set data structure has counterpart methods: set.difference(other) and set.difference_update(other). In both cases, they do the same thing: compute the difference between this set and the given set. In the former case, that result is returned without modifying the original set. In the latter case, the original set is modified and nothing is returned. It's very straightforward to figure out which does what.

In general, you should probably avoid updating a value and returning that same value, because that's more ambiguous. Note how most python methods do one or the other, but not both (and those that do do both, like list.pop(), do so sensibly, with the returned object not being the object that was modified).

Comments

1

As I understand Python, the most Pythonic way to approach this problem is to make it very clear what is happening. As long as you do that, I don't believe it matters.

my_dict = {'a': 3, 'b': 4}
double_values_in_dict(my_dict)

# Some other code

This is a contrived example, but it's pretty clear what's intended to happen here, even without the method definition included. What would be unclear is if you assigned the return value of double_values_in_dict to a new variable; I wouldn't know then what you may have done to the original dict object, and I'd have to start digging through that method to figure out how it actually works.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.