Pythonic ways to avoid Pointers - python

Assume we have an object a and we want modify data which is structures like this
a.substructure1.subsubstructure1.name_of_the_data1
and this
a.substructure2.subsubstructure2.name_of_the_data2
To access this structure we call an external method get_the_data_shortcut(a) which is heavily parameterized (for example the parameter subsstructure specifies which substructure to return). This seems very redundant but there is a very good default setting for all these parameter which makes sense. Also, this function will return another branch of data if the default branch is not available.
How do I modify get_the_data_shortcut(a) ?
b = get_the_data_shortcut(a)
b = b + 1
Then, get_the_data_shortcut(a) is unchanged because well Python is not Java.
Do I need a setter? Mostly, this is not my code and written by people who write pythonic code, and I am trying to keep up with those standards.

As you discovered changing the object b refers to won't modify the a object (or its substructures). If you want to do this you will need a method similar to your get_the_data_shortcut(a). Namely a
set_the_data_shortcut(a, newvalue)
Alternatively you could have a method which would return the substructure the value was stored in and manipulate that..
# returns a.substructure2.subsubstructure2
# or a.substructure1.subsubstructure1 based on the value of kind
substruct = get_the_substructure(a, kind)
substruct.name_of_data1 += 1

Python uses reference types, just like java.
However, when you do
b = b + 1
you are not updating the object you have. Instead, you are creating a new object and assigning it to the variable b.
If you want to update the value of b in the data structure, you should follow your suggestion and write a setter for the data structure.

Related

Are numbers considered objects in python?

I am aware that numeric values are immutable in python. I have also read how everything is an object in python. I just want to know if numeric types are also objects in python. Because if they are objects, then the variables are actually reference variables right? Does it mean that if I pass a number to a function and modify it inside a function, then two number objects with two references are created? Is there a concept of primitive data types in python?
Note: I too was thinking it as objects. But visualizing in python tutor says differnt:
http://www.pythontutor.com/visualize.html#mode=edit
def test(a):
a+=10
b=100
test(b)
Or is it a defect in the visualization tool?
Are numeric types objects?
>>> isinstance(1, object)
True
Apparently they are. :-).
Note that you might need to adjust your mental model of an object a little. It seems to me that you're thinking of object as something that is "mutable" -- that isn't the case. In reality, we need to think of python names as a reference to an object. That object may hold references to other objects.
name = something
Here, the right hand side is evaluated -- All the names are resolved into objects and the result of the expression (an object) is referenced by "name".
Ok, now lets consider what happens when you pass something to a function.
def foo(x):
x = 2
z = 3
foo(z)
print(z)
What do we expect to happen here? Well, first we create the function foo. Next, we create the object 3 and reference it by the name z. After that, we look up the value that z references and pass that value to foo. Upon entering foo, that value gets referenced by the (local) name x. We then create the object 2 and reference it by the local name x. Note, x has nothing to do with the global z -- They're independent references. Just because they were referencing the same object when you enter the function doesn't mean that they have to reference the function for all time. We can change what a name references at any point by using an assignment statement.
Note, your example with += may seem to complicate things, but you can think of a += 10 as a = a + 10 if it helps in this context. For more information on += check out: When is "i += x" different from "i = i + x" in Python?
Everything in Python is an object, and that includes the numbers. There are no "primitive" types, only built-in types.
Numbers, however, are immutable. When you perform an operation with a number, you are creating a new number object.

How to assign the same value to multiple variables in Python

I have a question about Python, which I am kinda new to. Let's assume I want to assign a 5x5 matrix to 10 different variables. I searched across the board, and what I found was this:
a, b, c, d, e = myMatrix
That is all good, but in Python, this means that when I change a, I also change the values of the other variables, because they all come down to the same memory adress if I got this correctly.
My question: Is there a fast way of assigning myMatrix to multiple Variables and giving each of them a unique memory adress? So that I can change myMatrix without changing a, b or c. I do explicitly search for some kind of multi-assignment.
Thanks in advance!
use the [copy] module
>>> import copy
>>> new_matrix = copy.deepcopy(myMatrix)
As Burhan Khalid and juanchopanza have pointed out, what happens in your example will be different in, for example,
the case where "myMatrix" is actually an array of 5 values (in which case "a" will get the first value and "e" will get the last value), and
the case where "myMatrix" is an instance of an Object (in which case "a" through "e" will each refer to the same object).
It sounds like you're thinking of case 2, and hoping for something like a macro which will automatically expand your single assignment statement (with a single Right Hand Side Value, whether Deep Copied or not) into 5 assignment statements, each with its own Left Hand Side, Right Hand Side, and Deep Copy.
I don't know of any way to do this, and I would point out that:
When most OO languages encounter an assignment operation like yours with an Object on the Right Hand Side, the compiler/interpreter looks for a "copy constructor" for the class of the RHS Object, and uses it (if found) to generate the value (an Object reference) which is actually assigned to the LHS. Can you even imagine what the syntax could look like for what you're describing, where the copy constructor is supposed to be called 5 times to yield 5 different Objects on the RHS, references to which are then assigned to five different variables on the LHS? What could you possibly write in a single assignment statement that would make this intent clear?
If you're writing code where Deep vs. Shallow copies will actually have an effect on behavior then IMHO you owe it to yourself and anyone else who has to read and maintain your code to make this obvious and explicit - like the answer from wong2, repeated 5 times (once for each of the 5 variables).

memory management with objects and lists in python

I am trying to understand how exactly assignment operators, constructors and parameters passed in functions work in python specifically with lists and objects. I have a class with a list as a parameter. I want to initialize it to an empty list and then want to populate it using the constructor. I am not quite sure how to do it.
Lets say my class is --
class A:
List = [] # Point 1
def __init1__(self, begin=[]): # Point 2
for item in begin:
self.List.append(item)
def __init2__(self, begin): # Point 3
List = begin
def __init3__(self, begin=[]): # Point 4
List = list()
for item in begin:
self.List.append(item)
listObj = A()
del(listObj)
b = listObj
I have the following questions. It will be awesome if someone could clarify what happens in each case --
Is declaring an empty like in Point 1 valid? What is created? A variable pointing to NULL?
Which of Point 2 and Point 3 are valid constructors? In Point 3 I am guessing that a new copy of the list passed in (begin) is not made and instead the variable List will be pointing to the pointer "begin". Is a new copy of the list made if I use the constructor as in Point 2?
What happens when I delete the object using del? Is the list deleted as well or do I have to call del on the List before calling del on the containing object? I know Python uses GC but if I am concerned about cleaning unused memory even before GC kicks in is it worth it?
Also assigning an object of type A to another only makes the second one point to the first right? If so how do I do a deep copy? Is there a feature to overload operators? I know python is probably much simpler than this and hence the question.
EDIT:
5. I just realized that using Point 2 and Point 3 does not make a difference. The items from the list begin are only copied by reference and a new copy is not made. To do that I have to create a new list using list(). This makes sense after I see it I guess.
Thanks!
In order:
using this form is simply syntactic sugar for calling the list constructor - i.e. you are creating a new (empty) list. This will be bound to the class itself (is a static field) and will be the same for all instances.
apart from the constructor name which must always be init, both are valid forms, but mean different things.
The first constructor can be called with a list as argument or without. If it is called without arguments, the empty list passed as default is used within (this empty list is created once during class definition, and not once per constructor call), so no items are added to the static list.
The second must be called with a list parameter, or python will complain with an error, but using it without the self. prefix like you are doing, it would just create a new local variable name List, accessible only within the constructor, and leave the static A.List variable unchanged.
Deleting will only unlink a reference to the object, without actually deleting anything. Once all references are removed, however, the garbage collector is free to clear the memory as needed.
It is usually a bad idea to try to control the garbage collector. instead. just make sure you don't hold references to objects you no longer need and let it make its work.
Assigning a variable with an object will only create a new reference to the same object, yes. To create a deep copy use the related functions or write your own.
Operator overloading (use with care, it can make things more confusing instead of clearer if misused) can be done by overriding some special methods in the class definition.
About your edit: like i pointed above, when writing List=list() inside the constructor, without the self. (or better, since the variable is static, A.) prefix, you are just creating an empty variable, and not overriding the one you defined in the class body.
For reference, the usual way to handle a list as default argument is by using a None placeholder:
class A(object):
def __init__(self, arg=None):
self.startvalue = list(arg) if arg is not None else list()
# making a defensive copy of arg to keep the original intact
As an aside, do take a look at the python tutorial. It is very well written and easy to follow and understand.
"It will be awesome if someone could clarify what happens in each case" isn't that the purpose of the dis module ?
http://docs.python.org/2/library/dis.html

Modifying variables in Python function is affecting variables with different names outside the function

I have a nested dictionary containing a bunch of data on a number of different objects (where I mean object in the non-programming sense of the word). The format of the dictionary is allData[i][someDataType], where i is a number designation of the object that I have data on, and someDataType is a specific data array associated with the object in question.
Now, I have a function that I have defined that requires a particular data array for a calculation to be performed for each object. The data array is called cleanFDF. So I feed this to my function, along with a bunch of other things it requires to work. I call it like this:
rm.analyze4complexity(allData[i]['cleanFDF'], other data, other data, other data)
Inside the function itself, I straight away re-assign the cleanFDF data to another variable name, namely clFDF. I.e. The end result is:
clFDF = allData[i]['cleanFDF']
I then have to zero out all of the data that lies below a certain threshold, as such:
clFDF[ clFDF < threshold ] = 0
OK - the function works as it is supposed to. But now when I try to plot the original cleanFDF data back in the main script, the entries that got zeroed out in clFDF are also zeroed out in allData[i]['cleanFDF']. WTF? Obviously something is happening here that I do not understand.
To make matters even weirder (from my point of view), I've tried to do a bodgy kludge to get around this by 'saving' the array to another variable before calling the function. I.e. I do
saveFDF = allData[i]['cleanFDF']
then run the function, then update the cleanFDF entry with the 'saved' data:
allData[i].update( {'cleanFDF':saveFDF} )
but somehow, simply by performing clFDF[ clFDF < threshold ] = 0 within the function modifies clFDF, saveFDF and allData[i]['cleanFDF'] in the main friggin' script, zeroing out all the entires at the same array indexes! It is like they are all associated global variables somehow, but I've made no such declarations anywhere...
I am a hopeless Python newbie, so no doubt I'm not understanding something about how it works. Any help would be greatly appreciated!
You are passing the value at allData[i]['cleanFDF'] by reference (decent explanation at https://stackoverflow.com/a/430958/337678). Any changes made to it will be made to the object it refers to, which is still the same object as the original, just assigned to a different variable.
Making a deep copy of the data will likely fix your issue (Python has a deepcopy library that should do the trick ;)).
Everything is a reference in Python.
def function(y):
y.append('yes')
return y
example = list()
function(example)
print(example)
it would return ['yes'] even though i am not directly changing the variable 'example'.
See Why does list.append evaluate to false?, Python append() vs. + operator on lists, why do these give different results?, Python lists append return value.

cache in python function

This appeared as some test question.
If you consider this function which uses a cache argument as the 1st argument
def f(cache, key, val):
cache[key] = val
# insert some insanely complicated operation on the cache
print cache
and now create a dictionary and use the function like so:
c = {}
f(c,"one",1)
f(c,"two",2)
this seems to work as expected (i.e adding to the c dictionary), but is it actually passing that reference or is it doing some inefficient copy ?
The dictionary passed to cache is not copied. As long as the cache variable is not rebound inside the function, it stays the same object, and modifications to the dictionary it refers to will affect the dictionary outside.
There is not even any need to return cache in this case (and indeed the sample code does not).
It might be better if f was a method on a dictionary-like object, to make this more conceptually clear.
If you use the id() function (built-in, does not need to be imported) you can get a unique identifier for any object. You can use that to confirm that you are really and truly dealing with the same object and not any sort of copy.

Categories