Is indexed slice a view - python

In torch slicing creates a View i.e. the data is not copied into a new tensor i.e. it acts as ALIAS
b = a[3:10, 2:5 ]
My understanding is that is not the case for indexed slice. f.e.
b = a[[1,2,3] : [5,11]]
Is this correct ?
And second is there a module that mimic a view i.e. internally holds the indexes but access the original tensor i.e. act as a sort of proxy ?
Something like this, but more general :
class IXView:
def __init__(self, ixs, ten):
self.ixs = ixs
self.ten = ten
def __getitem__(self, rows) :
return self.ten[self.ixs[rows],:]

You are correct that iterable-indexed tensor slices do not create a view but rather create a new copy in memory. It seems in practice that this is because any tensor view operation that creates non-contiguous tensor data then calls output.contiguous() under the hood. The one exception seems to be torch.view. More on this here.. You can see this for yourself by calling is_contiguous() or <tensor>.storage().data_ptr() to view the memory address.
a = torch.rand([10,10,10])
a.is_contiguous()
>>> True
a.storage().data_ptr()
>>> 93837543268480 # will be different for you
### normal torch slicing
b = a[3:4,2:8,5:6]
b.is_contiguous()
>>> False
b.storage().data_ptr()
>>> 93837543268480 # same as for a, because this is a view of the data in a
### List slicing of a tensor
c = a[[1,2,3],[2,3,4],:]
c.is_contiguous()
>>> True
c.storage().data_ptr()
>>> 93839531853056 # different than a

Related

Understanding Mutability and Multiple Variable Assignment to Class Objects in Python

I'm looking for some clarification regarding mutability and class objects. From what I understand, variables in Python are about assigning a variable name to an object.
If that object is immutable then when we set two variables to the same object, it'll be two separate copies (e.g. a = b = 3 so a changing to 4 will not affect b because 3 is a number, an example of an immutable object).
However, if an object is mutable, then changing the value in one variable assignment will naturally change the value in the other (e.g. a = b = [] -> a.append(1) so now both a and b will refer to "[1]")
Working with classes, it seems even more fluid than I believed. I wrote a quick example below to show the differences. The first class is a typical Node class with a next pointer and a value. Setting two variables, "slow" and "fast", to the same instance of the Node object ("head"), and then changing the values of both "slow" and "fast" won't affect the other. That is, "slow", "fast", and "head" all refer to different objects (verified by checking their id() as well).
The second example class doesn't have a next pointer and only has a self.val attribute. This time changing one of the two variables, "p1" and "p2", both of which are set to the same instance, "start", will affect the other. This is despite that self.val in the "start" instance is an immutable number.
'''
The below will have two variable names (slow, fast) assigned to a head Node.
Changing one of them will NOT change the other reference as well.
'''
class Node:
def __init__(self, x, next=None):
self.x = x
self.next = next
def __str__(self):
return str(self.x)
n3 = Node(3)
n2 = Node(2, n3)
n1 = Node(1, n2)
head = n1
slow = fast = head
print(f"Printing before moving...{head}, {slow}, {fast}") # 1, 1, 1
while fast and fast.next:
fast = fast.next.next
slow = slow.next
print(f"Printing after moving...{head}, {slow}, {fast}") # 1, 2, 3
print(f"Checking the ids of each variable {id(head)}, {id(slow)}, {id(fast)}") # all different
'''
The below will have two variable names (p1, p2) assigned to a start Dummy.
Changing one of them will change the other reference as well.
'''
class Dummy:
def __init__(self, val):
self.val = val
def __str__(self):
return str(self.val)
start = Dummy(100)
p1 = p2 = start
print(f"Printing before changing {p1}, {p2}") # 100, 100
p1.val = 42
print(f"Printing after changing {p1}, {p2}") # 42, 42
This is a bit murky for me to understand what is actually going on under the hood and I'm seeking clarification so I can feel confident in setting multiple variable assignments to the same object expecting a true copy (without resorting to "import copy; copy.deepcopy(x);")
Thank you for your help
This isn't a matter of immutability vs mutability. This is a matter of mutating an object vs reassigning a reference.
If that object is immutable then when we set two variables to the same object, it'll be two separate copies
This isn't true. A copy won't be made. If you have:
a = 1
b = a
You have two references to the same object, not a copy of the object. This is fine though because integers are immutable. You can't mutate 1, so the fact that a and b are pointing to the same object won't hurt anything.
Python will never make implicit copies for you. If you want a copy, you need to copy it yourself explicitly (using copy.copy, or some other method like slicing on lists). If you write this:
a = b = some_obj
a and b will point to the same object, regardless of the type of some_obj and whether or not it's mutable.
So what's the difference between your examples?
In your first Node example, you never actually alter any Node objects. They may as well be immutable.
slow = fast = head
That initial assignment makes both slow an fast point to the same object: head. Right after that though, you do:
fast = fast.next.next
This reassigns the fast reference, but never actually mutates the object fast is looking at. All you've done is change what object the fast reference is looking at.
In your second example however, you directly mutate the object:
p1.val = 42
While this looks like reassignment, it isn't. This is actually:
p1.__setattr__("val", 42)
And __setattr__ alters the internal state of the object.
So, reassignment changes what object is being looked at. It will always take the form:
a = b # Maybe chained as well.
Contrast with these that look like reassignment, but are actually calls to mutating methods of the object:
l = [0]
l[0] = 5 # Actually l.__setitem__(0, 5)
d = Dummy()
d.val = 42 # Actually d.__setattr__("val", 42)
You overcomplicate things. The fundamental, simple rule is: each time you use = to assign an object to a variable, you make the variable name refer to that object, that's all. The object being mutable or not makes no difference.
With a = b = 3, you make the names a and b refer to the object 3. If you then make a = 4, you make the name a refer to the object 4, and the name b still refers to 3.
With a = b = [], you've created two names a and b that refer to the same list object. When doing a.append(1), you append 1 to this list. You haven't assigned anything to a or b in the process (you didn't write any a = ... or b = ...). So, whether you access the list through the name a or b, it's still the same list that you manipulate. It can just be called by two different names.
The same happens in your example with classes: when you write fast = fast.next.next, you make the name fast refer to a new object.
When you do p1.val = 42, you don't make p1 refer to a new different instance, but you change the val attribute of this instance. p1 and p2are still two names for this unique instance, so using either name lets you refer to the same instance.
Mutable and Immutable Objects
When a program is run, data objects in the program are stored in the computer’s
memory for processing. While some of these objects can be modified at that memory
location, other data objects can’t be modified once they are stored in the memory. The
property of whether or not data objects can be modified in the same memory location
where they are stored is called mutability. We can check the mutability of an object by checking its memory location before and
after it is modified. If the memory location remains the same when the data object is
modified, it means it is mutable. To check the memory location of where a data object is stored, we use the function, id(). Consider the following example
a=[5, 10, 15]
id(a)
#1906292064
a[1]=20
id(a)
#1906292064
#Assigning values to the list a. The ID of the memory location where a is stored.
#Replacing the second item in the list,10 with a new item, 20.
#print(a) Using the print() function to verify the new value of a.# Using the function #id() to get the memory location of a.
#The ID of the memory location where a is stored.
the memory location has not changed as the ID remains (1906292064)
remains the same before and after the variable is modified. This indicates that the list
is mutable, i.e., it can be modified at the same memory location where it is stored

Python Appending objects to a list

Beginner with Python, need some help to understand how to manage list of objects.
I built a list of simple objects representing a coordinate
class Point:
def __init__(self, x, y):
self.x=0
self.y=0
I create a empty list to store different points :
combs = []
point = Point(0, 0)
then I build different points using point and ever ytime appending to the list combs
For instance:
point.x=2
point.y=2
combs.append(point)
point.x=4
point.y=4
combs.append(point)
I expect that combs is something like [.. 2,2 4,4] on the contrary it's [....4,4 4,4].
It means that every time I change the instance of a point, I change all the points stored in the list with the latest value.
How can I do this?
The thing is when you're trying to change the value of x and y , you're expecting to have a new object (like a new x and y with different values) but you aren't. What happens is I think is whenever you set point.x =4 and point.y = 4 is you're just changing the attribute x and y in your class
take a look at this link. This helped me a lot, I encountered that kind of problem or should I say similar of yours
I suggest using the copy package
https://www.programiz.com/python-programming/shallow-deep-copy
You are appending the same variable to the combine. You need to create a new Point object and initialize it with the new values.
combine = []
p = Point(2,2)
combine.append(p)
p = Point(4,4)
combine.append(p)
This is because Python is using reference count for the garbage collection.
In your example you create point which increments the ref count.
Then you pass it to the list which increment the count. Then change the value and pass the same variable. Which increments the count once again.
Think of it more like passing a reference or memory pointer of the point variable to the list. You gave the list twice the same pointer.
So you need to create different variables or make a deep copies https://docs.python.org/3.8/library/copy.html
Custom classes, unless they are built on an immutable type, are mutable. This means that you have appended a reference to your list, and that changing the value of the references will change every instance of the reference. Consider this:
class Test():
def __init__(self, a):
self.a = a
>>> t1 = Test(1)
>>> t1.a
1
>>> t2 = t1
>>> t2.a
1
>>> t2.a = 2
>>> t2.a
2
>>> t1.a
2
See how changing t2 also changed t1?
So you have a few options. You could create new points instead of reusing old ones, you could use a copy() method, or you could write a method into your Point class that exports something immutable, like a tuple of the (x, y) values and append that to your list, instead of appending the entire object to your list.
You are working with only a single Point. Construct a second one. See commented line below.
point = Point(0, 0)
point.x=2
point.y=2
combs.append(point)
point = Point(0, 0) # add this
point.x=4
point.y=4
combs.append(point)
By the way, your __init__ ignores its parameters -- throws them away. A better version is below. We assign self.x=x to make use of the parameter. (Likewise y).
def __init__(self, x, y):
self.x=x
self.y=x
You need to pass one value into point
How to add an item to your list
combs = []
point = 1 # For example
combs.append(point)
Use command lines to study
Try to use BASH or CMD... Use command lines... They will have instant feedback of your code
Good place to find basic stuff
Try to see the examples on w3scholl. It is a great place. Here is the link for W3Scholl - Python - List
Understand basics first
Before you jump into classes, try to understand lists very well! You will learn more and build a solid knowledge if you take a step by step growing! Keep pushing!!!

How to include an array as a parameter of an object in python?

So I know how to create objects in python with things like strings and numbers as parameters, but how would I include an array as one of the parameters of an object I want to create?
It's the same process as with the all other attributes. Asign the attributes via def __init__() function and that's it. For example:
class EvenNumbers:
def __init__(self, array):
self.evenNumbers = array
Create an object:
array = [2, 6, 4] #NOTE: this structure is type list not array
en = EvenNumbers(array)
And then use it for whatever you need.

How do I copy a class and its list members in Python 2.7 and not copy the references?

I read this about Python classes (link) and it seems to be the issue I am having.
Here is an excerpt from my class and other code:
class s_board:
def __init__(self):
self.__board = [[n for n in range(1, 10)] for m in range(81)]
self.__solved = [False for m in range(81)]
def copy(self):
b = s_board()
b.__board = self.__board[:]
b.__solved = self.__solved[:]
return b
if __name__ == '__main__':
A = s_board()
B = A.copy()
B.do_some_operation_on_lists()
When I call B's method that does something to the list, A's lists seem to be affected as well.
So my questions:
Am I not copying the class or the lists correctly?
Is there another issue here?
How do I fix it so that I get a new copy of the class?
self.__board[:] creates a new list containing references to all the same objects that were in self.__board. Since self.__board contains lists, and lists are mutable, you end up with the two s_board instances with partially aliased data, and changing one affects the other.
As Raymond Hettinger suggested, you can use the copy.deepcopy to (mostly) guarantee that you take a true copy of an object and don't share any data. I say mostly, as I believe there are some strange objects that deepcopy will not work on, but for normal things like lists and straightforward classes it will work fine.
I have an additional suggestion though. You call b = s_board(), which goes to all the effort of constructing the lists for the new blank board, and then you throw them away by assigning to b.__board and b.__solved. It seems to be like it would be better to do something like the following:
class s_board:
def __init__(self, board=None, solved=None):
if board is None:
self.__board = [[n for n in range(1, 10)] for m in range(81)]
else:
self.__board = copy.deepcopy(board)
if solved is None:
self.__solved = [False for m in range(81)]
else:
self.__solved = copy.deepcopy(solved)
def copy(self):
b = s_board(self.__board, self.__solved)
return b
Now if you call A = s_board() you get a new blank board, and if you call A.copy() you get a distinct copy of A, without having had to allocate and then discard a new blank board.
try using deepcopy() instead of copy()
copy() inserts references if it is able, deepcopy() should copy all of the members without using references.
The inner lists are being shared. Here's an article that explains what is happening: http://www.python-course.eu/deep_copy.php
To fix the code, you can use copy.deepcopy to make sure there is no shared data:
def copy(self):
b = s_board()
b.__board = copy.deepcopy(self.__board)
b.__solved = copy.deepcopy(self.__solved)
return b

Is there a way in Python to return a value via an output parameter?

Some languages have the feature to return values using parameters also like C#.
Let’s take a look at an example:
class OutClass
{
static void OutMethod(out int age)
{
age = 26;
}
static void Main()
{
int value;
OutMethod(out value);
// value is now 26
}
}
So is there anything similar in Python to get a value using parameter, too?
Python can return a tuple of multiple items:
def func():
return 1,2,3
a,b,c = func()
But you can also pass a mutable parameter, and return values via mutation of the object as well:
def func(a):
a.append(1)
a.append(2)
a.append(3)
L=[]
func(L)
print(L) # [1,2,3]
You mean like passing by reference?
For Python object the default is to pass by reference. However, I don't think you can change the reference in Python (otherwise it won't affect the original object).
For example:
def addToList(theList): # yes, the caller's list can be appended
theList.append(3)
theList.append(4)
def addToNewList(theList): # no, the caller's list cannot be reassigned
theList = list()
theList.append(5)
theList.append(6)
myList = list()
myList.append(1)
myList.append(2)
addToList(myList)
print(myList) # [1, 2, 3, 4]
addToNewList(myList)
print(myList) # [1, 2, 3, 4]
Pass a list or something like that and put the return value in there.
In addition, if you feel like reading some code, I think that pywin32 has a way to handle output parameters.
In the Windows API it's common practice to rely heavily on output parameters, so I figure they must have dealt with it in some way.
You can do that with mutable objects, but in most cases it does not make sense because you can return multiple values (or a dictionary if you want to change a function's return value without breaking existing calls to it).
I can only think of one case where you might need it - that is threading, or more exactly, passing a value between threads.
def outer():
class ReturnValue:
val = None
ret = ReturnValue()
def t():
# ret = 5 won't work obviously because that will set
# the local name "ret" in the "t" function. But you
# can change the attributes of "ret":
ret.val = 5
threading.Thread(target = t).start()
# Later, you can get the return value out of "ret.val" in the outer function
Adding to Tark-Tolonen's answer:
Please absolutely avoid altering the object reference of the output argument in your function, otherwise the output argument won't work. For instance, I wish to pass an ndarray into a function my_fun and modify it
def my_fun(out_arr)
out_arr = np.ones_like(out_arr)
print(out_arr) # prints 1, 1, 1, ......
print(id(out_arr))
a = np.zeros(100)
my_fun(a)
print(a) # prints 0, 0, 0, ....
print(id(a))
After calling my_fun, array a stills remains all zeros since the function np.ones_like returns a reference to another array full of ones and assigns it to out_arr instead of modifying the object reference passed by out_arr directly. Running this code you will find that two print(id()) gives different memory locations.
Also, beware of the array operators from numpy, they usually returns a reference to another array if you write something like this
def my_fun(arr_a, arr_b, out_arr)
out_arr = arr_a - arr_b
Using the - and = operator might cause similar problems. To prevent having out_arr's memory location altered, you can use the numpy functions that does the exactly same operations but has a out parameter built in. The proceeding code should be rewritten as
def my_fun(arr_a, arr_b, out_arr):
np.subtract(arr_a, arr_b, out = out_arr)
And the memory location of out_arr remains the same before and after calling my_fun while its values gets modified successfully.

Categories