Understanding slicing in python [duplicate] - python

This question already has answers here:
Slicing a list in Python without generating a copy
(5 answers)
Closed 5 months ago.
import numpy as np
x = np.arange(10**5)
for i in xrange(x.size):
foo(x[3000:40000])
and just another version of the code above
import numpy as np
x = np.arange(10**5)
slice_of_x = x[3000:40000]
for i in xrange(x.size):
foo(slice_of_x)
Will the second code be faster or not? I am inclined to think in terms of pointer in C (so that infact the first may be faster) but I understand python is no C.
O.k. This post
Slicing a list in Python without generating a copy
answers my question.

Read this http://scipy-lectures.github.io/advanced/advanced_numpy/#life-of-ndarray, in particular the section Slicing with integers.
From the page [on slicing]
Everything can be represented by changing only shape, strides, and possibly adjusting the data pointer!
Never makes copies of the data
So the overhead of creating a slice is small: constructing a Python object which stores a few tuples and a pointer to the data.
In your first code sample, you create the slice in the loop each time. In the second example you create the slice once. The difference will probably be negligible for most functions foo that you would choose to write.

Related

Is there a different way to automatically adjust the length of a jupyter list? [duplicate]

This question already has answers here:
Efficient 2d list initialization [python]
(2 answers)
Closed 9 months ago.
I am using for loops to shorten my code for grabbing specific sections of a dataframe and finding the mean and standard deviation. I am then saving those values into a list, since there are different data types.
I wanted to simply initializing the list by writing list=[0,0,0]*(len(cities)) to automatically create an x by 3 list, where x is however many cities there are. The problem with this, as shown in the picture, is that setting a single index to a value, such as list[0][1]=1, doesn't only affect that index, but instead affects every 3rd column in every row.
I was hoping for a more elegant solution than simply manually counting the number of cities and copy/pasting [0,0,0],.... that many times. I realize a solution is not to use for loops as I did and avoid lists altogether, but that's not the point for this question.
Simple image of the issue described above. Value being applied to all rows of list.
https://i.stack.imgur.com/hruFz.png
zeros = [[0,0,0] for _ in range(50)]
is one way
but probably better to do
zeros = numpy.zeros((50,3))
(if a numpy.array will suffice)
if you have mixed datatypes you will need to use (e.g. strings and numbers)
zeros = numpy.zeros((50,3),object)

"in" operation on sets and lists in Python [duplicate]

This question already has answers here:
What makes sets faster than lists?
(8 answers)
Closed 1 year ago.
I have a book on Python which says:
in is a very fast operation on sets:
stopwords_list = ["a", "an"] + hundreds_of_other_words + ["yet", "you"]
"zip" in stopwords_list # False, but have to check every element
stopwords_set = set(stopwords_list)
"zip" in stopwords_set # Very fast to check
I have two questions:
Why is in faster on sets than on lists?
If the in operator really is faster on sets, then why don't the makers of Python just rewrite the in method for lists to do x in set(list)? Why can't the idea in this book just be made part of the language?
set uses hash tables to get an average O(1) lookup instead of O(n) lookup with lists, that's why it's way faster, provided the elements are hashable (read more: What makes sets faster than lists?).
Now converting a list into a set when in is called requires the full list to be parsed and converted to set, so it's even slower that searching in a list (even if the element is at the start of the list, the conversion is done on all the elements so there's no short-circuiting).
There's no magic. Converting a list into a set is useful, but only at init phase, and then the lookup is done in the set multiple times during the processing.
But in that case, creating a set directly is the best way.

Shuffle a copy of a list [duplicate]

This question already has an answer here:
How to properly copy a list in python
(1 answer)
Closed 2 years ago.
It seems that the copy of a list is shuffled too?
So if I do for 2 lists:
data = examples
np.random.shuffle(examples)
then the list data is also shuffled? Why?
Python doesn't create copies of objects when it thinks they're not needed.
In your case you can use built-in copy module:
import copy
data = copy.deepcopy(examples)
np.random.shuffle(examples)
You have to do examples = np.copy(data) because otherwise the two variables will reference the same position in memory meaning they both change.

finding the position of an element within a numpy array in python [duplicate]

This question already has answers here:
Is there a NumPy function to return the first index of something in an array?
(20 answers)
Closed 3 years ago.
I want to ask a question about finding the position of an element within an array in Python's numpy package.
I am using Jupyter Notebook for Python 3 and have the following code illustrated below:
concentration_list = array([172.95, 173.97, 208.95])
and I want to write a block of code that would be able to return the position of an element within the array.
For this purpose, I wanted to use 172.95 to demonstrate.
Initially, I attempted to use .index(), passing in 172.95 inside the parentheses but this did not work as numpy does not recognise the .index() method -
concentration_position = concentration_list.index(172.95)
AttributeError: 'numpy.ndarray' object has no attribute 'index'
The Sci.py documentation did not mention anything about such a method being available when I accessed the site.
Is there any function available (that I may not have discovered) to solve the problem?
You can go through the where function from the numpy library
import numpy as np
concentration_list = np.array([172.95, 173.97, 208.95])
number = 172.95
print(np.where(concentration_list == number)[0])
Output : [0]
Use np.where(...) for this purpose e.g.
import numpy as np
concentration_list = np.array([172.95, 173.97, 208.95])
index=np.ravel(np.asarray(concentration_list==172.95).nonzero())
print(index)
#outputs (array of all indexes matching the condition):
>> [0]

How to access the elements in numpy array of sets? [duplicate]

This question already has answers here:
Dictionary in a numpy array?
(2 answers)
Closed 5 years ago.
By mistake I used numpy array over a set to save my result. The run was time-consuming and it is hard to repeat it. It seems something is stored in the variable, but how can I retrieve the elements inside it?
To simplify, I did something like
#import numpy as np
result = np.array({1,2,3,4})
Now how can I access the elements (i.e., 1,2,3,4) in result?
The only "problem" here is that your array is a scalar (otherwise you could use result[0]). In this case you can simply use result.item().

Categories