I am trying to insert values, one at at time, from several python lists of lists (i.e. 2D lists) into another 2D list. (I know numpy is better at this, but I am trying to compare the performance of lists to numpy, so please don't just suggest numpy.) I want to insert the values at specific locations, hence the indexing on the left hand side.
resampled_pix_spot_list is a 240 by 240 list of lists, and pix_spot_list is a 225 by 225 list of lists.
The error I am getting, from the final four lines in the example, is "TypeError: 'float' object is not subscriptable". I get that pix_prod_bl[0][0], for example, is a float, but I don't understand why I can't insert it into a particular set of indices in resampled_pix_spot_list.
Edit 1- added minimal working example.
Edit 2- in adding the working example, I found that I accidentally had the line commented where I convert the lists back to numpy, and somehow I misinterpreted the Spyder console about where the error was originating. Anyway it works now, thank you very much for the quick feedback. I guess I'll leave this here in case it's helpful to anyone else.
Edit 3- pix_spot_values is an array of data, so just a random array of floats between 0 and 1 will suffice.
xc=57
yc=189
rebin=15
# fraction pixel offset requiring interpolation
dx=xc*rebin-int(np.floor(xc*rebin)) # positive value between 0 and 1
dy=yc*rebin-int(np.floor(yc*rebin)) # positive value between 0 and 1
# weights for interpolation
w00=(1-dy)*(1-dx)
w10=dy*(1-dx)
w01=(1-dy)*dx
w11=dy*dx
# now the rest of the offset is an integer shift
dx=int(np.floor(xc*rebin))-int(np.floor(xc))*rebin # positive integer between 0 and 14
dy=int(np.floor(yc*rebin))-int(np.floor(yc))*rebin # positive integer between 0 and 14
def new_pix_spot(w00, w10, w01, w11, pix_spot_list, ny_spot, nx_spot, rebin, dy, dx):
#first change numpy array to list
pix_spot_list=pix_spot_values.tolist()
#preallocate array of zeros
resampled_pix_spot_list=[[0 for x in range (ny_spot + rebin)] for y in range(nx_spot+rebin)]
#create 2D lists
pix_prod_bl = [[x*w00 for x in y] for y in pix_spot_list]#bottom left
pix_prod_br = [[x*w10 for x in y] for y in pix_spot_list]#bottom right
pix_prod_tl = [[x*w01 for x in y] for y in pix_spot_list]#top left
pix_prod_tr = [[x*w11 for x in y] for y in pix_spot_list]#top right
for i in range (len(pix_spot_list)):
for j in range (len(pix_spot_list)):
k=dy + i
m=dx + j
n=dy + 1 + i
p=dx + 1 + i
resampled_pix_spot_list[k][m] += pix_prod_bl[i][j] #bottom left
resampled_pix_spot_list[n][m] += pix_prod_br[i][j] #bottom right
resampled_pix_spot_list[k][p] += pix_prod_tl[i][j] #top left
resampled_pix_spot_list[n][p] += pix_prod_tr[i][j] #top right
resampled_pix_spot_values = np.array(resampled_pix_spot_list)
return resampled_pix_spot_values
Inserting and Replacing
To insert values into a list in Python, you must work with the list object (for example, resampled_pix_spot_list[0]) rather than the elements within it (resampled_pix_spot_list[0][0], as you tried).
In both Python 2 and 3, you can insert into a list with your_list.insert(<index>, <element>) (list insertion docs here).
So to insert a number to the left of your chosen coordinate, the code would be:
resampled_pix_spot_list[k].insert(m, pix_prod_bl[i][j])
If you wanted to replace the pixel at that position, you would write:
resampled_pix_spot_list[k][m] = pix_prod_bl[i][j]
(Notice the [k] vs [k][m].) In short: To insert, talk to the list; to replace, talk to the element.
Pitfalls of Repeated Inserts
Just a tip: if you're planning on repeatedly inserting values into specific places in a list, try to iterate from the end of the list, backwards. If you don't, you'll have to adjust your indices, since each .insert() call will shift part of your list to the right.
To see what I mean, let's imagine I have the list [1, 2, 3] and want to end up with [1, 88, 2, 99, 3] via insertions. The order we insert matters. Compare the wrong order (iterating forwards):
data = [1, 2, 3]
>>> data.insert(1, 88)
>>> print(data)
[1, 88, 2, 3] # so far so good
>>> data.insert(2, 99)
>>> print(data)
[1, 88, 99, 2, 3] # oops! the first insert changed my indices, so index "2" was wrong!
with the right order (iterating backwards):
data = [1, 2, 3]
>>> data.insert(2, 99)
>>> print(data)
[1, 2, 99, 3] # so far so good
>>> data.insert(1, 88)
>>> print(data)
[1, 88, 2, 99, 3] # same insertions + different order = different results!
Slices
Some food for thought: Python 2.7 and 3 both allow you to replace whole "slices" of lists with a very clean syntax, which would also help you avoid "off-by-one" errors (slice notation docs here). For example:
>>> data = [1, 2, 3]
>>> data[1:2] = [88, data[1], 99] # safer, shorter, faster, and clearer
>>> print(data)
[1, 88, 2, 99, 3]
Working with slices might be a bit more declarative and clear. Hope this helps!
Related
Edit: I fixed y so that x,y have the same length
I don't understand much about programing but I have a giant mass of data to analyze and it has to be done in Python.
Say I have two arrays:
import numpy as np
x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([25,18,16,19,30,5,9,20,80,45])
and say I want to choose the values in y which are greater than 17, and keep only the values in x which has the same index as the left values in y. for example I want to erase the first value of y (25) and accordingly the matching value in x (1).
I tried this:
filter=np.where(y>17, 0, y)
but I don't know how to filter the x values accordingly (the actual data are much longer arrays so doing it "by hand" is basically imposible)
Solution: using #mozway tip, now that x,y have the same length the needed code is:
import numpy as np
x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([25,18,16,19,30,5,9,20,80,45])
x_filtered=x[y>17]
As your question is not fully clear and you did not provide the expected output, here are two possibilities:
filtering
Nunique arrays can be sliced by an array (iterable) of booleans.
If the two arrays were the same length you could do:
x[y>17]
Here, xis longer than y so we first need to make it the same length:
import numpy as np
x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([25,18,16,19,30,5,9,20])
x[:len(y)][y>17]
Output: array([1, 2, 4, 5, 8])
replacement
To select between x and y based on a condition, use where:
np.where(y>17, x[:len(y)], y)
Output:
array([ 1, 2, 16, 4, 5, 5, 9, 8])
As someone with little experience in Numpy specifically, I wrote this answer before seeing #mozway's excellent answer for filtering. My answer works on more generic containers than Numpy's arrays, though it uses more concepts as a result. I'll attempt to explain each concept in enough detail for the answer to make sense.
TL;DR:
Please, definitely read the rest of the answer, it'll help you understand what's going on.
import numpy as np
x = np.array([1,2,3,4,5,6,7,8,9,10])
y = np.array([25,18,16,19,30,5,9,20])
filtered_x_list = []
filtered_y_list = []
for i in range(min(len(x), len(y))):
if y[i] > 17:
filtered_y_list.append(y[i])
filtered_x_list.append(x[i])
filtered_x = np.array(filtered_x_list)
filtered_y = np.array(filtered_y_list)
# These lines are just for us to see what happened
print(filtered_x) # prints [1 2 4 5 8]
print(filtered_y) # prints [25 18 19 30 20]
Pre-requisite Knowledge
Python containers (lists, arrays, and a bunch of other stuff I won't get into)
Lets take a look at the line:
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
What's Python doing?
The first thing it's doing is creating a list:
[1, 2, 3] # and so on
Lists in Python have a few features that are useful for us in this solution:
Accessing elements:
x_list = [ 1, 2, 3 ]
print(x_list[0]) # prints 1
print(x_list[1]) # prints 2, and so on
Adding elements to the end:
x_list = [ 1, 2, 3 ]
x_list.append(4)
print(x_list) # prints [1, 2, 3, 4]
Iteration:
x_list = [ 1, 2, 3 ]
for x in x_list:
print(x)
# prints:
# 1
# 2
# 3
Numpy arrays are slightly different: we can still access and iterate elements in them, but once they're created, we can't modify them - they have no .append, and there are other modifications one can do with lists (like changing one value, or deleting a value) we can't do with numpy arrays.
So the filtered_x_list and the filtered_y_list are empty lists we're creating, but we're going to modify them by adding the values we care about to the end.
The second thing Python is doing is creating a numpy array, using the list to define its contents. The array constructor can take a list expressed as [...], or a list defined by x_list = [...], which we're going to take advantage of later.
A little more on iteration
In your question, for every x element, there is a corresponding y element. We want to test something for each y element, then act on the corresponding x element, too.
Since we can access the same element in both arrays using an index - x[0], for instance - instead of iterating over one list or the other, we can iterate over all indices needed to access the lists.
First, we need to figure out how many indices we're going to need, which is just the length of the lists. len(x) lets us do that - in this case, it returns 10.
What if x and y are different lengths? In this case, I chose the smallest of the two - first, do len(x) and len(y), then pass those to the min() function, which is what min(len(x), len(y)) in the code above means.
Finally, we want to actually iterate through the indices, starting at 0 and ending at len(x) - 1 or len(y) - 1, whichever is smallest. The range sequence lets us do exactly that:
for i in range(10):
print(i)
# prints:
# 0
# 1
# 2
# 3
# 4
# 5
# 6
# 7
# 8
# 9
So range(min(len(x), len(y))), finally, gets us the indices to iterate over, and finally, this line makes sense:
for i in range(min(len(x), len(y))):
Inside this for loop, i now gives us an index we can use for both x and y.
Now, we can do the comparison in our for loop:
for i in range(min(len(x), len(y))):
if y[i] > 17:
filtered_y_list.append(y[i])
Then, including xs for the corresponding ys is a simple case of just appending the same x value to the x list:
for i in range(min(len(x), len(y))):
if y[i] > 17:
filtered_y_list.append(y[i])
filtered_x_list.append(x[i])
The filtered lists now contain the numbers you're after. The last two lines, outside the for loop, just create numpy arrays from the results:
filtered_x = np.array(filtered_x_list)
filtered_y = np.array(filtered_y_list)
Which you might want to do, if certain numpy functions expect arrays.
While there are, in my opinion, better ways to do this (I would probably write custom iterators that produce the intended results without creating new lists), they require a somewhat more advanced understanding of programming, so I opted for something simpler.
Given the below examples:
array = [1,2,3,4,0]
In: array[0] += 2
Out: 3
In: array[1:3] += 2
Out: TypeError: 'int' object is not iterable
In: array[1:3] += [100, 100]
Out: [1, 2, 3, 100, 100, 4, 5]
Can someone explain me why the two last examples wont return something like [1,102,103,4,0] AND if it is possible doing this with a simple slice, not using a for loop...
When using the slice operator it refers to sub-part of the list, so operating on it requires a list too (we can use the "add" operator on list with list, and not with an int, unlike in some other languages).
Therefore the following:
array[1:3] += 2
Throws:
TypeError: 'int' object is not iterable
Because 2 is not a list (actually an iterable, which is more general than list).
But:
array[1:3] += [100, 100]
Works and adds (actually appends) the two elements in the middle (index 3) of array according to indexes:
[3, 2, 3, 100, 100, 4, 0]
Without using a for loop, as requested
If you want to add to the values in the slice:
array = [1,2,3,4,0]
array.__setitem__(slice(1, 3), [x+2 for x in array[1:3]])
# [1, 102, 103, 4, 0]
print(array)
Which can be written also as:
array = [1,2,3,4,0]
def apply_on_slice(lst, start, end, callable):
array.__setitem__(slice(start, end), [callable(x) for x in array[start:end]])
apply_on_slice(array, 1, 3, lambda x : x + 100)
# [1, 102, 103, 4, 0]
print(array)
Using a for loop
Here are some other options to do so, elegantly:
array[1:3] = (x+2 for x in array[1:3])
Or of course, using a regular for loop, which is more efficient than using slicing twice:
for i in range(1, 3):
array[i] += 2
You are clearly expecting the operation to be applied element-wise, as in R and other languages (and within Python, in numpy arrays and the like). E.g., adding 2 to a list would add 2 to each of the list's elements. This is not how Python lists work: Each of the statements you ask about constructs one object on each side of the operator (a list or list slice, a list element, an integer), then applies the operation (just once) to these two objects. So if you "add" two lists you get concatenation, if you try to add a list and an int you get a TypeError, etc. The details you can read in #Aviv's answer.
I'm trying to make a script, where the input is an array with random numbers. I try to delete the lowest number in the array which is no problem. But if there are several occurrences of this number in the array, how do I make sure that it is only the first occurrence of this number that gets deleted?
Let's say we have the following array:
a = np.array([2,6,2,1,6,1,9])
Here the lowest number is 1, but since it occurs two times, I only want to remove the first occurence so i get the following array as a result:
a = np.array([2,6,2,6,1,9])
Since you're using NumPy, not native Python lists:
a = np.array([2,6,2,1,6,1,9])
a = np.delete(a, a.argmin())
print(a)
# [2 6 2 6 1 9]
np.delete: Return a new array with sub-arrays along an axis deleted.
np.argmin: Returns the indices of the minimum values along an axis.
With a NumPy array, you cannot delete elemensts with del as you can in a list.
A simple way to do this with a native Python list is:
>> a = [1,2,3,4,1,2,1]
>> del a[a.index(min(a))]
>> a
[2, 3, 4, 1, 2, 1]
You can simple do two things first sort and then shift array. For example
var list = [2, 1, 4, 5, 1];
list=list.sort(); // result would be like this [1,1,2,4,5]
list=list.shift();// result would be [1,2,4,5]
I'm looking at getting values in a list with an increment.
l = [0,1,2,3,4,5,6,7]
and I want something like:
[0,4,6,7]
At the moment I am using l[0::2] but I would like sampling to be sparse at the beginning and increase towards the end of the list.
The reason I want this is because the list represents the points along a line from the center of a circle to a point on its circumference. At the moment I iterate every 10 points along the lines and draw a circle with a small radius on each. Therefore, my circles close to the center tend to overlap and I have gaps as I get close to the circle edge. I hope this provides a bit of context.
Thank you !
This can be more complicated than it sounds... You need a list of indices starting at zero and ending at the final element position in your list, presumably with no duplication (i.e. you don't want to get the same points twice). A generic way to do this would be to define the number of points you want first and then use a generator (scaled_series) that produces the required number of indices based on a function. We need a second generator (unique_ints) to ensure we get integer indices and no duplication.
def scaled_series(length, end, func):
""" Generate a scaled series based on y = func(i), for an increasing
function func, starting at 0, of the specified length, and ending at end
"""
scale = float(end) / (func(float(length)) - func(1.0))
intercept = -scale * func(1.0)
print 'scale', scale, 'intercept', intercept
for i in range(1, length + 1):
yield scale * func(float(i)) + intercept
def unique_ints(iter):
last_n = None
for n in iter:
if last_n is None or round(n) != round(last_n):
yield int(round(n))
last_n = n
L = [0, 1, 2, 3, 4, 5, 6, 7]
print [L[i] for i in unique_ints(scaled_series(4, 7, lambda x: 1 - 1 / (2 * x)))]
In this case, the function is 1 - 1/2x, which gives the series you want [0, 4, 6, 7]. You can play with the length (4) and the function to get the kind of spacing between the circles you are looking for.
I am not sure what exact algorithm you want to use, but if it is non-constant, as your example appears to be, then you should consider creating a generator function to yield values:
https://wiki.python.org/moin/Generators
Depending on what your desire here is, you may want to consider a built in interpolator like scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html#scipy.interpolate.interp1d
Basically, given your question, you can't do it with the basic slice operator. Without more information this is the best answer I can give you :-)
Use the slice function to create a range of indices. You can then extend your sliced list with other slices.
k = [0,1,2,3,4,5,6,7]
r = slice(0,len(k)//2,4)
t = slice(r.stop,None,1)
j = k[r]
j.extend(k[t])
print(j) #outputs: [0,4,5,6,7]
What I would do is just use list comprehension to retrieve the values. It is not possible to do it just by indexing. This is what I came up with:
l = [0, 1, 2, 3, 4, 5, 6, 7]
m = [l[0]] + [l[1+sum(range(3, s-1, -1))] for s in [x for x in range(3, 0, -1)]]
and here is a breakdown of the code into loops:
# Start the list with the first value of l (the loop does not include it)
m = [l[0]]
# Descend from 3 to 1 ([3, 2, 1])
for s in range(3, 0, -1):
# append 1 + sum of [3], [3, 2] and [3, 2, 1]
m.append(l[ 1 + sum(range(3, s-1, -1)) ])
Both will give you the same answer:
>>> m
[0, 4, 6, 7]
I made this graphic that would I hope will help you to understand the process:
I would like to ask what the following does in Python.
It was taken from http://danieljlewis.org/files/2010/06/Jenks.pdf
I have entered comments telling what I think is happening there.
# Seems to be a function that returns a float vector
# dataList seems to be a vector of flat.
# numClass seems to an int
def getJenksBreaks( dataList, numClass ):
# dataList seems to be a vector of float. "Sort" seems to sort it ascendingly
dataList.sort()
# create a 1-dimensional vector
mat1 = []
# "in range" seems to be something like "for i = 0 to len(dataList)+1)
for i in range(0,len(dataList)+1):
# create a 1-dimensional-vector?
temp = []
for j in range(0,numClass+1):
# append a zero to the vector?
temp.append(0)
# append the vector to a vector??
mat1.append(temp)
(...)
I am a little confused because in the pdf there are no explicit variable declarations. However I think and hope I could guess the variables.
Yes, the method append() adds elements to the end of the list. I think your interpretation of the code is correct.
But note the following:
x =[1,2,3,4]
x.append(5)
print(x)
[1, 2, 3, 4, 5]
while
x.append([6,7])
print(x)
[1, 2, 3, 4, 5, [6, 7]]
If you want something like
[1, 2, 3, 4, 5, 6, 7]
you may use extend()
x.extend([6,7])
print(x)
[1, 2, 3, 4, 5, 6, 7]
Python doesn't have explicit variable declarations. It's dynamically typed, variables are whatever type they get assigned to.
Your assessment of the code is pretty much correct.
One detail: The range function goes up to, but does not include, the last element. So the +1 in the second argument to range causes the last iterated value to be len(dataList) and numClass, respectively. This looks suspicious, because the range is zero-indexed, which means it will perform a total of len(dataList) + 1 iterations (which seems suspicious).
Presumably dataList.sort() modifies the original value of dataList, which is the traditional behavior of the .sort() method.
It is indeed appending the new vector to the initial one, if you look at the full source code there are several blocks that continue to concatenate more vectors to mat1.
append is a list function used to append a value at the end of the list
mat1 and temp together are creating a 2D array (eg = [[], [], []]) or matrix of (m x n)
where m = len(dataList)+1 and n = numClass
the resultant matrix is a zero martix as all its value is 0.
In Python, variables are implicitely declared. When you type this:
i = 1
i is set to a value of 1, which happens to be an integer. So we will talk of i as being an integer, although i is only a reference to an integer value. The consequence of that is that you don't need type declarations as in C++ or Java.
Your understanding is mostly correct, as for the comments. [] refers to a list. You can think of it as a linked-list (although its actual implementation is closer to std::vectors for instance).
As Python variables are only references to objects in general, lists are effectively lists of references, and can potentially hold any kind of values. This is valid Python:
# A vector of numbers
vect = [1.0, 2.0, 3.0, 4.0]
But this is perfectly valid code as well:
# The list of my objects:
list = [1, [2,"a"], True, 'foo', object()]
This list contains an integer, another list, a boolean... In Python, you usually rely on duck typing for your variable types, so this is not a problem.
Finally, one of the methods of list is sort, which sorts it in-place, as you correctly guessed, and the range function generates a range of numbers.
The syntax for x in L: ... iterates over the content of L (assuming it is iterable) and sets the variable x to each of the successive values in that context. For example:
>>> for x in ['a', 'b', 'c']:
... print x
a
b
c
Since range generates a range of numbers, this is effectively the idiomatic way to generate a for i = 0; i < N; i += 1 type of loop:
>>> for i in range(4): # range(4) == [0,1,2,3]
... print i
0
1
2
3