I am trying to run through a list and delete elements that do not meet a certain threshold but i am receiving error 'float' object does not support item deletion when I try to delete.
Why am i getting this error? Is there anyway to delete items from lists like this for floats?
Relevant Code:
def remove_abnormal_min_max(distances, avgDistance):
#Define cut off for abnormal roots
cutOff = 0.20 * avgDistance # 20 percent of avg distance
for indx, distance in enumerate(distances): #for all the distances
if(distance <= cutOff): #if the distance between min and max is less than or equal to cutOff point
del distance[indx] #delete this distance from the list
return distances
Your list of float values is called distances (plural), each individual float value from that sequence is called distance (singular).
You are trying to use the latter, rather than the former. del distance[indx] fails because that is the float value, not the list object.
All you need to do is add the missing s:
del distances[indx]
# ^
However, now you are modifying the list in place, shortening it as you loop. This'll cause you to miss elements; items that were once at position i + 1 are now at i while the iterator happily continues at i + 1.
The work-around to that is to build a new list object with everything you wanted to keep instead:
distances = [d for d in distances if d > cutOff]
You mentioned in your comment that you need to reuse the index of the deleted distance. You can build a list of all the indxs you need at once using a list comprehension:
indxs = [k for k,d in enumerate(distances) if d <= cutOff]
And then you can iterate over this new list to do the other work you need:
for indx in indxs:
del distances[indx]
del otherlist[2*indx, 2*indx+1] # or whatever
You may also be able to massage your other work into another list comprehension:
indxs = [k for k,d in enumerate distances if d > cutOff] # note reversed logic
distances = [distances[indx] for indx in indxs] # one statement so doesn't fall in the modify-as-you-iterate trap
otherlist = [otherlist[2*indx, 2*indx+1] for indx in indxs]
As an aside, if you are using NumPy, which is a numerical and scientific computing package for Python, you can take advantage of boolean arrays and what they call smart indexing and use indxs directly to access your list:
import numpy as np
distances = np.array(distances) # convert to a numpy array so we can use smart indexing
keep = ~(distances > cutOff)
distances = distances[keep] # this won't work on a regular Python list
Related
I have a rather complex list of dictionaries with nested dictionaries and arrays. I am trying to figure out a way to either,
make the list of data less complicated and then loop through the
raster points or,
find a way to loop through the array of raster points as is.
What I am ultimately trying to do is loop through all raster points within each polygon, perform a simple greater than or less than on the value assigned to that raster point (values are elevation values). If greater than a given value assign 1, if less than given value assign 0. I would then create a separate array of these 1s and 0s of which I can then get an average value.
I have found all these points (allpoints within pts), but they are in arrays within a dictionary within another dictionary within a list (of all polygons) at least I think, I could be wrong in the organization as dictionaries are rather new to me.
The following is my code:
import numpy as np
def mystat(x):
mystat = dict()
mystat['allpoints'] = x
return mystat
stats = zonal_stats('acp.shp','myGeoTIFF.tif')
pts = zonal_stats('acp.shp','myGeoTIFF.tif', add_stats={'mystat':mystat})
Link to my documents. Any help or direction would be greatly appreciated!
I assume you are using rasterstats package. You could try something like this:
threshold_value = 15 # You may change this threshold value to yours
for o_idx in range(0, len(pts)):
data = pts[o_idx]['mystat']['allpoints'].data
for d_idx in range(0, len(data)):
for p_idx in range(0, len(data[d_idx])):
# You may change the conditions below as you want
if data[d_idx][p_idx] > threshold_value:
data[d_idx][p_idx] = 1
elif data[d_idx][p_idx] <= threshold_value:
data[d_idx][p_idx] = 0;
It is going to update the data within the pts list
I am trying to build a sort of a battery meter, where I have one program that collects voltage samples and adds it to an array. My idea is, that I collect a lot of data when the battery is full, and then build a function that will compare this data with the average of the last 100 or so readings of the voltage as new readings are added every few seconds as long as I don't interrupt the process.
I am using matplotlib to show the voltage output and so far it is working fine: I posted an answer here on live changing graphs
The voltage function looks like this:
pullData = open("dynamicgraph.txt","r").read() //values are stored here in another function
dataArray = pullData.split('\n')
xar = []
yar = []
averagevoltage = 0
for eachLine in dataArray:
if len(eachLine)>=19:
x,y = eachLine.split(',')
xar.append(np.int64(x)) //a datetime value
yar.append(float(y)) //the reading
ax1.clear()
ax1.plot(xar,yar)
ax1.set_ylim(ymin=25,ymax=29)
if len(yar) > 1:
plt.title("Voltage: " + str(yar [-1]) + " Average voltage: "+ str(np.mean(yar)))
I am just wondering what the syntax of getting the average of the last x numbers of the array should look like?
if len(yar) > 100
#get average of last 100 values only
It's a rather simple problem. Assuming you're using numpy which provides easy functions for averaging.
array = np.random.rand(200, 1)
last100 = array[-100:] # Retrieve last 100 entries
print(np.average(last100)) # Get the average of them
If you want to cast your normal array to a numpy array you can do it with:
np.array(<your-array-goes-here>)
Use slice notation with negative index to get the n last items in a list.
yar[-100:]
If the slice is larger than the list, the entire list will be returned.
I don't think you even need to use numpy. You can access the last 100 elements by slicing your array as follows:
l = yar[-100:]
This returns all elements at indices from -100 ('100th' last element) to -1 (last element). Then, you can just native Python functions as follows.
mean = sum(l) / len(l)
Sum(x) returns the sum of all values within the list, and len(l) returns the length of the list.
you could use the Python Standard Library statistics:
import statistics
statistics.mean(your_data_list[-n:]) # n = n newst numbers
I'm looking for a better, faster way to center a couple of lists. Right now I have the following:
import random
m = range(2000)
sm = sorted(random.sample(range(100000), 16000))
si = random.sample(range(16005), 16000)
# Centered array.
smm = []
print sm
print si
for i in m:
if i in sm:
smm.append(si[sm.index(i)])
else:
smm.append(None)
print m
print smm
Which in effect creates a list (m) containing a range of random numbers to center against, another list (sm) from which m is centered against and a list of values (si) to append.
This sample runs fairly quickly, but when I run a larger task with much more variables performance slows to a standstill.
your mainloop contains this infamous line:
if i in sm:
it seems to be nothing but since sm is a result of sorted it is a list, hence O(n) lookup, which explains why it's slow with a big dataset.
Moreover you're using the even more infamous si[sm.index(i)], which makes your algorithm O(n**2).
Since you need the indexes, using a set is not so easy, and there's better to do:
Since sm is sorted, you could use bisect to find the index in O(log(n)), like this:
for i in m:
j = bisect.bisect_left(sm,i)
smm.append(si[j] if (j < len(sm) and sm[j]==i) else None)
small explanation: bisect gives you the insertion point of i in sm. It doesn't mean that the value is actually in the list so we have to check that (by checking if the returned value is within existing list range, and checking if the value at the returned index is the searched value), if so, append, else append None.
Regardless of whether this is the most efficient way to structure this sorting algorithm in Python (it's not), my understandings of indexing requirements/the nature of the built-in 'min' function are failing to account for the following error in the following code:
Error:
builtins.IndexError: list index out of range
Here's the code:
#Create function to sort arrays with numeric entries in increasing order
def selection_sort(arr):
arruns = arr #pool of unsorted array values, initially the same as 'arr'
indmin = 0 #initialize arbitrary value for indmin.
#indmin is the index of the minimum value of the entries in arruns
for i in range(0,len(arr)):
if i > 0: #after the first looping cycle
del arruns[indmin] #remove the entry that has been properly sorted
#from the pool of unsorted values.
while arr[i] != min(arruns):
indmin = arruns.index(min(arruns)) #get index of min value in arruns
arr[i] = arruns[indmin]
#example case
x = [1,0,5,4] #simple array to be sorted
selection_sort(x)
print(x) #The expectation is: [0,1,4,5]
I've looked at a couple other index error examples and have not been able to attribute my problem to anything occurring while entering/exiting my while loop. I thought that my mapping of the sorting process was sound, but my code even fails on the simple array assigned to x above. Please help if able.
arr and arruns are the same lists. You are removing items from the list, decreasing its size, but leaving max value of i variable untouched.
Fix:
arruns = [] + arr
This will create new array for arruns
This is what my code looks like when simplified:
# This function returns some value depending on the index (integer)
# with which it is called.
def funct(index):
value <-- some_process[index]
# Return value for this index.
return value
where the indexes allowed are stored in a list:
# List if indexes.
x = [0,1,2,3,...,1000]
I need to find the x index that returns the minimum value for funct. I could just apply a brute force approach and loop through the full x list storing all values in new a list and then simply find the minimum with np.argmin():
list_of_values = []
for indx in x:
f_x = funct(x)
list_of_values.append(f_x)
min_value = np.argmin(list_of_values)
I've tried this and it works, but it becomes quite expensive when x has too many elements. I'm looking for a way to optimize this process.
I know that scipy.optimize has some optimization functions to find a global minimum like anneal and basin-hopping but I've failed to correctly apply them to my code.
Can these optimization tools be used when I can only call a function with an integer (the index) or do they require the function to be continuous?
the python builtin min accepts a key function:
min_idx = min(x, key=funct)
min_val = funct(min_idx)
This gives you an O(n) solution implemented about as well as you're going to get in python.