Python 3 - assignments within a loop - python

So, I have a large number of arrays that I need a mean value for. To try and simplify I'm attempting to use a for-loop, but I'm not sure if this is possible or if it is, how I can name the left-hand of the assignment line. I've tried:
for data in data_list:
data + _mean = np.mean(data)
This gives a syntax error; can't assign to operator. I feel like that's a lame try, but I'm very new to coding and I'm not sure what I can do on the left-hand side to make each assignment have a unique name.
Thanks for your time!

Don't try to give each value a unique name, put them in a list. You can use a list comprehension for this.
means = [np.mean(data) for data in data_list]

Related

How to iterate over a list while applying a regex to it in Python?

I'm trying to run this block of code.
castResultType = list(filterResultSet[1:32:5])
cleanList = []
for i in castResultType:
cleanList.append(re.sub('[^\d\.]', '',castResultType[i]))
print(cleanList)
I'm hoping to get essentially each item in the list castResultType, which is getting specific values from the list filterResultSet inserted into the empty list cleanList. While also having the regex above applied to each value from castResultType before it's inserted into cleanList. I'm sure I'm doing something wrong, and would appreciate any assistance. I'm also very new to python and programming in general, so I apologize if I'm asking something stupid.
The problem you're (probably) facing is in this line: for i in castResultType.
What this line is doing is looping over the values within our (list) castResultType and assigning its elements to i. However, when you call castResultType[i], you're asking for a non-existent index of the list.
What you probably meant to do is:
for i in range(len(castResultType)):
cleanList.append(re.sub('[^\d\.]', '',castResultType[i]))
# etc
What you could also do is:
for i in castResultType:
cleanList.append(re.sub('[^\d\.]', '', i))
# etc
There are a lot of other ways of coding what you want to do, but these are probably the most familiar for a beginner.
In addition to the bug in your looping, the problem could be in your use of regular expressions. I recommend experimenting with your regular expression on a single element from castResultType before looping through.
You can use the built-in map function to map a function to items in an iterable:
clean_list = list(map(lambda elem: re.sub('[^\d\.]', '', elem), castResultType))

Python list.remove items present in second list

I've searched around and most of the errors I see are when people are trying to iterate over a list and modify it at the same time. In my case, I am trying to take one list, and remove items from that list that are present in a second list.
import pymysql
schemaOnly = ["table1", "table2", "table6", "table9"]
db = pymysql.connect(my connection stuff)
tables = db.cursor()
tables.execute("SHOW TABLES")
tablesTuple = tables.fetchall()
tablesList = []
# I do this because there is no way to remove items from a tuple
# which is what I get back from tables.fetchall
for item in tablesTuple:
tablesList.append(item)
for schemaTable in schemaOnly:
tablesList.remove(schemaTable)
When I put various print statements in the code, everything looks like proper and like it is going to work. But when it gets to the actual tablesList.remove(schemaTable) I get the dreaded ValueError: list.remove(x): x not in list.
If there is a better way to do this I am open to ideas. It just seemed logical to me to iterate through the list and remove items.
Thanks in advance!
** Edit **
Everyone in the comments and the first answer is correct. The reason this is failing is because the conversion from a Tuple to a list is creating a very badly formatted list. Hence there is nothing that matches when trying to remove items in the next loop. The solution to this issue was to take the first item from each Tuple and put those into a list like so: tablesList = [x[0] for x in tablesTuple] . Once I did this the second loop worked and the table names were correctly removed.
Thanks for pointing me in the right direction!
I assume that fetchall returns tuples, one for each database row matched.
Now the problem is that the elements in tablesList are tuples, whereas schemaTable contains strings. Python does not consider these to be equal.
Thus when you attempt to call remove on tablesList with a string from schemaTable, Python cannot find any such value.
You need to inspect the values in tablesList and find a way convert them to a strings. I suspect it would be by simply taking the first element out of the tuple, but I do not have a mySQL database at hand so I cannot test that.
Regarding your question, if there is a better way to do this: Yes.
Instead of adding items to the list, and then removing them, you can append only the items that you want. For example:
for item in tablesTuple:
if item not in schemaOnly:
tablesList.append(item)
Also, schemaOnly can be written as a set, to improve search complexity from O(n) to O(1):
schemaOnly = {"table1", "table2", "table6", "table9"}
This will only be meaningful with big lists, but in my experience it's useful semantically.
And finally, you can write the whole thing in one list comprehension:
tablesList = [item for item in tablesTuple if item not in schemaOnly]
And if you don't need to keep repetitions (or if there aren't any in the first place), you can also do this:
tablesSet = set(tablesTuple) - schemaOnly
Which is also has the best big-O complexity of all these variations.

Failing to understand simple error message

I've defined a function as below to try to interpolate between two sets of data. When I run it, I get the message:
for i, j in range(0, len(wavelength)):
TypeError: 'int' object is not iterable
I'm not sure what I'm doing wrong. Admittedly, I'm not very good at this.
def accountforfilter(wavelength, flux, filterwavelength, throughput):
filteredwavelength=[]
filteredflux=[]
for i in range(0, len(wavelength)):
if wavelength[i] in filterwavelength[j]:
j=filterwavelength.index(wavelength[i])
filteredwavelength.append(wavelength[i])
filteredflux.append(flux[i]*throughput[j])
elif wavelength[i]<filterwavelength[j]<wavelength[i+1]:
m=((throughput[j+1]-throughput[j])/(filterwavelength[j+1]-filterwavelength[j])
c=throughput[j]-(m*(wavelength[i]))
filteredwavelength.append(wavelength[i])
filteredflux.append(flux[i]*(m*wavelength[i]+c)
return filteredwavelength, filteredflux
range() returns a list of integers. By using for i,j in range() you are telling Python to unpack each item in range() to two values. But since those values are integers, which are a single piece of data, and thus not iterable, you get the error message.
Your code also looks a bit strange.
At first it seems like you want to loop over all combinations of wavelength/filterwavelength, which would be the same as
for i in range(len(wavelength)):
for j in range(len(filterwavelength)):
do_stuff()
but then you are modifying the j parameter inside the loop body, which I don't understand.
Regardless, there is probably a lot easier, and more clear, way to write the code you want. But from the current code it is hard to know what is expected (and should probably go in a separate question).
The problem is that the range only works with one variable like so:
for i in range(0, len(wavelength))
You try to use two variables at once so python tries to unpack an integer which is impossible. You should use the above. If you need two independent indices, use
for i in range(0, len(...))
for j in range(0, len(...))
Btw ranges always start with zero so you can save yourself some typing and use range(len(...)) instead.
You can use zip if you want.
check this: Better way to iterate over two or multiple lists at once
if you have two sets of data.
for i,j in zip(set1,set2):
print i,j

python finding high value among slices of array

Below are snippets of code that is giving me some problems. what i am trying to do is find every occurrence of a 356 day high. To do this i am trying code similar to the one below, but getting an exception on the "for i" line: 'builtin_function_or_method' object has no attribute 'getitem'
Quote = namedtuple("Quote", "Date Close Volume")
quotes = GetData() # arrray
newHighs = []
for i,q in range[365, len(quotes)]: #<--Exception
max = max(xrange[i-365, i].Close) #<--i know this won't work, will fix when i get here
if (q.Close > max):
newHighs.append(i,q)
Any help on fixing this would be appreciated. Also any tips on implementing this in an efficient manner (since quotes array currently has 17K elements) would also be nice.
range is a function that returns a generator (or list in python2). Thus, it must be called as a function range(365, len(quotes)), which will return all the numbers from 365 to len(quotes).
Square brackets imply indexing, like accessing items in a list. Since range is a function, not a list, it throws an exception when you try to access it.
"Range" is a function. This means you use circle brackets, not square ones. This is the same deal with "xrange" in the line below. I understand why you'd think to use the square brackets, but what "range" does is create the list using those arguments. So it's not the same as when you want elements m to n of a list.

Python conciseness confuses me

I have been looking at Pandas: run length of NaN holes, and this code fragment from the comments in particular:
Series([len(list(g)) for k, g in groupby(a.isnull()) if k])
As a python newbie, I am very impressed by the conciseness but not sure how to read this. Is it short for something along the lines of
myList = []
for k, g in groupby(a.isnull()) :
if k:
myList.append(len(list(g)))
Series(myList)
In order to understand what is going on I was trying to play around with it but get an error:
list object is not callable
so not much luck there.
It would be lovely if someone could shed some light on this.
Thanks,
Anne
You've got the translation correct. However, the code you give cannot be run because a is a free variable.
My guess is that you are getting the error because you have assigned a list object to the name list. Don't do that, because list is a global name for the type of a list.
Also, in future please always provide a full stack trace, not just one part of it. Please also provide sufficient code that at least there are no free variables.
If that is all of your code, then you have only a few possibilities:
myList.append is really a list
len is really a list
list is really a list
isnull is really a list
groupby is really a list
Series is really a list
The error exists somewhere behind groupby.
I'm going to go ahead and strike out myList.append (because that is impossible unless you are using your own groupby function for some reason) and Series. Unless you are importing Series from somewhere strange, or you are re-assigning the variable, we know Series can't be a list. A similar argument can be made for a.isnull.
So that leaves us with two real possibilities. Either you have re-assigned something somewhere in your script to be a list where it shouldn't be, or the error is behind groupby.
I think you're using the wrong groupby itertools.groupby takes and array or list as an argument, groupby in pandas may evaluate the first argument as a function. I especially think this because isnull() returns an array-like object.

Categories