I'm trying to calculate the alcohol by volume (abv) of some beer by using variables from 2 separate lists (which I took from a dictionary entry). I'm having trouble getting the values from both lists to be applied to the equation that I have for abv (and it's probably not possible to have a for loop with an and statement like the one I have below). Is it possible to get variables from two separate lists to be subbed into the same equation in one for loop?
Right now it's telling me that I have a type error where 'bool' object is not iterable. Here's what I've tried so far in terms of coding:
beers = {"SG": [1.050, 1.031, 1.077, 1.032, 1.042, 1.055, 1.019, 1.089, 1.100, 1.032],
"FG": [1.010, 1.001, 1.044, 1.003, 1.003, 1.013, 1.002, 1.020, 1.056, 1.000],
"grad student 1": [5.264, 3.983, 4.101, 7.216, 2.313, 4.876, 2.255, 8.991, 5.537, 4.251],
"grad student 2": [5.211, 3.008, 4.117, 3.843, 5.168, 5.511, 3.110, 8.903, 5.538, 4.255]}
#separating the SG and FG values from the dictionary entry
SG_val = beers["SG"]
FG_val = beers['FG']
def find_abv(SG = SG_val, FG = FG_val):
abv_list = []
i = 0.0
j = 0.0
for i in SG_val and j in FG_val:
abv = (((1.05/0.79)*((i - j)/j))*100)
abv_list.append(abv)
return abv_list
find_abv()
print(abv_list)```
You cannot use and to iterate two variables in a single for loop. You can use the zip function to do that:
def find_abv(SG = SG_val, FG = FG_val):
abv_list = []
i = 0.0
j = 0.0
for i, j in zip(SG,FG):
abv = (((1.05/0.79)*((i - j)/j))*100)
abv_list.append(abv)
return abv_list
abv_list = find_abv()
print(abv_list)
You also need to assign the result of find_abv() to a variable in order to print it, which you don't, as it seems in your code.
Another thing is that the use of SG_val and FG_val in the loop of your find_abv is pointless, since you have the SG an FG parameters in your function.
You can't use a for loop to directly iterate through multiple lists. Currently, your function is trying to iterate through (SG_val and j in FG_val), which itself is a boolean and can therefore not be iterated through.
If the two lists will always have the same number of items, then you could simply iterate through the indexes:
# len(SG_val) returns the length of SG_val
for i in range(len(SG_val)):
abv = (((1.05/0.79)*((SG_val[i] - FG_val[i])/FG_val[i]))*100)
abv_list.append(abv)
# put the return outside of the for loop so that it can finish iterating before returning the value
return abv_list
If the lists aren't always going to be the same length then you can write for i in range(len(SG_val) if len(SG_val) <= len(FG_val) else len(SG_val)): instead of for i in range(len(SG_val)):so that it iterates until it reaches the end of the smallest list.
Also, to output the value returned by the function you have to assign it to something and then print it or just print it directly:
abv_list = find_abv()
print(abv_list)
# or
print(find_abv())
Related
I have a list of lists called new_oder_list. I am iterating through this. I would like to create a sub-batch of 20 unique ids from these lists. The same id may appear in the next list so I am keeping a track of the ids in the order_chk_lst list. If there is a repetitive id in the list, I would like to skip that element and check the next element. I am assigning a unique ID to each sub-batch(of 20 elements). I have tried the following code but I am not getting more than 20 ids. I have tried the following code. I would really appreciate your feedback. Thank you.
new_order_list
[5029339601, 5029339775, 5029338374, 5029338219, 5029339927, 5029338917, 5029338917, 5029338219, 5029339601, 5029338905, 5029339320, 5029338282, 5029338374, 5029339109, 5029339320, 5029369758, 5029338282, 5029369758, 5029368075, 5029368652, 5029339941, 5029368652, 5029369810, 5029339584, 5029339584, 5029339775, 5029369810, 5029338531, 5029368003, 5029339536, 5029340252, 5029338531, 5029339137, 5029340252, 5029368003, 5029339137, 5029339536, 5029338531, 5029367966, 5029339109, 5029338390, 5029368075, 5029339576, 5029368083, 5029338209, 5029338417, 5029338905, 5029339576, 5029339941, 5029368075, 5029339895, 5029340051, 5029368075, 5029338390, 5029370218, 5029370218, 5029338209, 5029340051, 5029339895, 5029367966, 5029338417]
[5029370469, 5029368482, 5029370383, 5029340357, 5029340357, 5029370563, 5029370469, 5029340412, 5029339528, 5029370121, 5029370121, 5029370121, 5029368482, 5029368535, 5029370563, 5029339528, 5029370328, 5029368866, 5029369260, 5029369260, 5029369326, 5029370469, 5029338175, 5029338175, 5029368535, 5029368866, 5029368248, 5029340270, 5029339842, 5029339528, 5029340287, 5029338230, 5029368248, 5029368535, 5029368866, 5029340270, 5029339513, 5029369326, 5029368528, 5029340412, 5029339842, 5029338230, 5029370469, 5029370328, 5029369961, 5029340287, 5029370563, 5029370383, 5029340476, 5029340476]
implementation
MAX_ORDER = 20
batch_id = 10000000
sub_batch_id = 10000000
for i, order in enumerate(new_order_list):
# Increment batch_id if the order reaches every MAX_ORDER
if order in order_chk_lst:
# if the id is repeated then go to the next ( I think I am making a mistake here as the value of `i` will change.
continue
order_chk_lst.append(order)
if i % MAX_ORDER == 0:
batch_id = 1
# assign sub_batch_id for each zone (i == 0 will be the first assign within the batch)
# This is my function which will assgn the batch id (I have added this for a reference)
sub_batch_assign, sub_batch_id = assign_sub_batch(zones, sub_batch_id)
# e.g. sub_batch_assign = {"1A": 10000000, "1B": 10000001, "1D": 10000002}
def assign_sub_batch(zones: list, sub_batch_id: int) -> (dict, int):
sub_batch_assign = {}
for zone in zones:
sub_batch_assign[zone] = sub_batch_id
sub_batch_id += 1
return (sub_batch_assign, sub_batch_id)
If you want unique items just change new_oder_list to a set, it will remove all the duplicates. Iterate over the list of lists and use update to add the items to order_chk_lst
order_chk_lst = set()
for lst in new_order_list:
order_chk_lst.update(lst)
You can also change it back to list if you really need it to be list
order_chk_lst = list(order_chk_lst)
If the order is important you can use the fact that dict preserve the order since Python 3.6
order_chk_dict = {}
for lst in new_order_list:
order_chk_dict.update(dict.fromkeys(lst))
order_chk_lst = list(order_chk_dict.keys())
Background:I have two catalogues consisting of positions of spatial objects. My aim is to find the similar ones in both catalogues with a maximum difference in angular distance of certain value. One of them is called bss and another one is called super.
Here is the full code I wrote
import numpy as np
def crossmatch(bss_cat, super_cat, max_dist):
matches=[]
no_matches=[]
def find_closest(bss_cat,super_cat):
dist_list=[]
def angular_dist(ra1, dec1, ra2, dec2):
r1 = np.radians(ra1)
d1 = np.radians(dec1)
r2 = np.radians(ra2)
d2 = np.radians(dec2)
a = np.sin(np.abs(d1-d2)/2)**2
b = np.cos(d1)*np.cos(d2)*np.sin(np.abs(r1 - r2)/2)**2
rad = 2*np.arcsin(np.sqrt(a + b))
d = np.degrees(rad)
return d
for i in range(len(bss_cat)): #The problem arises here
for j in range(len(super_cat)):
distance = angular_dist(bss_cat[i][1], bss_cat[i][2], super_cat[j][1], super_cat[j][2]) #While this is supposed to produce single floating point values, it produces numpy.ndarray consisting of three entries
dist_list.append(distance) #This list now contains numpy.ndarrays instead of numpy.float values
for k in range(len(dist_list)):
if dist_list[k] < max_dist:
element = (bss_cat[i], super_cat[j], dist_list[k])
matches.append(element)
else:
element = bss_cat[i]
no_matches.append(element)
return (matches,no_matches)
When put seperately, the function angular_dist(ra1, dec1, ra2, dec2) produces a single numpy.float value as expected. But when used inside the for loop in this crossmatch(bss_cat, super_cat, max_dist) function, it produces numpy.ndarrays instead of numpy.float. I've stated this inside the code also. I don't know where the code goes wrong. Please help
How do you get the very next list within a nested list in python?
I have a few lists:
charLimit = [101100,114502,124602]
conditionalNextQ = [101101, 101200, 114503, 114504, 124603, 124604]`
response = [[100100,4]
,[100300,99]
,[1100500,6]
,[1100501,04]
,[100700,12]
,[100800,67]
,[100100,64]
,[100300,26]
,[100500,2]
,[100501,035]
,[100700,9]
,[100800,8]
,[101100,"hello"]
,[101101,"twenty"] ... ]
for question in charLimit:
for limitQuestion in response:
limitNumber = limitQuestion[0]
if question == limitNumber:
print(limitQuestion)
The above code is doing what I want, i.e. printing the list instances in response when it contains one of the numbers in charlimit. However, I also want it to print the immediate next value in response also.
For example the second-to-last value in response contains 101100 (a value thats in charlimit) so I want it to not only print
101100,"hello"
(as the code does at the moment)
but the very next list also (and only the next)
101100,"hello"
101101,"twenty"
Thank is advance for any help here. Please note that response is a verrrrry long list and so I'm looking to make things fairly efficient if possible, although its not crucial in the context of this work. I'm probably missing something very simple but cant find examples of anyone doing this without using specific indexes in very small lists.
You can use enumerate
Ex:
charLimit = [101100,114502,124602]
conditionalNextQ = [101101, 101200, 114503, 114504, 124603, 124604]
response = [[100100,4]
,[100300,99]
,[1100500,6]
,[1100501,04]
,[100700,12]
,[100800,67]
,[100100,64]
,[100300,26]
,[100500,2]
,[100501,035]
,[100700,9]
,[100800,8]
,[101100,"hello"]
,[101101,"twenty"]]
l = len(response) - 1
for question in charLimit:
for i, limitQuestion in enumerate(response):
limitNumber = limitQuestion[0]
if question == limitNumber:
print(limitQuestion)
if (i+1) <= l:
print(response[i+1])
Output:
[101100, 'hello']
[101101, 'twenty']
I would eliminate the loop over charLimit and loop over response instead. Using enumerate in this loop allows us to access the next element by index, in the case that we want to print it:
for i, limitQuestion in enumerate(response, 1):
limitNumber = limitQuestion[0]
# use the `in` operator to check if `limitNumber` equals any
# of the numbers in `charLimit`
if limitNumber in charLimit:
print(limitQuestion)
# if this isn't the last element in the list, also
# print the next one
if i < len(response):
print(response[i])
If charLimit is very long, you should consider defining it as a set instead, because sets have faster membership tests than lists:
charLimit = {101100,114502,124602}
I'm fairly new to Pig/Python and in need of help. Trying to write a Pig Script that reconciles financial data. The parameters used follow a syntax like (grand_tot, x1, x2,... xn), meaning that the first value should equal the sum of remaining values.
I don't know of a way to accomplish this using Pig alone, so I've been trying to write a Python UDF. Pig passes a tuple to Python; if the sum of x1:xn equals grand_tot, then Python should return a "1" to Pig to show that the numbers match, otherwise it returns a "0".
Here is what I have so far:
register 'myudf.py' using jython as myfuncs;
A = LOAD '$file_nm' USING PigStorage(',') AS (grand_tot,west_region,east_region,prod_line_a,prod_line_b, prod_line_c, prod_line_d);
A1 = GROUP A ALL;
B = FOREACH A1 GENERATE TOTUPLE($recon1) as flds;
C = FOREACH B GENERATE myfuncs.isReconciled(flds) AS res;
DUMP C;
$recon1 is passed as a parameter, and defined as:
grand_tot, west_region, east_region
I will later pass $recon2 as:
grand_tot, prod_line_a, prod_line_b, prod_line_c, prod_line_d
Sample row of data (in $file_nm) looks like:
grand_tot,west_region,east_region,prod_line_a,prod_line_b, prod_line_c, prod_line_d
10000,4500,5500,900,2200,450,3700,2750
12500,7500,5000,3180,2770,300,3950,2300
9900,7425,2475,1320,460,3070,4630,1740
Lastly... here is what I'm trying to do with Python UDF code:
#outputSchema("result")
def isReconciled(arrTuple):
arrTemp = []
arrNew = []
string1 = ""
result = 0
## the first element of the Tuple should be the sum of remaining values
varGrandTot = arrTuple[0]
## create a new array with the remaining Tuple values
arrTemp = arrTuple[1:]
for item in arrTuple:
arrNew.append(item)
## sum the second to the nth values
varSum = sum(arrNew)
## if the first value in the tuple equals the sum of all remaining values
if varGrandTot = varSum then:
#reconciled to the penny
result = 1
else:
result = 0
return result
The error message I receive is:
unsupported operand type(s) for +: 'int' and 'array.array'
I've tried numerous things attempting to convert the array values into numeric and convert to float so that I can sum, but with no success.
Any ideas??? Thanks for looking!
You can do this in PIG itself.
First, specify the datatype in the schema. PigStorage will use bytearray as default data type.Hence your python script is throwing the error.Looks like your sample data has int but in your question you have mentioned float.
Second, add the fields starting from the second field or the fields of your choice.
Third, use the bincond operator to check the first field value with the sum.
A = LOAD '$file_nm' USING PigStorage(',') AS (grand_tot:float,west_region:float,east_region:float,prod_line_a:float,prod_line_b:float, prod_line_c:float, prod_line_d:float);
A1 = FOREACH A GENERATE grand_tot,SUM(TOBAG(prod_line_a,prod_line_b,prod_line_c,prod_line_d)) as SUM_ALL;
B = FOREACH A1 GENERATE (grand_tot == SUM_ALL ? 1 : 0);
DUMP B;
It is very likely, that your arrTuple is not an array of numbers, but some item is an array.
To check it, modify your code by adding some checks:
#outputSchema("result")
def isReconciled(arrTuple):
# some checks
tmpl = "Item # {i} shall be a number (has value {itm} of type {tp})"
for i, num in enumerate(arrTuple):
msg = templ.format(i=i, itm=itm, tp=type(itm))
assert isinstance(arrTuple[0], (int, long, float)), msg
# end of checks
arrTemp = []
arrNew = []
string1 = ""
result = 0
## the first element of the Tuple should be the sum of remaining values
varGrandTot = arrTuple[0]
## create a new array with the remaining Tuple values
arrTemp = arrTuple[1:]
for item in arrTuple:
arrNew.append(item)
## sum the second to the nth values
varSum = sum(arrNew)
## if the first value in the tuple equals the sum of all remaining values
if varGrandTot = varSum then:
#reconciled to the penny
result = 1
else:
result = 0
return result
It is very likely, that it will throw an AssertionFailed exception on one of the items. Read the
assertion message to learn, which item is making the troubles.
Anyway, if you want to return 0 or 1 if first number equals sum of the rest of the array, following
would work too:
#outputSchema("result")
def isReconciled(arrTuple):
if arrTuple[0] == sum(arrTuple[1:]):
return 1
else:
return 0
and in case, you would live happy with getting True instead of 1 and False instead of 0:
#outputSchema("result")
def isReconciled(arrTuple):
return arrTuple[0] == sum(arrTuple[1:])
I am receiving the error
TypeError: 'filter' object is not subscriptable
When trying to run the following block of code
bonds_unique = {}
for bond in bonds_new:
if bond[0] < 0:
ghost_atom = -(bond[0]) - 1
bond_index = 0
elif bond[1] < 0:
ghost_atom = -(bond[1]) - 1
bond_index = 1
else:
bonds_unique[repr(bond)] = bond
continue
if sheet[ghost_atom][1] > r_length or sheet[ghost_atom][1] < 0:
ghost_x = sheet[ghost_atom][0]
ghost_y = sheet[ghost_atom][1] % r_length
image = filter(lambda i: abs(i[0] - ghost_x) < 1e-2 and
abs(i[1] - ghost_y) < 1e-2, sheet)
bond[bond_index] = old_to_new[sheet.index(image[0]) + 1 ]
bond.sort()
#print >> stderr, ghost_atom +1, bond[bond_index], image
bonds_unique[repr(bond)] = bond
# Removing duplicate bonds
bonds_unique = sorted(bonds_unique.values())
And
sheet_new = []
bonds_new = []
old_to_new = {}
sheet=[]
bonds=[]
The error occurs at the line
bond[bond_index] = old_to_new[sheet.index(image[0]) + 1 ]
I apologise that this type of question has been posted on SO many times, but I am fairly new to Python and do not fully understand dictionaries. Am I trying to use a dictionary in a way in which it should not be used, or should I be using a dictionary where I am not using it?
I know that the fix is probably very simple (albeit not to me), and I will be very grateful if someone could point me in the right direction.
Once again, I apologise if this question has been answered already
Thanks,
Chris.
I am using Python IDLE 3.3.1 on Windows 7 64-bit.
filter() in python 3 does not return a list, but an iterable filter object. Use the next() function on it to get the first filtered item:
bond[bond_index] = old_to_new[sheet.index(next(image)) + 1 ]
There is no need to convert it to a list, as you only use the first value.
Iterable objects like filter() produce results on demand rather than all in one go. If your sheet list is very large, it might take a long time and a lot of memory to put all the filtered results into a list, but filter() only needs to evaluate your lambda condition until one of the values from sheet produces a True result to produce one output. You tell the filter() object to scan through sheet for that first value by passing it to the next() function. You could do so multiple times to get multiple values, or use other tools that take iterables to do more complex things; the itertools library is full of such tools. The Python for loop is another such a tool, it too takes values from an iterable one by one.
If you must have access to all filtered results together, because you have to, say, index into the results at will (e.g. because this time your algorithm needed to access index 223, index 17 then index 42) only then convert the iterable object to a list, by using list():
image = list(filter(lambda i: ..., sheet))
The ability to access any of the values of an ordered sequence of values is called random access; a list is such a sequence, and so is a tuple or a numpy array. Iterables do not provide random access.
Use list before filter condtion then it works fine. For me it resolved the issue.
For example
list(filter(lambda x: x%2!=0, mylist))
instead of
filter(lambda x: x%2!=0, mylist)
image = list(filter(lambda i: abs(i[0] - ghost_x) < 1e-2 and abs(i[1] - ghost_y) < 1e-2, sheet))