I'm messing around with lists of lists containing strings and values EX: LofL = [["string", 4.0, 1.1, -3.0, -7.2],["string", 2.0, -1.0, 3.3], ["string", 4.4, 5.5, -6.6, 1.1]] and I'm trying to take the values within each list within the list, and average them as long as the values are not below 0. For example the first would be 5.1/2 since the third digit is negative. This in the end would make the List of lists look like: LofL =[["string", 5.1/2],["string", 2/1], ["string", 9.9/2]]. I've tried this so far:
LofL = *see above example*
avgLofL = LofL
for sublist in LofL:
while sublist in range(1,len(sublist)) > 0.0:
rowavg = [sum(sublist) / range(1,len(sublist)) for sublist in LofL]
for sublist in avgLofL:
for sublist in range(1,len(sublist)):
avgLofL.append(rowavg)
return avgLofL
It says my rowavg isn't referenced before assingment, but when I intitialize it as rowavg = 0 my list has no length. I'm unsure where I'm making a mistake
This is a possible solution:
from statistics import mean
avgLofL = [[next(x for x in lst if isinstance(x, str)),
mean(x for x in lst if not isinstance(x, str) and x >= 0)]
for lst in LofL]
Ok, I think this is what you actually asked for:
LofL = [["string", 4.0, 1.1, -3.0, -7.2],
["string", 2.0, -1.0, 3.3],
["string", 4.4, 5.5, -6.6, 1.1]]
avgLofL = []
for row in LofL:
sublist = []
for x in row[1:]:
if x>=0:
sublist.append(x)
else:
break
avgLofL.append([row[0], sum(sublist)/float(len(sublist))])
print(avgLofL)
Its result seems to match the example:
[['string', 2.55], ['string', 2.0], ['string', 4.95]]
It processes one row at a time, completely. Assumes that the first element is a string which should be kept, then collects the other elements in sublist until it finds a negative one. Then calculates the average of the collection, builds and stores a "[string,average]" pair, and continues with the next row.
In its current form it will die on having a negative number right at the start (division by zero). You can either drop an explicit if somewhere, or some dirty hack, like sum(sublist)/max(1,float(len(sublist))).
def avgofList(LofL):
avgLofL = []
for sublist in LofL:
total = 0
count = 0
for item in sublist:
if isinstance(item, float) and item > 0:
total += item
count += 1
sl = [x for x in sublist if not isinstance(x, float)]
sl.append(total / count)
avgLofL.append(sl)
return avgLofL
Very similar to the answer provided by #tevemadar except that no intermediate lists are used and also accounts for the first negative number being in index 1 of the sub-lists. It does however assume that the sublists are not empty and that the first element is to be retained.
LofL = [["string", 4.0, 1.1, -3.0, -7.2],
["string", 2.0, -1.0, 3.3],
["string", 4.4, 5.5, -6.6, 1.1]]
def process(e):
t, n = 0, 0
for v in e[1:]:
if v >= 0:
t += v
n += 1
else:
break
return [e[0], t / n] if n > 0 else [e[0]]
result = [process(e) for e in LofL]
print(result)
Output:
[['string', 2.55], ['string', 2.0], ['string', 4.95]]
I am still a beginner in Python. I have a tuple to be filtered, merged and sorted.
The tuple looks like this:
id, ts,val
tup = [(213,5,10.0),
(214,5,20.0),
(215,5,30.0),
(313,5,60.0),
(314,5,70.0),
(315,5,80.0),
(213,10,11.0),
(214,10,21.0),
(215,10,31.0),
(313,10,61.0),
(314,10,71.0),
(315,10,81.0),
(315,15,12.0),
(314,15,22.0),
(215,15,32.0),
(313,15,62.0),
(214,15,72.0),
(213,15,82.0] and so on
Description about the list: The first column(id)can have only these 6 values 213,214,215,313,314,315 but in any different order. The second column(ts) will have same values for every 6 rows. Third column(val) will have some random floating point values
Now my final result should be something like this:
result = [(5,10.0,20.0,30.0,60.0,70.0,80.0),
(10,11.0,21.0,31.0,61.0,71.0,81.0),
(15,82.0,72.0,32.0,62.0,22.0,12.0)]
That is the first column in each row is to be deleted. There should be only one unique row for each unique value in the second column. so the order of each result row should be:
(ts,val corresponding to id 213,val corresponding to 214, corresponding to id 215,val corresponding to 313,corresponding to id 314,val corresponding to 315)
Note : I am restricted to use only the standard python libraries. So panda, numpy cannot be used.
I tried a lot of possibilities but couldnt solve it. Please help me do this. Thanks in advance.
You can use itertools.groupby
from itertools import groupby
result=[]
for i,g in groupby(lst, lambda x:x[1]):
group= [i]+map(lambda x:x[-1],sorted(list(g),key=lambda x:x[0]))
result.append(tuple(group))
print result
Output:
[(5, 10.0, 20.0, 30.0, 60.0, 70.0, 80.0),
(10, 11.0, 21.0, 31.0, 61.0, 71.0, 81.0),
(15, 82.0, 72.0, 32.0, 62.0, 22.0, 12.0)]
With a slight change to your code you can fix it. If you change i[1] in ssd[cnt] to i[1] == ssd[cnt][0] your code may work. Also in else part you should add another list to ssd because you are creating another set of data. Also if the data should come according to their id's you should sort them by (ts,id). After applying the changes:
tup.sort( key = lambda x: (x[1],x[0]) )
ssd = [[]]
cnt = 0
ssd[0].append(tup[0][1])
for i in tup:
if i[1] == ssd[cnt][0]:
ssd[cnt].append(i[2])
else:
cnt = cnt + 1
ssd.append([])
ssd[cnt].append(i[1])
ssd[cnt].append(i[2])
Output
[[5, 10.0, 20.0, 30.0, 60.0, 70.0, 80.0],
[10, 11.0, 21.0, 31.0, 61.0, 71.0, 81.0],
[15, 82.0, 72.0, 32.0, 62.0, 22.0, 12.0]]
Here's a vanilla python solution, although I do think that using groupby is more pythonic. This does have the disadvantage that it has to build the dicts in memory, so it won't scale to a large tup list.
This does, however, obey the ordering requirement.
from collections import defaultdict
tup = ...
tup_dict = defaultdict(dict)
for id, ts, val in tup:
print id, ts, val
tup_dict[ts][id] = val
for tup_key in sorted(tup_dict):
id_dict = tup_dict[tup_key]
print tuple([tup_key] + [ id_dict[id_key] for id_key in sorted(id_dict)])
We want to iterate on a sorted instance of your tup, unpacking the items as we go, but first we need an auxiliary variable to store the keys and a variable to store our results
keys, res = [], []
for t0, t1, t2 in sorted(tup, key=lambda x:(x[1],x[0])):
the key argument is a lambda function that instructs thesorted` function to sort on the second and the first item of each element in the individual tuple --- so here we have the body of the loop
if t1 not in keys:
keys.append[t1]
res.append([t1])
that is, if the second integer in the tuple was not already processed, we have to memorize the fact that it's being processed and we want to add a new list in our result variable, that starts with the value of the second integer
To finish the operation on an individual tuple, we are sure that there is a list in res that starts with t1, indexing the aux variable we know the index of that list and so we can append the float to it...
i = keys.index(t1)
res[i].append(t2)
To have all of that in short
keys, res = [], []
for t0, t1, t2 in sorted(tup, key=lambda x:(x[1],x[0])):
if t1 not in keys:
keys.append[t1]
res.append([t1])
i = keys.index(t1)
res[i].append(t2)
Now, in res you have a list of lists, if you really need a list of tuples you can convert with a list comprehension
res = [tuple(elt) for elt in res]
adding to the answer of #Ahsanul Haque he also need it in order so instead of list(g) do sorted(g,key=lambda y:y[0]) you can also do the use tuple from the start
for i,g in groupby(tup,lambda x:x[1]):
gro = (i,) + tuple(map(lambda x:x[-1],sorted(g,key=lambda y:y[0])))
resul.append(gro)
I have a dictionary
k = {'a':[7,2,3],'b':[7,2,7], 'c': [8,9,10]}
where is each val is a list. I want to delete the ith term(depending on condition) in a val without going out of range. this is code for it
for i in range(len(k['a'])):
if k['a'][i] == k['b'][i]:
pass
else:
for key in k:
del [k[key][i]]
This would work return a dictionary equivalent to this
{'a':[7,2],'b':[7,2], 'c': [8,9]}
However if the dictionary was this
k = {'a':[6,2,3],'b':[7,2,7], 'c': [8,9,10]}
I would get this Error
list index out of range
How I delete key vals so I don't get this error?
The issue is that when you delete one item from each array, the size of these arrays decreases by one. Thus, the main loop becomes one iteration too long.
An illustration of the problem:
>>> a = [1, 2, 3]
>>> i = 2
>>> a[i]
3
>>> len(a)
3
>>> del [a[1]]
>>> a
[1, 3]
>>> len(a)
2
>>> a[i] # used to work
IndexError: list index out of range
In order for the index and the loop duration to work out you need to do something like this:
i = 0
while i < len(k['a']):
if k['a'][i] == k['b'][i]:
i += 1
else:
for key in k:
del [k[key][i]]
I would like to store each value of price in an array, getting it from a dict. I am a fresh at python, I spent hours trying to figure this out...
for item in world_dict:
if item[1] == 'House':
price = float(item[2])
print p
The output is like:
200.5
100.7
300.9
...
n+100
However, I want to store it on this format : [200.5, 100.7, 300.9, ..., n+100]
Define a list and append to it:
prices = []
for item in world_dict:
if item[1] == 'House':
price = float(item[2])
prices.append(price)
print(prices)
or, you can write it in a shorter way by using list comprehension:
prices = [float(item[2]) for item in world_dict if item[1] == 'House']
print(prices)
This a simple example of how to store values into an array using for loop in python
Hope this will help freshers to gain something when they search the above question
list = [] #define the array
#Fucntion gets called here
def gameConfiguration():
n= int(input("Enter Number of Players : "))
for i in range(0, n):
ele = (input("Name : "))
list.append(ele) # adding the element to the array
print(list[0]) #so you can print the element 1
print(list[1]) #so you can print the element 2 so on and soforth
gameConfiguration() # starts the execution of the fuction
In simplest way, to store the value of for loop in array.
from array import *
ar = []
for i in range (0, 5, 1) :
x = 3.0*i+1
ar.append(x)
print(ar)
Output like this;
[1.0, 4.0, 7.0, 10.0, 13.0]
Ids = [ i for i, v in heapq.nlargest(RN, enumerate(score_real_test), key=operator.itemgetter(1))]
This will give the indexes of the RN largest values in list score_real_test. Is it instead possible to get the indexes of the RN largest values that "are in list score_real_test and satisfy a boolean condition COND" ?
Ids should contains RN indexes.
EDIT: for now I use this solution but it is not the best one :
score_real_test_2 = np.sort( [ v for i,v in enumerate(score_real_test) if pred_real_test[i] == NOVEL ] )
score_real_test_2 = score_real_test_2[len(score_real_test_2)-RN:]
large_dist_ids = [i for i in range(len(score_real_test)) if score_real_test[i] in score_real_test_2]
If you want to keep this is a one-liner, you can simply replace enumerate by a list comprehension(or generator expression) that pre-filters your data based on the condition.
Like so:
Ids = [i for i, v in heapq.nlargest(RN, [(j,v) in enumerate(score_real_test) if pred_real_test[i] == NOVEL], key=operator.itemgetter(1))]
Or if you want to keep it clearer, you can add a filtering step before the selection(still no need for pre-sorting):
tmp = [item for item in enumerate(score_real_test) if pred_real_test[item[0]] == NOVEL]
Ids = [i for i, v in heapq.nlargest(RN, tmp, key=operator.itemgetter(1))]