values = [[3.5689651969162908, 4.664618442892583, 3.338666695570425],
[6.293153787450157, 1.1285723419142026, 10.923859694586376],
[2.052506259736077, 3.5496423448584924, 9.995488620338277],
[9.41858935127928, 10.034233496516803, 7.070345442417161]]
def flatten(values):
new_values = []
for i in range(len(values)):
for v in range(len(values[0])):
new_values.append(values[i][v])
return new_values
v = flatten(values)
print("A 2D list contains:")
print("{}".format(values))
print("The flattened version of the list is:")
print("{}".format(v))
I am flatting the 2D list to 1D, but I can format it. I know the (v) is a list, and I tried to use for loop to print it, but I still can't get the result I want. I am wondering are there any ways to format the list. I want to print the (v) as a result with two decimal places. Like this
[3.57, 4.66, 3.34, 6.29, 1.13, 10.92, 2.05, 3.55, 10.00, 9.42, 10.03, 7.07]
I am using the Eclipse and Python 3.0+.
You could use:
print(["{:.2f}".format(val) for val in v])
Note that you can flatten your list using itertools.chain:
import itertools
v = list(itertools.chain(*values))
I would use the built-in function round(), and while I was about it I would simplify your for loops:
def flatten(values):
new_values = []
for i in values:
for v in i:
new_values.append(round(v, 2))
return new_values
How to flatten and transform the list in one line
[round(x,2) for b in [x for x in values] for x in b]
It returns a list of two decimals after the comma.
One you have v you can use a list comprehension like:
formattedList = ["%.2f" % member for member in v]
output was as follows:
['3.57', '4.66', '3.34', '6.29', '1.13', '10.92', '2.05', '3.55', '10.00', '9.42', '10.03', '7.07']
Hope that helps!
You can first flatten the list (as described here) and then use round to solve this:
flat_list = [number for sublist in l for number in sublist]
# All numbers are in the same list now
print(flat_list)
[3.5689651969162908, 4.664618442892583, 3.338666695570425, 6.293153787450157, ..., 7.070345442417161]
rounded_list = [round(number, 2) for number in flat_list]
# The numbers are rounded to two decimals (but still floats)
print(flat_list)
[3.57, 4.66, 3.34, 6.29, 1.13, 10.92, 2.05, 3.55, 10.00, 9.42, 10.03, 7.07]
This can be written shorter if we put the rounding directly into the list comprehension:
print([round(number, 2) for sublist in l for number in sublist])
Related
I want to turn my array which consists out of 2 lists into a ranked list.
Currently my code produces :
[['txt1.txt' 'txt2.txt' 'txt3.txt' 'txt4.txt' 'txt5.txt' 'txt6.txt'
'txt7.txt' 'txt8.txt']
['0.13794219565502694' '0.024652340886571225' '0.09806335128916213'
'0.07663118536707426' '0.09118273488073968' '0.06278926571143634'
'0.05114729750522118' '0.02961812647701087']]
I want to make it so that txt1.txt goes with the first value, txt2 goes with the second value etc.
So something like this
[['txt1.txt', '0.13794219565502694'], ['txt2.txt', '0.024652340886571225']... etc ]]
I do not want it to become tuples by using zip.
My current code:
def rankedmatrix():
matrix = numpy.array([names,x])
ranked_matrix = sorted(matrix.tolist(), key=lambda score: score[1], reverse=True)
print(ranked_matrix)
Names being :
names = ['txt1.txt', 'txt2.txt', 'txt3.txt', 'txt4.txt', 'txt5.txt', 'txt6.txt', 'txt7.txt', 'txt8.txt']
x being:
x = [0.1379422 0.01540234 0.09806335 0.07663119 0.09118273 0.06278927
0.0511473 0.02961813]
Any help is appreciated.
You can get the list of lists with zip as well:
x = [['txt1.txt', 'txt2.txt', 'txt3.txt', 'txt4.txt', 'txt5.txt', 'txt6.txt'
'txt7.txt', 'txt8.txt'], ['0.13794219565502694', '0.024652340886571225', '0.09806335128916213',
'0.07663118536707426', '0.09118273488073968', '0.06278926571143634',
'0.05114729750522118', '0.02961812647701087']]
res = [[e1, e2] for e1, e2 in zip(x[0], x[1])]
print(res)
Output:
[['txt1.txt', '0.13794219565502694'], ['txt2.txt', '0.024652340886571225'], ['txt3.txt', '0.09806335128916213'], ['txt4.txt', '0.07663118536707426'], ['txt5.txt', '0.09118273488073968'], ['txt6.txttxt7.txt', '0.06278926571143634'], ['txt8.txt', '0.05114729750522118']]
You can use map to convert the tuple to list.
list(map(list, zip(names, x)))
[['txt1.txt', 0.1379422],
['txt2.txt', 0.01540234],
['txt3.txt', 0.09806335],
['txt4.txt', 0.07663119],
['txt5.txt', 0.09118273],
['txt6.txt', 0.06278927],
['txt7.txt', 0.0511473],
['txt8.txt', 0.02961813]]
I have a sample list of data like this:
list_ = [
(['0.640', '0.630', '0.64'], ['0.61', '0.65', '0.53']),
(['20.00', '21.00', '21.00'], ['21.00', '22.00', '22.00']),
(['0.025', '0.025', '0.026'], ['0.150', '0.150', '0.130'])
]
I'm trying to merge all lists in tuple into tuple, which would be the result of list of tuples.
Now I would like to get a merged list as follows
output = [
('0.640', '0.630', '0.64', '0.61', '0.65', '0.53'),
('20.00', '21.00', '21.00', '21.00', '22.00', '22.00'),
('0.025', '0.025', '0.026', '0.150', '0.150', '0.130')
]
# or
output = [
['0.640', '0.630', '0.64', '0.61', '0.65', '0.53'],
['20.00', '21.00', '21.00', '21.00', '22.00', '22.00'],
['0.025', '0.025', '0.026', '0.150', '0.150', '0.130']
]
Any help appreciated. Thanks in advance!
from itertools import chain
output = [tuple(chain.from_iterable(t)) for t in list_]
Use chain from itertools.
list comprehension
[[item for internal_list_ in tuple_ for item in internal_list_] for tuple_ in list_]
numpy
np.array(list_).reshape((len(list_), -1))
output = [x[0]+x[1] for x in list_]
If you want a general solution you don't have to import itertools in this case as others have suggested. This works for n-tuples:
output = [sum([*x], []) for x in list_]
This solution will be superior when you don't have thousands of lists, but inferior otherwise.
Question:
I have a list in the following format:
x = [["hello",0,5], ["hi",0,6], ["hello",0,8], ["hello",1,1]]
The algorithm:
Combine all inner lists with the same starting 2 values, the third value doesn't have to be the same to combine them
e.g. "hello",0,5 is combined with "hello",0,8
But not combined with "hello",1,1
The 3rd value becomes the average of the third values: sum(all 3rd vals) / len(all 3rd vals)
Note: by all 3rd vals I am referring to the 3rd value of each inner list of duplicates
e.g. "hello",0,5 and "hello",0,8 becomes hello,0,6.5
Desired output: (Order of list doesn't matter)
x = [["hello",0,6.5], ["hi",0,6], ["hello",1,1]]
Question:
How can I implement this algorithm in Python?
Ideally it would be efficient as this will be used on very large lists.
If anything is unclear let me know and I will explain.
Edit: I have tried to change the list to a set to remove duplicates, however this doesn't account for the third variable in the inner lists and therefore doesn't work.
Solution Performance:
Thanks to everyone who has provided a solution to this problem! Here
are the results based on a speed test of all the functions:
Update using running sum and count
I figured out how to improve my previous code (see original below). You can keep running totals and counts, then compute the averages at the end, which avoids recording all the individual numbers.
from collections import defaultdict
class RunningAverage:
def __init__(self):
self.total = 0
self.count = 0
def add(self, value):
self.total += value
self.count += 1
def calculate(self):
return self.total / self.count
def func(lst):
thirds = defaultdict(RunningAverage)
for sub in lst:
k = tuple(sub[:2])
thirds[k].add(sub[2])
lst_out = [[*k, v.calculate()] for k, v in thirds.items()]
return lst_out
print(func(x)) # -> [['hello', 0, 6.5], ['hi', 0, 6.0], ['hello', 1, 1.0]]
Original answer
This probably won't be very efficient since it has to accumulate all the values to average them. I think you could get around that by having a running average with a weighting factored in, but I'm not quite sure how to do that.
from collections import defaultdict
def avg(nums):
return sum(nums) / len(nums)
def func(lst):
thirds = defaultdict(list)
for sub in lst:
k = tuple(sub[:2])
thirds[k].append(sub[2])
lst_out = [[*k, avg(v)] for k, v in thirds.items()]
return lst_out
print(func(x)) # -> [['hello', 0, 6.5], ['hi', 0, 6.0], ['hello', 1, 1.0]]
You can try using groupby.
m = [["hello",0,5], ["hi",0,6], ["hello",0,8], ["hello",1,1]]
from itertools import groupby
m.sort(key=lambda x:x[0]+str(x[1]))
for i,j in groupby(m, lambda x:x[0]+str(x[1])):
ss=0
c=0.0
for k in j:
ss+=k[2]
c+=1.0
print [k[0], k[1], ss/c]
This should be O(N), someone correct me if I'm wrong:
def my_algorithm(input_list):
"""
:param input_list: list of lists in format [string, int, int]
:return: list
"""
# Dict in format (string, int): [int, count_int]
# So our list is in this format, example:
# [["hello",0,5], ["hi",0,6], ["hello",0,8], ["hello",1,1]]
# so for our dict we will make keys a tuple of the first 2 values of each sublist (since that needs to be unique)
# while values are a list of third element from our sublist + counter (which counts every time we have a duplicate
# key, so we can divide it and get average).
my_dict = {}
for element in input_list:
# key is a tuple of the first 2 values of each sublist
key = (element[0], element[1])
if key not in my_dict:
# If the key do not exists add it.
# Value is in form of third element from our sublist + counter. Since this is first value set counter to 1
my_dict[key] = [element[2], 1]
else:
# If key does exist then increment our value and increment counter by 1
my_dict[key][0] += element[2]
my_dict[key][1] += 1
# we have a dict so we will need to convert it to list (and on the way calculate averages)
return _convert_my_dict_to_list(my_dict)
def _convert_my_dict_to_list(my_dict):
"""
:param my_dict: dict, key is in form of tuple (string, int) and values are in form of list [int, int_counter]
:return: list
"""
my_list = []
for key, value in my_dict.items():
sublist = [key[0], key[1], value[0]/value[1]]
my_list.append(sublist)
return my_list
my_algorithm(x)
This will return:
[['hello', 0, 6.5], ['hi', 0, 6.0], ['hello', 1, 1.0]]
While your expected return is:
[["hello", 0, 6.5], ["hi", 0, 6], ["hello", 1, 1]]
If you really need ints then you can modify _convert_my_dict_to_list function.
Here's my variation on this theme: a groupby sans the expensive sort. I also changed the problem to make the input and output a list of tuples as these are fixed-size records:
from itertools import groupby
from operator import itemgetter
from collections import defaultdict
data = [("hello", 0, 5), ("hi", 0, 6), ("hello", 0, 8), ("hello", 1, 1)]
dictionary = defaultdict(complex)
for key, group in groupby(data, itemgetter(slice(2))):
total = sum(value for (string, number, value) in group)
dictionary[key] += total + 1j
array = [(*key, value.real / value.imag) for key, value in dictionary.items()]
print(array)
OUTPUT
> python3 test.py
[('hello', 0, 6.5), ('hi', 0, 6.0), ('hello', 1, 1.0)]
>
Thanks to #wjandrea for the itemgetter replacement for lambda. (And yes, I am using complex numbers in passing for the average to track the total and count.)
I have two lists and they are lists of tuples.
For example
List1 = [('zaidan', 0.0013568521031207597),('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279)]
List2 = [('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279), ('zaidan', 0.0013568521031207597)]
If the items were in the same order I could use the following code to multiply the two values:
val = [(t1, v1*v2) for (t1, v1), (t2, v2) in zip(tf,idf)]
But my issue is the order of one the lists outputs randomly so the code doesn't work. So essentially I need to see if the word in one list matches the word in the other and then multiply to get an output in a similar way as the list of tuples.
This question excellently demonstrates the advantages of the dictionary data structure and how your problem could benefit from it. So first, we convert your list of tuples to dictionaries (dict-calls) and then you "combine" the two dicts as per your requirement to get the desired result.
lst1 = [('zaidan', 0.0013568521031207597),('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279)]
lst2 = [('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279), ('zaidan', 0.0013568521031207597)]
dct1 = dict(lst1)
dct2 = dict(lst2)
res = {k: v * dct2.get(k, 1) for k, v in dct1.items()}.items()
which produces:
dict_items([('zaidan', 1.8410476297432288e-06), ('zimmerman', 1.8410476297432288e-06), ('ypa', 1.656942866768906e-05)])
And if the dict_item data type is confusing, you can always cast it to a vanilla-list.
res = list(res)
print(res)
# [('zaidan', 1.8410476297432288e-06), ('zimmerman', 1.8410476297432288e-06), ('ypa', 1.656942866768906e-05)]
i would tell you the easiest solution if your data are the same.
just sort it :
ls1 = sorted(ls1, key=lambda tup: tup[0])
ls2 = sorted(ls2, key=lambda tup: tup[0])
val = [(t1, v1*v2) for (t1, v1), (t2, v2) in zip(ls1,ls2)]
If, for any reason, you do not want to use dictionary (although it is a superior solution) but want to do this with lists and tuples, what you are looking for is looping through the lists and checking for equality:
x = [('zaidan', 0.0013568521031207597),('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279)]
y = [('zimmerman', 0.0013568521031207597), ('ypa', 0.004070556309362279), ('zaidan', 0.0013568521031207597)]
z = []
for item in x:
for _item in y:
if item[0] == _item[0]
z.append((item[0], item[1]*_item[1]))
At the end, z will be a list of tuples with the original string at the 0 index and the result of multiplication at the 1 index.
I am having two lists as follows:
list_1
['A-1','A-1','A-1','A-2','A-2','A-3']
list_2
['iPad','iPod','iPhone','Windows','X-box','Kindle']
I would like to split the list_2 based on the index values in list_1. For instance,
list_a1
['iPad','iPod','iPhone']
list_a2
['Windows','X-box']
list_a3
['Kindle']
I know index method, but it needs the value to be matched to be passed along with. In this case, I would like to dynamically find the indexes of the values in list_1 with the same value. Is this possible? Any tips/hints would be deeply appreciated.
Thanks.
There are a few ways to do this.
I'd do it by using zip and groupby.
First:
>>> list(zip(list_1, list_2))
[('A-1', 'iPad'),
('A-1', 'iPod'),
('A-1', 'iPhone'),
('A-2', 'Windows'),
('A-2', 'X-box'),
('A-3', 'Kindle')]
Now:
>>> import itertools, operator
>>> [(key, list(group)) for key, group in
... itertools.groupby(zip(list_1, list_2), operator.itemgetter(0))]
[('A-1', [('A-1', 'iPad'), ('A-1', 'iPod'), ('A-1', 'iPhone')]),
('A-2', [('A-2', 'Windows'), ('A-2', 'X-box')]),
('A-3', [('A-3', 'Kindle')])]
So, you just want each group, ignoring the key, and you only want the second element of each element in the group. You can get the second element of each group with another comprehension, or just by unzipping:
>>> [list(zip(*group))[1] for key, group in
... itertools.groupby(zip(list_1, list_2), operator.itemgetter(0))]
[('iPad', 'iPod', 'iPhone'), ('Windows', 'X-box'), ('Kindle',)]
I would personally find this more readable as a sequence of separate iterator transformations than as one long expression. Taken to the extreme:
>>> ziplists = zip(list_1, list_2)
>>> pairs = itertools.groupby(ziplists, operator.itemgetter(0))
>>> groups = (group for key, group in pairs)
>>> values = (zip(*group)[1] for group in groups)
>>> [list(value) for value in values]
… but a happy medium of maybe 2 or 3 lines is usually better than either extreme.
Usually I'm the one rushing to a groupby solution ;^) but here I'll go the other way and manually insert into an OrderedDict:
list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']
from collections import OrderedDict
d = OrderedDict()
for code, product in zip(list_1, list_2):
d.setdefault(code, []).append(product)
produces a d looking like
>>> d
OrderedDict([('A-1', ['iPad', 'iPod', 'iPhone']),
('A-2', ['Windows', 'X-box']), ('A-3', ['Kindle'])])
with easy access:
>>> d["A-2"]
['Windows', 'X-box']
and we can get the list-of-lists in list_1 order using .values():
>>> d.values()
[['iPad', 'iPod', 'iPhone'], ['Windows', 'X-box'], ['Kindle']]
If you've noticed that no one is telling you how to make a bunch of independent lists with names like list_a1 and so on-- that's because that's a bad idea. You want to keep the data together in something which you can (at a minimum) iterate over easily, and both dictionaries and list of lists qualify.
Maybe something like this?
#!/usr/local/cpython-3.3/bin/python
import pprint
import collections
def main():
list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']
result = collections.defaultdict(list)
for list_1_element, list_2_element in zip(list_1, list_2):
result[list_1_element].append(list_2_element)
pprint.pprint(result)
main()
Using itertools.izip_longest and itertools.groupby:
>>> from itertools import groupby, izip_longest
>>> inds = [next(g)[0] for k, g in groupby(enumerate(list_1), key=lambda x:x[1])]
First group items of list_1 and find the starting index of each group:
>>> inds
[0, 3, 5]
Now use slicing and izip_longest as we need pairs list_2[0:3], list_2[3:5], list_2[5:]:
>>> [list_2[x:y] for x, y in izip_longest(inds, inds[1:])]
[['iPad', 'iPod', 'iPhone'], ['Windows', 'X-box'], ['Kindle']]
To get a list of dicts you can something like:
>>> inds = [next(g) for k, g in groupby(enumerate(list_1), key=lambda x:x[1])]
>>> {k: list_2[ind1: ind2[0]] for (ind1, k), ind2 in
zip_longest(inds, inds[1:], fillvalue=[None])}
{'A-1': ['iPad', 'iPod', 'iPhone'], 'A-3': ['Kindle'], 'A-2': ['Windows', 'X-box']}
You could do this if you want simple code, it's not pretty, but gets the job done.
list_1 = ['A-1','A-1','A-1','A-2','A-2','A-3']
list_2 = ['iPad','iPod','iPhone','Windows','X-box','Kindle']
list_1a = []
list_1b = []
list_1c = []
place = 0
for i in list_1[::1]:
if list_1[place] == 'A-1':
list_1a.append(list_2[place])
elif list_1[place] == 'A-2':
list_1b.append(list_2[place])
else:
list_1c.append(list_2[place])
place += 1