Related
I want to make one large list for entering into a database with values from 4 different lists. I want it to be like
[[list1[0], list2[0], list3[0], list4[0]], [list1[1], list2[1], list3[1], list4[1]], etc.....]
Another issue is that currently the data is received like this:
[ [ [list1[0], list1[1], [list1[3]]], [[list2[0]]], etc.....]
I've tried looping through each list using indexs and adding them to a new list based on those but it hasn't worked, I'm pretty sure it didn't work because some of the lists are different lengths (they're not meant to be but it's automated data so sometimes there's a mistake).
Anyone know what's the best way to go about this? Thanks.
First list can be constructed using zip function as follows (for 4 lists):
list1 = [1,2,3,4]
list2 = [5,6,7,8]
list3 = [9,10,11,12]
list4 = [13,14,15,16]
res = list(zip(list1,list2,list3,list4))
For arbitrtary number of lists stored in another list u can use *-notation to unpack outer list:
lists = [...]
res = list(zip(*lists))
To construct list of lists for zipping from you data in second issue use flatten concept to it and then zip:
def flatten(l):
res = []
for el in l:
if(isinstance(el, list)):
res += flatten(el)
else:
res.append(el)
return res
auto_data = [...]
res = list(zip(*[flatten(el) for el in auto_data]))
Some clarification at the end:
zip function construct results of the smallest length between all inputs, then you need to extend data in list comprehension in last code string to be one length to not lose some info.
So if I understand correctly, this is your input:
l = [[1.1,1.2,1.3,1.4],[2.1,2.2,2.3,2.4],[3.1,3.2,3.3,3.4],[4.1,4.2,4.3,4.4]]
and you would like to have this output
[[1.1,2.1,3.1,4.1],...]
If so, this could be done by using zip
zip(*l)
Make a for loop which only gives you the counter variable. Use that variable to index the lists. Make a temporary list , fill it up with the values from the other lists. Add that list to the final one. With this you will et the desired structure.
nestedlist = []
for counter in range(0,x):
temporarylist = []
temporarylist.append(firstlist[counter])
temporarylist.append(secondlist[counter])
temporarylist.append(thirdlist[counter])
temporarylist.append(fourthlist[counter])
nestedlist.append(temporarylist)
If all the 4 lists are the same length you can use this code to make it even nicer.
nestedlist = []
for counter in range(0,len(firstlist)): #changed line
temporarylist = []
temporarylist.append(firstlist[counter])
temporarylist.append(secondlist[counter])
temporarylist.append(thirdlist[counter])
temporarylist.append(fourthlist[counter])
nestedlist.append(temporarylist)
This comprehension should work, with a little help from zip:
mylist = [i for i in zip(list1, list2, list3, list4)]
But this assumes all the list are of the same length. If that's not the case (or you're not sure of that), you can "pad" them first, to be of same length.
def padlist(some_list, desired_length, pad_with):
while len(some_list) < desired_length:
some_list.append(pad_with)
return some_list
list_of_lists = [list1, list2, list3, list4]
maxlength = len(max(list_of_lists, key=len))
list_of_lists = [padlist(l, maxlength, 0) for l in list_of_lists]
And now do the above comprehension statement, works well in my testing of it
mylist = [i for i in zip(*list_of_lists)]
If the flatten concept doesn't work, try this out:
import numpy as np
myArray = np.array([[list1[0], list2[0], list3[0], list4[0]], [list1[1], list2[1], list3[1], list4[1]]])
np.hstack(myArray)
Also that one should work:
np.concatenate(myArray, axis=1)
Just for those who will search for the solution of this problem when lists are of the same length:
def flatten(lists):
results = []
for numbers in lists:
for output in numbers:
results.append(output)
return results
print(flatten(n))
This question already has answers here:
How do I make a flat list out of a list of lists?
(34 answers)
Closed 7 years ago.
I want to convert multiple lists inside a list? I am doing it with a loop, but each sub list item doesn't get a comma between it.
myList = [['a','b','c','d'],['a','b','c','d']]
myString = ''
for x in myList:
myString += ",".join(x)
print myString
ouput:
a,b,c,da,b,c,d
desired output:
a,b,c,d,a,b,c,d
This can be done using a list comprehension where you will "flatten" your list of lists in to a single list, and then use the "join" method to make your list a string. The ',' portion indicates to separate each part by a comma.
','.join([item for sub_list in myList for item in sub_list])
Note: Please look at my analysis below for what was tested to be the fastest solution on others proposed here
Demo:
myList = [['a','b','c','d'],['a','b','c','d']]
result = ','.join([item for sub_list in myList for item in sub_list])
output of result -> a,b,c,d,a,b,c,d
However, to further explode this in to parts to explain how this works, we can see the following example:
# create a new list called my_new_list
my_new_list = []
# Next we want to iterate over the outer list
for sub_list in myList:
# Now go over each item of the sublist
for item in sub_list:
# append it to our new list
my_new_list.append(item)
So at this point, outputting my_new_list will yield:
['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd']
So, now all we have to do with this is make it a string. This is where the ','.join() comes in to play. We simply make this call:
myString = ','.join(my_new_list)
Outputting that will give us:
a,b,c,d,a,b,c,d
Further Analysis
So, looking at this further, it really piqued my interest. I suspect that in fact the other solutions are possibly faster. Therefore, why not test it!
I took each of the solutions proposed, and ran a timer against them with a much bigger sample set to see what would happen. Running the code yielded the following results in increasing order:
map: 3.8023074030061252
chain: 7.675725881999824
comprehension: 8.73164687899407
So, the clear winner here is in fact the map implementation. If anyone is interested, here is the code used to time the results:
from timeit import Timer
def comprehension(l):
return ','.join([i for sub_list in l for i in sub_list])
def chain(l):
from itertools import chain
return ','.join(chain.from_iterable(l))
def a_map(l):
return ','.join(map(','.join, l))
myList = [[str(i) for i in range(10)] for j in range(10)]
print(Timer(lambda: comprehension(myList)).timeit())
print(Timer(lambda: chain(myList)).timeit())
print(Timer(lambda: a_map(myList)).timeit())
from itertools import chain
myList = [['a','b','c','d'],['a','b','c','d']]
print(','.join(chain.from_iterable(myList)))
a,b,c,d,a,b,c,d
You could also just join at both levels:
>>> ','.join(map(','.join, myList))
'a,b,c,d,a,b,c,d'
It's shorter and significantly faster than the other solutions:
>>> myList = [['a'] * 1000] * 1000
>>> from timeit import timeit
>>> timeit(lambda: ','.join(map(','.join, myList)), number=10)
0.18380278121490046
>>> from itertools import chain
>>> timeit(lambda: ','.join(chain.from_iterable(myList)), number=10)
0.6535200733309843
>>> timeit(lambda: ','.join([item for sub_list in myList for item in sub_list]), number=10)
1.0301431917067738
I also tried [['a'] * 10] * 10, [['a'] * 10] * 100000 and [['a'] * 100000] * 10 and it was always the same picture.
myList = [['a','b','c','d'],[a','b','c','d']]
smyList = myList[0] + myList[1]
str1 = ','.join(str(x) for x in smyList)
print str1
output
a,b,c,d,a,b,c,d
I have a list of tuples which I would like to only return the second column of data from and only unique values
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
Desired output:
['Andrew#gmail.com','Jim#gmail.com','Sarah#gmail.com']
My idea would be to iterate through the list and append the item from the second column into a new list then use the following code. Before I go down that path too far I know there is a better way to do this.
from collections import Counter
cnt = Counter(mytuple_new)
unique_mytuple_new = [k for k, v in cnt.iteritems() if v > 1]
You can use zip function :
>>> set(zip(*mytuple)[1])
set(['Sarah#gmail.com', 'Jim#gmail.com', 'Andrew#gmail.com'])
Or as a less performance way you can use map and operator.itemgetter and use set to get the unique tuple :
>>> from operator import itemgetter
>>> tuple(set(map(lambda x:itemgetter(1)(x),mytuple)))
('Sarah#gmail.com', 'Jim#gmail.com', 'Andrew#gmail.com')
a benchmarking on some answers :
my answer :
s = """\
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
set(zip(*mytuple)[1])
"""
print timeit.timeit(stmt=s, number=100000)
0.0740020275116
icodez answer :
s = """\
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
seen = set()
[x[1] for x in mytuple if x[1] not in seen and not seen.add(x[1])]
"""
print timeit.timeit(stmt=s, number=100000)
0.0938332080841
Hasan's answer :
s = """\
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
set([k[1] for k in mytuple])
"""
print timeit.timeit(stmt=s, number=100000)
0.0699651241302
Adem's answer :
s = """
from itertools import izip
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
set(map(lambda x: x[1], mytuple))
"""
print timeit.timeit(stmt=s, number=100000)
0.237300872803 !!!
unique_emails = set(item[1] for item in mytuple)
The list comprehension will help you generate a list containing only the second column data, and converting that list to set() removes duplicated values.
try:
>>> unique_mytuple_new = set([k[1] for k in mytuple])
>>> unique_mytuple_new
set(['Sarah#gmail.com', 'Jim#gmail.com', 'Andrew#gmail.com'])
You can use a list comprehension and a set to keep track of seen values:
>>> mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
>>> seen = set()
>>> [x[1] for x in mytuple if x[1] not in seen and not seen.add(x[1])]
['Andrew#gmail.com', 'Jim#gmail.com', 'Sarah#gmail.com']
>>>
The most important part of this solution is that order is preserved like in your example. Doing just set(x[1] for x in mytuple) or something similar will get you the unique items, but their order will be lost.
Also, the if x[1] not in seen and not seen.add(x[1]) may seem a little strange, but it is actually a neat trick that allows you to add items to the set inside the list comprehension (otherwise, we would need to use a for-loop).
Because and performs short-circuit evaluation in Python, not seen.add(x[1]) will only be evaluated if x[1] not in seen returns True. So, the condition sees if x[1] is in the set and adds it if not.
The not operator is placed before seen.add(x[1]) so that the condition evaluates to True if x[1] needed to be added to the set (set.add returns None, which is treated as False. not False is True).
How about the obvious and simple loop? There is no need to create a list and then convert to set, just don't add dupliates.
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
result = []
for item in mytuple:
if item[1] not in result:
result.append(item[1])
print result
Output:
['Andrew#gmail.com', 'Jim#gmail.com', 'Sarah#gmail.com']
Is the order of the items important? A lot of the proposed answers use set to unique-ify the list. That's good, proper, and performant if the order is unimportant. If order does matter, you can used an OrderedDict to perform set-like unique-ification while preserving order.
# test data
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
from collections import OrderedDict
emails = list(OrderedDict((t[1], 1) for t in mytuple).keys())
print emails
Yielding:
['Andrew#gmail.com', 'Jim#gmail.com', 'Sarah#gmail.com']
Update
Based on iCodez's suggestion, restating answer to:
from collections import OrderedDict
emails = list(OrderedDict.fromkeys(t[1] for t in mytuple).keys())
I want part of a script I am writing to do something like this.
x=0
y=0
list=[["cat","dog","mouse",1],["cat","dog","mouse",2],["cat","dog","mouse",3]]
row=list[y]
item=row[x]
print list.count(item)
The problem is that this will print 0 because it isn't searching the individual lists.How can I make it return the total number of instances instead?
Search per sublist, adding up results per contained list with sum():
sum(sub.count(item) for sub in lst)
Demo:
>>> lst = [["cat","dog","mouse",1],["cat","dog","mouse",2],["cat","dog","mouse",3]]
>>> item = 'cat'
>>> sum(sub.count(item) for sub in lst)
3
sum() is a builtin function for adding up its arguments.
The x.count(item) for x in list) is a "generator expression" (similar to a list comprehension) - a handy way to create and manage list objects in python.
item_count = sum(x.count(item) for x in list)
That should do it
Using collections.Counter and itertools.chain.from_iterable:
>>> from collections import Counter
>>> from itertools import chain
>>> lst = [["cat","dog","mouse",1],["cat","dog","mouse",2],["cat","dog","mouse",3]]
>>> count = Counter(item for item in chain.from_iterable(lst) if not isinstance(item, int))
>>> count
Counter({'mouse': 3, 'dog': 3, 'cat': 3})
>>> count['cat']
3
I filtered out the ints because I didn't see why you had them in the first place.
I'm learning python. I have a list of simple entries and I want to convert it in a dictionary where the first element of list is the key of the second element, the third is the key of the fourth, and so on. How can I do it?
list = ['first_key', 'first_value', 'second_key', 'second_value']
Thanks in advance!
The most concise way is
some_list = ['first_key', 'first_value', 'second_key', 'second_value']
d = dict(zip(*[iter(some_list)] * 2))
myDict = dict(zip(myList[::2], myList[1::2]))
Please do not use 'list' as a variable name, as it prevents you from accessing the list() function.
If there is much data involved, we can do it more efficiently using iterator functions:
from itertools import izip, islice
myList = ['first_key', 'first_value', 'second_key', 'second_value']
myDict = dict(izip(islice(myList,0,None,2), islice(myList,1,None,2)))
If the list is large, you end up wasting memory by building slices or eager zips. One way to convert the list more lazily is to (ab)use the list iterator and izip.
from itertools import izip
lst = ['first_key', 'first_value', 'second_key', 'second_value']
i = iter(lst)
d = dict(izip(i,i))
The KISS way:
Use exception and iterators
myDict = {}
it = iter(list)
for x in list:
try:
myDict[it.next()] = it.next()
except:
pass
myDict