converting a simple list to a dictionary (in python) - python

I'm learning python. I have a list of simple entries and I want to convert it in a dictionary where the first element of list is the key of the second element, the third is the key of the fourth, and so on. How can I do it?
list = ['first_key', 'first_value', 'second_key', 'second_value']
Thanks in advance!

The most concise way is
some_list = ['first_key', 'first_value', 'second_key', 'second_value']
d = dict(zip(*[iter(some_list)] * 2))

myDict = dict(zip(myList[::2], myList[1::2]))
Please do not use 'list' as a variable name, as it prevents you from accessing the list() function.
If there is much data involved, we can do it more efficiently using iterator functions:
from itertools import izip, islice
myList = ['first_key', 'first_value', 'second_key', 'second_value']
myDict = dict(izip(islice(myList,0,None,2), islice(myList,1,None,2)))

If the list is large, you end up wasting memory by building slices or eager zips. One way to convert the list more lazily is to (ab)use the list iterator and izip.
from itertools import izip
lst = ['first_key', 'first_value', 'second_key', 'second_value']
i = iter(lst)
d = dict(izip(i,i))

The KISS way:
Use exception and iterators
myDict = {}
it = iter(list)
for x in list:
try:
myDict[it.next()] = it.next()
except:
pass
myDict

Related

Create list based on index of existing list

I know how to create a new list based on the values of an existing list, eg casting
numspec = [float(x) for x in textspec]
Now I have a list of numbers where I need to subtract a value based on the index of a list. I have calculated an a and b value and ended up doing
peakadj = []
for i in range(len(peakvalues)):
val=peakvalues[i]-(i*a+b)
peakadj.append(val)
This works, but I don't like the feel of it, is there any more pythonic way of doing this?
Use the builtin enumerate function and a list comprehension.
peakadj = [val-(i*a+b) for i, val in enumerate(peakvalues)]
Perhaps faster:
from itertools import count
peakadj = [val-iab for val, iab in zip(peakvalues, count(b, a))]
Or:
from itertools import count
from operator import sub
peakadj = [*map(sub, peakvalues, count(b, a))]
Little benchmark

How to merge n lists together item by item for each list

I want to make one large list for entering into a database with values from 4 different lists. I want it to be like
[[list1[0], list2[0], list3[0], list4[0]], [list1[1], list2[1], list3[1], list4[1]], etc.....]
Another issue is that currently the data is received like this:
[ [ [list1[0], list1[1], [list1[3]]], [[list2[0]]], etc.....]
I've tried looping through each list using indexs and adding them to a new list based on those but it hasn't worked, I'm pretty sure it didn't work because some of the lists are different lengths (they're not meant to be but it's automated data so sometimes there's a mistake).
Anyone know what's the best way to go about this? Thanks.
First list can be constructed using zip function as follows (for 4 lists):
list1 = [1,2,3,4]
list2 = [5,6,7,8]
list3 = [9,10,11,12]
list4 = [13,14,15,16]
res = list(zip(list1,list2,list3,list4))
For arbitrtary number of lists stored in another list u can use *-notation to unpack outer list:
lists = [...]
res = list(zip(*lists))
To construct list of lists for zipping from you data in second issue use flatten concept to it and then zip:
def flatten(l):
res = []
for el in l:
if(isinstance(el, list)):
res += flatten(el)
else:
res.append(el)
return res
auto_data = [...]
res = list(zip(*[flatten(el) for el in auto_data]))
Some clarification at the end:
zip function construct results of the smallest length between all inputs, then you need to extend data in list comprehension in last code string to be one length to not lose some info.
So if I understand correctly, this is your input:
l = [[1.1,1.2,1.3,1.4],[2.1,2.2,2.3,2.4],[3.1,3.2,3.3,3.4],[4.1,4.2,4.3,4.4]]
and you would like to have this output
[[1.1,2.1,3.1,4.1],...]
If so, this could be done by using zip
zip(*l)
Make a for loop which only gives you the counter variable. Use that variable to index the lists. Make a temporary list , fill it up with the values from the other lists. Add that list to the final one. With this you will et the desired structure.
nestedlist = []
for counter in range(0,x):
temporarylist = []
temporarylist.append(firstlist[counter])
temporarylist.append(secondlist[counter])
temporarylist.append(thirdlist[counter])
temporarylist.append(fourthlist[counter])
nestedlist.append(temporarylist)
If all the 4 lists are the same length you can use this code to make it even nicer.
nestedlist = []
for counter in range(0,len(firstlist)): #changed line
temporarylist = []
temporarylist.append(firstlist[counter])
temporarylist.append(secondlist[counter])
temporarylist.append(thirdlist[counter])
temporarylist.append(fourthlist[counter])
nestedlist.append(temporarylist)
This comprehension should work, with a little help from zip:
mylist = [i for i in zip(list1, list2, list3, list4)]
But this assumes all the list are of the same length. If that's not the case (or you're not sure of that), you can "pad" them first, to be of same length.
def padlist(some_list, desired_length, pad_with):
while len(some_list) < desired_length:
some_list.append(pad_with)
return some_list
list_of_lists = [list1, list2, list3, list4]
maxlength = len(max(list_of_lists, key=len))
list_of_lists = [padlist(l, maxlength, 0) for l in list_of_lists]
And now do the above comprehension statement, works well in my testing of it
mylist = [i for i in zip(*list_of_lists)]
If the flatten concept doesn't work, try this out:
import numpy as np
myArray = np.array([[list1[0], list2[0], list3[0], list4[0]], [list1[1], list2[1], list3[1], list4[1]]])
np.hstack(myArray)
Also that one should work:
np.concatenate(myArray, axis=1)
Just for those who will search for the solution of this problem when lists are of the same length:
def flatten(lists):
results = []
for numbers in lists:
for output in numbers:
results.append(output)
return results
print(flatten(n))

Dropping values from a list of tuples

I have a list of tuples which I would like to only return the second column of data from and only unique values
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
Desired output:
['Andrew#gmail.com','Jim#gmail.com','Sarah#gmail.com']
My idea would be to iterate through the list and append the item from the second column into a new list then use the following code. Before I go down that path too far I know there is a better way to do this.
from collections import Counter
cnt = Counter(mytuple_new)
unique_mytuple_new = [k for k, v in cnt.iteritems() if v > 1]
You can use zip function :
>>> set(zip(*mytuple)[1])
set(['Sarah#gmail.com', 'Jim#gmail.com', 'Andrew#gmail.com'])
Or as a less performance way you can use map and operator.itemgetter and use set to get the unique tuple :
>>> from operator import itemgetter
>>> tuple(set(map(lambda x:itemgetter(1)(x),mytuple)))
('Sarah#gmail.com', 'Jim#gmail.com', 'Andrew#gmail.com')
a benchmarking on some answers :
my answer :
s = """\
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
set(zip(*mytuple)[1])
"""
print timeit.timeit(stmt=s, number=100000)
0.0740020275116
icodez answer :
s = """\
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
seen = set()
[x[1] for x in mytuple if x[1] not in seen and not seen.add(x[1])]
"""
print timeit.timeit(stmt=s, number=100000)
0.0938332080841
Hasan's answer :
s = """\
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
set([k[1] for k in mytuple])
"""
print timeit.timeit(stmt=s, number=100000)
0.0699651241302
Adem's answer :
s = """
from itertools import izip
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
set(map(lambda x: x[1], mytuple))
"""
print timeit.timeit(stmt=s, number=100000)
0.237300872803 !!!
unique_emails = set(item[1] for item in mytuple)
The list comprehension will help you generate a list containing only the second column data, and converting that list to set() removes duplicated values.
try:
>>> unique_mytuple_new = set([k[1] for k in mytuple])
>>> unique_mytuple_new
set(['Sarah#gmail.com', 'Jim#gmail.com', 'Andrew#gmail.com'])
You can use a list comprehension and a set to keep track of seen values:
>>> mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
>>> seen = set()
>>> [x[1] for x in mytuple if x[1] not in seen and not seen.add(x[1])]
['Andrew#gmail.com', 'Jim#gmail.com', 'Sarah#gmail.com']
>>>
The most important part of this solution is that order is preserved like in your example. Doing just set(x[1] for x in mytuple) or something similar will get you the unique items, but their order will be lost.
Also, the if x[1] not in seen and not seen.add(x[1]) may seem a little strange, but it is actually a neat trick that allows you to add items to the set inside the list comprehension (otherwise, we would need to use a for-loop).
Because and performs short-circuit evaluation in Python, not seen.add(x[1]) will only be evaluated if x[1] not in seen returns True. So, the condition sees if x[1] is in the set and adds it if not.
The not operator is placed before seen.add(x[1]) so that the condition evaluates to True if x[1] needed to be added to the set (set.add returns None, which is treated as False. not False is True).
How about the obvious and simple loop? There is no need to create a list and then convert to set, just don't add dupliates.
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
result = []
for item in mytuple:
if item[1] not in result:
result.append(item[1])
print result
Output:
['Andrew#gmail.com', 'Jim#gmail.com', 'Sarah#gmail.com']
Is the order of the items important? A lot of the proposed answers use set to unique-ify the list. That's good, proper, and performant if the order is unimportant. If order does matter, you can used an OrderedDict to perform set-like unique-ification while preserving order.
# test data
mytuple = [('Andrew','Andrew#gmail.com','20'),('Jim',"Jim#gmail.com",'12'),("Sarah","Sarah#gmail.com",'43'),("Jim","Jim#gmail.com",'15'),("Andrew","Andrew#gmail.com",'56')]
from collections import OrderedDict
emails = list(OrderedDict((t[1], 1) for t in mytuple).keys())
print emails
Yielding:
['Andrew#gmail.com', 'Jim#gmail.com', 'Sarah#gmail.com']
Update
Based on iCodez's suggestion, restating answer to:
from collections import OrderedDict
emails = list(OrderedDict.fromkeys(t[1] for t in mytuple).keys())

Comparing Multiple Lists Python

I'm trying to compare multiple lists. However the lists aren't label...normally. I'm using a while loop to make a new list each time and label them accordingly. So for example, if the while loop runs 3 times it will make a List1 a List2 and List3. Here is then snippet of the code to create the list.
for link in links:
print('*', link.text)
locals()['list{}'.format(str(i))].append(link.text)
So I want to compare each list for the strings that are in them but I want to compare all the lists at once then print out the common strings.
I feel like I'll be using something like this, but I'm not 100% sure.
lists = [list1, list2, list3, list4, list5, list6, list7, list8, list9, list10]
common = list(set().union(*lists).intersection(Keyword))
Rather than directly modifying locals() (generally not a good idea), use a defaultdict as a container. This data structure allows you to create new key-value pairs on the fly rather than relying on a method which is sure to lead to a NameError at some point.
from collections import defaultdict
i = ...
link_lists = defaultdict(list)
for link in links:
print('*', link.text)
link_lists[i].append(link.text)
To find the intersection of all of the lists:
all_lists = list(link_lists.values())
common_links = set(all_lists[0]).intersection(*all_lists[1:])
In Python 2.6+, you can pass multiple iterables to set.intersection. This is what the star-args do here.
Here's an example of how the intersection will work:
>>> from collections import defaultdict
>>> c = defaultdict(list)
>>> c[9].append("a")
>>> c[0].append("b")
>>> all = list(c.values())
>>> set(all[0]).intersection(*all[1:])
set()
>>> c[0].append("a")
>>> all = list(c.values())
>>> set(all[0]).intersection(*all[1:])
{'a'}
You have several options,
option a)
use itertools to get a cartesian product, this is quite nice because its an iterator
a = ["A", "B", "C"]
b = ["A","C"]
c = ["C","D","E"]
for aval,bval,cval in itertools.product(a,b,c):
if aval == bval and bval == cval:
print aval
option b)
Use sets (recommended):
all_lists = []
# insert your while loop X times
for lst in lists: # This is my guess of your loop running.
currentList = map(lambda x: x.link, links)
all_lists.append(currentList) # O(1) operation
result_set = set()
if len(all_lists)>1:
result_set = set(all_lists[0]).intersection(*all_lists[1:])
else:
result_set = set(all_lists[0])
Using the sets, however, will be faster

How to get unique sorted list in one statement?

The return value of sort() is None so the following code doesn't work:
def sorted_unique_items(a):
return list(set(a)).sort()
Any idea for better solution?
Use sorted():
sorted(set(a))
and you can omit the list() call entirely.
Its simple Just use the sorted() method :
data = [10,5,46,4]
sorted_data = sorted(data)
print "Sorted Data ::::",sorted_data
Above solution is only work for hashable type like string,integer and tuple but it not work for unhashable type like list.
for example if you have a list data= [[2],[1],[2],[4]]
for unhashable type best solution is :
from itertools import izip, islice
values= [[2],[1],[2],[4]]
def sort_unique_unhashable(values):
values = sorted(values)
if not values:
return []
consecutive_pairs = izip(values, islice(values, 1, len(values)))
result = [a for (a, b) in consecutive_pairs if a != b]
result.append(values[-1])
return result
print sort_unique_unhashable(values)
There is Two options:
a = [2,34,55,1,22,11,22,55,1]
#First option using sorted and set.
sorted(set(a))
#second option using list and set.
list(set(a))

Categories