Adding two asynchronous lists, into a dictionary - python

I've always found Dictionaries to be an odd thing in python. I know it is just me i'm sure but I cant work out how to take two lists and add them to the dict. If both lists were mapable it wouldn't be a problem something like dictionary = dict(zip(list1, list2)) would suffice. However, during each run the list1 will always have one item and the list2 could have multiple items or single item that I'd like as values.
How could I approach adding the key and potentially multiple values to it?
After some deliberation, Kasramvd's second option seems to work well for this scenario:
dictionary.setdefault(list1[0], []).append(list2)

Based on your comment all you need is assigning the second list as a value to only item of first list.
d = {}
d[list1[0]] = list2
And if you want to preserve the values for duplicate keys you can use dict.setdefault() in order to create value of list of list for duplicate keys.
d = {}
d.setdefault(list1[0], []).append(list2)

Related

How to add elements of a list to existing dictionary based on key?

I have an iterator for creating multiple lists. I need to keep adding the generated list to a dictionary dict1 based on the key value k:
some value here = k
for a in jsoncontent:
list1.append(a["Value"])
dict1.setdefault(k, []).append(list1)
Right now I get:
{k:[[10,11],[12,32,6],[7,4]]}
But I need:
{k:[10,11,12,32,6,7,4]}
How do I merge these lists?
It sounds like you want extend versus append. extend inserts the contents of the list at the end of the list, while append insets its argument, a list in this case. See https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types

Dedupe python list based on multiple criteria

I have a list:
mylist = [('Item A','CA','10'),('Item B','CT','12'),('Item A','CA','14'),('Item A','NH','10')]
I would like to remove duplicates based on column 1 and 2. So my desired output would be:
[('Item A','CA','10'),('Item B','CT','12'),('Item A','NH','10')]
I'm not really sure how to go about this, so I haven't posted any code, but am just looking for some help :)
Use a dict. The other answer is good. For variety, here's a single expression that will give you the uniq'd list (though the order of elements is not preserved).
{ tuple(item[0:2]):item for item in mylist[::-1] }.values()
This creates a dict from the elements of mylist using elements 0 and 1 as the key (implicitly removing duplicates). Because mylist is iterated in reverse order, the last element with a duplicate key (elements 0 and 1) will remain in the dict.
Dict keys can be of any hashable type. Create a dict with the first two columns of each item as the key, and only add to unique if those columns haven't been seen before.
unique = {}
for item in mylist:
if item[0:2] not in unique:
unique[item[0:2]] = item
print unique.values()

List Comprehension of Lists Nested in Dictionaries

I have a dictionary where each value is a list, like so:
dictA = {1:['a','b','c'],2:['d','e']}
Unfortunately, I cannot change this structure to get around my problem
I want to gather all of the entries of the lists into one single list, as follows:
['a','b','c','d','e']
Additionally, I want to do this only once within an if-block. Since I only want to do it once, I do not want to store it to an intermediate variable, so naturally, a list comprehension is the way to go. But how? My first guess,
[dictA[key] for key in dictA.keys()]
yields,
[['a','b','c'],['d','e']]
which does not work because
'a' in [['a','b','c'],['d','e']]
yields False. Everything else I've tried has used some sort of illegal syntax.
How might I perform such a comprehension?
Loop over the returned list too (looping directly over a dictionary gives you keys as well):
[value for key in dictA for value in dictA[key]]
or more directly using dictA.itervalues():
[value for lst in dictA.itervalues() for value in lst]
List comprehensions let you nest loops; read the above loops as if they are nested in the same order:
for lst in dictA.itervalues():
for value in lst:
# append value to the output list
Or use itertools.chain.from_iterable():
from itertools import chain
list(chain.from_iterable(dictA.itervalues()))
The latter takes a sequence of sequences and lets you loop over them as if they were one big list. dictA.itervalues() gives you a sequence of lists, and chain() puts them together for list() to iterate over and build one big list out of them.
If all you are doing is testing for membership among all the values, then what you really want is to a simple way to loop over all the values, and testing your value against each until you find a match. The any() function together with a suitable generator expression does just that:
any('a' in lst for lst in dictA.itervalues())
This will return True as soon as any value in dictA has 'a' listed, and stop looping over .itervalues() early.
If you're actually checking for membership (your a in... example), you could rewrite it as:
if any('a' in val for val in dictA.itervalues()):
# do something
This saves having to flatten the list if that's not actually required.
In this particular case, you can just use a nested comprehension:
[value for key in dictA.keys() for value in dictA[key]]
But in general, if you've already figured out how to turn something into a nested list, you can flatten any nested iterable with chain.from_iterable:
itertools.chain.from_iterable(dictA[key] for key in dictA.keys())
This returns an iterator, not a list; if you need a list, just do it explicitly:
list(itertools.chain.from_iterable(dictA[key] for key in dictA.keys()))
As a side note, for key in dictA.keys() does the same thing as for key in dictA, except that in older versions of Python, it will waste time and memory making an extra list of the keys. As the documentation says, iter on a dict is the same as iterkeys.
So, in all of the versions above, it's better to just use in dictA instead.
In simple code just for understanding this might be helpful
ListA=[]
dictA = {1:['a','b','c'],2:['d','e']}
for keys in dictA:
for values in dictA[keys]:
ListA.append(values)
You can do some like ..
output_list = []
[ output_list.extend(x) for x in {1:['a','b','c'],2:['d','e']}.values()]
output_list will be ['a', 'b', 'c', 'd', 'e']

Sorting dictionary list-values based on time

I'm pretty new to python (couple weeks into it) and I'm having some trouble wrapping my head around data structures. What I've done so far is extract text line-by-line from a .txt file and store them into a dictionary with the key as animal, for example.
database = {
'dog': ['apple', 'dog', '2012-06-12-08-12-59'],
'cat': [
['orange', 'cat', '2012-06-11-18-33-12'],
['blue', 'cat', '2012-06-13-03-23-48']
],
'frog': ['kiwi', 'frog', '2012-06-12-17-12-44'],
'cow': [
['pear', 'ant', '2012-06-12-14-02-30'],
['plum', 'cow', '2012-06-12-23-27-14']
]
}
# year-month-day-hour-min-sec
That way, when I print my dictionary out, it prints out by animal types, and the newest dates first.
Whats the best way to go about sorting this data by time? I'm on python 2.7. What I'm thinking is
for each key:
grab the list (or list of lists) --> get the 3rd entry --> '-'.split it, --> then maybe try the sorted(parameters)
I'm just not really sure how to go about this...
Walk through the elements of your dictionary. For each value, run sorted on your list of lists, and tell the sorting algorithm to use the third field of the list as the "key" element. This key element is what is used to compare values to other elements in the list in order to ascertain sort order. To tell sorted which element of your lists to sort with, use operator.itemgetter to specify the third element.
Since your timestamps are rigidly structured and each character in the timestamp is more temporally significant than the next one, you can sort them naturally, like strings - you don't need to convert them to times.
# Dictionary stored in d
from operator import itemgetter
# Iterate over the elements of the dictionary; below, by
# calling items(), k gets the key value of an entry and
# v gets the value of that entry
for k,v in d.items():
if v and isinstance(v[0], list):
v.sort(key=itemgetter(2)) # Start with 0, so third element is 2
If your dates are all in the format year-month-day-hour-min-sec,2012-06-12-23-27-14,I think your step of split it is not necessary,just compare them as string.
>>> '2012-06-12-23-27-14' > '2012-06-12-14-02-30'
True
Firstly, you'll probably want each key,value item in the dict to be of a similar type. At the moment some of them (eg: database['dog'] ) are a list of strings (a line) and some (eg: database['cat']) are a list of lines. If you get them all into list of lines format (even if there's only one item in the list of lines) it will be much easier.
Then, one (old) way would be to make a comparison function for those lines. This will be easy since your dates are already in a format that's directly (string) comparable. To compare two lines, you want to compare the 3rd (2nd index) item in them:
def compare_line_by_date(x,y):
return cmp(x[2],y[2])
Finally you can get the lines for a particular key sorted by telling the sorted builtin to use your compare_line_by_date function:
sorted(database['cat'],compare_line_by_date)
The above is suitable (but slow, and will disappear in python 3) for arbitrarily complex comparison/sorting functions. There are other ways to do your particular sort, for example by using the key parameter of sorted:
def key_for_line(line):
return line[2]
sorted(database['cat'],key=key_for_line)
Using keys for sorting is much faster than cmp because the key function only needs to be run once per item in the list to be sorted, instead of every time items in the list are compared (which is usually much more often than the number of items in the list). The idea of a key is to basically boil each list item down into something that be compared naturally, like a string or a number. In the example above we boiled the line down into just the date, which is then compared.
Disclaimer: I haven't tested any of the code in this answer... but it should work!

How to append and uniqify a tuple

d1 = ({'x':1, 'y':2}, {'x':3, 'y':4})
d2 = ({'x':1, 'y':2}, {'x':5, 'y':6}, {'x':1, 'y':6, 'z':7})
I have two tuple d1 and d2. I know tuples are immutable. So I have to append another tuple using list. Is there any better solution.
Next question is How to uniqify a tuple on keys say 'x'. if 'x':1 in keys comes twice it is dulicate.
append_tuple = ({'x':1, 'y':2}, {'x':5, 'y':6}, {'x':1, 'y':6, 'z':7}, {'x':1, 'y':2}, {'x':3, 'y':4})
unique_tuple = ({'x':1, 'y':2}, {'x':3, 'y':4}, {'x':5, 'y':6})
Note:
I want to remove the duplicate element from a tuple of dict if key values say 'x' has save value in two dict then those are duplicate element.
There is no better way to extend a tuple. I would say if you are constantly doing this, you need to migrate away from a tuple, or change your design.
Yet again it sounds like you are using the wrong type of collection. But you could remove the duplicated keys from the dictionary using the existing SO answer How to uniqufy the tuple element?
Tuples are generally for data where the number of items is fixed and each place has its own "meaning", so things like sorting, appending, and removing duplicates will never be very natural on tuples and weren't designed to be. If you're stuck with tuples, converting to a list, doing these operations, then converting back is perfectly reasonable.
For appending, you'd do:
d1 += (ITEM,)
and to extend, you'd just do:
d1 += d2
For uniquifying:
unique_list = []
for i1 in append_tuple:
if not any((k,v) in i2.items() for (k,v) in i1.items() for i2 in unique_list):
unique_list.append(i1)
unique_tuple = tuple(unique_list)
It might seem like there's a more concise/elegant solution, but what you are trying to do is fairly specific, and for stuff like this it's better to be explicit and build up the list in a for loop than to try to force it into a comprehension or similar.

Categories