Python 2.7: efficiently reformat list of tuples - python

I would like to reformat the following list containing tuples with integers (shared between some tuples) and strings (idiosyncratic to each tuple)
mylist = [(8, 'dddd'), (8, '33333'), (8, 'fdsss'), (9, 'fsfjs'),(10, 'dddd'), (10, '33333'), (12, 'fdsss'), (12, 'fsfjs')]
so that each tuple contains an integer and a concatenated string of all strings belonging to it, like so:
mynewlist = [(8, 'dddd, 33333, fdsss'), (9, 'fsfjs'),(10, 'dddd, 333333'), (12, 'fdsss, fsfjs')
After some deliberation, the most parsimonious solution I've come up with is to simply loop across all tuples and concatenate strings until the integer doesn't match the next one:
mynewlist = []
label = ''
for i in range(len(mylist)-1):
if mylist[i][0] != mylist[i+1][0]:
mynewlist.append(tuple([mylist[i][0], label + mylist[i][1]]))
label = ''
else:
label = label + mylist[i][1] + ','
This works fine. However, I'd like to know if there's a more efficient/Pythonic way of producing the list. I considered using a list comprehension, but this wouldn't allow me to select the strings without going through the whole list many times over; the list comprehension would need to be run for each unique integer, which seems wasteful. I also considered pre-selecting the strings associated with a unique integer through indexing, but this appears quite un-Pythonic to me.
Advice is very appreciated. Thanks!

You could use itertools.groupby() to do the grouping here:
from itertools import groupby
from operator import itemgetter
mynewlist = [
(key, ', '.join([s for num, s in group]))
for key, group in groupby(mylist, itemgetter(0))]
This uses list comprehensions to process each group and extract the strings from the grouped tuples for concatenation. The operator.itemgetter() object tells groupby() to group the input on the first element:
>>> from itertools import groupby
>>> from operator import itemgetter
>>> mylist = [(8, 'dddd'), (8, '33333'), (8, 'fdsss'), (9, 'fsfjs'),(10, 'dddd'), (10, '33333'), (12, 'fdsss'), (12, 'fsfjs')]
>>> [(key, ', '.join([s for num, s in group])) for key, group in groupby(mylist, itemgetter(0))]
[(8, 'dddd, 33333, fdsss'), (9, 'fsfjs'), (10, 'dddd, 33333'), (12, 'fdsss, fsfjs')]
Note that the groupby() iterator groups only consecutive matching elements. That means if your input is not sorted, then tuples with the same initial element are not necessarily going to always be put together either. If your input is not sorted and you need all tuples with the same starting element to be grouped regardless of where they are in the input sequence, use a dictionary to group the elements first:
grouped = {}
for key, string in mylist:
grouped.setdefault(key, []).append(string)
mynewlist = [(key, ', '.join([s for num, s in group])) for key, group in grouped.items()]

A defaultdict would do the trick:
from collections import defaultdict
dct = defaultdict(list)
for k, v in mylist:
dct[k].append(v)
mynewlist = [(k, ','.join(v)) for k, v in dct.iteritems()]

You can do it with a custom dict subclass:
class mydict(dict):
def __setitem__(self, key, val):
self.setdefault(key,[]).append(val)
>>> mylist = [(8, 'dddd'), (8, '33333'), (8, 'fdsss'),
... (9, 'fsfjs'),(10, 'dddd'), (10, '33333'),
... (12, 'fdsss'), (12, 'fsfjs')]
>>> d = mydict()
>>> for key, val in mylist:
... d[key] = val
Now d contains something like
{8: ['dddd', '33333', 'fdsss'], 9: ['fsfjs'], 10: ['dddd', '33333'], 12: ['fdsss', 'fsfjs']}
(to within order of items), and you can easily massage this into the form you want:
result = [(key,', '.join(d[key])) for key, value in d]

Related

Python: How to select and then compare and filter certain values in a dict?

I have a dict which looks like this:
{
'Amathus': [datetime.date(2022, 8, 10), datetime.time(1, 30), 'Wroclaw', Decimal('3.75'), 33.91],
'Falesia': [datetime.date(2022, 8, 10), datetime.time(1, 30), 'Wroclaw', Decimal('4.00'), 21.46],
'Diamond': [datetime.date(2022, 8, 10), datetime.time(1, 30), 'Posznan', Decimal('4.50'), 40.24],
'Kid': [datetime.date(2022, 8, 10), datetime.time(1, 30), 'Posznan', Decimal('4.50'), 42.24]
}
and so on.
I now want to select every Key that has the value "Wroclaw", these keys for the last value (74,14 for example) and only return the highest 3 keys with all values attached.
My try so far:
I get all the keys with this:
getkeys = [k for k, v in mydict.items() if city in v] #city is a variable, containing "Wroclaw"
newdict = {}
for k in getkeys :
aupdate = {k :finaldict2[k]}
newdict.update(aupdate)
sorteddict = sorted(newdict, key=newdict.get, reverse=True)
So far, so good - I now have the keys in the sorted order in a list. Now I could use sth like this to print the 3 highest values:
counting = 0
while counting <= 2:
testvalue = newdict[sorteddict[counting]]
print(sorteddict[counting],testvalue)
counting += 1
But this feels so clonky and just not like the best solution, but this is a far as I come right now.
So how to improve this approach further? Any advice is appreciated :D
You don't need the sorted list of keys.
result = sorted(newdict.items(), key=lambda x: x[1], reverse=True)[:3]
This returns a list of (key, value) tuples.

Insert values to a dictionary in an ascending order in Python?

I have a a class that gets key and value and add them to a dictionary. I am trying to insert into the dict while keeping the order of the values in an ascending order. I know that OrderedDict remembers the order that keys were first inserted, but wondering what to use if I want to keep the values of the dict sorted as well. Here is an example:
rom collections import OrderedDict
from random import randint
class MyDict():
def __init__(self):
self.d=OrderedDict()
def add(self, key, val):
self.d[key] = val
def show(self):
return self.d
i=1
obj=MyDict()
for _ in range(5):
obj.add(i, randint(1,50))
i+=1
print(obj.show())
OrderedDict([(1, 8), (2, 6), (3, 10), (4, 32), (5, 15)])
However, I am looking to get something like this:
OrderedDict([(2, 6), (1, 8), (3, 10), (5, 15), (4, 32)])
Since it is apparent from your comments that you only need the dict sorted upon output but not during a series of insertions, you don't actually need to incur the overhead of sorting upon each insertion, but rather you can add sorting to methods where the order matters, which can be done by subclassing OrderedDict and overriding all such methods with ones that would sort the items first.
The example below overrides the __repr__ method just so that printing the object would produce your desired output, but you should override all the other relevant methods for completeness:
class MyDict(OrderedDict):
def __sorted(self):
sorted_items = sorted(super().items(), key=lambda t: t[::-1])
self.clear()
self.update(sorted_items)
def __repr__(self):
self.__sorted()
return super().__repr__()
# do the same to methods __str__, __iter__, items, keys, values, popitem, etc.
Demo: https://replit.com/#blhsing/RudeMurkyIntegrationtesting
Since you are looking to sort the dictionary based on the values
>>> from collections import OrderedDict
>>> from random import randint
>>>
>>> d = OrderedDict()
>>>
>>> i=1
>>> for _ in range(5):
... d[i] = randint(1,50)
... i+=1
...
>>>
>>> sorted(d.items(),key=lambda x:(x[1],x[0]))
[(2, 2), (5, 20), (3, 35), (1, 36), (4, 47)]
>>>
key within sorted can be used in this case to sort based on values

iterating over list-valued dictionary

Suppose I have a dictionary that has tuplets as keys and lists of tuples as values, for example:
d={(0,1):[(1,1)],
(0,2):[(1,1),(1,2)],
(0,3):[(1,1),(1,2),(1,3)]}
I would like to remove all entries such that their value is contained in the value of another key, for example:
from d I would like to remove the entry with key (0,1) because the (1,1) is contained in [(1,1),(1,2)]
and remove the entry with key (0,2) because [(1,1),(1,2)] is contained in [(1,1),(1,2),(1,3)].
Order of tuples in the lists matters.
I can solve this using a bunch of for loops, like this:
for key, val in d.items():
for k,v in d.items():
for i in range(0, len(val)):
if val[i] in v and len(v) - len(val) == 1:
del_me = True
else:
del_me = False
break
if del_me:
to_del.append(key)
for key in set(to_del):
del d[key]
edit: (further explanation)
Keys are not important here but will be important later.
In other words:
let a,b,c,d denote unique tuples
let k1,k2,..., denote keys
lets have the entries:
k1:[a],
k2:[d],
k3:[a,b],
k4:[b,a],
k5:[a,b,c],
I want to end up with:
k2,k4,k5
Removed entries will be:
k1 because a is in k3
k3 because a,b is in k5
It hurts my eyes when I'm looking at this, sorry.
What would be a pythonic way to do this?
Let's say that you have an dictionary that looks like as follows:
d={(0,1):[(1,1)],(0,2):[(1,1),(1,2)],(0,3):[(1,1),(1,2),(1,3)],(0,4):[(1,1),(1,2),(1,4)]}
Now you want to compare the values against all the keys with one another. I suggest you place all the values in this dictionary in a list.
mylist=[]
for key in d.keys():
mylist.append(d[key])
print(mylist)
mylist will look like as follows having all the values of the dictionary.
[[(1, 1)], [(1, 1), (1, 2)], [(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]
Now you want to compare all these values with one another and remove those values that are the subsets of any of those values. Like in this example, [(1,1)] is the subset of [(1,1),(1,2)] and so [(1,1)] will be removed. Similarly, [(1,1),(1,2)] is the subset of [(1,1),(1,2),(1,3)] and so it will also be removed. We can accomplish this as follows:
out = []
for k, elm in enumerate(mylist):
for _,elm2 in enumerate(mylist[:k] + mylist[k + 1:]):
if frozenset(elm).issubset(elm2):
break
else:
out.append(elm)
print(out)
out list will give us unique elements of the list mylist.
[[(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]
In the above segment of code, we loop over the list and take the current index. Then we loop again over the same list but we remove the element at the current index of the first loop. Then we convert the first element of the index of the first loop into a frozenset and use the method issubset in order to check if the first element if a subset of the second element or not.
Now we just compare our dictionary d values with out list. And if, d values are not in out list we delete that key from the dictionary d. This is accomplished as follows:
for key in list(d):
if d[key] not in out:
del d[key]
print(d)
Th output dictionary d will be:
{(0, 3): [(1, 1), (1, 2), (1, 3)], (0, 4): [(1, 1), (1, 2), (1, 4)]}

Adding the values of a dictionary to a list

Hi I am having a bit of difficulty adding tuples which are the values of a dictionary I have extracted the tuples and need to added to an iterable item say an empty list. i.e.
path = [1,2,3,4]
pos = {1:(3,7), 2(3,0),3(2,0),4(5,8)}
h = []
for key in path:
if key in pos:
print pos[key]
h.append(pos[Key])#Gives an error
Please how can i append the values in pos[key] into a h. Thanks
You can use list comprehension:
h = [pos[key] for key in path if key in pos]
Demo:
print h
>>> [(3, 7), (3, 0), (2, 0), (5, 8)]
Notes:
A dictionary should be declared like pairs of key:value. Your syntax is incorrect.
Also, Python is case-sensitive so key is different than Key.

Python convert pairs list to dictionary

I have a list of about 50 strings with an integer representing how frequently they occur in a text document. I have already formatted it like shown below, and am trying to create a dictionary of this information, with the first word being the value and the key is the number beside it.
string = [('limited', 1), ('all', 16), ('concept', 1), ('secondly', 1)]
The code I have so far:
my_dict = {}
for pairs in string:
for int in pairs:
my_dict[pairs] = int
Like this, Python's dict() function is perfectly designed for converting a list of tuples, which is what you have:
>>> string = [('limited', 1), ('all', 16), ('concept', 1), ('secondly', 1)]
>>> my_dict = dict(string)
>>> my_dict
{'all': 16, 'secondly': 1, 'concept': 1, 'limited': 1}
Just call dict():
>>> string = [('limited', 1), ('all', 16), ('concept', 1), ('secondly', 1)]
>>> dict(string)
{'limited': 1, 'all': 16, 'concept': 1, 'secondly': 1}
The string variable is a list of pairs. It means you can do something somilar to this:
string = [...]
my_dict = {}
for k, v in string:
my_dict[k] = v
Make a pair of 2 lists and convert them to dict()
list_1 = [1,2,3,4,5]
list_2 = [6,7,8,9,10]
your_dict = dict(zip(list_1, list_2))

Categories