Python group and sum with string as keys [closed]

Python group and sum with string as keys [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Beginners question...
If possible without pandas, I'd like to sum up groups within a list or an array.
Input:
Input = [["A",0.2],["B",0.5],["A",0.6],["C",0.1],["B",0.9]]
Desired Output:
Output = [["A",0.8],["B",1.4],["C",0.1]]
Thanks!

You could sum over equal keys by using a dictionary. If you really need, you still can recreate the result to a list of lists via a list comprehension:
lst = [["A",0.2],["B",0.5],["A",0.6],["C",0.1],["B",0.9]]
d = dict()
for sl in lst:
d[sl[0]] = d.get(sl[0], 0) + sl[1]
res = [[k, v] for k, v in d.items()]

You can by doing this:
from collections import defaultdict
sums = defaultdict(lambda: 0)
for arr in input:
sums[arr[0]] += arr[1]
output = [[key, value] for key,value in sums.items()]
This way seems the most idiomatic for me. Following the convention of Python, you should name your variables with lower case and underscore. You can learn more about the defaultdict here: https://docs.python.org/3.6/library/collections.html

You can use this :
Dict = {group[0]: 0 for group in Input}
for group in Input: Dict[group[0]] += group[1]
Output = [[group, value] for group, value in Dict.items()]
A dict can only have unique keys, so we solve half of the problems. To start, each value of each key will be 0.
Next, we iterate through Input and add each value to its corresponding key in our Dict. So now, our job is almost done.
We only have to convert it to the form you want, using a comprehension list.

I would recommend going for a dict as shown hereunder:
inp = [["A",0.2],["B",0.5],["A",0.6],["C",0.1],["B",0.9]]
#create an empty dict
out = {}
#for each element of the inp list
for a in inp:
#if the key does already exist in the dict, you sum the current value
#with what was found in the array at this iteration
if a[0] in out:
out.update({a[0]:a[1]+out.get(a[0])})
#you create a new pair (new key, new value) extract from inp
else:
out.update({a[0]:a[1]})
print out
#if you really need a nested list as output you can convert back the dict
#into a list of list
res = [[key, val] for key, val in out.items()]
print res
output:
{'A': 0.8, 'C': 0.1, 'B': 1.4}
[['A', 0.8], ['C', 0.1], ['B', 1.4]]

This should work:
Output = [ [k,sum(a[1] for a in Input if a[0] == k)] for k in set(a[0] for a in Input) ]

Related

How to loop between a list and dictionary to get frequencies in separate list (Python) [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I am trying to loop between a list full of characters to iterate and match up with the values in the dictionary to ultimately grab frequencies set on another list. For example:
a = ['NIEP','LOAS','IJFE','NIEP'] #list of characters
b = [] #empty list that would ultimately contain all the matching values
di = {'NIEP':0,'IJFE':0} #dictionary values with frequency counts set to 0
What b would ideally look like:
b = ['NIEP':2,'IJFE':1] #the numbers are the amount of times they repeat in list a
I have tried:
b = []
for x in di.keys():
if x in a:
b.append(di[x])
This returns an empty list and 0 values like [0,0]. I also tried:
b = [di[x] for x in a if x in di]
Which is the same thing but in a different format, it still returns the zeros just 3 of them.

Your b should be a dictionary because you're hunting the frequency of a matching reference with di dictionary. First iterate the list a then count if it exist in the dictionary di.
a = ['NIEP','LOAS','IJFE','NIEP'] #list of characters
di = {'NIEP':0,'IJFE':0}
b = {}
for x in a:
if x in di.keys():
if x in b:
b[x]+= 1
else:
b[x]=1
print(b)
OUTPUT
{'NIEP': 2, 'IJFE': 1}

Since you are starting with a dictionary containing keys and zero values, it seems like you want to maintain these zero values for keys even if they are not in a. One way to do that is to take a count of the values in a if the key is in di and then merge them with the original di. For example:
from collections import Counter
a = ['NIEP','LOAS','IJFE','NIEP'] #list of characters
di = {'NIEP':0,'IJFE':0}
b = {**di, **Counter(s for s in a if s in di)}
# {'NIEP': 2, 'IJFE': 1}
If di has additional keys, those will be preserved with the 0 value:
a = ['NIEP','LOAS','IJFE','NIEP'] #list of characters
di = {'NIEP':0,'IJFE':0, 'NULL': 0}
b = {**di, **Counter(s for s in a if s in di)}
# {'NIEP': 2, 'IJFE': 1, 'NULL': 0}
A further alternative, if you want to keep the counter rather than a dict is to make the counter, then update it to get the zero values:
b = Counter(s for s in a if s in di)
b.update(di)
# b will be Counter({'NIEP': 2, 'IJFE': 1, 'NULL': 0})

Just use the builtin functionality collections.Counter
from collections import Counter
a = ['NIEP','LOAS','IJFE','NIEP']
di = {'NIEP':0,'IJFE':0}
b = Counter(value for value in a if value in di)
print(b)
Output
Counter({'NIEP': 2, 'IJFE': 1})
But if you want a list of results
...
b = [{key: value} for key, value in b.items()]
print(b)
Output
[{'NIEP': 2}, {'IJFE': 1}]

You can use Counter from Collections -
from collections import Counter
a = ['NIEP','LOAS','IJFE','NIEP']
b = dict(Counter(a))
print(b)
Result:
a = ['NIEP','LOAS','IJFE','NIEP']
Or, if you do not want to use a module then try -
b = {ele:a.count(ele) for ele in a} # A dict Comprehension
{'NIEP': 2, 'LOAS': 1, 'IJFE': 1}

Or try:
>>> a = ['NIEP','LOAS','IJFE','NIEP']
>>> {k: a.count(k) for k in a}
{'NIEP': 2, 'LOAS': 1, 'IJFE': 1}
>>>

Create a dictionary with specific pairs from other dictionaries

This list:
data=[[{'t1':'text1.txt','date1':'class1'}],[{'t2':'text2.txt','date2':'class2'}]]
data
gives
[[{'t1': 'text1.txt', 'date1': 'class1'}],
[{'t2': 'text2.txt', 'date2': 'class2'}]]
and I want to turn it into this:
EDIT brackets added
[[{'text1.txt': 'class1'}], [{'text2.txt': 'class2'}]]
which means:
to create a list where each sublist inside, will be comprised of a dictionary where the key would be the value of the first dictionary in the first sublist and the value would be the value of the second dictionary of the first sublist and so on for the following sublists.
My attempt was this:
se=[]
for i in data:
for j in i:
jk=j.values()
se.append(jk)
se

Iterate through each dictionary in nested list and create a tuple from values() method for each dictionary like this tuple(dict.values()). After converting to tuple you can use dict() to create dictionary from the tuple like this dict([tuple(dict.values())])
Note: If your dictionary has exactly two keys then only it will work.
res = [[dict([tuple(d.values())]) for d in lst]for lst in data]
print(res)
Output:
[[{'text1.txt': 'class1'}], [{'text2.txt': 'class2'}]]

Your code does most of the job. You can add another line to get the desired results:
In [108]: se
Out[108]: [dict_values(['text1.txt', 'class1']), dict_values(['text2.txt', 'class2'])]
In [109]: [[{list(x)[0]:list(x)[1]} for x in se]]
Out[109]: [[{'text1.txt': 'class1'}, {'text2.txt': 'class2'}]]

Try this:
data=[[{'t1':'text1.txt','date1':'class1'}],[{'t2':'text2.txt','date2':'class2'}]]
all_vals = [];
for i in data:
for j in i:
for key in j:
all_vals.append(j[key])
new_list = [];
for i in range(0,len(all_vals)-1):
if (i % 2) == 0:
new_dict = {};
new_dict = {all_vals[i]:all_vals[i+1]}
new_list.append(new_dict)
else:
continue
print(new_list)
Output:
[{'text1.txt': 'class1'}, {'text2.txt': 'class2'}]
This code works regardless of the length of your list.

The following function should convert the inputs to the outputs you requested.
def convert_to_list_of_list_of_dictionaries(input_dictionaries):
ret_dictionaries = []
for inp in input_dictionaries:
k, v = inp[0].values()
ret_dictionaries.append({k, v})
return ret_dictionaries
However, there are a few things going on with the input/outputs that are little concerning and make the data harder to work with.
On the input side, the data is being wrapped in an extra list that in this context, does not provide any function, and forces you to index the first element of the inner list to access the dict k, v = inp[0].values(). On the output side, we're doing the same thing, which makes it harder to iterate over the outputs. It would look something like:
# psuedocode
for kv in reformatted_outputs:
unwrapped_dict = kv[0]
key, value = next(iter(unwrapped_dict.items()))
# process key and value
instead, if you had an output format like ``{'text1.txt': 'class1', 'text2.txt': 'class2'}`, you could process data like:
key, value in reformatted_outputs.items():
# process key and value
If you have the ability to modify the inputs and outputs of what you're working on, this could save you some trouble, and anyone you're working with some head scratches.
If you wanted to modify the output format, your function could look something like:
def convert_to_single_dictionary(input_dictionaries):
ret = {}
for inp in input_dictionaries:
print(inp)
# it looks like the input data is being wrapped in an extra list
k, v = inp[0].values()
ret[k] = v
return ret
Hope this is helpful and thanks for asking the question!

Combining two multi-dimensional arrays [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Having this two arrays in Python:
a = [ ['a',1], ['b',2], ['c',3] ]
b = [ ['a',10], ['b',20], ['c',30], ['d',40] ]
Is it possible to combine them into a single array like this?
output = [ ['a',1,10], ['b',2,20], ['b',3,30], ['c',0,40] ]
Since 'c' doesn't exist in the first array, first integer should be 0.
Thank you.

As stated in the comments, if missing values can be at any place, you cannot use zip or zip_longest. Here I use dicts for joining values:
a = [ ['a',1], ['b',2], ['c',3] ]
b = [ ['a',10], ['b',20], ['c',30], ['d',40] ]
d1, d2 = dict(a), dict(b)
d = {k: [d1.get(k, 0), d2.get(k, 0)] for k in d1.keys() | d2.keys()}
print( sorted([k, *v] for k, v in d.items()) ) # use custom key= to sort them to right order (or don't use sort if you don't need it)
Prints:
[['a', 1, 10], ['b', 2, 20], ['c', 3, 30], ['d', 0, 40]]

It would seem to me that picking a dictionary for your datatype would make more sense than lists, but if you need to use lists, this is an approach:
the lists in the original lists, are really just tuples, matching key to value; a dictionary would make more sense here
you want one list in the result for each key in the original lists
the lists in the result list, are really just 3-tuples, matching key to first and second value
you want the value from the first list for each key (or 0 if none) as the first value
you want the value from the first list for each key (or 0 if none) as the second value
da = {l[0]: l[1] for l in a}
db = {l[0]: l[1] for l in b}
result = [
[k, da[k] if k in da else 0, db[k] if k in db else 0]
for k in sorted(set(list(da.keys()) + list(db.keys())))
]
print(result)
Note that the sorted() is in there to preserve order. If alphabetical order isn't guaranteed, you can just combine the keys and dedupe using something like this:
def dedupe(seq):
seen = set()
seen_add = seen.add
return [x for x in seq if not (x in seen or seen_add(x))]

Create dictionary using split() operation and list comprehension [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I want to create 'dictionary' using `list comprehension':
a_list = ['1, Lastoŭski', '2, Kupala', '3, Kolas']
What I have tried so far is:
d = {key: value for (key, value) in s.split(',') for s in a}
>>> NameError: name 's' is not defined
But this is completely wrong. Could you help me?

as pointed out by #Delgan, it can be done directly via
d1 = dict(keyval.split(", ") for keyval in a_list)
without the inner nesting :)
older approach which were not really correct :-
d = [a.split(',') for a in a_list]
d1 = {key: val for key,val in d}
or
d1 = {key: val for key,val in (a.split(',') for a in a_list)}

No need for a dictionary comprehension. You're making something more complex than it needs to be ;).
a_list = ['1, Lastoŭski', '2, Kupala', '3, Kolas']
d = {}
for i in a_list:
temp = i.split(', ')
d[temp[0]] = temp[1]
print d
# returns: {'3': 'Kolas', '2': 'Kupala', '1': 'Lastoŭski'}
If you need a list comprehension, then this will suffice:
d = dict((key, value) for key, value in [i.split(', ') for i in a_list])
You were close, but you were missing brackets

Try this:
d = dict(map(str, x.split(',')) for x in a_list)
This would help.

Get the name of the dictionary that contains the given key in python [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
a = {'name1':'tom'}
b = {'name2':'harry'}
c = {'name3':'peter'}
For a given key, I would like to get the name of the dictionary contains it.
Example:
If i give 'name2', I would like to get b as the result
Thanks

You can use next to return the next value in a sequence, and filter it based on some criteria:
key = 'name2'
found = next(d for d in (a, b, c)
if key in d)

a = {'name1':'tom'}
b = {'name2':'harry'}
c = {'name3':'peter'}
name = 'name2'
for k, v in locals().copy().items():
if isinstance(v, dict):
if name in v:
print('"{}" contains id dict "{}"'.format(name, k))
# "name2" contains in dict "b"
But usually you shouldn't do it. Create dict of dicts and iterate through it:
ds = {
'a': {'name1':'tom'},
'b': {'name2':'harry'},
'c': {'name3':'peter'},
}
name = 'name2'
for k, v in ds.items():
if name in v:
print('"{}" contains id dict "{}"'.format(name, k))
# "name2" contains in dict "b"

Let's say that you put references to your dictionaries in an iterable
l = [da, db, dc, ...]
that you initialize the reference to one of your dictionaries to an invalid value
d = None
to find which dictionary has word as a key it's simply
for _ in l:
if word in _:
d = _
break
at this point, either you didn't find word in any of your dicts and d is hence still equal to None, or you've found the dictionary containing word
and you can do everything you want to it
if d:
...

Since you said you would like to get b back as a variable, one way I can think of doing this is making a list of your dictionaries and enumerating through them.
a = {'name1':'tom'}
b = {'name2':'harry'}
c = {'name3':'peter'}
name = 'name2'
dicts = list(a,b,c)
required_dic = findDict(dicts)
def findDict(dicts):
for obj in dicts:
if "name2" in obj:
return obj
but as #germn said, a better idea would be to create a nested dictionary.

You may create a mapping dict, which maps the key to variable binding:
{"name1" : a, "name2" : b, "name3" : c}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python group and sum with string as keys [closed] - python

This should work: Output = [ [k,sum(a[1] for a in Input if a[0] == k)] for k in set(a[0] for a in Input) ]

Related

How to loop between a list and dictionary to get frequencies in separate list (Python) [closed]

Create a dictionary with specific pairs from other dictionaries

Combining two multi-dimensional arrays [closed]

Create dictionary using split() operation and list comprehension [closed]

Get the name of the dictionary that contains the given key in python [closed]

Categories

Resources