Getting an empty list when appending values with multiprocessing

Getting an empty list when appending values with multiprocessing - python

I am using Python 3.5 to edit a list, which in this case is predictions_dict['D'], included in the dictionary predictions_dict. This is the code that I use:
import multiprocessing as multip
predictions_dict = {'A': [],
'B': [],
'C': [],
'D': [],
'E': [],
'F': [],
'Def': []}
data = [{'index': 1, 'rank': 'A'}, {'index': 2, 'rank': 'D'}, {'index': 3, 'rank': 'E'}]
prediction = [(1, 'C'), (2, 'D'), (3, 'D')]
def create_predictions_dict(index, rank):
for j in data:
if j['index'] == index:
predictions_dict[rank].append((index, j['rank'], rank))
break
np = multip.cpu_count()
p = multip.Pool(processes=np)
_ = p.starmap(create_predictions_dict, prediction)
p.close()
p.join()
print('final list:', predictions_dict['D'])
when I execute this code, the output I get is:
final list: []
And I don't understand why, as I would expect to get:
final list: [(2, 'D', 'D'), (3, 'E', 'D')]

I have worked on a solution, thanks to the fact that in the comments the problem was identified as the fact that processes don't share state:
import multiprocessing as multip
predictions_dict = {'A': [],
'B': [],
'C': [],
'D': [],
'E': [],
'F': [],
'Def': []}
data = [{'index': 1, 'rank': 'A'}, {'index': 2, 'rank': 'D'}, {'index': 3, 'rank': 'E'}]
prediction = [(1, 'C'), (2, 'D'), (3, 'D')]
def create_predictions_dict(index, rank):
for j in data:
if j['index'] == index:
return index, j['rank'], rank
np = multip.cpu_count()
p = multip.Pool(processes=np)
sk = p.starmap(create_predictions_dict, prediction)
p.close()
p.join()
for elem in sk:
predictions_dict[elem[2]].append(elem)
print('final list:', predictions_dict['D'])

Related

Python - How to convert a List into an Adjacency List for graphs structure

My problem is that I can't convert a graph constructed through a list into a graph constructed as a dictionary, which must act as an adjacency list.
I have already constructed a random generated graph by randomly adding to each edge: start node (string), end node (string), weight (int).
But now I would need to convert it to a graph like this (a dictionary) that represent an adjacency list:
example_graph = {
'A': {'B': 2, 'C': 3},
'B': {'A': 2, 'C': 1, 'D': 1, 'E': 4},
'C': {'A': 3, 'B': 1, 'F': 5},
'D': {'B': 1, 'E': 1},
'E': {'B': 4, 'D': 1, 'F': 1},
'F': {'C': 5, 'E': 1, 'G': 1},
'G': {'F': 1},
}
These graphs must be the same, that's why i need to convert the first one.
So what I did then is to put those three initial values (start node, end node, weight) into a list called graphConvert like this:
while i < graph.numberOfNodes():
graphConvert.insert(i, list(zip(graph.edges[i].node1.printSingleNode(), graph.edges[i].node2.printSingleNode(), [graph.edges[i].weight])))
deleteIntegers.append(graph.edges[i].weight)
i += 1
deleteIntegers = list(set(deleteIntegers))
That's an example of the result: [[('C', 'B', 4)], [('A', 'D', 2)], [('D', 'C', 3)], [('A', 'C', 4)]]
Then i added this code to convert the list into a dictionary:
adj_list = {}
for edge_list in graphConvert:
for edge in edge_list:
for vertex in edge:
adj_list.setdefault(vertex, set()).update((set(edge) - {vertex}))
for i in range(deleteIntegers.__len__()):
adj_list.__delitem__(deleteIntegers[i])
That's the result: {'C': {'B', 3, 4, 'D', 'A'}, 'B': {'C', 4}, 'A': {'C', 2, 'D', 4}, 'D': {3, 'C', 2, 'A'}}
I was hoping to obtain something like this: {'C': {'B': 4, 'D': 3, 'A': 4}, 'B': {'C': 4}, 'A': {'D': 2, 'C': 4}, etc. etc.
But as you can see the results are incorrect, and I can't figure out how I can solve this problem. For example, I don't understand how I can stop the for loop before it gets to the node's weight and print it without sense, however then I would have to insert it afterwards to correctly display the distance between the starting and ending node.
But that is just one of the things I am not understanding and what is wrong with the program.
I've been banging my head about it for a while now and I'm not getting the hang of it, maybe I need a rest!
I haven't been using python that long, so I still have a lot to learn.
Thank you so much in advance to anyone who answers me!

You can use a defaultdict:
from collections import defaultdict
graph_to_convert = [[('C', 'B', 4)], [('A', 'D', 2)], [('D', 'C', 3)], [('A', 'C', 4)]]
g = defaultdict(dict)
for edge in graph_to_convert:
a,b,w = edge[0]
g[a][b] = w
print(g)
#defaultdict(<class 'dict'>, {'C': {'B': 4}, 'A': {'D': 2, 'C': 4}, 'D': {'C': 3}})
If you aren't happy with having a defaultdict as the final product, you can add the line g = dict(g) to cast the result to a straight dict.

how to group a list of dictionaries to get a list of their corresponding indices?

Have a list of dictionaries, something like this:
l = [{'a':25}, {'a':25}, {'b':30}, {'c':200}, {'b':30}]
want to find the distinct elements and their corresponding indices, something like this:
[
({'a':25}, [0,1]),
({'b':30}, [2,4]),
({'c':200}, [3]),
]
tried with itertools.groupby, but couldn't make it happen, perhaps I'm missing something, any other directions are great too.

Consider this list of dictionaries:
>>> dicts
[{'a': 3},
{'d': 4, 'a': 3, 'c': 1},
{'d': 8, 'c': 0, 'b': 9},
{'c': 3, 'a': 9},
{'a': 5, 'd': 8},
{'d': 5, 'b': 5, 'a': 0},
{'b': 7, 'c': 7},
{'d': 6, 'b': 7, 'a': 6},
{'a': 4, 'c': 1, 'd': 5, 'b': 2},
{'d': 7}]
Assuming you want all indices of every instance of every dictionary's keys:
idxs = {}
for i, d in enumerate(l):
for pair in d.items():
idxs.setdefault(pair, []).append(i)
This produces what I would consider more useful output, as it allows you to look up the indices of any specific key-value pair:
{('a', 3): [0, 1],
('d', 4): [1],
('c', 1): [1, 8],
('d', 8): [2, 4],
('c', 0): [2],
('b', 9): [2],
('c', 3): [3],
('a', 9): [3],
('a', 5): [4],
('d', 5): [5, 8],
('b', 5): [5],
('a', 0): [5],
('b', 7): [6, 7],
('c', 7): [6],
('d', 6): [7],
('a', 6): [7],
('a', 4): [8],
('b', 2): [8],
('d', 7): [9]}
However, if you must convert to List[Tuple[Dict[str, int], List[int]]], you can produce it very easily from the previous output:
>>> [(dict((p,)), l) for p, l in idxs.items()]
[({'a': 3}, [0, 1]),
({'d': 4}, [1]),
({'c': 1}, [1, 8]),
({'d': 8}, [2, 4]),
({'c': 0}, [2]),
({'b': 9}, [2]),
({'c': 3}, [3]),
({'a': 9}, [3]),
({'a': 5}, [4]),
({'d': 5}, [5, 8]),
({'b': 5}, [5]),
({'a': 0}, [5]),
({'b': 7}, [6, 7]),
({'c': 7}, [6]),
({'d': 6}, [7]),
({'a': 6}, [7]),
({'a': 4}, [8]),
({'b': 2}, [8]),
({'d': 7}, [9])]

Turn the dictionaries into tuples so you can use them as keys in a dictionary. Then iterate over the list, adding the indexes to this dictionary.
locations_dict = {}
for i, d in enumerate(l):
dtuple = tuple(d.items())
locations_dict.setdefault(dtuple, []).append(i)
locations = [(dict(key), value) for key, value in locations_dict.items()]

from collections import defaultdict
indices = defaultdict(list)
for idx, val in enumerate(l):
indices[tuple(*val.items())].append(idx)
print(indices)
# output
defaultdict(list, {('a', 25): [0, 1], ('b', 30): [2, 4], ('c', 200): [3]})

Another way of doing it:
import ast
l = [{'a':25}, {'a':25}, {'b':30}, {'c':200}, {'b':30}]
n_dict = {}
for a, b in enumerate(l):
n_dict[str(b)] = n_dict.get(str(b), []) + [a]
print(list(zip( [ast.literal_eval(i) for i in n_dict.keys()], n_dict.values() )))

great idea with the dicts/defaultdicts, this also seems to work:
l = [{'a':25}, {'a':25}, {'b':30}, {'c':200}, {'b':30}, {'a': 25}]
sorted_values = sorted(enumerate(l), key=lambda x: str(x[1]))
grouped = itertools.groupby(sorted_values, lambda x: x[1])
grouped_indices = [(k, [x[0] for x in g]) for k, g in grouped]
print(grouped_indices)
the idea is that once an array is sorted (keeping the original indices as additional details) itertools/linux groupby is preaty similar to sql/pandas groupby

How to count frequency of such list using basic libraries?

List looks like this having ascii character and number value, I want to count occurrence of each of ASCII character for 0, 1 and 2
So for A {0=10, 1=2, 2 =12} likewise
[('P', 0),
('S', 2),
('R', 1),
('O', 1),
('J', 1),
('E', 1),
('C', 1),
('T', 1),
('G', 1),
('U', 1),
('T', 1),
('E', 1),
('N', 1)]
I have tried
char_freq = {c:[0,0,0] for c in string.ascii_uppercase}
also
for i in range(3):
for x,i in a:
print(x,i)
I want to count X for i where X is [A-Z]
It should give me result like
Character | 0 | 1 | 2
A 10 5 4

although you don't supply enough example data to actually achieve your desired output.. i think this is what you're looking for:
from collections import Counter
import pandas as pd
l = [('P', 0),
('S', 2),
('R', 1),
('O', 1),
('J', 1),
('E', 1),
('C', 1),
('T', 1),
('G', 1),
('U', 1),
('T', 1),
('E', 1),
('N', 1)]
df = pd.DataFrame(l)
counts = df.groupby(0)[1].agg(Counter)
returns:
C {1: 1}
E {1: 2}
G {1: 1}
J {1: 1}
N {1: 1}
O {1: 1}
P {0: 1}
R {1: 1}
S {2: 1}
T {1: 2}
U {1: 1}
this will give you each ASCII character, along with each unique number, and how many occurrences of each number

from collections import Counter
l = [('A', 1),
('A', 1),
('A', 2),
('A', 2),
('B', 1),
('B', 2),
('B', 3),
('B', 4)]
data = {}
for k,v in l:
data[k] = [v] if k not in data else data[k] + [v]
char_freq = {k: dict(Counter(v)) for k, v in data.items()}
print(char_freq)
Outputs:
{'A': {1: 2, 2: 2}, 'B': {1: 1, 2: 1, 3: 1, 4: 1}}

your code looks fine you just have to make a small change to the char_freq variable to get the expected result:
char_freq = {c: {0: 0, 1: 0, 2: 0} for c in string.ascii_uppercase}
for x, i in a:
char_freq[x][i] += 1
to avoid having all the alphabet in your char_freq you could use only the necessary characters:
char_freq = {c: {0: 0, 1: 0, 2: 0} for c in {t[0] for t in a}}
for x, i in a:
char_freq[x][i] += 1
output:
{'O': {0: 0, 1: 1, 2: 0},
'T': {0: 0, 1: 2, 2: 0},
'N': {0: 0, 1: 1, 2: 0},
'G': {0: 0, 1: 1, 2: 0},
'U': {0: 0, 1: 1, 2: 0},
'E': {0: 0, 1: 2, 2: 0},
'J': {0: 0, 1: 1, 2: 0},
'R': {0: 0, 1: 1, 2: 0},
'C': {0: 0, 1: 1, 2: 0},
'S': {0: 0, 1: 0, 2: 1},
'P': {0: 1, 1: 0, 2: 0}}

how to reconfigure this dictionary to change its keys

Let's say I have this dictionary:
>>> dic = {('a', 'l'):3, ('a', 'p'):2, ('b', 'l'):4, ('b', 'p'):1}
How can I edit it so I can have it like this:
>>> dic_new = {'a':{'l':3, 'p':2}, 'b':{'l':4, 'p':1}}
Whenever I change the keys I get an error. I am confused.

In each case, you want to set d2[k1][k2]=v whereever you have d1[k1,k2]=v. The simplest way to do this is to start with a defaultdict.
>>> from collections import defaultdict
>>> d1 = {('a', 'l'):3, ('a', 'p'):2, ('b', 'l'):4, ('b', 'p'):1}
>>> d2 = defaultdict(dict)
>>> for k1, k2 in d1:
... d2[k1][k2] = d[k1,k2]
...
>>> d2
defaultdict(<class 'dict'>, {'a': {'l': 3, 'p': 2}, 'b': {'l': 4, 'p': 1}})
>>> dict(d2)
{'a': {'l': 3, 'p': 2}, 'b': {'l': 4, 'p': 1}}
If you don't want to use a defaultdict, use the setdefault method.
d2 = {}
for k1, k2 in d1:
d2.setdefault(k1, {})[k2] = d1[k1,k2]

You can iterate through the original dictionary and create a new one as you find keys:
dic = {('a', 'l'):3, ('a', 'p'):2, ('b', 'l'):4, ('b', 'p'):1}
dic_new = {}
for (new_key, new_sub_key),value in dic.items():
if new_key not in dic_new:
dic_new[new_key] = {}
dic_new[new_key][new_sub_key] = value
print(dic_new)
Output
{'a': {'l': 3, 'p': 2}, 'b': {'l': 4, 'p': 1}}

You can use groupby + OrderedDict:
from itertools import groupby
from collections import OrderedDict
dic = {('a', 'l'):3, ('a', 'p'):2, ('b', 'l'):4, ('b', 'p'):1}
dic = OrderedDict(dic)
new_d = {}
for k, g in groupby(dic, lambda x: x[0]):
for x in g:
if k in new_d:
new_d[k].update({x[1]: dic[x]})
else:
new_d[k] = {x[1]: dic[x]}
print(new_d)
# {'a': {'l': 3, 'p': 2}, 'b': {'l': 4, 'p': 1}}
Or in case where you can guarantee dictionaries are ordered as per first value in tuple key, you can straightaway ignore OrderedDict.

all possible combinations of dicts based on values inside dicts

I want to generate all possible ways of using dicts, based on the values in them. To explain in code, I have:
a = {'name' : 'a', 'items': 3}
b = {'name' : 'b', 'items': 4}
c = {'name' : 'c', 'items': 5}
I want to be able to pick (say) exactly 7 items from these dicts, and all the possible ways I could do it in.
So:
x = itertools.product(range(a['items']), range(b['items']), range(c['items']))
y = itertools.ifilter(lambda i: sum(i)==7, x)
would give me:
(0, 3, 4)
(1, 2, 4)
(1, 3, 3)
...
What I'd really like is:
({'name' : 'a', 'picked': 0}, {'name': 'b', 'picked': 3}, {'name': 'c', 'picked': 4})
({'name' : 'a', 'picked': 1}, {'name': 'b', 'picked': 2}, {'name': 'c', 'picked': 4})
({'name' : 'a', 'picked': 1}, {'name': 'b', 'picked': 3}, {'name': 'c', 'picked': 3})
....
Any ideas on how to do this, cleanly?

Here it is
import itertools
import operator
a = {'name' : 'a', 'items': 3}
b = {'name' : 'b', 'items': 4}
c = {'name' : 'c', 'items': 5}
dcts = [a,b,c]
x = itertools.product(range(a['items']), range(b['items']), range(c['items']))
y = itertools.ifilter(lambda i: sum(i)==7, x)
z = (tuple([[dct, operator.setitem(dct, 'picked', vval)][0] \
for dct,vval in zip(dcts, val)]) for val in y)
for zz in z:
print zz
You can modify it to create copies of dictionaries. If you need a new dict instance on every iteration, you can change z line to
z = (tuple([[dct, operator.setitem(dct, 'picked', vval)][0] \
for dct,vval in zip(map(dict,dcts), val)]) for val in y)

easy way is to generate new dicts:
names = [x['name'] for x in [a,b,c]]
ziped = map(lambda x: zip(names, x), y)
maped = map(lambda el: [{'name': name, 'picked': count} for name, count in el],
ziped)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Getting an empty list when appending values with multiprocessing - python

Related

Python - How to convert a List into an Adjacency List for graphs structure

how to group a list of dictionaries to get a list of their corresponding indices?

How to count frequency of such list using basic libraries?

how to reconfigure this dictionary to change its keys

all possible combinations of dicts based on values inside dicts

Categories

Resources