python while id is the same do something - python

I feel like it is something basic but somehow I don't get it. I would like to loop over a list and append all the persons to the same id. Or write to the file, it doesn't matter.
[1, 'Smith']
[1, 'Black']
[1, 'Mueller']
[2, 'Green']
[2, 'Adams']
[1; 'Smith', 'Black', 'Mueller']
[2; 'Green', 'Adams']
First I have created a list of all ids and then a I had two for-loops like this:
final_doc = []
for id in all_ids:
persons = []
for line in doc:
if line[0] == id:
persons.append(line[1])
final_doc.append(id, persons)
It takes ages. I was trying to create a dictionary with ids and then combine it somehow, but the dictionary was taking the same id only once (may be I did there something not as I should have). Now I am thinking about using while-loop. While id is still the same append persons. But it is easy to understand how to do it if it has to be, for example, while id is less than 25. But in the case "while it is the same" I am not sure what to do. Any ideas are very appreciated.

You can group them together in a dictionary.
Given
lists = [[1, 'Smith'],
[1, 'Black'],
[1, 'Mueller'],
[2, 'Green'],
[2, 'Adams'] ]
do
d = {}
for person_id, name in lists:
d.setdefault(person_id, []).append(name)
d now contains
{1: ['Smith', 'Black', 'Mueller'], 2: ['Green', 'Adams']}
Note:
d.setdefault(person_id, []).append(name)
is a shortcut for
if person_id not in d:
d[person_id] = []
d[person_id].append(name)
If you prefer your answer to be a list of lists with the person_id as the first item in the list (as implied in your question), change code to
d = {}
for person_id, name in lists:
d.setdefault(person_id, [person_id]).append(name) # note [person_id] default
result = list(d.values()) # omit call to list if on Python 2.x
result contains
[[1, 'Smith', 'Black', 'Mueller'], [2, 'Green', 'Adams']]

Related

How to change a value inside a list which is a value for a key in a dictionary?

I try to change one value inside the list but all values change for all keys.
vertex_list = ['a','b']
distance_dict = dict.fromkeys(vertex_list, [10, 'None'])
distance_dict['a'][0] = 1
print(distance_dict)
output:
{'a': [1, 'None'], 'b': [1, 'None']}
when I built the dictionary with traditional { } it works fine. I guess the problem is for the list inside dict.fromkeys, which creates one list for all dictionary keys. is there a solution to fix it. without using ''for'' and { } to build the dictionary?
Quoting from the official documentation,
fromkeys() is a class method that returns a new dictionary. value
defaults to None. All of the values refer to just a single instance,
so it generally doesn’t make sense for value to be a mutable object
such as an empty list. To get distinct values, use a dict
comprehension instead.
Means that dict.fromkeys is useful if the second argument is a singleton instance(like None). So the solution is to use the dict comprehension.
distance_dict = {vertex: [10, None] for vertex in vertex_list}
If you print the id of distance_dict 's value created by dict.fromkeys:
vertex_list = ['a','b']
distance_dict = dict.fromkeys(vertex_list, [10, 'None'])
print(id(distance_dict['a']), id(distance_dict['b']))
print(id(distance_dict['a']) == id(distance_dict['b']))
Output:
2384426495040 2384426495040
True
distance_dict['a'] and distance_dict['b'] are actually the same list object, occupying the same memory space.
If you use the dict comprehension:
distance_dict = {vertex: [10, None] for vertex in vertex_list}
print(id(distance_dict['a']) == id(distance_dict['b']))
Output:
False
They are different list objects with the same elements.
You can solve with:
vertex_list = ['a','b']
distance_dict = {x: [10, 'None'] for x in vertex_list}
distance_dict['a'][0] = 1
print(distance_dict)
Output:
{'a': [1, 'None'], 'b': [10, 'None']}

Looking for easiest way to perform average of common dictionaries in a list

I have a list of dictionaries like,
list1 = [{'a':[10,2],'b':[20,4]}, {'a':[60,6],'b':[40,8]}]
Trying to get the final output as
list1 = [{'a':[35,4],'b':[30,6]}]
I was trying to get the list of values for each key in each dictionary and average them based on the length of the list and put into a new dictionary.
Not sure of the best / most pythonic way of doing this.
Any help highly appreciated.
There are many different ways to do this. A simple one iterates over each key, over each inner index, and calculates the average to store to a new dictionary:
from pprint import pprint
list1 = [
{'a': [10, 2], 'b': [20, 4]},
{'a': [60, 6], 'b': [40, 8]},
]
means = {}
for key in list1[0]:
key_means = []
means[key] = key_means
for index in range(len(list1[0][key])):
key_means.append(
sum(
ab_dict[key][index]
for ab_dict in list1
) / len(list1)
)
pprint(means)
This implementation assumes that keys appearing in the first row are uniformly represented in all other rows.

Creating a dictionary with nested arrays

How do you take a pre-existing dictionary and essentially add an item from a list into the dictionary as a tuple using a for loop? I made this example below. I want to take color_dict and reformat it so that each item would be in the format 'R':['red',1].
I got as far as below, but then couldn't figure out how to do the last part.
lista = {'red':'R', 'orange':'O', 'yellow':'Y', 'green':'G',
'blue':'B', 'indigo':'I', 'violet':'V'}
color_dict = {'R':1, 'O':2, 'Y':3, 'G':4, 'B':5, 'I':6, 'V':7}
a = color_dict.keys()
color_keys = []
color_vals = []
for x in lista[0::2]:
color_keys.append(x)
for x in lista[1::2]:
color_vals.append(x)
new = zip(color_keys, color_vals)
new_dict = dict(new)
print new_dict
If anyone has any other suggestions that would be great, I'm not understanding how to use dict comprehension.
Basically what you want to do is to loop through the items in lista and for each pair color: colkey find the respective value in color_dict (indexed by colkey). And then you just need to stitch everything together: colkey: [color, color_dict[colkey]] is the new item in the new dict for each item in the lista dict.
You can use a dict comprehension to build this:
>>> new_dict = {colkey: [color, color_dict[colkey]] for color, colkey in lista.items()}
>>> new_dict
{'O': ['orange', 2], 'Y': ['yellow', 3], 'V': ['violet', 7], 'R': ['red', 1], 'G': ['green', 4], 'B': ['blue', 5], 'I': ['indigo', 6]}

Use list of strings elements as arrays

I have a list of strings:
list1 = ['array1', 'array2', 'array3']
whose elements I would like to use as names of other lists, like (as conceptual example):
list1[0] = [1, 2, 3]
I know this assignation does not make any sense, but it is only to show what I need.
I have looked for this a lot but didn't find a handy example. If I understood properly, this is not the right way to do it, and better is to use dictionaries. But, I never used dictionaries yet, so I would need some examples.
aDict = { name: 42 for name in list1 }
This gives you:
{'array3': 42, 'array2': 42, 'array1': 42}
If you wanted it ordered:
from collections import OrderedDict
aDict = OrderedDict((name, 42) for name in list1)
Use a dictionary:
list1 = {'array1': [0, 1, 2], 'array2': [0, 3], 'array3': [1]}
and then access it like this:
list1['array1'] # output is [0, 1, 2]
To dynamically populate your dictionary:
list1 = {'array1': []}
list1['array1'].append(1)
You can do What you want with exec like this :
list1 = ['array1', 'array2', 'array3']
x=list1[1]
exec("%s = %d" % (x,2))
print array2
so result is :
2
But never use exec when you can use something much safer like dictionary, it can be dangerous !

Python Most Efficient Way to Search a List

Bear with me as I am very new to Python. Basically I am looking for the most efficient way to search through a multi-dimensional list. So say I have the following list:
fruit = [
[banana, 6],
[apple, 5],
[banana, 9],
[apple, 10],
[pear, 2],
]
And I wanted the outcome of my function to produce: Apple: 15, Banana: 15, Pear 2. What would be the most efficient way to do this?
That is not in any way a search...
What you want is
import collections
def count(items):
data = collections.defaultdict(int)
for kind, count in items:
data[kind] += count
return data
fruit = [['banana', 6], ['apple',5], ['banana',9],['apple',10],['pear',2]]
f = {}
def fruit_count():
for x in fruit:
if x[0] not in f.keys():
f.update({x[0]:x[1]})
else:
t = f.get(x[0])
t = t + x[1]
f.update({x[0]:t})
return f
f = {'apple': 15, 'banana': 15, 'pear': 2}
Use a collections.defaultdict to accumulate, and iterate through the list.
accum = collections.defaultdict(int)
for e in fruit:
accum[e[0]] += e[1]
myHash = {}
fruit = [
[banana, 6],
[apple, 5],
[banana, 9],
[apple, 10],
[pear, 2],
]
for i in fruit:
if not i[0] in myHash.keys():
myHash[i[0]] = 0
myHash[i[0]] += i[1]
for i in myHash:
print i, myHash[i]
would return
apple 15
banana 15
pear 2
Edit
I didn't know about defaultdict in python. That is a much better way.
I'm unsure what type apple and banana are, so I made just them empty classes and used their class names for identification. One approach to this problem is to use the dictionary method setdefault() which first checks to see if a given key is already in the dictionary and if it is simply returns it, but if it's not, will insert it it with a default value before returning that.
To make more efficient use of it for this problem by avoiding multiple dictionary key lookups, the count associated with each key needs be stored in something "mutable" or changeable since simple integers are not in Python. The trick is to store the numeric count in a one-element list which can be changed. The first function in code below shows how this can be done.
Note that the Python collections module in the standard library has had a dictionary subclass in it called defaultdict which could have been used instead which effectively does the setdefault() operation for you whenever a non-existent key is first accessed. It also makes storing the count in a list for efficiency unnecessary and updating it a slightly simpler.
In Python 2.7 another dictionary subclass was added to the collections module called counter. Using it probably would be the best solution since it was designed for exactly this kind of application. The code below shows how to do it all three ways (and sorts the list of totals created).
class apple: pass
class banana: pass
class pear: pass
fruit = [
[banana, 6],
[apple, 5],
[banana, 9],
[apple, 10],
[pear, 2],
]
# ---- using regular dictionary
def tally(items):
totals = dict()
for kind, count in items:
totals.setdefault(kind, [0])[0] += count
return sorted([key.__name__,total[0]] for key, total in totals.iteritems())
print tally(fruit)
# [['apple', 15], ['banana', 15], ['pear', 2]]
import collections
# ---- using collections.defaultdict dict subclass
def tally(items):
totals = collections.defaultdict(int) # requires Python 2.5+
for kind, count in items:
totals[kind] += count
return sorted([key.__name__, total] for key, total in totals.iteritems())
print tally(fruit)
# [['apple', 15], ['banana', 15], ['pear', 2]]
# ---- using collections.Counter dict subclass
def tally(items):
totals = collections.Counter() # requires Python 2.7+
for kind, count in items:
totals[kind] += count
return sorted([key.__name__, total] for key, total in totals.iteritems())
print tally(fruit)
# [['apple', 15], ['banana', 15], ['pear', 2]]

Categories