Assign values to dictionary by order they were added - python

Pretty much what the title says, I want to create a dictionary with phone numbers as keys and every time a new number is added I want its value to increment by one.
Like this: {'7806969':1 , '78708708' : 2} and etc...
nodes=[1,2,3,4,5,6,7,8,9]
customers=open('customers.txt','r')
calls=open('calls.txt.','r')
sorted_no={}
for line in customers:
rows=line.split(";")
if rows[0] not in sorted_no:
sorted_no[rows[0]]=nodes[0]
else:
sorted_no[rows[0]]=
print(sorted_no)
That was the code I have so far, I tried creating a list for my problem but that plan quickly fell apart.

use a defaultdict and just sort the output if you actually want it sorted by least to most frequent:
sorted_no = defaultdict(int)
for line in customers:
rows = line.split(";")
sorted_no[rows[0]] += 1
Or just use a Counter dict:
from collections import Counter
with open('customers.txt') as customers:
c = Counter(line.split(";")[0] for line in customers )
print(c.most_common())
To actually just increment the count per element and because you have no duplicates use enumerate :
with open('customers.txt') as customers:
sorted_no = {}
for ind, line in enumerate(customers,1):
rows=line.split(";")
sorted_no[rows[0]] = ind
Or as a dict comprehension:
with open('customers.txt') as customers:
sorted_no = {line.split(";")[0]:ind for ind, line in enumerate(customers,1)}
If order is important simply use:
from collections import OrderedDict
sorted_no = OrderedDict()
with open('customers.txt') as customers:
sorted_no = OrderedDict((line.split(";")[0], ind) for ind, line in enumerate(customers,1))
enumerate(customers,1) gives every index of each line in customers but we pass in 1 as the start index so we start at 1 instead of 0.

If I understand you, all you need to do is increase the number you're using as you go:
sorted_no = {}
with open("customers.txt") as fp:
for line in fp:
number = line.split(";")[0]
if number not in sorted_no:
sorted_no[number] = len(sorted_no) + 1
This produces something like
{'7801234567': 4,
'7801236789': 6,
'7803214567': 9,
'7804321098': 7,
'7804922860': 3,
'7807890123': 1,
'7808765432': 2,
'7808907654': 5,
'7809876543': 8}
where the first unique phone number seen gets 1, and the second 2, etc.

This is probably one of the shorter ways to do it (thank Jon Clements in comments):
#!/usr/bin/env python3.4
from collections import defaultdict
import itertools
sorted_no = defaultdict(itertools.count(1).__next__)
for line in customers:
rows=line.split(";")
# no need to put anything,
# just use the key and it increments automagically.
sorted_no[rows[0]]
itertools.count(1) produces a generator, which is equivalent (roughly) to:
def lazy():
counter = 0
while True:
counter += 1
yield counter
I left my original answer so people can learn about the default-binding gotcha, or maybe even use it if they want:
#!/usr/bin/env python3.4
from collections import defaultdict
def lazy_gen(current=[0]):
current[0] += 1
return current[0]
sorted_no = defaultdict(lazy_gen)
for line in customers:
rows=line.split(";")
# no need to put anything,
# just use the key and it increments automagically.
sorted_no[rows[0]]
It works because Python's default assignment happens once, and when you use a mutable object (list in this case), you can change the function's return value dynamically.
It's a little wierd though :)

Related

How can I iterate over a list of integers, and reference the largest key in my dictionary (all integers) so that it is smaller than current value?

I have a dictionary with positive integers as keys, the values don't matter for my question.
Separately, I am iterating through a list of integers, and I want to reference the largest key in my dictionary, that is smaller than the current integer that I am iterating over in my list (if it exists!).
For example:
from collections import defaultdict
def Loep(obstacles):
my_dict = defaultdict(int)
output = []
for i in range(len(obstacles)):
if max(j for j in my_dict.keys() if j<= obstacles[i]):
temp = max(j for j in my_dict.keys() if j<= obstacles[i])
my_dict[obstacles[i]] = temp + 1
output.append(my_dict[obstacles[i]])
else:
my_dict[obstacles[i]] = 1
output.append(my_dict[obstacles[i]])
print(Loep([3,1,5,6,4,2]))
I am getting an error for the 'if' statement above- I believe it is because I have one too many arguments in max(), any ideas how to amend the code?
The error is: ValueError: max() arg is an empty sequence
I've tried separating it, but I can't quite do it.
Something like this:
from collections import defaultdict
def Loep(obstacles):
my_dict = defaultdict(int)
my_dict.update({
1: 0,
2: 0,
3: 0,
4: 0,
5: 0,
6: 0,
})
output = []
for obstacle in obstacles:
keys = [j for j in my_dict.keys() if j <= obstacle]
if keys:
# there is at least one qualifying key
key = max(keys)
my_dict[obstacle] = key + 1
output.append(my_dict[obstacle])
else:
my_dict[obstacle] = 1
output.append(my_dict[obstacle])
return output
print(Loep([3, 1, 5, 6, 4, 2]))
In response to your comment about doing it in one line.. yes, you could condense it like so:
for obstacle in obstacles:
key = max([None]+[j for j in my_dict.keys() if j <= obstacle])
if key is not None:
# etc
.. and definitely there are other ways to do it.. using filter.. or other ways.. but end of the day you are trying to not just get the max, but to get the max lower than a specific value. Unless you're working with a very large amount of data, or in need of extreme speed.. that this is the easiest way.
Try this. Is it what you want?
from collections import defaultdict
def Loep(obstacles):
my_dict = defaultdict(int)
output = []
for i in range(len(obstacles)):
founds = [j for j in my_dict.keys() if j <= obstacles[i]]
if founds:
max_val = max(founds)
my_dict[obstacles[i]] = max_val + 1
else:
my_dict[obstacles[i]] = 1
output.append(my_dict[obstacles[i]])
return output
print(Loep([3, 1, 5, 6, 4, 2]))

How do you fix 'IndexError: list index out of range' in Python

This is supposed to return a list of contestants from most to least based on the number of tasks they did (how many times they showed up in the input list), and if 2 contestants have the same number of tasks, then then sort those people by there times (least to greatest).
For example, when given this
["tyson 0:11", "usain 0:12", "carl 0:30", "carl 0:20", "usain 0:40", "carl 1:00", "usain 0:57"]
as input, it's supposed to return this:
["usain", "carl", "tyson"]
However, I can't seem to figure out how to sort by time after sorting by tasks.
The code:
from more_itertools import unique_everseen
def winners(data):
names = []
times = []
taskcount = []
ndict = {}
for i in data:
name = i.split()[0]
time = i.split()[1]
numMin, numSec = time.split(':')
nmin = int(numMin)
nsec = int(numSec)
total = (60 * nmin) + nsec
names.append(name)
times.append(total)
index = 0
for name in names:
count = names.count(name)
taskcount.append(count)
for name in names:
taskcount.pop(0)
taskcount = list(unique_everseen(taskcount))
for name in names:
if name not in ndict:
ndict[name] = [taskcount[index], times[index]]
else:
ndict[name][1] += times[index]
index += 1
sortedDict = sorted(ndict.items(),reverse = True , key=lambda kv: kv[1])
R = [t[0] for t in sortedDict]
return R
On top of that, it seems to work fine & dandy whenever I input a certain list but whenever I input others, it blows up:
Traceback (most recent call last):
File "<ipython-input-69-420jihgfedcc>", line 1, in <module>
runfile('C:/Users/User/Folder/contestWinner.py', wdir='C:/Users/User/Folder')
File "C:\Users\User\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 704, in runfile
execfile(filename, namespace)
File "C:\Users\User\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/User/Folder/contestWinner.py", line 42, in <module>
print(winners(data))
File "C:/Users/User/Folder/contestWinner.py", line 33, in winners
ndict[name] = [taskcount[index], times[index]]
IndexError: list index out of range
Is there anyway to fix the error & sort by time? Sorry if this is really stupid, I'm a beginner at Python.
Another answer has dealt with the problem, but I'd like to suggest a more functional approach:
data = ["tyson 0:11", "usain 0:12", "carl 0:30", "carl 0:20", "usain 0:40", "carl 1:00", "usain 0:57"]
from itertools import groupby
from operator import itemgetter
def process_time(t):
minutes, seconds = map(int, t.split(':'))
return 60 * minutes + seconds
def sort_key(pair):
return (-len(pair[1]), min(pair[1]))
grouped = groupby(sorted(task.split() for task in data),
key=itemgetter(0))
processed = {key: [process_time(time) for name, time in group]
for key, group in grouped}
print(processed)
print([name for name, time in sorted(processed.items(), key=sort_key)])
Output:
{'carl': [20, 30, 60], 'tyson': [11], 'usain': [12, 40, 57]}
['usain', 'carl', 'tyson']
First, we sort and group each entry in the input data by the first element, using sorted and itertools.groupby. This allows us to get our data in the more structured form of a dict, where the keys are names and the values are lists of times.
Along the way, we also process the strings representing times into an integer in seconds.
Next, we want to sort the dict's keys by their values, first in decreasing order of length (because the length of the value is the number of tasks), then in increasing order of minimum time.
This is done by specifying a key function, sort_key, which, here, returns a tuple. The effect of the key function is that the input will be sorted as if the key function was applied to it.
tuples are sorted by their first element, then the second, and so on until all ties are broken or the last element is reached. In this case, we have a 2-tuple, where the first element is the negative length of the input and the second is the minimum value.
Note that the former is negative because sorted, by default, sorts in ascending order; by negating the length, we reverse the sort order. You can pass reverse=True in cases of sorting on only one element, but here we have two sorts in different orders.
The effect of all this is that we perform the required sort to get our answer.
You're getting this error because of these lines:
for name in names:
taskcount.pop(0)
taskcount = list(unique_everseen(taskcount))
Removing these also removes the error.
However, your code still wont return the order you expect, because it is putting the greater time first. You will get [('carl', [3, 110]), ('usain', [3, 109]), ('tyson', [1, 11])] at the end, or when returned ['carl', 'usain', 'tyson'], because carl has a longer total time then usain.
It should just be a matter of tweaking the line sortedDict = sorted(ndict.items(),reverse = True , key=lambda kv: kv[1]) so it sorts by time in the opposite direction.
from collections import Counter
data_release = ["tyson 0:11", "usain 0:12", "carl 0:30", "carl 0:20", "usain 0:40", "carl 1:00", "usain 0:57"]
def sorted_data(data):
temp_data = []
for item in data:
item = item.split()[0]
temp_data.append(item)
temp_dict = dict(Counter(temp_data))
sorted_list = sorted(temp_dict.items(), key=lambda d: - d[1])
result = []
for temp_tuple in sorted_list:
result.append(temp_tuple[0])
return result
print(sorted_data(data_release))
Make full use of python collections lib, The problem will be more simple!
I solved this problem with a slight tweak of gmds's code, so I'm going to accept his/hers. I used a simple nested for loop that replaced each value of processed with the list of the combined times in the original value. Thanks gmds!
Code:
from itertools import groupby
from operator import itemgetter
def winners(data):
def process_time(t):
minutes, seconds = map(int, t.split(':'))
return 60 * minutes + seconds
def sort_key(pair):
return (-len(pair[1]), min(pair[1]))
grouped = groupby(sorted(task.split() for task in data), key=itemgetter(0))
processed = {key: [process_time(time) for name, time in group]
for key, group in grouped}
for name in list(processed.keys()):
length = len(processed.get(name))
value = []
for i in range(length):
value.append(sum(processed.get(name)))
processed[name] = value
sortedL = [name for name, time in sorted(processed.items(), key = sort_key)]
return sortedL
if __name__ == "__main__":
data = ["owen 2:00", "jeff 1:29", "owen 1:00", "jeff 1:30", "robert 0:21"]
print(winners(data))
Thanks for the help guys!

Python - sorting a list of numbers based on indexes

I need to create a program that has a class that crates an object "Food" and a list called "fridge" that holds these objects created by class "Food".
class Food:
def __init__(self, name, expiration):
self.name = name
self.expiration = expiration
fridge = [Food("beer",4), Food("steak",1), Food("hamburger",1), Food("donut",3),]
This was not hard. Then i created an function, that gives you a food with highest expiration number.
def exp(fridge):
expList=[]
xen = 0
for i in range(0,len(fridge)):
expList.append(fridge[xen].expiration)
xen += 1
print(expList)
sortedList = sorted(expList)
return sortedList.pop()
exp(fridge)
This one works too, now i have to create a function that returns a list where the index of the list is the expiration date and the number of that index is number of food with that expiration date.
The output should look like: [0,2,1,1] - first index 0 means that there is no food with expiration date "0". Index 1 means that there are 2 pieces of food with expiration days left 1. And so on. I got stuck with too many if lines and i cant get this one to work at all. How should i approach this ? Thanks for the help.
In order to return it as a list, you will first need to figure out the maximum expiration date in the fridge.
max_expiration = max(food.expiration for food in fridge) +1 # need +1 since 0 is also a possible expiration
exp_list = [0] * max_expiration
for food in fridge:
exp_list[food.expiration] += 1
print(exp_list)
returns [0, 2, 0, 1, 1]
You can iterate on the list of Food objects and update a dictionary keyed on expiration, with the values as number of items having that expiration. Avoid redundancy such as keeping zero counts in a list by using a collections.Counter object (a subclass of dict):
from collections import Counter
d = Counter(food.expiration for food in fridge)
# fetch number of food with expiration 0
print(d[0]) # -> 0
# fetch number of food with expiration 1
print(d[1]) # -> 2
You can use itertools.groupby to create a dict where key will be the food expiration date and value will be the number of times it occurs in the list
>>> from itertools import groupby
>>> fridge = [Food("beer",4), Food("steak",1), Food("hamburger",1), Food("donut",3),]
>>> d = dict((k,len(list(v))) for k,v in groupby(sorted(l,key=lambda x: x.expiration), key=lambda x: x.expiration))
Here we specify groupby to group all elements of list that have same expiration(Note the key argument in groupby). The output of groupby operation is roughly equivalent to (k,[v]), where k is the group key and [v] is the list of values belong to that particular group.
This will produce output like this:
>>> d
>>> {1: 2, 3: 1, 4: 1}
At this point we have expiration and number of times a particular expiration occurs in a list, stored in a dict d.
Next we need to create a list such that If an element is present in the dict d output it, else output 0. We need to iterate from 0 till max number in dict d keys. To do this we can do:
>>> [0 if not d.get(x) else d.get(x) for x in range(0, max(d.keys())+1)]
This will yield your required output
>>> [0,2,0,1,1]
Here is a flexible method using collections.defaultdict:
from collections import defaultdict
def ReverseDictionary(input_dict):
reversed_dict = defaultdict(set)
for k, v in input_dict.items():
reversed_dict[v].add(k)
return reversed_dict
fridge_dict = {f.name: f.expiration for f in fridge}
exp_food = ReverseDictionary(fridge_dict)
# defaultdict(set, {1: {'hamburger', 'steak'}, 3: {'donut'}, 4: {'beer'}})
exp_count = {k: len(exp_food.get(k, set())) for k in range(max(exp_food)+1)}
# {0: 0, 1: 2, 2: 0, 3: 1, 4: 1}
Modify yours with count().
def exp(fridge):
output = []
exp_list = [i.expiration for i in fridge]
for i in range(0, max(exp_list)+1):
output.append(exp_list.count(i))
return output

how can I read from a file and append each word to a dictionary?

what I want to do is read from a file, and then for each word, append it to a dictionary along with its number of occurances.
example:
'today is sunday. tomorrow is not sunday.'
my dictionary would then be this:
{'today': 1, 'is': 2, 'sunday': 2, 'tomorrow': 1, 'not': 1}
the way I'm going about it is to use readline and split to create a list, and then append each element and it's value to an empty dictionary, but it's not really working so far. here's what I have so far, although its incomplete:
file = open('any_file,txt', 'r')
for line in file.readline().split():
for i in range(len(line)):
new_dict[i] = line.count(i) # I'm getting an error here as well, saying that
return new_dict # I can't convert int to str implicitly
the problem with this is that when my dictionary updates when each line is read, the value of a word won't accumulate. so if in another line 'sunday' occurred 3 times, my dictionary would contain {'sunday': 3} instead of {'sunday': 5}. any help? I have no idea where to go from here and I'm new to all of this.
You are looking for collections.Counter.
e.g:
from itertools import chain
with open("file.txt") as file:
Counter(chain.from_iterable(line.split() for line in file))
(Using a itertools.chain.from_iterable() generator expression too.)
Note that your example only works on the first line, I presume this wasn't intentional, and this solution is for across the whole file (obviously it's trivial to swap that around).
Here is a simple version that doesn't deal with punctuation
from collections import Counter
counter = Counter()
with open('any_file,txt', 'r') as file:
for line in file:
for word in line.split():
counter[word] += 1
can also be written like this:
from collections import Counter
counter = Counter(word for line in file for word in line.split())
Here's one way to solve the problem using a dict
counter = {}
with open('any_file,txt', 'r') as file:
for line in file:
for word in line.split():
if word not in counter:
counter[word] = 1
else:
counter[word] += 1
try this
file = open('any_file.txt', 'r')
myDict = {}
for line in file:
lineSplit = line.split(" ")
for x in xrange(len(lineSplit)):
if lineSplit[x] in myDict.keys(): myDict[lineSplit[x]] += 1
else: myDict[lineSplit[x]] = 1
file.close()
print myDict
Do you use Python 3 or Python 2.7?
If yes, use Counter from collections library:
import re
from collections import Counter
words = re.findall('\w+', open('any_file.txt').read().lower())
Counter(words).most_common(10)
But you get list of tuples though. It should be easy for you to turn list of tuples to dictionary.

how I fill a list with many variables python

I have a some variables and I need to compare each of them and fill three lists according the comparison, if the var == 1 add a 1 to lista_a, if var == 2 add a 1 to lista_b..., like:
inx0=2 inx1=1 inx2=1 inx3=1 inx4=4 inx5=3 inx6=1 inx7=1 inx8=3 inx9=1
inx10=2 inx11=1 inx12=1 inx13=1 inx14=4 inx15=3 inx16=1 inx17=1 inx18=3 inx19=1
inx20=2 inx21=1 inx22=1 inx23=1 inx24=2 inx25=3 inx26=1 inx27=1 inx28=3 inx29=1
lista_a=[]
lista_b=[]
lista_c=[]
#this example is the comparison for the first variable inx0
#and the same for inx1, inx2, etc...
for k in range(1,30):
if inx0==1:
lista_a.append(1)
elif inx0==2:
lista_b.append(1)
elif inx0==3:
lista_c.append(1)
I need get:
#lista_a = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
#lista_b = [1,1,1]
#lista_c = [1]
Your inx* variables should almost certinaly be a list to begin with:
inx = [2,1,1,1,4,3,1,1,3,1,2,1,1,1,4,3,1,1,3,1,2,1,1,1,2,3,1,1,3,1]
Then, to find out how many 2's it has:
inx.count(2)
If you must, you can build a new list out of that:
list_a = [1]*inx.count(1)
list_b = [1]*inx.count(2)
list_c = [1]*inx.count(3)
but it seems silly to keep a list of ones. Really the only data you need to keep is a single integer (the count), so why bother carrying around a list?
An alternate approach to get the lists of ones would be to use a defaultdict:
from collections import defaultdict
d = defaultdict(list)
for item in inx:
d[item].append(1)
in this case, what you want as list_a could be accessed by d[1], list_b could be accessed as d[2], etc.
Or, as stated in the comments, you could get the counts using a collections.Counter:
from collections import Counter #python2.7+
counts = Counter(inx)
list_a = [1]*counts[1]
list_b = [1]*counts[2]
...

Categories