So I have this list and a function that calculates the scores of my teams. i then put the team name and the score in a separate dictionary but the problem is that i have a few duplicate teams in this list. theres a second item which is whether or not the team response was valid if the result was this: team1 - score 100 - validresponse 0 i just want to get rid of the team even if its a duplicate, however of theres two duplicates of the SAME team and both their submissions were valid then i want to add their scores together and set it as one thing in the dictionary. the only problem is that when doing this, the dictionary automatically disregards the other duplicates.
Here's my code:
import numpy as np
import pandas as pd
mylist = []
with open("input1.txt", "r") as input:
for line in input:
items = line.split()
mylist.append([int(item) for item in items[0:]])
amountOfTestCases = mylist[0][0]
amountOfTeams = mylist[1][0]
amountOfLogs = mylist[1][1]
count = 1
count2 = 1
mydict = {}
teamlist = []
for i in mylist[2:]:
count2 += 1
teamlist.append(mylist[count2][1])
def find_repeating(lst, count=2):
ret = []
counts = [None] * len(lst)
for i in lst:
if counts[i] is None:
counts[i] = i
elif i == counts[i]:
ret += [i]
if len(ret) == count:
return ret
rep_indexes = np.where(pd.DataFrame(teamlist).duplicated(keep=False))
print(teamlist)
print(rep_indexes)
duplicate = find_repeating(teamlist)
def calculate_points(row):
points = mylist[row][3] * 100
points -= mylist[row][0]
return points
for i in teamlist:
count += 1
mydict['team%s' % mylist[count][1]] = calculate_points(count)
print(mydict)
the teamlist = [5, 4, 1, 2, 5, 4]
validresponse 0 i just want to get rid of the team even if its a duplicate
check if the response is valid
if invalid continue without doing anything else
duplicates of the SAME team and both their submissions were valid then i want to add their scores together
check if the key/team already exists (a duplicate)
if it exists
get its value
add the new value
assign the result to that dictionary key
if it is not a duplicate
make a new key with that value
Related
I made a surname dict containing surnames like this:
--The files contains 200 000 words, and this is a sample on the surname_dict--
['KRISTIANSEN', 'OLDERVIK', 'GJERSTAD', 'VESTLY SKIVIK', 'NYMANN', 'ØSTBY', 'LINNERUD', 'REMLO', 'SKARSHAUG', 'ELI', 'ADOLFSEN']
I am not allow to use counter library or numpy, just native Python.
My idea was to use for-loop sorting through the dictionary, but just hit some walls. Please help with some advice.
Thanks.
surname_dict = []
count = 0
for index in data_list:
if index["lastname"] not in surname_dict:
count = count + 1
surname_dict.append(index["lastname"])
for k, v in sorted(surname_dict.items(), key=lambda item: item[1]):
if count < 10: # Print only the top 10 surnames
print(k)
count += 1
else:
break
As mentioned in a comment, your dict is actually a list.
Try using the Counter object from the collections library. In the below example, I have edited your list so that it contains a few duplicates.
from collections import Counter
surnames = ['KRISTIANSEN', 'OLDERVIK', 'GJERSTAD', 'VESTLY SKIVIK', 'NYMANN', 'ØSTBY', 'LINNERUD', 'REMLO', 'SKARSHAUG', 'ELI', 'ADOLFSEN', 'OLDERVIK', 'ØSTBY', 'ØSTBY']
counter = Counter(surnames)
for name in counter.most_common(3):
print(name)
The result becomes:
('ØSTBY', 3)
('OLDERVIK', 2)
('KRISTIANSEN', 1)
Change the integer argument to most_common to 10 for your use case.
The best approach to answer your question is to consider the top ten categories :
for example : category of names that are used 9 times and category of names that are used 200 times and so . Because , we could have a case where 100 of users use different usernames but all of them have to be on the top 10 used username. So to implement my approach here is the script :
def counter(file : list):
L = set(file)
i = 0
M = {}
for j in L :
for k in file :
if j == k:
i+=1
M.update({i : j})
i = 0
D = list(M.keys())
D.sort()
F = {}
if len(D)>= 10:
K = D[0:10]
for i in K:
F.update({i:D[i]})
return F
else :
return M
Note: my script calculate the top ten categories .
You could place all the values in a dictionary where the value is the number of times it appears in the dataset, and filter through your newly created dictionary and push any result that has a value count > 10 to your final array.
edit: your surname_dict was initialized as an array, not a dictionary.
surname_dict = {}
top_ten = []
for index in data_list:
if index['lastname'] not in surname_dict.keys():
surname_dict[index['lastname']] = 1
else:
surname_dict[index['lastname']] += 1
for k, v in sorted(surname_dict.items()):
if v >= 10:
top_ten.append(k)
return top_ten
Just use a standard dictionary. I've added some duplicates to your data, and am using a threshold value to grab any names with more than 2 occurences. Use threshold = 10 for your actual code.
names = ['KRISTIANSEN', 'OLDERVIK', 'GJERSTAD', 'VESTLY SKIVIK', 'NYMANN', 'ØSTBY','ØSTBY','ØSTBY','REMLO', 'LINNERUD', 'REMLO', 'SKARSHAUG', 'ELI', 'ADOLFSEN']
# you need 10 in your code, but I've only added a few dups to your sample data
threshold = 2
di = {}
for name in names:
#grab name count, initialize to zero first time
count = di.get(name, 0)
di[name] = count + 1
#basic filtering, no sorting
unsorted = {name:count for name, count in di.items() if count >= threshold}
print(f"{unsorted=}")
#sorting by frequency: filter out the ones you don't want
bigenough = [(count, name) for name, count in di.items() if count >= threshold]
tops = sorted(bigenough, reverse=True)
print(f"{tops=}")
#or as another dict
tops_dict = {name:count for count, name in tops}
print(f"{tops_dict=}")
Output:
unsorted={'ØSTBY': 3, 'REMLO': 2}
tops=[(3, 'ØSTBY'), (2, 'REMLO')]
tops_dict={'ØSTBY': 3, 'REMLO': 2}
Update.
Wanted to share what code I made in the end. Thank you guys so much. The feedback really helped.
Code:
etternavn_dict = {}
for index in data_list:
if index['etternavn'] not in etternavn_dict.keys():
etternavn_dict[index['etternavn']] = 1
else:
etternavn_dict[index['etternavn']] += 1
print("\nTopp 10 etternavn:")
count = 0
for k, v in sorted(etternavn_dict.items(), key=lambda item: item[1]):
if count < 10:
print(k)
count += 1
else:
break
I have multiple lists, the first index of each list are related the second as well so on and so fourth. I need a way of linking the order of these two lists together. so i have a list of teams (some are duplicate) i need an if statement that says: if theres a duplicate of this, then compare this to the duplicate and take the related value in the other list and choose the better one
import sys
import itertools
from itertools import islice
fileLocation = input("Input the file location of ScoreBoard: ")
T = []
N = []
L = []
timestamps = []
teamids = []
problemids = []
inputids = []
scores = []
dictionary = {}
amountOfLines = len(open('input1.txt').readlines())
with open('input1.txt') as input1:
for line in islice(input1, 2, amountOfLines):
parsed = line.strip().split()
timestamps.append(parsed[0])
teamids.append(parsed[1])
problemids.append(parsed[2])
inputids.append(parsed[3])
scores.append(parsed[4])
def checkIfDuplicates(teamids):
''' Check if given list contains any duplicates '''
if len(teamids) == len(set(teamids)):
return False
else:
return True
for i in teamids:
if checkIfDuplicates(i):
dictionary['team%s' % i] = {}
if dictionary < amountOfTeams:
dictionary['team%s' %]
for i in score:
dictionary[teamid][]
print(dictionary)
loop through each list item
delete item if duplicate
for i in list1:
for k in list2:
if i == k:
list.remove(i)
I am trying to sort students list and rank each student according to their marks using insertion sort. Data of students include Name, Roll no, Address, Mark.
Here, I store the Mark of students in one list - Marklist and other data of students in a second list - stdData.
I sorted the student Mark List using Insertion sort. But right now I have 2 separate lists. How can I merge and print the sorted list of each student with their marks?
import csv
stdData = [] # store RollNum,student name last name,address
Marklist = [] # store the final mark of each student
#generallist=[]
with open("studentlist.csv", "r") as f1:
recordReader = csv.DictReader(f1)
for row in recordReader:
#generallist.append(row)
row['Mark']=int(row['Mark'])
Marklist.append(row['Mark'])
stdData.append(row['RollNo'])
stdData.append(row['Name'])
stdData.append(row['LastName'])
stdData.append(row['Address'])
print(Marklist)
print(stdData)
for i in range(1, len(Marklist)):
key = Marklist[i]
j = i - 1
while j >= 0 and key < Marklist[j]:
Marklist[j + 1] = Marklist[j]
j -= 1
Marklist[j + 1] = key
print("Sorted List: ",Marklist)
Thanks.
You are very near to the correct solution. The answer lies in
Storing student details as list of list. Eg: [ [student1 details], [student2 details], [student3 details] ]
sorting stdData using the indices of MarkList.
Below is the code modified to address the above points:
import csv
stdData = [] # store RollNum,student name last name,address
Marklist = [] # store the final mark of each student
generallist=[]
with open("studentlist.csv", "r") as f1:
recordReader = csv.DictReader(f1)
for row in recordReader:
#generallist.append(row)
row['Mark']=int(row['Mark'])
Marklist.append(row['Mark'])
tmp_data = []
tmp_data.append(row['RollNo'])
tmp_data.append(row['Name'])
tmp_data.append(row['LastName'])
tmp_data.append(row['Address'])
stdData.append(tmp_data) # Storing student details as list of lists
print(Marklist)
print(stdData)
for i in range(1, len(Marklist)):
key = Marklist[i]
data = stdData[i] # Sort the elements in stdData using indices of MarkList
j = i - 1
while j >= 0 and key < Marklist[j]:
Marklist[j + 1] = Marklist[j]
stdData[j+1] = stdData[j]
j -= 1
Marklist[j + 1] = key
stdData[j+1] = data
print("Sorted List: ",Marklist)
for student_data in stdData:
print(student_data)
Even though the above solution gives the correct answer, it uses two lists.
We can sort a list using keys (need not to be actual list elements). The below code implements it and is a better solution.
import csv
stdData = [] # store RollNum,student name last name,address
with open("studentlist.csv", "r") as f1:
recordReader = csv.DictReader(f1)
for row in recordReader:
tmp_data = []
tmp_data.append(row['RollNo'])
tmp_data.append(row['Name'])
tmp_data.append(int(row['Mark']))
tmp_data.append(row['LastName'])
tmp_data.append(row['Address'])
stdData.append(tmp_data) # Storing student details as list of lists
print(stdData)
for i in range(1, len(stdData)):
key = stdData[i][2] # here the key is the mark
data = stdData[i] # we will copy the data to correct index
j = i - 1
while j >= 0 and key < stdData[j][2]:
stdData[j+1] = stdData[j]
j -= 1
stdData[j+1] = data
print("Sorted List:")
for rollno, name, mark, lastname, address in stdData:
print(rollno, name, mark, lastname, address)
Happy coding.
I have an input list like [1,2,2,1,6] the task in hand is to sort by the frequency. I have solved this question and am getting the output as [1,2,6].
But the caveat is that if two of the numbers have the same count like count(1) == count(2). So the desired output is [2,1,6]
then in the output array, 2 must come before 1 as 2 > 1.
So for the input [1,1,2,2,3,3] the output should be [3,2,1]. The counts are the same so they got sorted by their actual values.
This is what I did
input format:
number of Test cases
The list input.
def fun(l):
d = {}
for i in l:
if i in d:
d[i] += 1
else:
d[i] = 1
d1 = sorted(d,key = lambda k: d[k], reverse=True)
return d1
try:
test = int(input())
ans = []
while test:
l = [int(x) for x in input().split()]
ans.append(fun(l))
test -= 1
for i in ans:
for j in i:
print(j, end = " ")
print()
except:
pass
I think that this can help you. I added reverse parameter that is setting by default to True, because that gives the solution, but I wrote in the code where you can edit this as you may.
Here is the code:
from collections import defaultdict # To use a dictionary, but initialized with a default value
def fun(l, reverse = True):
d = defaultdict(int)
# Add count
for i in l:
d[i] += 1
# Create a dictionary where keys are values
new_d = defaultdict(list)
for key,value in d.items():
new_d[value].append(key)
# Get frequencies
list_freq = list(new_d.keys())
list_freq.sort(reverse = reverse) #YOU CAN CHANGE THIS
list_freq
# Add numbers in decreasing order by frequency
# If two integers have the same frequency, the greater number goes first
ordered_list = []
for number in list_freq:
values_number = new_d[number]
values_number.sort(reverse = reverse) # YOU CAN CHANGE THIS
ordered_list.extend(values_number)
return ordered_list
Examples:
l = [1,2,2,1,6]
fun(l)
#Output [2,1,6]
I hope this can help you!
Problem is to return the name of the event that has the highest number of participants in this text file:
#Beyond the Imposter Syndrome
32 students
4 faculty
10 industries
#Diversifying Computing Panel
15 students
20 faculty
#Movie Night
52 students
So I figured I had to split it into a dictionary with the keys as the event names and the values as the sum of the integers at the beginning of the other lines. I'm having a lot of trouble and I think I'm making it too complicated than it is.
This is what I have so far:
def most_attended(fname):
'''(str: filename, )'''
d = {}
f = open(fname)
lines = f.read().split(' \n')
print lines
indexes = []
count = 0
for i in range(len(lines)):
if lines[i].startswith('#'):
event = lines[i].strip('#').strip()
if event not in d:
d[event] = []
print d
indexes.append(i)
print indexes
if not lines[i].startswith('#') and indexes !=0:
num = lines[i].strip().split()[0]
print num
if num not in d[len(d)-1]:
d[len(d)-1] += [num]
print d
f.close()
import sys
from collections import defaultdict
from operator import itemgetter
def load_data(file_name):
events = defaultdict(int)
current_event = None
for line in open(file_name):
if line.startswith('#'):
current_event = line[1:].strip()
else:
participants_count = int(line.split()[0])
events[current_event] += participants_count
return events
if __name__ == '__main__':
if len(sys.argv) < 2:
print('Usage:\n\t{} <file>\n'.format(sys.argv[0]))
else:
events = load_data(sys.argv[1])
print('{}: {}'.format(*max(events.items(), key=itemgetter(1))))
Here's how I would do it.
with open("test.txt", "r") as f:
docText = f.read()
eventsList = []
#start at one because we don't want what's before the first #
for item in docText.split("#")[1:]:
individualLines = item.split("\n")
#get the sum by finding everything after the name, name is the first line here
sumPeople = 0
#we don't want the title
for line in individualLines[1:]:
if not line == "":
sumPeople += int(line.split(" ")[0]) #add everything before the first space to the sum
#add to the list a tuple with (eventname, numpeopleatevent)
eventsList.append((individualLines[0], sumPeople))
#get the item in the list with the max number of people
print(max(eventsList, key=lambda x: x[1]))
Essentially you first want to split up the document by #, ignoring the first item because that's always going to be empty. Now you have a list of events. Now for each event you have to go through, and for every additional line in that event (except the first) you have to add that lines value to the sum. Then you create a list of tuples like (eventname) (numPeopleAtEvent). Finally you use max() to get the item with the maximum number of people.
This code prints ('Movie Night', 104) obviously you can format it to however you like
Similar answers to the ones above.
result = {} # store the results
current_key = None # placeholder to hold the current_key
for line in lines:
# find what event we are currently stripping data for
# if this line doesnt start with '#', we can assume that its going to be info for the last seen event
if line.startswith("#"):
current_key = line[1:]
result[current_key] = 0
elif current_key:
# pull the number out of the string
number = [int(s) for s in line.split() if s.isdigit()]
# make sure we actually got a number in the line
if len(number) > 0:
result[current_key] = result[current_key] + number[0]
print(max(result, key=lambda x: x[1]))
This will print "Movie Night".
Your problem description says that you want to find the event with highest number of participants. I tried a solution which does not use list or dictionary.
Ps: I am new to Python.
bigEventName = ""
participants = 0
curEventName = ""
curEventParticipants = 0
# Use RegEx to split the file by lines
itr = re.finditer("^([#\w+].*)$", lines, flags = re.MULTILINE)
for m in itr:
if m.group(1).startswith("#"):
# Whenever a new group is encountered, check if the previous sum of
# participants is more than the recent event. If so, save the results.
if curEventParticipants > participants:
participants = curEventParticipants
bigEventName = curEventName
# Reset the current event name and sum as 0
curEventName = m.group(1)[1:]
curEventParticipants = 0
elif re.match("(\d+) .*", m.group(1)):
# If it is line which starts with number, extract the number and sum it
curEventParticipants += int(re.search("(\d+) .*", m.group(1)).group(1))
# This nasty code is needed to take care of the last event
bigEventName = curEventName if curEventParticipants > participants else bigEventName
# Here is the answer
print("Event: ", bigEventName)
You can do it without a dictionary and maybe make it a little simpler if just using lists:
with open('myfile.txt', 'r') as f:
lines = f.readlines()
lines = [l.strip() for l in lines if l[0] != '#'] # remove comment lines and '\n'
highest = 0
event = ""
for l in lines:
l = l.split()
if int(l[0]) > highest:
highest = int(l[0])
event = l[1]
print (event)