Trying to create a dict that holds name,position and number for each player for each team. But when trying to create the final dictionary players[team_name] =dict(zip(number,name,position)) it throws an error (see below). I can't seem to get it right, any thoughts on what I'm doing wrong here would be highly appreciated. Many thanks,
from bs4 import BeautifulSoup as soup
import requests
from lxml import html
clubs_url = 'https://www.premierleague.com/clubs'
parent_url = clubs_url.rsplit('/', 1)[0]
data = requests.get(clubs_url).text
html = soup(data, 'html.parser')
team_name = []
team_link = []
for ul in html.find_all('ul', {'class': 'block-list-5 block-list-3-m block-list-1-s block-list-1-xs block-list-padding dataContainer'}):
for a in ul.find_all('a'):
team_name.append(str(a.h4).split('>', 1)[1].split('<')[0])
team_link.append(parent_url+a['href'])
team_link = [item.replace('overview', 'squad') for item in team_link]
team = dict(zip(team_name, team_link))
data = {}
players = {}
for team_name, team_link in team.items():
player_page = requests.get(team_link)
cont = soup(player_page.content, 'lxml')
clud_ele = cont.find_all('span', attrs={'class' : 'playerCardInfo'})
for i in clud_ele:
v_number = [100 if v == "-" else v.get_text(strip=True) for v in i.select('span.number')]
v_name = [v.get_text(strip=True) for v in i.select('h4.name')]
v_position = [v.get_text(strip=True) for v in i.select('span.position')]
key_number = [key for element in i.select('span.number') for key in element['class']]
key_name = [key for element in i.select('h4.name') for key in element['class']]
key_position = [key for element in i.select('span.position') for key in element['class']]
number = dict(zip(key_number,v_number))
name = dict(zip(key_name,v_name))
position = dict(zip(key_position,v_name))
players[team_name] = dict(zip(number,name,position))
---> 21 players[team_name] = dict(zip(number,name,position))
22
23
ValueError: dictionary update sequence element #0 has length 3; 2 is required
There are many problems in your code. The one causing the error is that you are trying to instantiate a dictionary with a 3-items tuple in list which is not possible. See the dict doc for details.
That said, I would suggest to rework the whole nested loop.
First, you have in clud_ele a list of player info, each player info concerns only one player and provides only one position, only one name and only one number. So there is no need to store those informations in lists, you could use simple variables:
for player_info in clud_ele:
number = player_info.select('span.number')[0].get_text(strip=True)
if number == '-':
number = 100
name = player_info.select('h4.name')[0].get_text(strip=True)
position = player_info.select('span.position')[0].get_text(strip=True)
Here, usage of select method returns a list but since you know that the list contains only one item, it's ok to get this item to call get_text on. But you could check that the player_info.select('span.number') length is actually 1 before continuing to work if you want to be sure...
This way, you get scalar data type which will be much easier to manipulate.
Also note that I renamed the i to player_info which is much more explicit.
Then you can easily add your player data to your players dict:
players[team_name].append({'name': name,
'position': position
'number': number})
This assume that you create the players[team_name] before the nested loop with players[team_name] = [].
Edit: as stated in the #kederrac's answer, usage of a defaultdict is a smart and convenient way to avoid the manual creation of each players[team_name] list
Finally, this will give you:
a dictionary containing values for name, position and number keys for each player
a team list containg player dictionaries for each team
a players dictionary associating a team list for each team_name
It is the data structure you seems to want, but other structures are possible. Remember to think about your data structure to make it logical AND easily manipulable.
you can't instantiate a dict with 3 arguments, the problem is the fact that you have 3 variables in the zip: zip(number, name, position) with which you want to instantiate a dict, you should give only 2 arguments at a time, the key and the value
I've rewritten your las part of the code:
from collections import defaultdict
data = {}
players = defaultdict(list)
for team_name, team_link in team.items():
player_page = requests.get(team_link)
cont = soup(player_page.text, 'lxml')
clud_ele = cont.find_all('span', attrs={'class' : 'playerCardInfo'})
for i in clud_ele:
num = i.select('span.number')[0].get_text(strip=True)
number = 100 if num == '-' else num
name = i.select('h4.name')[0].get_text(strip=True)
position = i.select('span.position')[0].get_text(strip=True)
players[team_name].append({'number': number, 'position': position, 'name': name})
output:
defaultdict(list,
{'Arsenal': [{'number': '1',
'position': 'Goalkeeper',
'name': 'Bernd Leno'},
{'number': '26',
'position': 'Goalkeeper',
'name': 'Emiliano Martínez'},
{'number': '33', 'position': 'Goalkeeper', 'name': 'Matt Macey'},
{'number': '2',
'position': 'Defender',
'name': 'Héctor Bellerín'},
.......................
Related
I make a bunch of matrices that I want to store in python dictionaries and I always find myself typing the same thing for every state that I want to build, i.e.
Ne21_1st_state = {}
Ne21_2nd_state = {}
Ne21_3rd_state = {}
Ne21_4th_state = {}
Ne21_5th_state = {}
Ne21_6th_state = {}
...
Ne21_29th_state = {}
Ne21_30th_state = {}
Can somebody help me automate this using python for loops?
Thanks in advance!
I want something like this:
for i in range(3, 11):
states = f'Ar36_{i}th_state'
print(states)
where the output would be:
Ar36_3th_state
Ar36_4th_state
Ar36_5th_state
Ar36_6th_state
Ar36_7th_state
Ar36_8th_state
Ar36_9th_state
Ar36_10th_state
but instead of printing it it would create individual dictionaries named Ar36_3th_state, Ar36_4th_state, Ar36_5th_state, ...
can't we make a List of dictionaries
List of 30 (or any N) elements where each element is a dictionary with key = "Ar36_{i}th_state" and value = {whatever value you want}
You can create "name" of pseudo variable and use it as key in dictionary like:
my_dic = {1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}
my_empty_dic = {}
solution = {}
for i in range(1, 31):
name = 'Ne21_'+str(i)+'st_state'
#solution[name] = my_dic
solution[name] = my_empty_dic
for pseudo_variable in solution:
print(pseudo_variable, solution[pseudo_variable])
print(solution['Ne21_16st_state'])
for pseudo_variable in solution:
if '_16st' in pseudo_variable:
print(pseudo_variable, solution[pseudo_variable])
One way I've done this is using list comprehension.
key = list(
str(input(f"Please enter a Key for value {x + 1}: "))
if x == 0
else str(input(f"\nPlease enter a Key for value {x + 1}: "))
for x in range(3))
value = list(str(input(f"\nPlease enter a Bool for value {x + 1}: "))
for x in range(3))
BoolValues = dict(zip(key, value))
I first create a list of keys followed by a list of the values to be stored in the keys. Then I just zip them together into a dictionary. The conditional statements in the first list are only for a slightly better user-experience with \n being added if it's passed the first input.
Actually now that I look back on the question it may be slightly different to what I was thinking, are you trying to create new dictionaries for every matrix? If that is the case, is it something similar to this?: How do you create different variable names while in a loop?
I'm totally beginner with coding and just need help with some stuff.
My dream was to write a smart shopping list that automatically detects duplicates and increases the weight of duplicate products.
I get the shopping list from an external file which has the following form:
weight\n
ingredient\n
eg.
60
eggs
120
beef meat
25
pasta
120
eggs
etc...
After converting this files to dictionaries by this code:
final_list = []
def get_list(day_list):
for day in range(len(day_list)):
day += 1
day_to_open = f'Days/day{str(day)}.txt'
with open(day_to_open, 'r') as file:
day1 = file.readlines()
day1 = [item.rstrip() for item in day1]
x = 0
y = 1
list = []
for item in range(0, len(day1), 2):
dictio = {day1[y]: day1[x]}
x += 2
y += 2
list.append(dictio)
final_list.append(list)
list = []
for item in final_list:
list += item
return list
days = [1, 2, 3]
list = get_list(day_list=days)
Finally I get list of dictionaries like that:
[{'eggs': '60'}, {'beef meat': '120'}, {'pasta': '25'}, {'eggs': '120'}]
How can I iterate through the dictionary to check if any products are repeating, and if so leave one with the added weight?
For three weeks I have been trying to solve it, unfortunately to no avail.
Thank you very much for all your help!
#Edit
my goal is to make it look like this:
[{'eggs': 180}, {'beef meat': 120}, {'pasta': 25}]
#egg weight added (120 + 60)#
lis = [{'eggs': '60'}, {'beef meat': '120'}, {'pasta': '25'}, {'eggs': '120'}]
# make 1 dict from list of dicts and update max value
new = {}
for d in lis:
for k, v in d.items():
if (k not in new) or (int(v) > int(new[k])):
new[k] = v
# rebuild list of dicts
lis = [{k:v} for k, v in new.items()]
print(lis)
# [{'eggs': '120'}, {'beef meat': '120'}, {'pasta': '25'}]
As ShadowRanger has pointed out, it's not common practice to have a list of multiple dictionaries as you have done. Dictionaries are very useful if used correctly.
I'm not entirely sure the structure of the files you are reading, so I will just explain a way forward and leave it up to you to implement it. What I would suggest is that you first initiate a dictionary with all the necessary keys (ingredients in your case) with each of the values set to 0 (as an integer or float, rather than a string), so you would get a dictionary like this:
shopping_list = {'eggs': 0, 'beef meat': 0, 'pasta': 0}
Then, you will be able to access each of the values by calling the shopping_list dictionary and specifying the key of interest. For example, if you wanted to print the value of eggs, you would write:
print(shopping_list['eggs']) # this would return 0
You can then easily increase/decrease a value of interest; for example, to add 10 to pasta, you would write:
shopping_list['eggs'] += 10
Using this method, you can then iterate through each of your items, select the ingredient of interest and add the weight. So if you have duplicates, it will just add to the same ingredient. Again, I'm not sure the structure of the files you are reading, but it would be something along the lines of:
for ingredient, weight in file:
shopping_list[ingredient] += weight
Good luck for your dream - all the best!
I'm parsing through a response of XML using xpath from lxml library.
I'm getting the results and creating lists out of them like below:
object_name = [o.text for o in response.xpath('//*[name()="objectName"]')]
object_size_KB = [o.text for o in response.xpath('//*[name()="objectSize"]')]
I want to use the lists to create a dictionary per element in list and then add them to a final list like this:
[{'object_name': 'file1234', 'object_size_KB': 9347627},
{'object_name': 'file5671', 'objeobject_size_KBt_size': 9406875}]
I wanted a generator because I might need to search for more metadata from the response in the future so I want my code to be future proof and reduce repetition:
meta_names = {
'object_name': '//*[name()="objectName"]',
'object_size_KB': '//*[name()="objectSize"]'
}
def parse_response(response, meta_names):
"""
input: response: api xml response text from lxml xpath
input: meta_names: key names used to generate dictionary per object
return: list of objects dictionary
"""
mylist = []
# create list of each xpath match assign them to variables
for key, value in meta_names.items():
mylist.append({key: [o.text for o in response.xpath(value)]})
return mylist
However the function gives me this:
[{'object_name': ['file1234', 'file5671']}, {'object_size_KB': ['9347627', '9406875']}]
I've been searching for a similar case in the forums but couldn't find something to match my needs.
Appreciate your help.
UPDATE: Renneys answer was what I wanted I just adjusted the length value of range of my results since I don't always have the same length of xpath per object key and since my lists have identical length everytime I picked first index [0].
now the function looks like this.
def create_entries(root, keys):
tmp = []
for key in keys:
tmp.append([o.text for o in root.xpath('//*[name()="' + key + '"]')])
ret = []
# print(len(tmp[0]))
for i in range(len(tmp[0])):
add = {}
for j in range(len(keys)):
add[keys[j]] = tmp[j][i]
ret.append(add)
return ret
Use a two dimensional array:
def createEntries(root, keys):
tmp = []
for key in keys:
tmp.append([o.text for o in root.xpath('//*[name()="' + key + '"]')])
ret = []
for i in range(len(tmp)):
add = {}
for j in range(len(keys)):
add[keys[j]] = tmp[j][i]
ret.append(add)
return ret
I think this is what you are looking for.
You can use zip to combine your two lists into a list of value pairs.
Then, you can use a list comprehension or a generator expression to pair your value pairs with your desired keys.
import pprint
object_name = ['file1234', 'file5671']
object_size = [9347627, 9406875]
[{'object_name': 'file1234', 'object_size_KB': 9347627},
{'object_name': 'file5671', 'objeobject_size_KBt_size': 9406875}]
[{'object_name': ['file1234', 'file5671']}, {'object_size_KB': ['9347627', '9406875']}]
# List Comprehension
obj_list = [{'object_name': name, 'object_size': size} for name,size in zip(object_name,object_size)]
pprint.pprint(obj_list)
print('\n')
# Generator Expression
generator = ({'object_name': name, 'object_size': size} for name,size in zip(object_name,object_size))
for obj in generator:
print(obj)
Live Code Example -> https://onlinegdb.com/SyNSwd7jU
I think the accepted answer is more efficient, but here's an example of how list comprehensions could be used.
meta_names = {
'object_name': ['file1234', 'file5671'],
'object_size_KB': ['9347627', '9406875'],
'object_text': ['Bob', 'Ross']
}
def parse_response(meta_names):
"""
input: response: api xml response text from lxml xpath
input: meta_names: key names used to generate dictionary per object
return: list of objects dictionary
"""
# List comprehensions
to_dict = lambda l: [{key:val for key,val in pairs} for pairs in l]
objs = list(zip(*list([[key,val] for val in vals] for key,vals in meta_names.items())))
pprint.pprint(to_dict(objs))
parse_response(meta_names)
Live Code -> https://onlinegdb.com/ryLq4PVjL
I'm calculating the average score of people in a dictionary with two-dimensional array and I want to know how to return two people with the same score connected by "and"; EX: name and name
My code:
def bestAverage(inputDict):
dic = {}
for i in inputDict:
if i[0] in dic.keys():
dic[i[0]].append(int(i[1]))
else:
dic[i[0]] = [int(i[1])]
totle_score = 0
print(dic)
for key, value, in dic.items():
for c in value:
totle_score += int(c)
Q = len(value)
avrage = totle_score / Q
dic[key]= [avrage]
print(dic)
My input:
inputDict = [ ["Diane", 20],["Bion",25],["Jack","30"],["Diane","50"] ]
result = bestAverage(inputDict)
OUTCOME:
{'Diane': [35.0], 'Bion': [95.0], 'Jack': [125.0]}
Using the sorted dictionary, you can get the dictionary you want.
Sorry, I think my code is a bit complicated.
dic = {'Diane': [35.0],
'Bion': [95.0],
'Jack': [125.0],
'Diane_2': [35.0],
'Bion_2':[95],
'Diane_3':[35.0],
'John':[10]}
import operator
sorted_dic = sorted(dic.items(), key=operator.itemgetter(0))
new_dic = dict()
preKey = sorted_dic[0][0]
preValue = sorted_dic[0][1]
nms = preKey
for key,value in sorted_dic[1:]:
if(value == preValue):
nms += ' and ' + key
else:
new_dic[nms] = preValue
preKey = key
preValue = value
nms = preKey
new_dic[nms] = preValue
print(new_dic)
OUTCOME:
{'Jack': [125.0], 'John': [10], 'Diane and Diane_2 and Diane_3':
[35.0], 'Bion and Bion_2': [95.0]}
Per the OPs question in the comments, this example now produces a final structure containing entries for only those scores that had multiple people with that same score.
data = {'Diane': [35.0], 'Bion': [95.0], 'Jack': [125.0], 'Sam': [95.0]}
# Here, we create a dict of lists, where the keys are the scores, and the values
# are the names of each person who has that score. This will produce:
#
# {
# 35.0: ['Diane'],
# 95.0: ['Bion', 'Sam'],
# 125.0: ['Jack']
# }
collected = {}
# For each key (name) in the input dict...
for name in data:
# Get the score value out of the array for this name
val = data[name][0]
# If we don't have an entry in our new dict for this score (no key in the dict of that
# score value) then add that entry as the score for the key and an empty array for the value
if val not in collected:
collected[val] = []
# Now that we're sure we have an entry for the score of the name we're processing, add
# the name to the array for that score in the new dict
collected[val].append(name)
# Now we just "flip" each entry in the 'collected' map to create a new dict. We create
# one entry in this dict for each entry in the 'collected' map, where each key is a
# single string where we've combined all of the names with the same score, separated
# by 'and', and each value is the score that those names had.
result = {}
# Now iterate over each of our keys, the unique scores, in our new 'collected' dict...
for val in collected:
# We only want to create an entry in the new dict if the entry we're processing has more than
# just one name in the list of names. So here, we check for that, and skip adding an entry to
# the new dict if there is only one name in the list
if len(collected[val]) == 1:
continue
# Combine the value of this entry, the list of names with a particular score, into a single string
combinedNames = " and ".join(collected[val])
# Add an entry to our 'result' dict with this combined name as the key and the score as the value
result[combinedNames] = val
# Print each combined name string from the resulting structure
for names in result:
print(names)
Output:
Bion and Sam
I have an application that creates a list of lists. The second element in the list needs to be assigned using lookup list which also consists of a list of lists.
I have used the "all" method to match the values in the list. If the list value exists in the lookup list, it should update the second position element in the new list. However this is not the case. The == comparative yields a False match for all elements, even though they all exist in both lists.
I have also tried various combinations of index finding commands but they are not able to unpack the values of each list.
My code is below. The goal is to replace the "xxx" values in the newData with the numbers in the lookupList.
lookupList= [['Garry','34'],['Simon', '24'] ,['Louise','13'] ]
newData = [['Louise','xxx'],['Garry', 'xxx'] ,['Simon','xxx'] ]
#Matching values
for i in newData:
if (all(i[0] == elem[0] for elem in lookupList)):
i[1] = elem[1]
You can't do what you want with all(), because elem is not a local variable outside of the generator expression.
Instead of using a list, use a dictionary to store the lookupList:
lookupDict = dict(lookupList)
and looking up matches is a simple constant-time (fast) lookup:
for entry in newData:
if entry[0] in lookupDict:
entry[1] = lookupDict[entry[0]]
you should use dictionaries instead, like this:
lookupList = newData = {}
old_lookupList = [['Garry','34'],['Simon', '24'] ,['Louise','13'] ]
old_newData = [['Louise','xxx'],['Garry', 'xxx'] ,['Simon','xxx'] ]
#convert into dictionary
for e in old_newData: newData[e[0]] = e[1]
for e in old_lookupList: lookupList[e[0]] = e[1]
#Matching values
for key in lookupList:
if key in newData.keys():
newData[key]=lookupList[key]
#convert into list
output_list = []
for x in newData:
output_list.append([x, newData[x]])
I like the following code since it can be tweaked and used in different ways:
lookupList= [ ['Garry', '34'],['Simon', '24'] ,['Louise', '13'] ]
newData = [ ['Louise', 'xxx'],['Garry', 'xxx'], ['Peter', 'xxx'] ,['Simon', 'xxx'] ]
#Matching values
for R in newData:
for i in range(0, len(lookupList) + 1):
try:
if lookupList[i][0] == R[0]:
R[1] = lookupList[i][1]
break
except:
print('Lookup fail on record:', R)
print(newData)