I'm getting a keyerror exception when I input a player name here that is not in the records list. I can search it and get back any valid name, but if I input anything else, i get a keyerror. I'm not really sure how to go about handling this since it's kindof confusing already dealing with like 3 sets of data created from parsing my file.
I know this code is bad I'm new to python so please excuse the mess - also note that this is a sortof test file to get this functionality working, which I will then write into functions in my real main file. Kindof a testbed here, if that makes any sense.
This is what my data file, stats4.txt, has in it:
[00000] Cho'Gath - 12/16/3 - Loss - 2012-11-22
[00001] Fizz - 12/5/16 - Win - 2012-11-22
[00002] Caitlyn - 13/4/6 - Win - 2012-11-22
[00003] Sona - 4/5/9 - Loss - 2012-11-23
[00004] Sona - 2/1/20 - Win - 2012-11-23
[00005] Sona - 6/3/17 - Loss - 2012-11-23
[00006] Caitlyn - 14/2/16 - Win - 2012-11-24
[00007] Lux - 10/2/14 - Win - 2012-11-24
[00008] Sona - 8/1/22 - Win - 2012-11-27
Here's my code:
import re
info = {}
records = []
search = []
with open('stats4.txt') as data:
for line in data:
gameid = [item.strip('[') for item in line.split(']')]
del gameid[-1]
gameidstr = ''.join(gameid)
gameid = gameidstr
line = line[7:]
player, stats, outcome, date = [item.strip() for item in line.split('-', 3)]
stats = dict(zip(('kills', 'deaths', 'assists'), map(int, stats.split('/'))))
date = tuple(map(int, date.split('-')))
info[player] = dict(zip(('gameid', 'player', 'stats', 'outcome', 'date'), (gameid, player, stats, outcome, date)))
records.append(tuple((gameid, info[player])))
print "\n\n", info, "\n\n" #print the info dictionary just to see
champ = raw_input() #get champion name
#print info[champ].get('stats').get('kills'), "\n\n"
#print "[%s] %s - %s/%s/%s - %s-%s-%s" % (info[champ].get('gameid'), champ, info[champ].get('stats').get('kills'), info[champ].get('stats').get('deaths'), info[champ].get('stats').get('assists'), info[champ].get('date')[0], info[champ].get('date')[1], info[champ].get('date')[2])
#print "\n\n"
#print info[champ].values()
i = 0
for item in records: #this prints out all records
print "\n", "[%s] %s - %s/%s/%s - %s - %s-%s-%s" % (records[i][0], records[i][1]['player'], records[i][1]['stats']['kills'], records[i][1]['stats']['deaths'], records[i][1]['stats']['assists'], records[i][1]['outcome'], records[i][1]['date'][0], records[i][1]['date'][1], records[i][1]['date'][2])
i = i + 1
print "\n" + "*" * 50
i = 0
for item in records:
if champ in records[i][1]['player']:
search.append(records[i][1])
else:
pass
i = i + 1
s = 0
if not search:
print "no availble records" #how can I get this to print even if nothing is inputted in raw_input above for champ?
print "****"
for item in search:
print "\n[%s] %s - %s/%s/%s - %s - %s-%s-%s" % (search[s]['gameid'], search[s]['player'], search[s]['stats']['kills'], search[s]['stats']['deaths'], search[s]['stats']['assists'], search[s]['outcome'], search[s]['date'][0], search[s]['date'][1], search[s]['date'][2])
s = s + 1
I tried setting up a Try; Except sort of thing but I couldn't get any different result when entering an invalid player name. I think I could probably set something up with a function and returning different things if the name is present or not but I think I've just gotten myself a bit confused. Also notice that no match does indeed print for the 8 records that aren't matches, though thats not quite how I want it to work. Basically I need to get something like that for any invalid input name, not just a valid input that happens to not be in a record in the loop.
Valid input names for this data are:
Cho'Gath, Fizz, Caitlyn, Sona, or Lux - anything else gives a keyerror, thats what I need to handle so it doesn't raise an error and instead just prints something like "no records available for that champion" (and prints that only once, rather then 8 times)
Thanks for any help!
[edit] I was finally able to update this code in the post (thank you martineau for getting it added in, for some reason backticks aren't working to block code and it was showing up as bold normal text when i pasted. Anyways, look at if not search, how can I get that to print even if nothing is entered at all? just pressing return on raw_input, currently it prints all records after **** even though i didn't give it any search champ
where is your exact error occurring?
i'm just assuming it is when champ = raw_input() #get champion name
and then info[champ]
you can either check if the key exists first
if champ not in info:
print 'no records avaialble'
or use get
if info.get(champ)
or you can just try and access the key
try:
info[champ]
# do stuff
except KeyError:
print 'no records available'
the more specific you can be in your question the better, although you explained your problem you really didn't include any specifics Please always include a traceback if available, and post the relevant code IN your post not on a link.
Here's some modifications that I think address your problem. I also reformatted the code to make it a little more readable. In Python it's possible to continue long lines onto the next either by ending with a \ or just going to the next line if there's an unpaired '(' or '[' on the previous line.
Also, the way I put code in my questions or answer here is by cutting it out of my text editor and then pasting it into the edit window, after that I make sure it's all selected and then just use the {} tool at the top of edit window to format it all.
import re
from pprint import pprint
info = {}
records = []
with open('stats4.txt') as data:
for line in data:
gameid = [item.strip('[') for item in line.split(']')]
del gameid[-1]
gameidstr = ''.join(gameid)
gameid = gameidstr
line = line[7:]
player, stats, outcome, date = [item.strip() for item in line.split('-', 3)]
stats = dict(zip(('kills', 'deaths', 'assists'), map(int, stats.split('/'))))
date = tuple(map(int, date.split('-')))
info[player] = dict(zip(('gameid', 'player', 'stats', 'outcome', 'date'),
(gameid, player, stats, outcome, date)))
records.append(tuple((gameid, info[player])))
#print "\n\n", info, "\n\n" #print the info dictionary just to see
pprint(info)
champ = raw_input("Champ's name: ") #get champion name
#print info[champ].get('stats').get('kills'), "\n\n"
#print "[%s] %s - %s/%s/%s - %s-%s-%s" % (
# info[champ].get('gameid'), champ, info[champ].get('stats').get('kills'),
# info[champ].get('stats').get('deaths'), info[champ].get('stats').get('assists'),
# info[champ].get('date')[0], info[champ].get('date')[1],
# info[champ].get('date')[2])
#print "\n\n"
#print info[champ].values()
i = 0
for item in records: #this prints out all records
print "\n", "[%s] %s - %s/%s/%s - %s - %s-%s-%s" % (
records[i][0], records[i][1]['player'], records[i][1]['stats']['kills'],
records[i][1]['stats']['deaths'], records[i][1]['stats']['assists'],
records[i][1]['outcome'], records[i][1]['date'][0],
records[i][1]['date'][1], records[i][1]['date'][2])
i = i + 1
print "\n" + "*" * 50
i = 0
search = []
for item in records:
if champ in records[i][1]['player']:
search.append(records[i][1])
i = i + 1
if not search:
print "no match"
exit()
s = 0
for item in search:
print "\n[%s] %s - %s/%s/%s - %s - %s-%s-%s" % (search[s]['gameid'],
search[s]['player'], search[s]['stats']['kills'],
search[s]['stats']['deaths'], search[s]['stats']['assists'],
search[s]['outcome'], search[s]['date'][0], search[s]['date'][1],
search[s]['date'][2])
s = s + 1
Related
I have a file of notes that im trying to convert to a dictionary. I got the script working but failed to output the data im looking for when there are repeated values.
In short took the file commands or comments which are separated by # as per below. I take that list and seperate the 1st column "key" by # and the rest is the comment or definition. Then i check the magic word im looking for, parse it match it and then to out.
Flashcards file as per below
> car # automobile 4 wheels and run
> washington dc # the capital of United States
> fedora # an operating distro
> cat file # reads the file
> car nissan # altima
> car nissan # altima ## first car
> car nissan # maxima
> car nissan # rougue
flashcard_dict = dict()
flashcard_file = open('FlashCards','r')
enter = input("Searching nemo: ")
firstcolumn_str_list = list()
for x in flashcard_file:
flashcard_sprint = x.strip()
flascard_clean = flashcard_sprint.split("#",1)
firstcolumn_str = flascard_clean[0]
firstcolumn = firstcolumn_str.strip()
firstcolumn_str_list.append(firstcolumn)
secondcolumn = flascard_clean[1]
flashcard_dict[firstcolumn] = secondcolumn
print
print ("###" * 3)
lista = list()
# this is version 4 - where lambda works but fails as it matches the string in all words.
# so if the word is "es" all patterns are matched that has "es" AND NOT the specific word
filter_object = filter(lambda a: enter in a, firstcolumn_str_list)
for x in filter_object:
lista.append(x)
print (lista)
cc = 0
if cc < len(lista):
for lambdatodictmatch in lista:
if lambdatodictmatch in flashcard_dict:
print (flashcard_dict[lambdatodictmatch])
else:
print ("NONEsense... nothing here")
else:
print ("NONEsense... nothing here")
Again it works but when i search for car nissan. I get four responses but i only get the last "rougue" output or i get 4 repeated response "rougue".
what's the best way to accomplish this?
If you may have repeated elements then you should always use lists to keep even single value
if firstcolumn not in flashcard_dict:
flashcard_dict[firstcolumn] = []
firstcolumn[firstcolumn].append(secondcolumn)
instead of
flashcard_dict[firstcolumn] = secondcolumn
EDIT:
Full working code with other changes
first I used shorter and more readable names for variables,
I read file at start and later use loop to ask for different cards.
I added command !keys to display all keys, and !exit to exit loop and finish program,
list(sorted(flashcards.keys())) gives all keys from dictionary without repeating values (and sorted)
I used io only to simulate file in memory - so everyone can simply copy and run this code (without creating file FlashCards) but you should use open(...)
text = '''car # automobile 4 wheels and run
washington dc # the capital of United States
fedora # an operating distro
cat file # reads the file
car nissan # altima
car nissan # altima ## first car
car nissan # maxima
car nissan # rougue
'''
import io
# --- constansts ---
DEBUG = True
# --- functions ---
def read_data(filename='FlashCards'):
if DEBUG:
print('[DEBUG] reading file')
flashcards = dict() # with `s` at the end because it keeps many flashcards
#file_handler = open(filename)
file_handler = io.StringIO(text)
for line in file_handler:
line = line.strip()
parts = line.split("#", 1)
key = parts[0].strip()
value = parts[1].strip()
if key not in flashcards:
flashcards[key] = []
flashcards[key].append(value)
all_keys = list(sorted(flashcards.keys()))
return flashcards, all_keys
# --- main ---
# - before loop -
# because words `key` and `keys` are very similar and it is easy to make mistake in code - so I added prefix `all_`
flashcards, all_keys = read_data()
print("#########")
# - loop -
while True:
print() # empty line to make output more readable
enter = input("Searching nemo (or command: !keys, !exit): ").strip().lower()
print() # empty line to make output more readable
if enter == '!exit':
break
elif enter == '!keys':
#print( "\n".join(all_keys) )
for key in all_keys:
print('key>', key)
elif enter.startswith('!'):
print('unknown command:', enter)
else:
# keys which have `enter` only at
#selected_keys = list(filter(lambda text: text.startswith(enter), all_keys))
# keys which have `enter` in any place (at the beginning, in the middle, at the end)
selected_keys = list(filter(lambda text: enter in text, all_keys))
print('selected_keys:', selected_keys)
if selected_keys: # instead of `0 < len(selected_keys)`
for key in selected_keys:
# `selected_keys` has to exist in `flashcards` so there is no need to check if `key` exists in `flashcards`
print(key, '=>', flashcards[key])
else:
print("NONEsense... nothing here")
# - after loop -
print('bye')
I am trying to connect with elements that carry the contact numbers on each site. I was able to create the routine to get the numbers, extract the contact number with available formats and regex and the following code snippet to get the element
contact_elem = browser.find_elements_by_xpath("//*[contains(text(), '" + phone_num + "')]")
Considering the example of https://www.cssfirm.com/, the contact number appears in 2 locations, the top header and the bottom footer
The element texts accompanying the contact number are as follows :
<h3>CALL US TODAY AT (855) 910-7824</h3> - Footer
<span>Call Us<br>Today</span> (855) 910-7824 - Header
The extracted phone number matches perfectly while printing it out. For some reason, the element from the header part is not being detected.
I tried by searching for elements and even by deleting the footer element from the browser before executing the rest of the code.
What could be the reason for it to go undetected?
P.S: Below is the amateurish,uncorrected code. Efficiency edits/suggestions are welcome. The same code has been tested with various sites and works fine.
url = 'http://www.cssfirm.com/'
browser.get(url)
parsed = browser.find_element_by_tag_name('html').get_attribute('innerHTML')
s = BeautifulSoup(parsed, 'html.parser')
s = s.decode('utf-8')
phoneNumberRegex = '(\s*(?:\+?(\d{1,4}))?[-. (]*(\d{1,})[-. )]*(\d{3}|[A-Z0-9]+)[-. \/]*(\d{4}|[A-Z0-9]+)[-. \/]?(\d{4}|[A-Z0-9]+)?(?: *x(\d+))?\s*)'
custom_re = ['([0-9]{4,4} )([0-9]{3,3} )([0-9]{4,4})',
'([0-9]{3,3} )([0-9]{4,4} )([0-9]{4,4})',
'(\+[0-9]{2,2}-)([0-9]{4,4}-)([0-9]{4,4}-)(0)',
'(\([0-9]{3,3}\) )([0-9]{3,3}-)([0-9]{4,4})',
'(\+[0-9]{2,2} )(\(0\)[0-9]{4,4} )([0-9]{4,6})',
'([0-9]{5,5} )([0-9]{6,6})',
'(\+[0-9]{2,2}\(0\))([0-9]{4,4} )([0-9]{4,4})',
'(\+[0-9]{2,2} )([0-9]{3,3} )([0-9]{4,4} )([0-9]{3,3})',
'([0-9]{3,3}-)([0-9]{3,3}-)([0-9]{4,4})']
phones = []
phones = re.findall(phoneNumberRegex, s)
phone_num_list = ()
phone_num = ''
matched = 0
for phoneHeader in phones:
#phoneHeader = phoneHeader.decode('utf-8')
for ph_cnd in phoneHeader:
for pttrn in custom_re:
phones = re.findall(pttrn,ph_cnd)
if(phones):
phone_num_list = phones
for x in phone_num_list:
phone_num = ''.join(x)
try:
contact_elem = browser.find_element_by_xpath("//*[contains(text(), '" + phone_num + "')]")
phone_num_txt = contact_elem.text
if(phone_num_txt):
matched = 1
break
except NoSuchElementException:
pass
if(matched == 1):
break
if(matched == 1):
break
if(matched == 1):
break
print("Phone number :",phone_num) <-- Perfect output
contact_elem <--empty for header or just the footer element
EDIT
Code updated. Forgot an important piece. Moreover, there is sleep time given in between to give time for the page to load. Considering it trivial, I haven't included them for a quick read.
I found a temporary solution by searching for the partial link text, as the number also comes on the link.
contact_elem2 = browser.find_element_by_partial_link_text(phone_num)
However, this does not answer the generic question as to why that text was ignored within the element.
I am using python script to get the followers for a specific user. The script runs perfectly and it returns the IDs of the followers when i use user lookup API it only returns 3 result. The script is like this:
#!/usr/bin/python
from twitter import *
import sys
import csv
import json
config = {}
execfile("/home/oracle/Desktop/twitter-1.17.1/config.py", config)
twitter = Twitter(
auth = OAuth(config["access_key"], config["access_secret"],config["consumer_key"], config["consumer_secret"]))
username = "#####"
query = twitter.followers.ids(screen_name = username)
print "found %d followers" % (len(query["ids"]))
for n in range(0, len(query["ids"]), 100):
ids = query["ids"][n:n+100]
subquery = twitter.users.lookup(user_id = ids)
for user in subquery:
print " [%s] %s" % ("*" if user["verified"] else " ", user["screen_name"])
# print json.dumps(user)
And it returns the output like this:
{u'next_cursor_str': u'0', u'previous_cursor': 0, u'ids': [2938672765, 1913345678, 132150958, 2469504797, 2162312397, 737550671029764097, 743699723786158082, 743503916885737473, 742612685632770048, 742487358826811392, 742384945121878020, 741959985127665664, 1541162424, 739102973830254592, 740198523724038144, 542050890, 739971273934176256, 2887662768, 738922874011013120, 738354749045669888, 737638395711791104, 737191937061584896, 329618583, 3331556957, 729645523515396096, 2220176421, 162387597, 727099914635874304, 726665274737475584, 725406360406470657, 938760691, 715260034335305729, 723912842320158720, 538208881, 2188791158, 723558257541828608, 1263571466, 720182865275842564, 719947801598259200, 636067084, 719412219168038912, 719199478260043776, 715921761158574080........ ], u'next_cursor': 0, u'previous_cursor_str': u'0'}
When i use the user look up API it only returns 4 screen names like this:
found 1106 followers
[ ] In_tRu_dEr
[ ] amanhaider3
[ ] SaaddObaid
[ ] Soerwer
I want the screen names of all the IDs present but it returns only 4. Can anyone help.
your issue is in those 2 lines
(I assumed second line is intended although it is not in the question)
for n in range(0, len(query["ids"]), 100):
ids = query["ids"][n:n+100]
those lines will create multi ids arrays and they overwrite each other
so first iteration ids will have the ids from 0 to 100
then you overwrite it with ids from 100 to 200 and so on
till you reach last iteration from 1100 to 1106
so ids will only have ids on it
and apprenatly from those 6 only 4 are returned by twitter.users.lookup
to fix it you will need to keep everything under the for n loop
like this
for n in range(0, len(query["ids"]), 100):
ids = query["ids"][n:n+100]
subquery = twitter.users.lookup(user_id = ids)
for user in subquery:
print " [%s] %s" % ("*" if user["verified"] else " ", user["screen_name"])
this will work
I was recently assisted with getting the scores from a yahoo NHL page that would print out the teams and their aforementioned scores in a respective manner. Here is my code:
from bs4 import BeautifulSoup
from urllib.request import urlopen
url = urlopen("http://sports.yahoo.com/nhl/scoreboard?d=2013-01-19")
content = url.read()
soup = BeautifulSoup(content)
def yahooscores():
results = {}
for table in soup.find_all('table', class_='scores'):
for row in table.find_all('tr'):
scores = []
name = None
for cell in row.find_all('td', class_='yspscores'):
link = cell.find('a')
if link:
name = link.text
elif cell.text.isdigit():
scores.append(cell.text)
if name is not None:
results[name] = scores
for name, scores in results.items():
print ('%s: %s' % (name, ', '.join(scores)) + '.')
yahooscores()
Now, first of all: I am associating this stuff in a function because I am going to have to change the url constantly to get all the values of every day of the January month.
The issue here is that while I can print the scores and team text fine, I am trying to accomplish this:
Ottawa: 1, 1, 2.
Winnipeg: 1, 0, 0.
Pittsburgh: 2, 0, 1
Philadelphia: 0, 1, 0.
See, my code doesn't do that. I was in the process of trying to get that to happen, but what is complicating the process is that the tables are all under the same class of "scores" and seemingly, I can't find anything different amongst them.
In a nutshell, associate teams correctly with each other and have a space in-between for organization.
The trouble is, you're putting the results for each team into a dict, but there's no order in a dict and so you loose track of which scores came from which table on the page (i.e. which game).
To get around this, you could just print the results directly instead of storing them, and add an extra newline in the outer for loop:
def yahooscores():
results = {}
for table in soup.find_all('table', class_='scores'):
for row in table.find_all('tr'):
scores = []
name = None
for cell in row.find_all('td', class_='yspscores'):
link = cell.find('a')
if link:
name = link.text
elif cell.text.isdigit():
scores.append(cell.text)
if name is not None:
print ('%s: %s' % (name, ', '.join(scores)) + '.')
print ""
yahooscores()
Or, if you want to store the scores and show them later, you can store the teams for each game as well and use them to group the results:
def yahooscores():
results = {}
games = []
for table in soup.find_all('table', class_='scores'):
teams = []
for row in table.find_all('tr'):
scores = []
name = None
for cell in row.find_all('td', class_='yspscores'):
link = cell.find('a')
if link:
name = link.text
elif cell.text.isdigit():
scores.append(cell.text)
if name is not None:
results[name] = scores
teams.append(name)
games.append(teams)
for teams in games:
for name in teams:
scores = results[name]
print ('%s: %s' % (name, ', '.join(scores)) + '.')
print ""
yahooscores()
The problem is that you're treating the table as a flat list of teams, rather than as a list of scores, each of which has two teams in it.
The clean way to fix that is to change the way you parse the page so you loop over the games, then, for each game, store something like a pair of names-and-scores.
But there's also a quick&dirty solution: If you kept the teams in order, you could just pair them up after the fact. A dict has no inherent order, but an OrderedDict preserves the order of insertion. So, just change results = {} to results = collections.OrderedDict.
(Although if the only thing you ever do with this dict is iterate its items(), I'm not sure why you want a dictionary at all. Just do results = [], replace results[name] = scores with results.append((name, scores)), and then iterate over results instead of results.items().)
And now, if you want to print them out in pairs… well, you can make an iterator over pairs from any iterable very easily. For example:
def pairs(iterable):
return zip(*[iter(iterable)]*2)
for (name1, score1), (name2, score2) in pairs(results.items()):
print ('%s: %s' % (n1, ', '.join(s1)) + '.')
print ('%s: %s' % (n2, ', '.join(s2)) + '.')
print
Or, if you can't figure out what that means, something hacky like this works fine too:
pair_done = False
for name, scores in results.items():
print ('%s: %s' % (name, ', '.join(scores)) + '.')
if pair_done:
print
pair_done = not pair_done
… or:
for i, (name, scores) in enumerate(results.items()):
print ('%s: %s' % (name, ', '.join(scores)) + '.')
if i % 2:
print
I am using Django creating a site for records for football teams, I have a "pretty" display with CSS, etc, but as a backup / old school version I am trying to have the code write the information to a basic .html file that is using rjust, ljust, etc to format text. In the code below if I remove the link code, and just display the string for the team's name everything lines up properly. Once I add the HTML for the link though the columns do not line up and are completely out of whack. What have I done wrong?
standings = Team.objects.filter(active=True).order_by('-wp')
output += '<pre>\n'
output += '%s %s %s %s\n' % (str('Rk').rjust(3), str('Team').ljust(50), str('W').rjust(2), str('L').rjust(2))
output += '%s %s %s %s\n' % (str('--').rjust(3), str('----').ljust(50), str('-').rjust(2), str('-').rjust(2))
for row in mpi:
the_team = "%s" % (row.slug, row.name)
output += '%s %s %s %s\n' % (str(row.rank).rjust(3), str(the_team).ljust(50), str(row.won).rjust(2), str(row.lost).rjust(2))
output += '</pre>'
The string "%s" contains some characters that aren't rendered on browser, you're formatting the source code, not the visualization.
Replace str(row.the_team).ljust(50) by str(row.the_team).ljust(50+len(row.slug)+15) because there are 15 invisible chars (ie. ) plus the slug.
Update: You may want to remove some str. If some value is already a string, you didn't need to (re)transform it in string again.. You may also split long lines in shorten ones.
output = '<pre>\n'
output += '%s %s%s%s\n' % ('Rk'.rjust(3), 'Team'.ljust(50), 'W'.rjust(2), 'L'.rjust(2))
output += '%s %s%s%s\n' % ('--'.rjust(3), '----'.ljust(50), '-'.rjust(2), '-'.rjust(2))
for team in teams:
link = '%s' % (team.slug, team.name)
link = link.ljust(50 + len(team.slug) + 15)
rank, won, lost = str(team.rank).rjust(3), str(team.won).rjust(2), str(team.lost).rjust(2)
output += '%s %s%s%s\n' % (rank, link, won, lost)
output += '</pre>'
print output