I have a data frame with the text entries dataframe['text'] as well as a list of features to compute for the function. Although not all features work for all text entries, so I was trying to compute everything possible, without manually checking which one worked for which entry. So I wanted the loop to continue after the point where it errors:
with Processor('config.yaml', 'en') as doc_proc:
try:
for j in range (0,len(features)):
for i in range (0, len(dataframe['text'])) :
doc = doc_proc.analyze(dataframe['text'][i], 'string')
result = (doc.compute_features([features[j]]))
dataframe.loc[dataframe.index[i], [features[j]]] = list(result.values())
except:
continue
but I got the SyntaxError: unexpected EOF while parsing. The loop without try works, so I understand it's the reason but can't seem to find the correct way to change the syntax
Put the try/except inside the loop. Then it will resume with the next iteration.
with Processor('config.yaml', 'en') as doc_proc:
for feature in features:
for i in range (0, len(dataframe['text'])):
try:
doc = doc_proc.analyze(dataframe['text'][i], 'string')
result = (doc.compute_features([feature]))
dataframe.loc[dataframe.index[i], [feature]] = list(result.values())
except:
pass
I'm scraping Tripadvisor with Scrapy ( https://www.tripadvisor.com/Hotel_Review-g189541-d15051151-Reviews-CitizenM_Copenhagen_Radhuspladsen-Copenhagen_Zealand.html ).
One of the items I scrape is attractions count and radius as well as the count and radius of the restaurants. This information is not always present ( https://www.tripadvisor.com/Hotel_Review-g189541-d292667-Reviews-Strandmotellet_Greve-Copenhagen_Zealand.html ). If it is not present I get this error message : "IndexError: list index out of range" ( https://pastebin.com/pphM8FSM)
I tried to write a try-error construction without any success:
try:
nearby_restaurants0_attractions1_distance = response.css("._1aFljvmJ::text").extract()
except IndexError:
nearby_restaurants0_attractions1_distance = [None,None]
items["hotel_nearby_restaurants_distance"] = nearby_restaurants0_attractions1_distance[1]
items["hotel_nearby_attractions_distance"] = nearby_restaurants0_attractions1_distance[2]
Thanks a lot for your help!
List indices are zero-based, not one-based. If you are expecting a two-item list, you need to modify your last two lines to use [0] and [1] instead of [1] and [2]:
items["hotel_nearby_restaurants_distance"] = nearby_restaurants0_attractions1_distance[0]
items["hotel_nearby_attractions_distance"] = nearby_restaurants0_attractions1_distance[1]
I am not sure the IndexError was coming from when the data was missing, either. It might have just been hitting this bug even when the data was present. You may need to catch a different exception if the data is missing.
Answer for everybody who is interested:
Scrapy searches for items in nearby_restaurants0_attractions1_distance but if nothing can be found it returns None. So there is no IndexError at that stage.
The IndexError occures later when items only fetches a part of the list - which is obviously not present when Scrapy returned a None-Object. [The pastebin also shows in a line above the IndexError that the problem was with items]
nearby_restaurants0_attractions1_distance = response.css("._1aFljvmJ::text").extract()
try:
items["hotel_nearby_restaurants_distance"] = nearby_restaurants0_attractions1_distance[1]
except IndexError:
items["hotel_nearby_restaurants_distance"] = None
try:
items["hotel_nearby_attractions_distance"] = nearby_restaurants0_attractions1_distance[2]
except:
items["hotel_nearby_attractions_distance"] = None
I am trying to get an integer input from tkinter but I keep getting an error, but I'm not sure why.
The error is :
value = int(enter_box.get()) ValueError: invalid literal for int()
with base 10:
My code:
enter_box = Entry(win,bd = 5)
enter_box.pack(side = TOP)
value = int(enter_box.get()) # this is the line that keeps having the error
value = (int(value))
value = ((value) -1)
results = (results[value])
print (results)
It should just get an integer from the users input that I can add and subtract from.
You're calling the get method about one millisecond after creating the widget. – Bryan Oakley
The way you structure your instructions needs to be rearranged to avoid this error.
I have a python code, using pandas module, where I get data from a .csv file into a pandas dataframe. Then, I've to compare values from a list with values of the pandas dataframe. As I have some indexes defined in the list that aren't exist in the dataframe I get the error here:
for i in sorted(thresholds.keys()):
current=acme_current_data.loc[i, 'Recent-Server']
KeyError: u'the label [422] is not in the [index]
I need your help to know how can I check if the used index exist before continue to avoid the error. Indexing the dataframe or checking the length are not useful solution in my case.
I tried things like these but they don't work:
for i in sorted(thresholds.keys()):
if acme_current_data.loc[i, 'Recent-Server']:
current=acme_current_data.loc[i, 'Recent-Server']
Or:
for i in sorted(thresholds.keys()):
try:
current=acme_current_data.loc[i, 'Recent-Server']
except INDEX_ERROR:
print "Error"
Thanks in advance.
Here I let you a complete example (simplified) to see the error. First yoy have to make a source.csv file with this content to be processed.:
INVITE,Requests,60,77340232,13674,59,74062475,13504
Retransmissions,0,5387,34,0,114838,2474
100,Trying,57,77039746,13590,59,73752071,13420
180,Ringing,47,37411523,7067,41,36984407,6982
486,Busy Here,2,3689189,819,2,3689238,819
487,Terminated,13,21531195,3687,13,21531766,3687
488,Not Acceptable,0,39326,24,0,30665,22
491,Req Pending,0,121,4,0,118,4
4xx,Client Error,0,1,1,0,1,1
Then, a test.py with the code under this text. If I get a way to check if the current_data.loc[i, 'Recent-Server'] exists before assign it with current=current_data.loc[i, 'Recent-Server'], my problem will be solved. Any suggestion?
import os, sys
import pandas as pd
def compare(name,current_data,thresholds):
reference=current_data.loc['INVITE','Recent-Server']
# Check if we have INVITES events
if reference == '0':
print "{}: critical status".format(name)
return
for i in sorted(thresholds.keys()):
try:
current=current_data.loc[i, 'Recent-Server']
if current != '0':
valor=thresholds[i]
except IndexError:
print "Index Error"
clear="source.csv"
current = pd.read_csv(clear, names=['Message','Event','Recent-Server','Total-Server','PerMax-Server','Recent-Client','Total-Client','PerMax-Client'])
current.set_index("Message", inplace=True)
responses_all=("100", "180", "181", "182", "183", "200", "5xx")
# Thresholds for each event type
thresholds_mia={
responses_all[0]: ["value1"], #100 Trying
responses_all[1]: ["value2"], #180 Ringing
responses_all[2]: ["value3"], #181 Forwarded
responses_all[3]: ["value4"], #182 Queued
responses_all[4]: ["value5"], #183 Progress
responses_all[5]: ["value6"], #200 OK
responses_all[6]: ["value7"] #5xx Server Error
}
# Main
compare("Name",current,thresholds_mia)
Thanks for putting a complete code example, that is very helpful. Both suggestions made in my comment work:
Option 1: use the right exception
If you replace except IndexError in your code with except KeyError, your code will print "Index Error" five times. Snippet in question:
for i in sorted(thresholds.keys()):
try:
current = current_data.loc[i, 'Recent-Server']
if current != '0':
valor = thresholds[i]
except KeyError: # <------------------------ use the right exception
print("Index Error")
Option 2: check the index for membership before accessing
Alternatively, you can check the dataframe's index before accessing, like so:
for i in sorted(thresholds.keys()):
if i in current_data.index:
current = current_data.loc[i, 'Recent-Server']
if current != '0':
valor = thresholds[i]
Note here that I check the dataframe's index i in current_data.index. What you tried (i in current_data.loc.index) is a syntax error, since it's not loc that has the index but the dataframe current_data itself.
Both these techniques work. I prefer #2.
I've googled around a bit and it seems like nobody has had this problem before so here we go:
Function:
def split_time(time_string):
time = time_string.split('T')
time_array = time[-1]
return time_array
Call of Function:
class Entry():
def __init__(self,start,end,summary,description):
self.start_date = split_time(start)
self.end_date = split_time(end)
self.summary = summary
self.description = description
My function recieves a string containing a date time format like this: 2018-03-17T09:00:00+01:00
I want to cut it at 'T' so i used time = time_string.split('T') which worked just fine!
The output of time is ['2018-05-08', '12:00:00+02:00'].
So now i wanted to split it some more and ran into the following error:
While i can access time[0] which delivers the output 2018-05-08 i cant access time[1], i just get an Index out of range Error.
To me it seems like time does contain an array with two strings inside because of its output yo i'm really at a loss right now.
Any help would be appreciated =)
(and an explanation too!)
Use item[-1] to access the last item in the last.
Still unsure why item[1] would throw an error for a list with two items in it.