Feedparser Python Error : KeyError : 'title' - python

I have seen a lot of KeyCount Errors online but none of them quite match the troubles that I'm having. I am using feed parser to try and create a one run application that accesses all the URLs in a text file and outputs all the entries in each URL. When I run this code :
import feedparser as f
with open('addresses.rtf', 'r') as addresses:
for line in addresses:
d = f.parse(line)
print d["feed"]["title"]
print ""
print d.feed.subtitle
print ""
for post in d.entries:
print post.title
print post.link
print ""
I get this error message :
Traceback (most recent call last):
File "/Users/Josh/Desktop/Feed Parser Python Project/init.py", line 7, in <module>
print d["feed"]["title"]
File "build/bdist.macosx-10.6-intel/egg/feedparser.py", line 375, in __getitem__
return dict.__getitem__(self, key)
KeyError: 'title'
My text file is just a .rtf file that has a URL on each line (3 lines).
If someone could give us a hand please let me know and if you need any extra info please don't hesitate to ask. Any help is welcome. Thank you!

It's hard to tell exactly what is wrong here, but in the general case, any KeyError is because the data you are trying to access is not exactly what you expected. It's best to throw your assumptions out the window and take a close look at the actual data that your code is working with.
For debugging, I would recommend taking a close look at what happens before the error. What is the value of line as you read the file? Is it correct? What is the value of d? Did the call to f.parse(line) result in a valid object?

Related

IndexError: list index out of range (while using Googletrans API)

I'm trying to translate a yml file (which exists over 4000 lines) to dutch using the Googletrans API.
This is my python code:
from googletrans import Translator
import re
translator = Translator()
with open("Results2/ValuesfileNotTranslated.yml") as a_file: # Not translated File
for object in a_file:
stripped_object = object.rstrip()
found = False
file = open("Results2/ValuesfileTranslated.yml", "a") #Translated file
if not stripped_object.strip():
file.writelines("\n")
elif "# Do not translate" in stripped_object: #Skips lines with "# Do Not Translates"
counter_DoNotTranslate += 1
file.writelines(stripped_object + "\n")
else: #Translates english to dutch
counter_Translate += 1
results = translator.translate(stripped_object, src='en', dest='nl')
translatedText = results.text
file.writelines(re.split('|=', translatedText, maxsplit=1)[-1].strip() + "\n" )
But when I try to run the code it works until line 276.
This is the YAML file I want to translate.
...
E-mail
E-Mail oder Name
Password
Toggle dropdown
Are you sure?
Yes, sure
Cancel
Back
Download
Submit
Add
Edit
Delete
Beta
"This new functionality is not yet optimized for all browsers. Please create a helpdesk ticket to inform us about errors, providing details including the browser and browser version you are using. As a workaround, we ask you to try another browser to continue (Chrome or Firefox)"
Please enable Javascript in your browser!! # This is line 276
http://www.enable-javascript.com/
Read more
Read less
Close
...
After line 276 I get this error:
Traceback (most recent call last):
File "/Users/AndreB/Library/Mobile Documents/com~apple~CloudDocs/Work/Freelance/ProjectYML/Programma/python/vertaling.py", line 29, in <module>
results = translator.translate(stripped_object, src='en', dest='nl') #Choose a source and a destination
File "/Users/AndreB/Library/Python/3.9/lib/python/site-packages/googletrans/client.py", line 222, in translate
translated_parts = list(map(lambda part: TranslatedPart(part[0], part[1] if len(part) >= 2 else []), parsed[1][0][0][5]))
IndexError: list index out of range
I can't figure out what the problem is with my code.
Does anyone have an idea how I can fix this?
I know this is an old post but I just had this problem and found a solution. The issue was that Googletrans didn't know what to do with empty lines. I was translating by line so I just added:
if line != "\n":
before
translator.translate(line)
That's it - it worked!
You may need to modify if you're not translating by line, but that's the idea.
Good luck.

Don't understand this ConfigParser.InterpolationSyntaxError

So I have tried to write a small config file for my script, which should specify an IP address, a port and a URL which should be created via interpolation using the former two variables. My config.ini looks like this:
[Client]
recv_url : http://%(recv_host):%(recv_port)/rpm_list/api/
recv_host = 172.28.128.5
recv_port = 5000
column_list = Name,Version,Build_Date,Host,Release,Architecture,Install_Date,Group,Size,License,Signature,Source_RPM,Build_Host,Relocations,Packager,Vendor,URL,Summary
In my script I parse this config file as follows:
config = SafeConfigParser()
config.read('config.ini')
column_list = config.get('Client', 'column_list').split(',')
URL = config.get('Client', 'recv_url')
If I run my script, this results in:
Traceback (most recent call last):
File "server_side_agent.py", line 56, in <module>
URL = config.get('Client', 'recv_url')
File "/usr/lib64/python2.7/ConfigParser.py", line 623, in get
return self._interpolate(section, option, value, d)
File "/usr/lib64/python2.7/ConfigParser.py", line 691, in _interpolate
self._interpolate_some(option, L, rawval, section, vars, 1)
File "/usr/lib64/python2.7/ConfigParser.py", line 716, in _interpolate_some
"bad interpolation variable reference %r" % rest)
ConfigParser.InterpolationSyntaxError: bad interpolation variable reference '%(recv_host):%(recv_port)/rpm_list/api/'
I have tried debugging, which resulted in giving me one more line of error code:
...
ConfigParser.InterpolationSyntaxError: bad interpolation variable reference '%(recv_host):%(recv_port)/rpm_list/api/'
Exception AttributeError: "'NoneType' object has no attribute 'path'" in <function _remove at 0x7fc4d32c46e0> ignored
Here I am stuck. I don't know where this _remove function is supposed to be... I tried searching for what the message is supposed to tell me, but quite frankly I have no idea. So...
Is there something wrong with my code?
What does '< function _remove at ... >' mean?
There was indeed a mistake in my config.ini file. I did not regard the s at the end of %(...)s as a necessary syntax element. I suppose it refers to "string" but I couldn't really confirm this.
My .ini file for starting the Python Pyramid server had a similar problem.
And to use the variable from the .env file, I needed to add the following: %%(VARIEBLE_FOR_EXAMPLE)s
But I got other problems, and I solved them with this: How can I use a system environment variable inside a pyramid ini file?

Not clear on why my function is returning none

I have very limited coding background except for some Ruby, so if there's a better way of doing this, please let me know!
Essentially I have a .txt file full of words. I want to import the .txt file and turn it into a list. Then, I want to take the first item in the list, assign it to a variable, and use that variable in an external request that sends off to get the definition of the word. The definition is returned, and tucked into a different .txt file. Once that's done, I want the code to grab the next item in the list and do it all again until the list is exhausted.
Below is my code in progress to give an idea of where I'm at. I'm still trying to figure out how to iterate through the list correctly, and I'm having a hard time interpreting the documentation.
Sorry in advance if this was already asked! I searched, but couldn't find anything that specifically answered my issue.
from __future__ import print_function
import requests
import urllib
from bs4 import BeautifulSoup
def get_definition(x):
url = 'http://services.aonaware.com/DictService/Default.aspx?action=define&dict=wn&query={0}'.format(x)
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html, "lxml")
return soup.find('pre', text=True)[0]
lines = []
with open('vocab.txt') as f:
lines = f.readlines()
lines = [line.strip() for line in lines]
definitions = []
for line in lines:
definitions.append(get_definition(line))
out_str = '\n'.join(definitions)
with open('definitions.txt', 'w') as f:
f.write(out_str)
the problem I'm having is
Traceback (most recent call last):
File "WIP.py", line 20, in <module>
definitions.append(get_definition(line))
File "WIP.py", line 11, in get_definition
return soup.find('pre', text=True)[0]
File "/Library/Python/2.7/site-packages/bs4/element.py", line 958, in __getitem__
return self.attrs[key]
KeyError: 0
I understand that soup.find('pre', text=True) is returning None, but not why or how to fix it.
your problem is that find() returns a single result not a list. The result is a dict-like object so it tries to find the key 0 which it cannot.
just remove the [0] and you should be fine
Also soup.find(...) is not returning None. It is returning an answer! If it were returning None you would get the error
NoneType has no attribute __getitem__
Beautiful soup documentation for find()

TypeError: in Python

I have an issue, where a function returns a number. When I then try to assemble a URL that includes that number I am met with failure.
Specifically the error I get is
TypeError: cannot concatenate 'str' and 'NoneType' objects
Not sure where to go from here.
Here is the relevant piece of code:
# Get the raw ID number of the current configuration
configurationID = generate_configurationID()
# Update config name at in Cloud
updateConfigLog = open(logBase+'change_config_name_log.xml', 'w')
# Redirect stdout to file
sys.stdout = updateConfigLog
rest.rest(('put', baseURL+'configurations/'+configurationID+'?name=this_is_a_test_', user, token))
sys.stdout = sys.__stdout__
It works perfectly if I manually type the following into rest.rest()
rest.rest(('put', http://myurl.com/configurations/123456?name=this_is_a_test_, myusername, mypassword))
I have tried str(configurationID) and it spits back a number, but I no longer get the rest of the URL...
Ideas? Help?
OK... In an attempt to show my baseURL and my configurationID here is what I did.
print 'baseURL: '+baseURL
print 'configurationID: '+configurationID
and here is what I got back
it-tone:trunk USER$ ./skynet.py fresh
baseURL: https://myurl.com/
369596
Traceback (most recent call last):
File "./skynet.py", line 173, in <module>
main()
File "./skynet.py", line 30, in main
fresh()
File "./skynet.py", line 162, in fresh
updateConfiguration()
File "./skynet.py", line 78, in updateConfiguration
print 'configurationID: '+configurationID
TypeError: cannot concatenate 'str' and 'NoneType' objects
it-tone:trunk USER$
What is interesting to me is that the 369596 is the config ID, but like before it seems to clobber everything called up around it.
As kindall pointed out below, my generate_configurationID was not returning the value, but rather it was printing it.
# from generate_configurationID
def generate_configurationID():
dom = parse(logBase+'provision_template_log.xml')
name = dom.getElementsByTagName('id')
p = name[0].firstChild.nodeValue
print p
return p
Your configurationID is None. This likely means that generate_configurationID() is not returning a value. There is no way in Python for a variable name to "lose" its value. The only way, in the code you posted, for configurationID to be None is for generate_configurationID() to return None which is what will happen if you don't explicitly return any value.
"But it prints the configurationID right on the screen!" you may object. Sure, but that's probably in generate_configurationID() where you are printing it to make sure it's right but forgetting to return it.
You may prove me wrong by posting generate_configurationID() in its entirety, and I will admit that your program is magic.

GetAuthSubToken returns None

Hey guys, I am a little lost on how to get the auth token. Here is the code I am using on the return from authorizing my app:
client = gdata.service.GDataService()
gdata.alt.appengine.run_on_appengine(client)
sessionToken = gdata.auth.extract_auth_sub_token_from_url(self.request.uri)
client.UpgradeToSessionToken(sessionToken)
logging.info(client.GetAuthSubToken())
what gets logged is "None" so that does seem right :-(
if I use this:
temp = client.upgrade_to_session_token(sessionToken)
logging.info(dump(temp))
I get this:
{'scopes': ['http://www.google.com/calendar/feeds/'], 'auth_header': 'AuthSub token=CNKe7drpFRDzp8uVARjD-s-wAg'}
so I can see that I am getting a AuthSub Token and I guess I could just parse that and grab the token but that doesn't seem like the way things should work.
If I try to use AuthSubTokenInfo I get this:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/__init__.py", line 507, in __call__
handler.get(*groups)
File "controllers/indexController.py", line 47, in get
logging.info(client.AuthSubTokenInfo())
File "/Users/matthusby/Dropbox/appengine/projects/FBCal/gdata/service.py", line 938, in AuthSubTokenInfo
token = self.token_store.find_token(scopes[0])
TypeError: 'NoneType' object is unsubscriptable
so it looks like my token_store is not getting filled in correctly, is that something I should be doing?
Also I am using gdata 2.0.9
Thanks
Matt
To answer my own question:
When you get the Token just call:
client.token_store.add_token(sessionToken)
and App Engine will store it in a new entity type for you. Then when making calls to the calendar service just dont set the authsubtoken as it will take care of that for you also.

Categories