How can I parse a YAML file from the web using PyYAML? - python

I need to get a YAML file from the web and parse it using PyYAMl, but i can't seem to find a way to do it.
import urllib
import yaml
fileToBeParsed = urllib.urlopen("http://website.com/file.yml")
pythonObject = yaml.open(fileToBeParsed)
print pythonObject
The error produced when runing this is:
AttributeError: 'module' object has no attribute 'open'
If it helps, I am using python 2. Sorry if this is a silly question.

I believe you want yaml.load(fileToBeParsed) and I would suggest looking at urllib2.urlopen if not the requests module.

Related

getting AttributeError: object has no attribute 'PE' when trying to parse a pe file using pefile reader

I am new to Python, and have an assignment to create a Python script that read any pe file whether .exe or .dll. The problem when I use the pefile reader located at https://github.com/erocarrera/pefile
I am following the usage exactly as below:
import pefile
pe = pefile.PE(‘/path/to/pefile.exe’)
pe.OPTIONAL_HEADER.AddressOfEntryPoint
the first and second statement succeeded but the third one failed with:
AttributeError: 'PE' object has no attribute OPTIONAL_HEADER
Can someone help on this please.

Python - JSON Load from file not working

So I am writing a basic multipurpose script which uses json to import a dictionary from a file but for some reason it doesn't save properly. I've looked all over and can't find anything relating to my exact problem.
Here is my code:
import json
dicti = json.loads(open('database.db'))
print(str(dicti))
But then I get this error:
TypeError: JSON object must be str, not TextIOWrapper.
So does anyone have any ideas on what the problem is? Thanks in Advance.
Note: Currently the file only has inside it:
{}
You want json.load for loading a file. json.loads is for loading from a string.

urllib2 download HTML file

Using urllib2 in Python 2.7.4, I can readily download an Excel file:
output_file = 'excel.xls'
url = 'http://www.nbmg.unr.edu/geothermal/GEOTHERM-30Jun11.xls'
file(output_file, 'wb').write(urllib2.urlopen(url).read())
This results in the expected file that I can use as I wish.
However, trying to download just an HTML file gives me an empty file:
output_file = 'webpage.html'
url = 'http://www.nbmg.unr.edu/geothermal/mapfiles/nvgeowel.html'
file(output_file, 'wb').write(urllib2.urlopen(url).read())
I had the same results using urllib. There must be something simple I'm missing or don't understand. How do I download an HTML file from a URL? Why doesn't my code work?
If you want to download files or simply save a webpage you can use urlretrieve(from urllib library)instead of use read and write.
import urllib
urllib.urlretrieve("http://www.nbmg.unr.edu/geothermal/mapfiles/nvgeowel.html","doc.html")
#urllib.urlretrieve("url","save as..")
If you need to set a timeout you have to put it at the start of your file:
import socket
socket.setdefaulttimeout(25)
#seconds
It also Python 2.7.4 in my OS X 10.9, and the codes work well on it.
So I think there maybe other problems prevent its working. Can you open "http://www.nbmg.unr.edu/geothermal/GEOTHERM-30Jun11.xls" in your browser?
This may not directly answer the question, but if you're working with HTTP and have sufficient privileges to install python packages, I'd really recommend doing this with 'requests'. There's a related answered here - https://stackoverflow.com/a/13137873/45698

python ultrajson: how to use?

I've just installed ultrajson (ujson) to see if I can't get the json decoding to go faster (string to object). However, I'm not seeing any examples of how to use it.
with regular json it's just
import json
my_object = json.loads(my_string)
Change the import statement to import ujson as json
Then you can leave the other parts of your program as it is.

Get webpage contents with Python?

I'm using Python 3.1, if that helps.
Anyways, I'm trying to get the contents of this webpage. I Googled for a little bit and tried different things, but they didn't work. I'm guessing that this should be an easy task, but...I can't get it. :/.
Results of urllib, urllib2:
>>> import urllib2
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
import urllib2
ImportError: No module named urllib2
>>> import urllib
>>> urllib.urlopen("http://www.python.org")
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
urllib.urlopen("http://www.python.org")
AttributeError: 'module' object has no attribute 'urlopen'
>>>
Python 3 solution
Thank you, Jason. :D.
import urllib.request
page = urllib.request.urlopen('http://services.runescape.com/m=hiscore/ranking?table=0&category_type=0&time_filter=0&date=1519066080774&user=zezima')
print(page.read())
If you're writing a project which installs packages from PyPI, then the best and most common library to do this is requests. It provides lots of convenient but powerful features. Use it like this:
import requests
response = requests.get('http://hiscore.runescape.com/index_lite.ws?player=zezima')
print (response.status_code)
print (response.content)
But if your project does not install its own dependencies, i.e. is limited to things built-in to the standard library, then you should consult one of the other answers.
Because you're using Python 3.1, you need to use the new Python 3.1 APIs.
Try:
urllib.request.urlopen('http://www.python.org/')
Alternately, it looks like you're working from Python 2 examples. Write it in Python 2, then use the 2to3 tool to convert it. On Windows, 2to3.py is in \python31\tools\scripts. Can someone else point out where to find 2to3.py on other platforms?
Edit
These days, I write Python 2 and 3 compatible code by using six.
from six.moves import urllib
urllib.request.urlopen('http://www.python.org')
Assuming you have six installed, that runs on both Python 2 and Python 3.
If you ask me. try this one
import urllib2
resp = urllib2.urlopen('http://hiscore.runescape.com/index_lite.ws?player=zezima')
and read the normal way ie
page = resp.read()
Good luck though
Mechanize is a great package for "acting like a browser", if you want to handle cookie state, etc.
http://wwwsearch.sourceforge.net/mechanize/
You can use urlib2 and parse the HTML yourself.
Or try Beautiful Soup to do some of the parsing for you.
Also you can use faster_than_requests package. That's very fast and simple:
import faster_than_requests as r
content = r.get2str("http://test.com/")
Look at this comparison:
A solution with works with Python 2.X and Python 3.X:
try:
# For Python 3.0 and later
from urllib.request import urlopen
except ImportError:
# Fall back to Python 2's urllib2
from urllib2 import urlopen
url = 'http://hiscore.runescape.com/index_lite.ws?player=zezima'
response = urlopen(url)
data = str(response.read())
Suppose you want to GET a webpage's content. The following code does it:
# -*- coding: utf-8 -*-
# python
# example of getting a web page
from urllib import urlopen
print urlopen("http://xahlee.info/python/python_index.html").read()

Categories