I made this python lib and it had this function with uses urllib and urllib2 but when i execute the lib's functions from python shell i get this error
>>> from sabermanlib import geturl
>>> geturl("roblox.com","ggg.html")
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
geturl("roblox.com","ggg.html")
File "sabermanlib.py", line 21, in geturl
urllib.urlretrieve(Address,File)
File "C:\Users\Andres\Desktop\ddd\Portable Python 2.7.5.1\App\lib\urllib.py", line 94, in urlretrieve
return _urlopener.retrieve(url, filename, reporthook, data)
File "C:\Users\Andres\Desktop\ddd\Portable Python 2.7.5.1\App\lib\urllib.py", line 240, in retrieve
fp = self.open(url, data)
File "C:\Users\Andres\Desktop\ddd\Portable Python 2.7.5.1\App\lib\urllib.py", line 208, in open
return getattr(self, name)(url)
File "C:\Users\Andres\Desktop\ddd\Portable Python 2.7.5.1\App\lib\urllib.py", line 463, in open_file
return self.open_local_file(url)
File "C:\Users\Andres\Desktop\ddd\Portable Python 2.7.5.1\App\lib\urllib.py", line 477, in open_local_file
raise IOError(e.errno, e.strerror, e.filename)
IOError: [Errno 2] The system cannot find the file specified: 'roblox.com'
>>>
and here's the code for the lib i made:
import urllib
import urllib2
def geturl(Address,File):
urllib.urlretrieve(Address,File)
EDIT 2
I cant understand why i get this error in the python shell executing:
geturl(Address,File)
You don't want urllib.urlretrieve. This takes a file-like object. Instead, you want urllib.urlopen:
>>> help(urllib.urlopen)
urlopen(url, data=None, proxies=None)
Create a file-like object for the specified URL to read from.
Additionally, if you want to download and save a document, you'll need a more robust geturl function:
def geturl(Address, FileName):
html_data = urllib.urlopen(Address).read() # Open the URL
with open(FileName, 'wb') as f: # Open the file
f.write(html_data) # Write data from URL to file
geturl(u'http://roblox.com') # URL's must contain the full URI, including http://
Related
I have tried the Library called pytotree, But i didnt get any Answer
This is the code:
import pdftotree
file= open('C:/Users/chaitanya.naidu/Downloads/test.pdf', 'rb')
f = pdftotree.parse(file)
I am getting this error
Traceback (most recent call last):
File "<ipython-input-4-4a9a6b72801d>", line 1, in <module>
f = pdftotree.parse(file)
File "C:\Users\chaitanya.naidu\AppData\Local\Continuum\Anaconda3\lib\site-packages\pdftotree\core.py", line 63, in parse
if not extractor.is_scanned():
File "C:\Users\chaitanya.naidu\AppData\Local\Continuum\Anaconda3\lib\site-packages\pdftotree\TreeExtract.py", line 121, in is_scanned
self.parse()
File "C:\Users\chaitanya.naidu\AppData\Local\Continuum\Anaconda3\lib\site-packages\pdftotree\TreeExtract.py", line 91, in parse
for page_num, layout in enumerate(analyze_pages(self.pdf_file)):
File "C:\Users\chaitanya.naidu\AppData\Local\Continuum\Anaconda3\lib\site-packages\pdftotree\utils\pdf\pdf_utils.py", line 117, in analyze_pages
with open(os.path.realpath(file_name), "rb") as fp:
File "C:\Users\chaitanya.naidu\AppData\Local\Continuum\Anaconda3\lib\ntpath.py", line 542, in abspath
path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not _io.BufferedReader
You can use pdfkit, example:
import pdfkit
pdfkit.from_url('http://google.com', 'out.pdf')
pdfkit.from_file('test.html', 'out.pdf')
pdfkit.from_string('Hello!', 'out.pdf')
Please correct my code. I am trying to save the result of this web page in json format to a variable in python.
Error:
Traceback (most recent call last):
File "C:/Users/Varen/Desktop/json_v1.py", line 5, in <module>
json.dump(link, f)
File "C:\Python27\lib\json\__init__.py", line 189, in dump
for chunk in iterable:
File "C:\Python27\lib\json\encoder.py", line 442, in _iterencode
o = _default(o)
File "C:\Python27\lib\json\encoder.py", line 184, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <addinfourl at 53244992 whose fp = <socket._fileobject object at 0x032B4AF0>> is not JSON serializable
Code:
import urllib
import json
link = urllib.urlopen("http://www.saferproducts.gov/RestWebServices/Recall?RecallDateStart=2015-01-01&RecallDateEnd=2015-12-31&format=json")
with open('link.json', 'w') as f:
json.dump(link, f)
You need to read the data from the file like object returned by urlopen():
import urllib
import json
link = urllib.urlopen("http://www.saferproducts.gov/RestWebServices/Recall?RecallDateStart=2015-01-01&RecallDateEnd=2015-12-31&format=json")
with open('link.json', 'w') as f:
json.dump(link.read(), f)
will do the trick.
My goal is to extract a file out of a .tar.gz file without also extracting out the sub directories that precede the desired file. I am trying to module my method off this question. I already asked a question of my own but it seemed like the answer I thought would work didn't work fully.
In short, shutil.copyfileobj isn't copying the contents of my file.
My code is now:
import os
import shutil
import tarfile
import gzip
with tarfile.open('RTLog_20150425T152948.gz', 'r:*') as tar:
for member in tar.getmembers():
filename = os.path.basename(member.name)
if not filename:
continue
source = tar.fileobj
target = open('out', "wb")
shutil.copyfileobj(source, target)
Upon running this code the file out was successfully created however, the file was empty. I know that this file I wanted to extract does, in fact, have lots of information (approximately 450 kb). A print(member.size) returns 1564197.
My attempts to solve this were unsuccessful. A print(type(tar.fileobj)) told me that tar.fileobj is a <gzip _io.BufferedReader name='RTLog_20150425T152948.gz' 0x3669710>.
Therefore I tried changing source to: source = gzip.open(tar.fileobj) but this raised the following error:
Traceback (most recent call last):
File "C:\Users\dzhao\Desktop\123456\444444\blah.py", line 15, in <module>
shutil.copyfileobj(source, target)
File "C:\Python34\lib\shutil.py", line 67, in copyfileobj
buf = fsrc.read(length)
File "C:\Python34\lib\gzip.py", line 365, in read
if not self._read(readsize):
File "C:\Python34\lib\gzip.py", line 433, in _read
if not self._read_gzip_header():
File "C:\Python34\lib\gzip.py", line 297, in _read_gzip_header
raise OSError('Not a gzipped file')
OSError: Not a gzipped file
Why isn't shutil.copyfileobj actually copying the contents of the file in the .tar.gz?
fileobj isn't a documented property of TarFile. It's probably an internal object used to represent the whole tar file, not something specific to the current file.
Use TarFile.extractfile() to get a file-like object for a specific member:
…
source = tar.extractfile(member)
target = open("out", "wb")
shutil.copyfile(source, target)
I have a Tornado web application where I want to read the an uploaded file. This is received from the client and I try to do so like this:
def post(self):
file = self.request.files['images'][0]
dataOpen = open(file['filename'],'r');
dataRead = dataOpen.read()
But it gives an IOError:
Traceback (most recent call last):
File "C:\Python27\lib\site-packages\tornado\web.py", line 1332, in _execute
result = method(*self.path_args, **self.path_kwargs)
File "C:\Users\rsaxdsxc\workspace\pi\src\Server.py", line 4100, in post
dataOpen = open(file['filename'],'r');
IOError: [Errno 2] No such file or directory: u'000c02c55024aeaa96e6c79bfa2de3926dbd3767.jpg'
Why isn't it able to see the file?
Value of file['filename'] is just name of uploaded file, it is not path in your filesystem. Content of file is in file['body']. You can use StringIO module to emulate file interface if you want, or just directly iterate over file['body'].
Very good example you could use is here
So, your post request handler could look like:
def post(self):
file = self.request.files['images'][0]
dataRead = file['body']
store_file_somewhere(file['filename'], dataRead)
I am trying to write to the blobstore using the method described here:
http://code.google.com/appengine/docs/python/blobstore/overview.html#Writing_Files_to_the_Blobstore
I tried using the remote_api to execute the following code:
file_name = files.blobstore.create(mime_type='text/html',_blobinfo_uploaded_filename='sample.txt')
with files.open(file_name, 'a') as f:
f.write('sample text for the sample blob')
files.finalize(file_name)
invariably raises the error (at the third line above):
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\file
s\file.py", line 310, in write
self._make_rpc_call_with_retry('Append', request, response)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\file
s\file.py", line 388, in _make_rpc_call_with_retry
_make_call(method, request, response)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\file
s\file.py", line 236, in _make_call
_raise_app_error(e)
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\api\file
s\file.py", line 179, in _raise_app_error
raise FileNotOpenedError()
FileNotOpenedError
The file i am trying to write is very small (< 20KB) so its not a quota issue. Are there additional steps i am missing?
maybe you can need to add below module but if you dont previously add.
from __future__ import with_statement
-->from google.appengine.api import files
from google.appengine.ext import blobstore
from google.appengine.ext.webapp import blobstore_handlers
file_name = files.blobstore.create(mime_type='text/plain',_blobinfo_uploaded_filename='sample.txt')
with files.open(file_name, 'a') as f:
f.write('sample text for the sample blob')
files.finalize(file_name)