Unable to print the contents in the File using "urllib2" - python

I'm trying to print the lines present in the url file however,I'm getting an error that says:
with html as ins:
AttributeError: __exit__
Below posted is my code
import urllib2
response = urllib2.urlopen('------------------')
html = response.read()
counter = 0;
with html as ins:
array = []
for line in ins:
counter = counter+1
print "cluster number is:", counter
print line

If you want to write the bytes from the url as is (no decoding/encoding):
#!/usr/bin/env python2
import urllib2
import shutil
import sys
from contextlib import closing
with closing(urllib2.urlopen(url)) as response:
shutil.copyfileobj(response, sys.stdout)
It expects that the character encoding used by response is the same character encoding your terminal uses otherwise you'll see mojibake. See A good way to get the charset/encoding of an HTTP response in Python.
Your code in the question contains multiple errors e.g.:
wrong indentation
it tries to use a str object as a context manager that leads to AttributeError (__exit__ method is not defined) because str object does not implement the context manager protocol
for line in ins is misleading: iterating over a string yields characters, not lines.

Related

How to Replace \n, b and single quotes from Raw File from GitHub?

I am trying to download file from GitHub(raw file) and then run this file as .sql file.
import snowflake.connector
from codecs import open
import logging
import requests
from os import getcwd
import os
import sys
#logging
logging.basicConfig(
filename='C:/Users/abc/Documents/Test.log',
level=logging.INFO
)
url = "https://github.com/raw/abc/master/file_name?token=Anvn3lJXDks5ciVaPwA%3D%3D"
directory = getcwd()
filename = os.path.join(getcwd(),'VIEWS.SQL')
r = requests.get(url)
filename.decode("utf-8")
f = open(filename,'w')
f.write(str(r.content))
with open(filename,'r') as theFile, open(filename,'w') as outFile:
data = theFile.read().split('\n')
data = theFile.read().replace('\n','')
data = theFile.read().replace("b'","")
data = theFile.read()
outFile.write(data)
However I get this error
syntax error line 1 at position 0 unexpected 'b'
My converted sql file has b at the beginning and bunch of newline \n characters in the file. Also the entire output file is in single quotes 'text'. Can anyone help me get rid of these? Looks like replace isn't working.
OS: Windows
Python Version: 3.7.0
You introduced a b'.. prefix by converting the response.content bytes value to a string with str():
>>> import requests
>>> r = requests.get("https://github.com/raw/abc/master/file_name?token=Anvn3lJXDks5ciVaPwA%3D%3D")
>>> r.content
b'Not Found'
>>> str(r.content)
"b'Not Found'"
Of course, the specific dummy URL you gave in your question produces a 404 Not Found response, hence the Not Found content of the response body:
>>> r.status_code
404
so the contents in this demonstration are not actually all that useful. However, even for your real URL you probably want to test for a 200 status code before moving to write the data to a file!
What is going wrong in the above is that str(bytesvalue) converts a bytes object to its representation. You'd normally want to decode a bytes value with a text codec, using the bytes.decode() method. But because you are writing the data to a file here, you should instead just open the file in binary mode and write the bytes object without decoding:
r = requests.get(url)
if r.status_code == 200:
with open(filename, 'wb') as f:
f.write(r.content)
The 'wb' mode opens the file for writing in binary mode. Writing binary content to a binary file is the most efficient; decoding it first then writing to a text file requires that it is encoded again. Better to avoid doing double work.
As a side note: there is no need to join a local filename with getcwd(); relative paths always end up in the current working directory, and otherwise it's better to use os.path.abspath(filename).
You could also trust that GitHub sets the correct character set in the Content-Type headers and have response decode the value to str for you in the form of the response.text attribute:
r = requests.get(url)
if r.status_code == 200:
with open(filename, 'w') as f:
f.write(r.text)
but again, that's really doing extra work for nothing, first decoding the binary content from the request, then encoding again when writing to a text file.
Finally, for larger file responses it is better to stream the data and copy it directly to a file. The shutil.copyfileobj() function can take a raw response fileobject directly, provided you enable transparent transport decompression:
import shutil
r = requests.get(url, stream=True)
if r.status_code == 200:
with open(filename, 'wb') as f:
# enable transparent transport decompression handling
r.raw.decode_content = True
shutil.copyfileobj(r.raw, f)
Depending on your version of Python/OS it could be as simple as changing the file to read/write in binary (and if they're still there then altering where you have the replaces):
with open(filename,'rb') as theFile, open(filename,'wb') as outFile:
outfile.write(str(r.content))
data = theFile.read().split('\n')
data = data.replace('\n','')
data = data.replace("b'","")
outFile.write(data)
It would help to have a copy of the file and the line the error is occurring on.

Open an XML file through URL and save it

With Python 3, I want to read an XML web page and save it in my local drive.
Also, if the file already exist, it must overwrite it.
I tested some script like :
import urllib.request
xml = urllib.request.urlopen('URL')
data = xml.read()
file = open("file.xml","wb")
file.writelines(data)
file.close()
But I have an error :
TypeError: a bytes-like object is required, not 'int'
First suggestion: do what even the official urllib docs says and don't use urllib, use requests instead.
Your problem is that you use .writelines() and it expects a list of lines, not a bytes objects (for once in Python the error message is not very helpful). Use .write() instead
import requests
resp = requests.get('URL')
with open('file.xml', 'wb') as foutput:
foutput.write(resp.content)
I found a solution :
from urllib.request import urlopen
xml = open("import.xml", "r+")
xml.write(urlopen('URL').read().decode('utf-8'))
xml.close()
Thanks for your help.

Getting error while creating multiple file in python

I'm creating two files using python script, first file is JSON and second one is HTML file, my below is creating json file but while creating HTML file I'm getting error. Could someone help me to resolve the issue? I'm new to Python script so it would be really appreciated if you could suggest some solution
#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys
import json
JsonResponse = '[{"status": "active", "due_date": null, "group": "later", "task_id": 73286}]'
def create(JsonResponse):
print JsonResponse
print 'creating new file'
try:
jsonFile = 'testFile.json'
file = open(jsonFile, 'w')
file.write(JsonResponse)
file.close()
with open('testFile.json') as json_data:
infoFromJson = json.load(json_data)
print infoFromJson
htmlReportFile = 'Report.html'
htmlfile = open(htmlReportFile, 'w')
htmlfile.write(infoFromJson)
htmlfile.close()
except:
print 'error occured'
sys.exit(0)
create(JsonResponse)
I used below online Python editor to execute my code:
https://www.tutorialspoint.com/execute_python_online.php
infoFromJson = json.load(json_data)
Here, json.load() will expect a valid json data as json_data. But the json_data you provided are not valid json, it's a simple string(Hello World!). So, you are getting the error.
ValueError: No JSON object could be decoded
Update:
In your code you should get the error:
TypeError: expected a character buffer object
That's because, the content you are writing to the file needs to be string, but in place of that, you have a list of dictionary.
Two way to solve this. Replace the line:
htmlfile.write(infoFromJson)
To either this:
htmlfile.write(str(infoFromJson))
To make infoFromJson a string.
Or use the dump utility of json module:
json.dump(infoFromJson, json_data)
If you delete Try...except statement, you will see errors below:
Traceback (most recent call last):
File "/Volumes/Ithink/wechatProjects/django_wx_joyme/app/test.py", line 26, in <module>
create(JsonResponse)
File "/Volumes/Ithink/wechatProjects/django_wx_joyme/app/test.py", line 22, in create
htmlfile.write(infoFromJson)
TypeError: expected a string or other character buffer object
Errors occurred because htmlfile.write need string type ,but infoFromJson is a list .
So,change htmlfile.write(infoFromJson) to htmlfile.write(str(infoFromJson)) will avoid errors!

json2html python lib is not working

I'm trying to create new json file with my custom json input and converting JSON to HTML format and saving into .html file. But I'm getting error while generating JSON and HTML file. Please find my below code - Not sure what I'm doing wrong here:
#!/usr/bin/python
# -*- coding: utf-8 -*-
from json2html import *
import sys
import json
JsonResponse = {
"name": "json2html",
"description": "Converts JSON to HTML tabular representation"
}
def create(JsonResponse):
#print JsonResponse
print 'creating new file'
try:
jsonFile = 'testFile.json'
file = open(jsonFile, 'w')
file.write(JsonResponse)
file.close()
with open('testFile.json') as json_data:
infoFromJson = json.load(json_data)
scanOutput = json2html.convert(json=infoFromJson)
print scanOutput
htmlReportFile = 'Report.html'
htmlfile = open(htmlReportFile, 'w')
htmlfile.write(str(scanOutput))
htmlfile.close()
except:
print 'error occured'
sys.exit(0)
create(JsonResponse)
Can someone please help me resolve this issue.
Thanks!
First, get rid of your try / except. Using except without a type expression is almost always a bad idea. In this particular case, it prevented you from knowing what was actually wrong.
After we remove the bare except:, we get this useful error message:
Traceback (most recent call last):
File "x.py", line 31, in <module>
create(JsonResponse)
File "x.py", line 18, in create
file.write(JsonResponse)
TypeError: expected a character buffer object
Sure enough, JsonResponse isn't a character string (str), but is a dictionary. This is easy enough to fix:
file.write(json.dumps(JsonResponse))
Here is a create() subroutine with some other fixes I recommend. Note that writing the dumping the JSON followed immediately by loading the JSON is usually silly. I left it in assuming that your actual program does something slightly different.
def create(JsonResponse):
jsonFile = 'testFile.json'
with open(jsonFile, 'w') as json_data:
json.dump(JsonResponse, json_data)
with open('testFile.json') as json_data:
infoFromJson = json.load(json_data)
scanOutput = json2html.convert(json=infoFromJson)
htmlReportFile = 'Report.html'
with open(htmlReportFile, 'w') as htmlfile:
htmlfile.write(str(scanOutput))
The error is while writing to the JSON file. Instead of file.write(JsonResponse) you should use json.dump(JsonResponse,file). It will work.

file does not have buffer interface

i have a function in my views.py. It is like this
from django.core.files.uploadedfile import SimpleUploadedFile
def get_file(self, url):
# pdb.set_trace()
result = urllib.urlretrieve(url)
fi = open(result[0])
fi_name = os.path.basename(url)
suf = SimpleUploadedFile(fi_name, fi)
return suf
while creating SimpleUploadedFile object i get the error saying
TypeError: file doesnot have buffer interface
I tried changing open mode to 'rb'. But still get the same error
Plz help me out
SimpleUploadedFile needs the actual file content, rather than a file object. So you could fix your code like this:
suf = SimpleUploadedFile(fi_name, fi.read())
I must say though I don't know why you are using urlretrieve, which saves the content to a local temp file which you then must open and read. Better to use urlopen and pass it directly:
result = urllib.urlopen(url)
fi_name = os.path.basename(url)
suf = SimpleUploadedFile(fi_name, result.read())
Had this problem on the dreaded El Capitan, in the requests lib.
It seems that passing unicode as HTTP content breaks stuff when converted to memoryview() on the socket layer.
Just passing everything as plain strings fixed it for me.

Categories