[File IO Error in porting from python2 to python 3] - python

I port my project from python 2.7 to python 3.6
What I was doing in python 2.7
1)Decode from Base 64
2)Uncompress using Gzip
3)Read line by line and add in file
bytes_array = base64.b64decode(encryptedData)
fio = StringIO.StringIO(bytes_array)
f = gzip.GzipFile(fileobj=fio)
decoded_data = f.read()
f.close()
f = file("DecodedData.log",'w')
for item in decoded_data:
f.write(item)
f.close()
I tried the same thing using python 3 changes but It is not working giving one error or the other.
I am not able to use StringIO giving error
#initial_value must be str or None, not bytes
So I try this
bytes_array = base64.b64decode(encryptedData)
#initial_value must be str or None, not bytes
fio = io.BytesIO(bytes_array)
f = gzip.GzipFile(fileobj=fio)
decoded_data =f.read()
f= open("DecodedData.log",'w')
for item in decoded_data:
f.write(item)
f.close()
This gives error in the line f.write(item) that
write() argument must be str, not int
To my surprsie,item actually contains an integer when i print it.(83,83,61,62)
I think it as I have not given the limit,it is reading as many as it can.
So I try to read file line by line
f= open("DecodedData.log",'w')
with open(decoded_data) as l:
for line in l:
f.write(line)
But it still not working and \n also printing in file.
Can some suggest what I am missing.

decoded_data = f.read()
Will result in decoded_data being a bytes object. bytes objects are iterable, when you iterate them they will return each byte value from the data as an integer (0-255). That means when you do
for item in decoded_data:
f.write(item)
then item will be each integer byte value from your raw data.
f.write(decoded_data)
You've opened f in text mode, so you'll need to open it in binary mode if you want to write raw binary data into it. But the fact you've called the file DecodedData.log suggests you want it to be a (human readable?) text file.
So I think overall this will be more readable:
gzipped_data = base64.b64decode(encryptedData)
data = gzip.decompress(gzipped_data)
with open("DecodedData.log",'wb') as f:
f.write(data)
There's no need for the intermediate BytesIO at all, gzip has a decompress method (https://docs.python.org/3/library/gzip.html#gzip.decompress)

Related

file = open(ad, 'wb') TypeError: expected str, bytes or os.PathLike object, not NoneType

#bot.on_message(filters.command('song'))
def songs(_,message):
msg = message.text.replace(message.text.split(' ')[0], '')
videosSearch = VideosSearch(msg , limit = 1)
f = videosSearch.result()
nani = f['result']
for link in nani:
url = link['link']
video = pafy.new(url)
audiostreams = video.audiostreams
best = video.getbestaudio()
ad = best.download()
file = open(ad, 'wb')
bot.send_document(message.chat.id, file)
file.close()
i cant find the error please help
file = open(ad, 'wb')
TypeError: expected str, bytes or os.PathLike object, not NoneType
Since the other two replies aren't really getting the issue at hand, I'm going to chip in here.
On these Lines:
ad = best.download()
file = open(ad, 'wb')
bot.send_document(message.chat.id, file)
file.close()
You assign the result of best.download() to ad, which you want to send with Pyrogram. Since you get a TypeError on open() it is clear that open() didn't get the type it expected, namely a NoneType. You have to make sure that something is downloaded, and that the method returns a value you can use.
Maybe try the caveman approach and use print(ad) before trying to open() it.
In addition to all that: Pyrogram does not support sending a file representation (open()). You can open a file in bytes mode, read the bytes and use that with BytesIO, but when you have an actual file on your system, you can just use the path to the file: app.send_document(chat_id, "my_file.webm").
See the Documentation for app.send_document().
Replace this:
file = open(ad, 'wb')
With This:
file = open('ad', 'wb')

GZip and output file

I'm having difficulty with the following code (which is simplified from a larger application I'm working on in Python).
from io import StringIO
import gzip
jsonString = 'JSON encoded string here created by a previous process in the application'
out = StringIO()
with gzip.GzipFile(fileobj=out, mode="w") as f:
f.write(str.encode(jsonString))
# Write the file once finished rather than streaming it - uncomment the next line to see file locally.
with open("out_" + currenttimestamp + ".json.gz", "a", encoding="utf-8") as f:
f.write(out.getvalue())
When this runs I get the following error:
File "d:\Development\AWS\TwitterCompetitionsStreaming.py", line 61, in on_status
with gzip.GzipFile(fileobj=out, mode="w") as f:
File "C:\Python38\lib\gzip.py", line 204, in __init__
self._write_gzip_header(compresslevel)
File "C:\Python38\lib\gzip.py", line 232, in _write_gzip_header
self.fileobj.write(b'\037\213') # magic header
TypeError: string argument expected, got 'bytes'
PS ignore the rubbish indenting here...I know it doesn't look right.
What I'm wanting to do is to create a json file and gzip it in place in memory before saving the gzipped file to the filesystem (windows). I know I've gone about this the wrong way and could do with a pointer. Many thanks in advance.
You have to use bytes everywhere when working with gzip instead of strings and text. First, use BytesIO instead of StringIO. Second, mode should be 'wb' for bytes instead of 'w' (last is for text) (samely 'ab' instead of 'a' when appending), here 'b' character means "bytes". Full corrected code below:
Try it online!
from io import BytesIO
import gzip
jsonString = 'JSON encoded string here created by a previous process in the application'
out = BytesIO()
with gzip.GzipFile(fileobj = out, mode = 'wb') as f:
f.write(str.encode(jsonString))
currenttimestamp = '2021-01-29'
# Write the file once finished rather than streaming it - uncomment the next line to see file locally.
with open("out_" + currenttimestamp + ".json.gz", "wb") as f:
f.write(out.getvalue())

convert the content of a file in hex to base64 and print out the result

So i'm trying to create a very simple program that opens a file, read the file and convert what is in it from hex to base64 using python3.
I tried this :
file = open("test.txt", "r")
contenu = file.read()
encoded = contenu.decode("hex").encode("base64")
print (encoded)
but I get the error:
AttributeError: 'str' object has no attribute 'decode'
I tried multiple other things but always get the same error.
inside the test.txt is :
4B
if you guys can explain me what I do wrong would be awesome.
Thank you
EDIT:
i should get Sw== as output
This should do the trick. Your code works for Python <= 2.7 but needs updating in later versions.
import base64
file = open("test.txt", "r")
contenu = file.read()
bytes = bytearray.fromhex(contenu)
encoded = base64.b64encode(bytes).decode('ascii')
print(encoded)
you need to encode hex string from file test.txt to bytes-like object using bytes.fromhex() before encoding it to base64.
import base64
with open("test.txt", "r") as file:
content = file.read()
encoded = base64.b64encode(bytes.fromhex(content))
print(encoded)
you should always use with statement for opening your file to automatically close the I/O when finished.
in IDLE:
>>>> import base64
>>>>
>>>> with open('test.txt', 'r') as file:
.... content = file.read()
.... encoded = base64.b64encode(bytes.fromhex(content))
....
>>>> encoded
b'Sw=='

Python throwing error in reading JSON file

I am writing a function in a Python Script which will read the json file and print it.
The scripts reads as:
def main(conn):
global link, link_ID
with open('ad_link.json', 'r') as statusFile:
status = json.loads(statusFile.read())
statusFile.close()
print(status)
link_data = json.load[status]
link = link_data["link"]
link_ID = link_data["link_id"]
print(link)
print(link_ID)
I am getting error as:
link_data = json.load[status]
TypeError: 'function' object is not subscriptable
What is the issue?
The content of ad_link.json The file I am receiving is saved in this manner.
"{\"link\": \"https://res.cloudinary.com/dnq9phiao/video/upload/v1534157695/Adidas-Break-Free_nfadrz.mp4\", \"link_id\": \"ad_Bprise_ID_Adidas_0000\"}"
The function to receive and write JSON file
def on_message2(client, userdata, message):
print("New MQTT message received. File %s line %d" % (filename, cf.f_lineno))
print("message received?/'/'/' ", str(message.payload.decode("utf-8")), \
"topic", message.topic, "retained ", message.retain)
global links
links = str(message.payload.decode("utf-8")
logging.debug("Got new mqtt message as %s" % message.payload.decode("utf-8"))
status_data = str(message.payload.decode("utf-8"))
print(status_data)
print("in function on_message2")
with open("ad_link.json", "w") as outFile:
json.dump(status_data, outFile)
time.sleep(3)
The output of this function
New MQTT message received. File C:/Users/arunav.sahay/PycharmProjects/MediaPlayer/venv/Include/mediaplayer_db_mqtt.py line 358
message received?/'/'/' {"link": "https://res.cloudinary.com/dnq9phiao/video/upload/v1534157695/Adidas-Break-Free_nfadrz.mp4", "link_id": "ad_Bprise_ID_Adidas_0000"} topic ios_push retained 1
{"link": "https://res.cloudinary.com/dnq9phiao/video/upload/v1534157695/Adidas-Break-Free_nfadrz.mp4", "link_id": "ad_Bprise_ID_Adidas_0000"}
EDIT
I found out the error is in JSON format. I am receiving the JSON data in a wrong format. How will I correct that?
There are two major errors here:
You are trying to use the json.load function as a sequence or dictionary mapping. It's a function, you can only call it; you'd use json.load(file_object). Since status is actually a string, you'd have to use json.loads(status) to actually decode a JSON document stored in a string.
In on_message2, you encoded JSON data to JSON again. Now you have to decode it twice. That's an unfortunate waste of computer resources.
In the on_message2 function, the message.payload object is a bytes-value containing a UTF-8 encoded JSON document, if you want to write that to a file, don't decode to text, and don't encode the text to JSON again. Just write those bytes directly to a file:
def on_message2(client, userdata, message):
logging.debug("Got new mqtt message as %s" % message.payload.decode("utf-8"))
with open("ad_link.json", "wb") as out:
out.write(message.payload)
Note the 'wb' status; that opens a file in binary mode for writing, at which point you can write the bytes object to that file.
When you open a file without a b in the mode, you open a file in text mode, and when you write a text string to that file object, Python encodes that text to bytes for you. The default encoding depends on your OS settings, so without an explicit encoding argument to open() you can't even be certain that you end up with UTF-8 JSON bytes again! Since you already have a bytes value, there is no need to manually decode then have Python encode again, so use a binary file object and avoid that decode / encode dance here too.
You can now load the file contents with json.load() without having to decode again:
def main(conn):
with open('ad_link.json', 'rb') as status_file:
status = json.load(status_file)
link = status["link"]
link_id = status["link_id"]
Note that I opened the file as binary again. As of Python 3.6, the json.load() function can work both with binary files and text files, and for binary files it can auto-detect if the JSON data was encoded as UTF-8, UTF-16 or UTF-32.\
If you are using Python 3.5 or earlier, open the file as text, but do explicitly set the encoding to UTF-8:
def main(conn):
with open('ad_link.json', 'r', encoding='utf-8') as status_file:
status = json.load(status_file)
link = status["link"]
link_id = status["link_id"]
def main(conn):
global link, link_ID
with open('ad_link.json', 'r') as statusFile:
link_data = json.loads(statusFile.read())
link = link_data["link"]
link_ID = link_data["link_id"]
print(link)
print(link_ID)
replace loads with load when dealing with file object which supports read like operation
def main(conn):
global link, link_ID
with open('ad_link.json', 'r') as statusFile:
status = json.load(statusFile)
status=json.loads(status)
link = status["link"]
link_ID = status["link_id"]
print(link)
print(link_ID)

Open binary file in zip archive as ZipExtFile

I'm trying to access a binary stream (via a ZipExtFile object) from a data file contained in a Zip archive. To incrementally read in a text file object from the archive, this would be fairly straightforward:
with ziparchive as ZipFile("myziparchive.zip", 'r'):
with txtfile as ziparchive.open("mybigtextfile.txt", 'r'):
for line in txtfile:
....
Ideally the byte stream equivalent would be something like:
with ziparchive as ZipFile("myziparchive.zip", 'r'):
with binfile as ziparchive.open("mybigbinary.bin", 'rb'):
while notEOF
binchunk = binfile.read(MYCHUNKSIZE)
....
Unfortunately, ZipFile.open doesn't seem to support reading binary data to a ZipExtFile object. From the docs:
The mode parameter, if included, must be one of the following: 'r'
(the default), 'U', or 'rU'.
Given this constraint, how best to incrementally read in the binary file directly from the archive? Since the uncompressed file is quite large I'd like to avoid extracting it first.
I managed to solve the issue that I described in my comment to the OP. I have adapted it here, for your purpose, but I think that there is probably a way to just change the encoding of chunk_str, to avoid using ByteIO.
Anyway - here's my code if it helps:
from io import BytesIO
from zipfile import ZipFile
MYCHUNKSIZE = 10
archive_file = r"test_resources\0000232514_bom.zip"
src_file = r"0000232514_bom.xls"
no_of_chunks_to_read = 10
with ZipFile(archive_file,'r') as zf:
with zf.open(src_file) as src_f:
while no_of_chunks_to_read > 0:
chunk_str = src_f.read(MYCHUNKSIZE)
chunk_stream = BytesIO(chunk_str)
chunk_bytes = chunk_stream.read()
print type(chunk_bytes), len(chunk_bytes), chunk_bytes
if len(chunk_str) < MYCHUNKSIZE:
# End of file
break
no_of_chunks_to_read -= 1
For line by line reading:
with ZipFile("myziparchive.zip", 'r') as ziparchive:
with ziparchive.open("mybigtextfile.txt", 'r') as binfile:
for line in binfile:
line = line.decode() # bytes to str
...

Categories