how to write the output of iostream to buffer, python3 - python

I have a program that reads data from cli sys.argv[] and then writes it to a file.
I would like to also display the output to the buffer.
The manual says to use getvalue(), all I get are errors.
Python3 manual
import io
import sys
label = sys.argv[1]
domain = sys.argv[2]
ipv4 = sys.argv[3]
ipv6 = sys.argv[4]
fd = open( domain+".external", 'w+')
fd.write(label+"."+domain+". IN AAAA "+ipv6+"\n")
output = io.StringIO()
output.write('First line.\n')
print('Second line.', file=output)
# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()
# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
print(fd)
fd.getvalue()
error:
# python3.4 makecustdomain.py bubba domain.com 1.2.3.4 '2001::1'
<_io.TextIOWrapper name='domain.com.external' mode='w' encoding='US-ASCII'>
Traceback (most recent call last):
File "makecustdomain.py", line 84, in <module>
fd.getvalue()
AttributeError: '_io.TextIOWrapper' object has no attribute 'getvalue
How do I output the data from io stream write function data to buffer as well as to file?

You use open() to open the file, so it isn't a StringIO object, but a file-like object. To get the contents of the file after you write to it you can open the file with mode = 'w+', and instead of fd.getvalue(), do:
fd.seek(0)
var = fd.read()
This will put the contents of the file into var. This will also put you at the beginning of the file, though, so be carefully doing further writes.

Related

Is it possible to open a binary file with os.open and os.fdopen on Unix?

I'm trying to stream a file to clients with Python, and I need to add the HTTP header fields in the response, namely Content-Length and Last-Modified. I found that I can access these fields from the file using os.fstat, which returns a stat_result object, giving me st_size and st_mtime that I can use in the response header.
Now this os.fstat takes a file descriptor, which is provided by os.open. This works:
import os
file_name = "file.cab"
fd = os.open(file_name, os.O_RDONLY)
stats = os.fstat(fd)
print("Content-Length", stats.st_size) # Content-Length 27544
print("Last-Modified", stats.st_mtime) # Last-Modified 1650348549.6016183
Now to actually open this file and have a file object (so I can read and stream it), I can use os.fdopen, which takes the file descriptor provided by os.open.
f = os.fdopen(fd)
print(f) # <_io.TextIOWrapper name=3 mode='r' encoding='UTF-8'>
We can see that the return object has encoding set to UTF-8. However, when I try to read the file, it gives an error:
print(f.read())
Traceback (most recent call last):
File "{redacted}/stream.py", line 10, in <module>
print(f.read())
File "/usr/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 60: invalid start byte
Now there's this flag called os.O_BINARY, but it's mentioned in the document that
The above constants are only available on Windows.
And sure enough, since I'm running on a Unix machine, if I execute os.open with this flag, it gives an AttributeError:
fd = os.open(file_name, os.O_RDONLY | os.O_BINARY)
Traceback (most recent call last):
File "{redacted}/stream.py", line 5, in <module>
fd = os.open(file_name, os.O_RDONLY | os.O_BINARY)
AttributeError: module 'os' has no attribute 'O_BINARY'
So is it possible to open a binary file with os.open and os.fdopen on Unix?
Note that this problem doesn't occur if I just use the built-in open function:
file_name = "file.cab"
f = open(file_name, 'rb')
print(f) # <_io.BufferedReader name='file.cab'>
print(f.read()) # throws up the file in my terminal
But I have to open it with the os module, because I need to provide those HTTP header fields I mentioned.
Edit: As mentioned by tripleee, this is an example of an XY problem. I can get the result I want by using os.stat, which doesn't necessarily take a file descriptor and can be used with just the file path. So I can do something like this:
import os
file_name = "file.cab"
f = open(file_name, 'rb')
stats = os.stat(file_name)
print(f) # <_io.BufferedReader name='file.cab'>
print(stats) # os.stat_result(...)
So at this point, I'm only wondering how, or if, it's possible to do the same with os.open and os.fdopen.
Just tell os.fdopen() to open in binary mode:
f = os.fdopen(fd, 'rb')
Notice the hint in the os.fdopen documentation ...
This is an alias of the open() built-in function and accepts the same arguments.
... for the args parameter:
'r' open for reading (default)
'b' binary mode
Here's a full program to illustrate the difference:
#!/usr/bin/env python3
import os
filepath = "utf8.txt"
fd = os.open(filepath, os.O_CREAT | os.O_WRONLY )
fo1 = os.fdopen(fd)
fo2 = os.fdopen(fd, 'rb')
print(fo1)
print(fo2)
Result:
<_io.TextIOWrapper name=3 mode='r' encoding='UTF-8'>
<_io.BufferedWriter name=3>
PS: I ran into this problem when trying to save an image using PIL. The Image.save() method also accepts a file object / file descriptor. This one too has to be opened in binary mode.

GZip and output file

I'm having difficulty with the following code (which is simplified from a larger application I'm working on in Python).
from io import StringIO
import gzip
jsonString = 'JSON encoded string here created by a previous process in the application'
out = StringIO()
with gzip.GzipFile(fileobj=out, mode="w") as f:
f.write(str.encode(jsonString))
# Write the file once finished rather than streaming it - uncomment the next line to see file locally.
with open("out_" + currenttimestamp + ".json.gz", "a", encoding="utf-8") as f:
f.write(out.getvalue())
When this runs I get the following error:
File "d:\Development\AWS\TwitterCompetitionsStreaming.py", line 61, in on_status
with gzip.GzipFile(fileobj=out, mode="w") as f:
File "C:\Python38\lib\gzip.py", line 204, in __init__
self._write_gzip_header(compresslevel)
File "C:\Python38\lib\gzip.py", line 232, in _write_gzip_header
self.fileobj.write(b'\037\213') # magic header
TypeError: string argument expected, got 'bytes'
PS ignore the rubbish indenting here...I know it doesn't look right.
What I'm wanting to do is to create a json file and gzip it in place in memory before saving the gzipped file to the filesystem (windows). I know I've gone about this the wrong way and could do with a pointer. Many thanks in advance.
You have to use bytes everywhere when working with gzip instead of strings and text. First, use BytesIO instead of StringIO. Second, mode should be 'wb' for bytes instead of 'w' (last is for text) (samely 'ab' instead of 'a' when appending), here 'b' character means "bytes". Full corrected code below:
Try it online!
from io import BytesIO
import gzip
jsonString = 'JSON encoded string here created by a previous process in the application'
out = BytesIO()
with gzip.GzipFile(fileobj = out, mode = 'wb') as f:
f.write(str.encode(jsonString))
currenttimestamp = '2021-01-29'
# Write the file once finished rather than streaming it - uncomment the next line to see file locally.
with open("out_" + currenttimestamp + ".json.gz", "wb") as f:
f.write(out.getvalue())

How do I write to a file and then copy its contents to another file?

I am trying to write to a .txt file and then copy it into a second .txt file.
from sys import argv
script, send_file, get_file = argv
in_file = open(send_file, "r+")
in_file.write("I'm sending information to the receiver file.")
open(get_file, "w")
get_file.write(f"{in_file}")
But I keep getting the same error:
Traceback (most recent call last):
File "ex15_test.py", line 11, in <module>
get_file.write(f"{in_file}")
AttributeError: 'str' object has no attribute 'write'
Then I put open(get_file, "w") and get_file.write(f"{in_file}") inside of a variable and get no error whatsoever.
out_file = open(get_file, "w")
out_file.write(f"{in_file}")
But then this is what ends up being written into the second file:
<_io.TextIOWrapper name='sender.txt' mode='r+' encoding='cp1252'>
Do you know what I'm doing wrong?
Why did it work when I used the variables in the second code?
In open(get_file, "w"), get_file is the name of the file, it's a string.
You need to write to a file object, as you did to read in the first part of the code. So, it would be:
f = open(get_file, "w")
f.write(f"{in_file}")
f.close()
Note that you forgot to close both of your files in your code.
The good practice, though, is to use a context manager that will take care of the closing for you, whatever happens in your code (exception, ...)
So, the best way to do it would be:
with open(get_file, "w") as f:
f.write(f"{in_file}")
Sorry for messy code but this should do what you want i think
from sys import argv
script, send_file, get_file = argv
in_file = open(send_file, "r+")
in_file.write("I'm sending information to the receiver file.")
in_file.close()
in_file_2 = open(send_file, "r")
in_file_text = in_file_2.read()
in_file_2.close()
secondFile = open(get_file, "w")
secondFile.write(f"{in_file_text}")
secondFile.close()

How can I export binary data from Python subprocess command through STDOUT?

I am trying re-save a PDF with Ghostscript (to correct errors that PyPDF2 can't handle). I'm calling Ghostscript with subprocess.check_output, and I want to pass the original PDF in as STDIN and export the new one as STDOUT.
When I save the PDF to a file and read it back in, it works fine. When I try to pass the file in from STDOUT, it doesn't work. I think maybe this could be an encoding issue, but I don't want to encode anything to text, I just want binary data. Maybe there's something about encodings I don't understand.
How can I make the STDOUT data work like the file data?
import subprocess
from PyPDF2 import PdfFileReader
from io import BytesIO
import traceback
input_file_name = "SKMBT_42116071215160 (1).pdf"
output_file_name = 'saved2.pdf'
# input_file = open(input_file_name, "rb") # Moved below.
# Write to a file, then read the file back in. This works.
try:
ps1 = subprocess.check_output(
('gs', '-o', output_file_name, '-sDEVICE=pdfwrite', '-dPDFSETTINGS=/prepress', input_file_name),
# stdin=input_file # [edit] We pass in the file name, so this only confuses things.
)
# I use BytesIO() in this example only to make the examples parallel.
# In the other example, I use BytesIO() because I can't pass a string to PdfFileReader().
fakeFile1 = BytesIO()
fakeFile1.write(open(output_file_name, "rb").read())
inputpdf = PdfFileReader(fakeFile1)
print inputpdf
except:
traceback.print_exc()
print "---------"
# input_file.seek(0) # Added to address one comment. Removed while addressing another.
input_file = open(input_file_name, "rb")
# Export to STDOUT. This doesn't work.
try:
ps2 = subprocess.check_output(
('gs', '-o', '-', '-sDEVICE=pdfwrite', '-dPDFSETTINGS=/prepress', '-'),
stdin=input_file,
# shell=True # Using shell produces the same error.
)
fakeFile2 = BytesIO()
fakeFile2.write(ps2)
inputpdf = PdfFileReader(fakeFile2)
print inputpdf
except:
traceback.print_exc()
Output:
**** The file was produced by:
**** >>>> KONICA MINOLTA bizhub 421 <<<<
<PyPDF2.pdf.PdfFileReader object at 0x101d1d550>
---------
**** The file was produced by:
**** >>>> KONICA MINOLTA bizhub 421 <<<<
Traceback (most recent call last):
File "pdf_file_reader_test2.py", line 34, in <module>
inputpdf = PdfFileReader(fakeFile2)
File "/Library/Python/2.7/site-packages/PyPDF2/pdf.py", line 1065, in __init__
self.read(stream)
File "/Library/Python/2.7/site-packages/PyPDF2/pdf.py", line 1774, in read
idnum, generation = self.readObjectHeader(stream)
File "/Library/Python/2.7/site-packages/PyPDF2/pdf.py", line 1638, in readObjectHeader
return int(idnum), int(generation)
ValueError: invalid literal for int() with base 10: "7-8138-11f1-0000-59be60c931e0'"
Turns out, this has nothing to do with Python. It's a Ghostscript error. As pointed out in this post: Prevent Ghostscript from writing errors to standard output, Ghostscript writes errors to stdout, which corrupts files that are piped out.
Thanks to #Jean-François Fabre who suggested I look in the binary files.

Send file contents over ftp python

I have this Python Script
import os
import random
import ftplib
from tkinter import Tk
# now, we will grab all Windows clipboard data, and put to var
clipboard = Tk().clipboard_get()
# print(clipboard)
# this feature will only work if a string is in the clipboard. not files.
# so if "hello, world" is copied to the clipboard, then it would work. however, if the target has copied a file or something
# then it would come back an error, and the rest of the script would come back false (therefore shutdown)
random_num = random.randrange(100, 1000, 2)
random_num_2 = random.randrange(1, 9999, 5)
filename = "capture_clip" + str(random_num) + str(random_num_2) + ".txt"
file = open(filename, 'w') # clears file, or create if not exist
file.write(clipboard) # write all contents of var "foo" to file
file.close() # close file after printing
# let's send this file over ftp
session = ftplib.FTP('ftp.example.com','ftp_user','ftp_password')
session.cwd('//logs//') # move to correct directory
f = open(filename, 'r')
session.storbinary('STOR ' + filename, f)
f.close()
session.quit()
The file will send the contents created by the Python script (under variable "filename" eg: "capture_clip5704061.txt") to my FTP Server, though the contents of the file on the local system do not equal the file on the FTP server. As you can see, I use the ftplib module. Here is my error:
Traceback (most recent call last):
File "script.py", line 33, in<module>
session.storbinary('STOR ' + filename, f)
File "C:\Users\willi\AppData\Local\Programs\Python\Python36\lib\ftplib.py", line 507, in storbinary
conn.sendall(buf)
TypeError: a bytes-like object is required, not 'str'
Your library expects the file to be open in binary mode, it appears. Try the following:
f = open(filename, 'rb')
This ensures that the data read from the file is a bytes object rather than str (for text).

Categories