Python stdout and stdout.buffer capture - python

Hey i want to prevent any stdouts from anywhere and capture it in a variable.
Now the problem:
I have two methods to print something in stdout
Method 1:
print("Normal Print")
Method 2:
fds: List[BinaryIO] = [sys.stdin.buffer, sys.stdout.buffer, sys.stderr.buffer]
fds[1].write(b"%d\n" % a)
fds[1].flush()
Now i tried something like this
from io import BufferedWriter, BytesIO, StringIO
mystdout = StringIO()
mystdout.buffer = BufferedWriter(raw=BytesIO())
sys.stdout = mystdout
But with this i get no output at all.
How is the best way to archiev this?

What do you mean that you get no output at all? It's in variable:
mystdout = StringIO()
mystdout.buffer = BufferedRandom(raw=BytesIO()) # You can read from BufferedRandom
sys.stdout = mystdout
sys.stdout.buffer.write(b"BUFFER")
print("PRINT")
sys.stdout = sys.__stdout__ # Restore original stdout
print(mystdout.getvalue()) # PRINT
mystdout.buffer.seek(0)
print(mystdout.buffer.read()) b"BUFFER"

Related

Is there a way to get python to output to the shell and simultaneously output to str in a variable?

The only ways I can find in python that redirects the console output to string in a variable also seem to turn off the functionality of displaying the output to the console in a live manner similar to how python would output the code normally.
I currently am changing the sys.stdout but again, this is either one or the other it seems.
If I redefine it, I get the perfect performance for distant error checking, as I am able to save the output variable to a cloud based spreadsheet in the event of exception handling, which sends me notifications anywhere that I am.
However, redefining it means I don't get to come and locally check on the output of the program while it is "running smoothly"
EDIT: some of your answers have helped me refine my question. Here is a fresh re wording:
What is the best way to concisely store and record outputs with a single variable as the elements are printed to the console without overwriting sys.stdout?
old_stdout = sys.stdout
new_stdout = io.StringIO()
sys.stdout = new_stdout
def update_error_log_ss(traceback_, summary, output = ""):
print("Connecting to Smart Sheet to Update Error Log")
token = 'token'
error_log_sheetid = 00000000
ss_client = ss.Smartsheet(token)
now = datetime.datetime.now()
sheet = ss_client.Sheets.get_sheet(error_log_sheetid)
colid = []
for col in sheet.columns:
colid.append(col.id)
print(str(col.id) + " - " + col.title)
totalcols = len(colid)
row_add = ss.models.Row()
row_add.to_top = True
row_add.cells.append({
'column_id': colid[0],
'value': str(now)
})
row_add.cells.append({
'column_id': colid[1],
'value': summary
})
row_add.cells.append({
'column_id': colid[2],
'value': traceback_
})
row_add.cells.append({
'column_id': colid[3],
'value': output
})
response = ss_client.Sheets.add_rows(
error_log_sheetid,
[row_add]
)
return
except Exception:
if debug_ == False:
synch.update_error_log_ss(traceback.format_exc(),'initialization failure',new_stdout.getvalue())
main_()
else:
synch.update_error_log_ss(traceback.format_exc(),'initialization failure')
main_()
The only issue I have with the above solution is that in order for new_stdout.getvalue() to get defined, sys.stdout has to be overwritten
According to the print docs:
The file argument must be an object with a write(string) method
So you can just create one for your needs and assign it to sys.stdout which is the default file argument for print:
import io
import sys
class my_stdout:
def __init__(self):
self.real_stdout = sys.stdout
self.new_stdout = io.StringIO()
def write(self, s):
self.real_stdout.write(s)
self.new_stdout.write(s)
def getvalue(self):
return self.new_stdout.getvalue()
new_stdout = my_stdout()
sys.stdout = new_stdout
print("Original: Hello world!")
# for debug:
sys.stdout = new_stdout.real_stdout
print("From stringio:", new_stdout.getvalue())
Output:
Original: Hello world!
From stringio: Original: Hello world!
Maybe you want this:
def get(x):
print(x)
return x
myvar = get('myvalue')
print(myvar)
Output:
myvalue
myvalue
So basically just write your own function to print the value and return it to the variable.

Python write both commands and their output to a file

Is there any way to write both commands and their output to an external file?
Let's say I have a script outtest.py :
import random
from statistics import median, mean
d = [random.random()**2 for _ in range(1000)]
d_mean = round(mean(d), 2)
print(f'mean: {d_mean}')
d_median = round(median(d), 2)
print(f'median: {d_median}')
Now if I want to capture its output only I know I can just do:
python3 outtest.py > outtest.txt
However, this will only give me an outtest.txt file with e.g.:
mean: 0.34
median: 0.27
What I'm looking for is a way to get an output file like:
import random
from statistics import median, mean
d = [random.random()**2 for _ in range(1000)]
d_mean = round(mean(d), 2)
print(f'mean: {d_mean}')
>> mean: 0.34
d_median = round(median(d), 2)
print(f'median: {d_median}')
>> median: 0.27
Or some other format (markdown, whatever). Essentially, something like jupyter notebook or Rmarkdown but with using standard .py files.
Is there any easy way to achieve this?
Here's a script I just wrote which quite comprehensively captures printed output and prints it alongside the code, no matter how it's printed or how much is printed in one go. It uses the ast module to parse the Python source, executes the program one statement at a time (kind of as if it was fed to the REPL), then prints the output from each statement. Python 3.6+ (but easily modified for e.g. Python 2.x):
import ast
import sys
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <script.py> [args...]")
exit(1)
# Replace stdout so we can mix program output and source code cleanly
real_stdout = sys.stdout
class FakeStdout:
''' A replacement for stdout that prefixes # to every line of output, so it can be mixed with code. '''
def __init__(self, file):
self.file = file
self.curline = ''
def _writerow(self, row):
self.file.write('# ')
self.file.write(row)
self.file.write('\n')
def write(self, text):
if not text:
return
rows = text.split('\n')
self.curline += rows.pop(0)
if not rows:
return
for row in rows:
self._writerow(self.curline)
self.curline = row
def flush(self):
if self.curline:
self._writerow(self.curline)
self.curline = ''
sys.stdout = FakeStdout(real_stdout)
class EndLineFinder(ast.NodeVisitor):
''' This class functions as a replacement for the somewhat unreliable end_lineno attribute.
It simply finds the largest line number among all child nodes. '''
def __init__(self):
self.max_lineno = 0
def generic_visit(self, node):
if hasattr(node, 'lineno'):
self.max_lineno = max(self.max_lineno, node.lineno)
ast.NodeVisitor.generic_visit(self, node)
# Pretend the script was called directly
del sys.argv[0]
# We'll walk each statement of the file and execute it separately.
# This way, we can place the output for each statement right after the statement itself.
filename = sys.argv[0]
source = open(filename, 'r').read()
lines = source.split('\n')
module = ast.parse(source, filename)
env = {'__name__': '__main__'}
prevline = 0
endfinder = EndLineFinder()
for stmt in module.body:
# note: end_lineno will be 1-indexed (but it's always used as an endpoint, so no off-by-one errors here)
endfinder.visit(stmt)
end_lineno = endfinder.max_lineno
for line in range(prevline, end_lineno):
print(lines[line], file=real_stdout)
prevline = end_lineno
# run a one-line "module" containing only this statement
exec(compile(ast.Module([stmt]), filename, 'exec'), env)
# flush any incomplete output (FakeStdout is "line-buffered")
sys.stdout.flush()
Here's a test script:
print(3); print(4)
print(5)
if 1:
print(6)
x = 3
for i in range(6):
print(x + i)
import sys
sys.stdout.write('I love Python')
import pprint
pprint.pprint({'a': 'b', 'c': 'd'}, width=5)
and the result:
print(3); print(4)
# 3
# 4
print(5)
# 5
if 1:
print(6)
# 6
x = 3
for i in range(6):
print(x + i)
# 3
# 4
# 5
# 6
# 7
# 8
import sys
sys.stdout.write('I love Python')
# I love Python
import pprint
pprint.pprint({'a': 'b', 'c': 'd'}, width=5)
# {'a': 'b',
# 'c': 'd'}
You can call the script and inspect its output. You'll have to make some assumptions though. The output is only from stdout, only from lines containing the string "print", and each print produces only one line of output. That being the case, an example command to run it:
> python writer.py script.py
And the script would look like this:
from sys import argv
from subprocess import run, PIPE
script = argv[1]
r = run('python ' + script, stdout=PIPE)
out = r.stdout.decode().split('\n')
with open(script, 'r') as f:
lines = f.readlines()
with open('out.txt', 'w') as f:
i = 0
for line in lines:
f.write(line)
if 'print' in line:
f.write('>>> ' + out[i])
i += 1
And the output:
import random
from statistics import median, mean
d = [random.random()**2 for _ in range(1000)]
d_mean = round(mean(d), 2)
print(f'mean: {d_mean}')
>>> mean: 0.33
d_median = round(median(d), 2)
print(f'median: {d_median}')
>>> median: 0.24
More complex cases with multiline output or other statements producing output, this won't work. I guess it would require some in depth introspection.
My answer might be similar to #Felix answer, but I'm trying to do it in a pythonic way.
Just grab the source code (mycode) and then execute it and capture the results, finally write them back to the output file.
Remember that I've used exec to the demonstration for the solution and you shouldn't use it in a production environment.
I also used this answer to capture stdout from exec.
import mycode
import inspect
import sys
from io import StringIO
import contextlib
source_code = inspect.getsource(mycode)
output_file = "output.py"
#contextlib.contextmanager
def stdoutIO(stdout=None):
old = sys.stdout
if stdout is None:
stdout = StringIO()
sys.stdout = stdout
yield stdout
sys.stdout = old
# execute code
with stdoutIO() as s:
exec(source_code)
# capture stdout
stdout = s.getvalue().splitlines()[::-1]
# write to file
with open(output_file, "w") as f:
for line in source_code.splitlines():
f.write(line)
f.write('\n')
if 'print' in line:
f.write(">> {}".format(stdout.pop()))
f.write('\n')
You can assign result to variable before print out and later on write those values to file
Here is example
import random
from statistics import median, mean
d = [random.random()**2 for _ in range(1000)]
d_mean = round(mean(d), 2)
foo = f'mean: {d_mean}'
print(foo)
d_median = round(median(d), 2)
bar = f'median: {d_median}'
print(bar)
# save to file
with open("outtest.txt", "a+") as file:
file.write(foo + "\n")
file.write(bar + "\n")
file.close()
argument "a+" mean open file and append content to file instead of overwrite if file was exist.

How to fix mock_open differences in calls but not in end result

Using mock_open, I can capture the data from writes using the with [...] as construct. However, testing that what I have is correct is a little tricky. For example, I can do this:
>>> from mock import mock_open
>>> m = mock_open()
>>> with patch('__main__.open', m, create=True):
... with open('foo', 'w') as h:
... h.write('some stuff')
...
>>> m.mock_calls
[call('foo', 'w'),
call().__enter__(),
call().write('some stuff'),
call().__exit__(None, None, None)]
>>> m.assert_called_once_with('foo', 'w')
>>> handle = m()
>>> handle.write.assert_called_once_with('some stuff')
But I want to do compare what I think should have been written to what was. In effect something like this:
>>> expected = 'some stuff'
>>> assert(expected == m.all_that_was_written)
The problem I am facing with call is that different versions of json (2.0.9 vs 1.9) seem to print things differently. No, I cannot just update to the latest json.
The actual error I am getting is this:
E AssertionError: [call('Tool_000.json', 'w'),
call().__enter__(),
call().write('['),
call().write('\n '),
call().write('"1.0.0"'),
call().write(', \n '),
call().write('"2014-02-27 08:58:02"'),
call().write(', \n '),
call().write('"ook"'),
call().write('\n'),
call().write(']'),
call().__exit__(None, None, None)]
!=
[call('Tool_000.json', 'w'),
call().__enter__(),
call().write('[\n "1.0.0"'),
call().write(', \n "2014-02-27 08:58:02"'),
call().write(', \n "ook"'),
call().write('\n'),
call().write(']'),
call().__exit__(None, None, None)]
In effects, the calls are different but the end result is the same.
The code I am testing is fairly simple:
with open(get_new_file_name(), 'w') as fp:
json.dump(lst, fp)
So, creating another method that passes the file pointer seems overkill.
You can patch open() to return StringIO object and then check the contents.
with mock.patch('module_under_test.open', create=True) as mock_open:
stream = io.StringIO()
# patching to make getvalue() work after close() or __exit__()
stream.close = mock.Mock(return_value=None)
mock_open.return_value = stream
module_under_test.do_something() # this calls open()
contents = stream.getvalue()
assert(contents == expected)
Edit: added patch for stream.close to avoid exception on stream.getvalue().
mock_open is not fully featured yet. It works well if you are mocking files to be read but it does not yet have enough features for testing written files. The question clearly shows this deficiency.
My solution is to not use mock_open if you are testing the written content. Here is the alternative:
import six
import mock
import unittest
class GenTest(unittest.TestCase):
def test_open_mock(self):
io = six.BytesIO()
io_mock = mock.MagicMock(wraps=io)
io_mock.__enter__.return_value = io_mock
io_mock.close = mock.Mock() # optional
with mock.patch.object(six.moves.builtins, 'open', create=True, return_value=io_mock):
# test using with
with open('foo', 'w') as h:
expected = 'some stuff'
h.write(expected)
self.assertEquals(expected, io.getvalue())
# test using file handle directly
io.seek(0); io.truncate() # reset io
expected = 'other stuff'
open('bar', 'w').write(expected)
self.assertEquals(expected, io.getvalue())
# test getvalue after close
io.seek(0); io.truncate() # reset io
expected = 'closing stuff'
f = open('baz', 'w')
f.write(expected)
f.close()
self.assertEquals(expected, io.getvalue())
if __name__ == '__main__':
unittest.main()
Here's what I will do, write a method that returns the complete string from all calls of the write method.
class FileIOTestCase(unittest.TestCase):
""" For testing code involving file io operations """
def setUp(self):
""" patches the open function with a mock, to be undone after test. """
self.mo = mock_open()
patcher = patch("builtins.open", self.mo)
patcher.start()
self.addCleanup(patcher.stop)
def get_written_string(self):
return ''.join(c[0][0] for c in self.mo.return_value.write.call_args_list)
An example of how to use it
class TestWriteFile(FileIOTestCase):
def test_write_file__csv(self):
save.write_file("a,b\n1,2", "directory", "C6L")
self.mo.assert_called_once_with(os.path.join("directory", "C6L.csv"), 'w')
self.assertEqual(self.get_written_string(), "a,b\n1,2")

Creating a set from a variable instead of a file

I have a piece of code to read two files, convert them to sets, and then subtract one set from the other. I would like to use a string variable (installedPackages) for "a" instead of a file. I would also like to write to a variable for "c".
a = open("/home/user/packages1.txt")
b = open("/home/user/packages.txt")
c = open("/home/user/unique.txt", "w")
for line in set(a) - set(b):
c.write(line)
a.close()
b.close()
c.close()
I have tried the following and it does not work:
for line in set(installedPackages) - set(b):
I have tried to use StringIO, but I think I am using it improperly.
Here, finally, is how I have created installedPackages:
stdout, stderr = p.communicate()
installedPackages = re.sub('\n$', '', re.sub('install$', '', re.sub('\t', '', stdout), 0,re.MULTILINE))
Sample of packages.txt:
humanity-icon-theme
hunspell-en-us
hwdata
hyphen-en-us
ibus
ibus-gtk
ibus-gtk3
ibus-pinyin
ibus-pinyin-db-android
ibus-table
If you want to write to a string buffer file-like use StringIO
>>> from StringIO import StringIO
>>> installed_packages = StringIO()
>>> installed_packages.write('test')
>>> installed_packages.getvalue()
'test'
Something like the following?
Edit: after several iterations:
from subprocess import Popen, PIPE
DEBUG = True
if DEBUG:
def log(msg, data):
print(msg)
print(repr(data))
else:
def log(msg, data):
pass
def setFromFile(fname):
with open(fname) as inf:
return set(ln.strip() for ln in inf)
def setFromString(s):
return set(ln.strip() for ln in s.split("\n"))
def main():
# get list of installed packages
p = Popen(['dpkg', '--get-selections'], stdout=PIPE, stderr=PIPE)
stdout, stderr = p.communicate()
installed_packages = setFromString(stdout)
# get list of expected packages
known_packages = setFromFile('/home/john/packages.txt')
# calculate the difference
unknown_packages = installed_packages - known_packages
unknown_packages_string = "\n".join(unknown_packages)
log("Installed packages:", installed_packages)
log("Known packages:", known_packages)
log("Unknown packages:", unknown_packages)
if __name__=="__main__":
main()
The set data type takes an iterable as a parameter, therefore if installedPackages a string with multiple items you need to split it by the delimiter. For example, the following code would split the string by all commas:
for line in set(installedPackages.split(',')) - set(b):
c.write(line)

Pytesser inaccurate

Simple question. When I run this image through pytesser, i get $+s. How can I fix that?
EDIT
So... my code generates images similar to the image linked above, just with different numbers, and is supposed to solve the simple math problem, which is obviously impossible if all I can get out of the picture is $+s
Here's the code I'm currently using:
from pytesser import *
time.sleep(2)
i = 0
operator = "+"
while i < 100:
time.sleep(.1);
img = ImageGrab.grab((349, 197, 349 + 452, 197 + 180))
equation = image_to_string(img)
Then I'm going to go on to parse equation... as soon as I get pytesser working.
Try my little function. I'm running tesseract from the svn repo, so my results might be more accurate.
I'm on Linux, so on Windows, I'd imagine that you'll have to replace tesseract with tesseract.exe to make it work.
import tempfile, subprocess
def ocr(image):
tempFile = tempfile.NamedTemporaryFile(delete = False)
process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
process.communicate()
handle = open(tempFile.name + '.txt', 'r').read()
return handle
And a sample Python session:
>>> import tempfile, subprocess
>>> def ocr(image):
... tempFile = tempfile.NamedTemporaryFile(delete = False)
... process = subprocess.Popen(['tesseract', image, tempFile.name], stdout = subprocess.PIPE, stdin = subprocess.PIPE, stderr = subprocess.STDOUT)
... process.communicate()
... handle = open(tempFile.name + '.txt', 'r').read()
... return handle
...
>>> print ocr('326_fail.jpg')
0+1
if you're in linux, use gocr is more accurate. you can use it through
os.system("/usr/bin/gocr %s") % (sample_image)
and use readlines from stdout for manipulating output result to everything what you want (i.e creating output from gocr for specific variable).

Categories