gnupg - decrypt into Python bytesio stream - python

How can I select a stream as the output of a decrypt_file operation in gnupg?
The docs and the code seem to suggest this is not possible. If I am correct (see below), what workarounds are possible?
~~~
The documentation seems to suggest it is not possible:
decrypt_file(filename, always_trust=False, passphrase=None, output=None)¶
with "output (str) – A filename to write the decrypted output to."
~~~
Opening up the code, I see:
def decrypt_file(self, file, always_trust=False, passphrase=None,
output=None, extra_args=None):
args = ["--decrypt"]
if output: # write the output to a file with the specified name
self.set_output_without_confirmation(args, output)
if always_trust: # pragma: no cover
args.append("--always-trust")
if extra_args:
args.extend(extra_args)
result = self.result_map['crypt'](self)
self._handle_io(args, file, result, passphrase, binary=True)
logger.debug('decrypt result: %r', result.data)
return result
which points to set_output_without_confirmation, confirming the idea is that you pass a string filename:
def set_output_without_confirmation(self, args, output):
"If writing to a file which exists, avoid a confirmation message."
if os.path.exists(output):
# We need to avoid an overwrite confirmation message
args.extend(['--yes'])
args.extend(['--output', no_quote(output)])

To output the decrypted data to a variable use decrypt instead of decrypt_file, as shown here in the "Decrypt a string" paragraph.
So the original code:
status = gpg.decrypt_file(input_file, passphrase='my_passphrase', output='my_output_file')
is substituted by:
decrypted_data = gpg.decrypt(input_file.read(), passphrase='my_passphrase')
# decrypted_data.data contains the data
decrypted_stream = io.BytesIO(decrypted_data.data)
# this is py3, in py2 BytesIO is imported from BytesIO
As an example for the specific use case for csv data, building on this SO post, you could then do:
my_df = pandas.read_csv(decrypted_stream)

Related

S3 InvalidDigest when calling the PutObject operation [duplicate]

I have tried to upload an XML File to S3 using boto3. As recommended by Amazon, I would like to send a Base64 Encoded MD5-128 Bit Digest(Content-MD5) of the data.
https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Object.put
My Code:
with open(file, 'rb') as tempfile:
body = tempfile.read()
tempfile.close()
hash_object = hashlib.md5(body)
base64_md5 = base64.encodebytes(hash_object.digest())
response = s3.Object(self.bucket, self.key + file).put(
Body=body.decode(self.encoding),
ACL='private',
Metadata=metadata,
ContentType=self.content_type,
ContentEncoding=self.encoding,
ContentMD5=str(base64_md5)
)
When i try this the str(base64_md5) create a string like 'b'ZpL06Osuws3qFQJ8ktdBOw==\n''
In this case, I get this Error Message:
An error occurred (InvalidDigest) when calling the PutObject operation: The Content-MD5 you specified was invalid.
For Test purposes I copied only the Value without the 'b' in front: 'ZpL06Osuws3qFQJ8ktdBOw==\n'
Then i get this Error Message:
botocore.exceptions.HTTPClientError: An HTTP Client raised and unhandled exception: Invalid header value b'hvUe19qHj7rMbwOWVPEv6Q==\n'
Can anyone help me how to save Upload a File to S3?
Thanks,
Oliver
Starting with #Isaac Fife's example, stripping it down to identify what's required vs not, and to include imports and such to make it a full reproducible example:
(the only change you need to make is to use your own bucket name)
import base64
import hashlib
import boto3
contents = "hello world!"
md = hashlib.md5(contents.encode('utf-8')).digest()
contents_md5 = base64.b64encode(md).decode('utf-8')
boto3.client('s3').put_object(
Bucket="mybucket",
Key="test",
Body=contents,
ContentMD5=contents_md5
)
Learnings: first, the MD5 you are trying to generate will NOT look like what an 'upload' returns. We actually need a base64 version, it returns a md.hexdigest() version. hex is base16, which is not base64.
(Python 3.7)
Took me hours to figure this out because the only error you get is "The Content-MD5 you specified was invalid." Super useful for debugging... Anyway, here is the code I used to actually get the file to upload correctly before refactoring.
json_results = json_converter.convert_to_json(result)
json_results_utf8 = json_results.encode('utf-8')
content_md5 = md5.get_content_md5(json_results_utf8)
content_md5_string = content_md5.decode('utf-8')
metadata = {
"md5chksum": content_md5_string
}
s3 = boto3.resource('s3', config=Config(signature_version='s3v4'))
obj = s3.Object(bucket, 'filename.json')
obj.put(
Body=json_results_utf8,
ContentMD5=content_md5_string,
ServerSideEncryption='aws:kms',
Metadata=metadata,
SSEKMSKeyId=key_id)
and the hashing
def get_content_md5(data):
digest = hashlib.md5(data).digest()
return base64.b64encode(digest)
The hard part for me was figuring out what encoding you need at each step in the process and not being very familiar with how strings are stored in python at the time.
get_content_md5 takes a utf-8 bytes-like object only, and returns the same. But to pass the md5 hash to aws, it needs to be a string. You have to decode it before you give it to ContentMD5.
Pro-tip - Body on the other hand, needs to be given bytes or a seekable object. Make sure if you pass a seekable object that you seek(0) to the beginning of the file before you pass it to AWS or the MD5 will not match. For that reason, using bytes is less error prone, imo.

Read header of .blf CAN file

I have a .blf file containing multiple can bus messages, which I can read using python-can like so
import can
can_log = can.BLFReader("./test.blf")
for msg in can_log:
print(msg)
According to the python-can docs, the header of a standard .blf file has 144 bytes and contains the start- and ending-timestamp of the whole record itself.
I would like to directly read this start and end timestamp, is this possible?
I know I could also read the timestamp from the first message using msg.timestamp, but it slightly differs from the start timestamp that I would like to extract.
From the source code of python can :
[...]
class BLFReader(object):
"""
Iterator of CAN messages from a Binary Logging File.
Only CAN messages and error frames are supported. Other object types are
silently ignored.
"""
def __init__(self, filename):
self.fp = open(filename, "rb")
data = self.fp.read(FILE_HEADER_STRUCT.size)
header = FILE_HEADER_STRUCT.unpack(data)
#print(header)
assert header[0] == b"LOGG", "Unknown file format"
self.file_size = header[10]
self.uncompressed_size = header[11]
self.object_count = header[12]
self.start_timestamp = systemtime_to_timestamp(header[14:22])
self.stop_timestamp = systemtime_to_timestamp(header[22:30])
[...]
You can use start_timestamp and stop_timestamp this way :
can_log.start_timestamp

When working with a named pipe is there a way to do something like readlines()

Overall Goal: I am trying to read some progress data from a python exe to update the progress of the exe in another application
I have a python exe that is going to do some stuff, I want to be able to communicate the progress to another program. Based on several other Q&A here I have been able to have my running application send progress data to a named pipe using the following code
import win32pipe
import win32file
import glob
test_files = glob.glob('J:\\someDirectory\\*.htm')
# test_files has two items a.htm and b.htm
p = win32pipe.CreateNamedPipe(r'\\.\pipe\wfsr_pipe',
win32pipe.PIPE_ACCESS_DUPLEX,
win32pipe.PIPE_TYPE_MESSAGE | win32pipe.PIPE_WAIT,
1,65536,65536,300,None)
# the following line is the server-side function for accepting a connection
# see the following SO question and answer
""" http://stackoverflow.com/questions/1749001/named-pipes-between-c-sharp-and-python
"""
win32pipe.ConnectNamedPipe(p, None)
for each in testFiles:
win32file.WriteFile(p,each + '\n')
#send final message
win32file.WriteFile(p,'Process Complete')
# close the connection
p.close()
In short the example code writes the path of the each file that was globbed to the NamedPipe - this is useful and can be easily extended to more logging type events. However, the problem is trying to figure out how to read the content of the named pipe without knowing the size of each possible message. For example the first file could be named J:\someDirectory\a.htm, but the second could have 300 characters in the name.
So far the code I am using to read the contents of the pipe requires that I specify a buffer size
First establish the connection
file_handle = win32file.CreateFile("\\\\.\\pipe\\wfsr_pipe",
win32file.GENERIC_READ | win32file.GENERIC_WRITE,
0, None,
win32file.OPEN_EXISTING,
0, None)
and then I have been playing around with reading from the file
data = win32file.ReadFile(file_handle,128)
This generally works but I really want to read until I hit a newline character, do something with the content between when I started reading and the newline character and then repeat the process until I get to a line that has Process Complete in the line
I have been struggling with how to read only until I find a newline character (\n). I basically want to read the file by lines and based on the content of the line do something (either display the line or shift the application focus).
Based on the suggestion provided by #meuh I am updating this because I think there is a dearth of examples, guidance in how to use pipes
My server code
import win32pipe
import win32file
import glob
import os
p = win32pipe.CreateNamedPipe(r'\\.\pipe\wfsr_pipe',
win32pipe.PIPE_ACCESS_DUPLEX,
win32pipe.PIPE_TYPE_MESSAGE | win32pipe.PIPE_WAIT,
1,65536,65536,300,None)
# the following line is the server-side function for accepting a connection
# see the following SO question and answer
""" http://stackoverflow.com/questions/1749001/named-pipes-between-c-sharp-and-python
"""
win32pipe.ConnectNamedPipe(p, None)
for file_id in glob.glob('J:\\level1\\level2\\level3\\*'):
for filer_id in glob.glob(file_id + os.sep + '*'):
win32file.WriteFile(p,filer_id)
#send final message
win32file.WriteFile(p,'Process Complete')
# close the connection
p.close() #still not sure if this should be here, I need more testing
# I think the client can close p
The Client code
import win32pipe
import win32file
file_handle = win32file.CreateFile("\\\\.\\pipe\\wfsr_pipe",
win32file.GENERIC_READ |
win32file.GENERIC_WRITE,
0, None,win32file.OPEN_EXISTING,0, None)
# this is the key, setting readmode to MESSAGE
win32pipe.SetNamedPipeHandleState(file_handle,
win32pipe.PIPE_READMODE_MESSAGE, None, None)
# for testing purposes I am just going to write the messages to a file
out_ref = open('e:\\testpipe.txt','w')
dstring = '' # need some way to know that the messages are complete
while dstring != 'Process Complete':
# setting the blocksize at 4096 to make sure it can handle any message I
# might anticipate
data = win32file.ReadFile(file_handle,4096)
# data is a tuple, the first position seems to always be 0 but need to find
# the docs to help understand what determines the value, the second is the
# message
dstring = data[1]
out_ref.write(dstring + '\n')
out_ref.close() # got here so close my testfile
file_handle.close() # close the file_handle
I don't have windows but looking through the api it seems you should convert
your client to message mode by adding after the CreateFile() the call:
win32pipe.SetNamedPipeHandleState(file_handle,
win32pipe.PIPE_READMODE_MESSAGE, None, None)
then each sufficiently long read will return a single message, ie what the other wrote in a single write. You already set PIPE_TYPE_MESSAGE when you created the pipe.
You could simply use an implementation of io.IOBase that would wrap the NamedPipe.
class PipeIO(io.RawIOBase):
def __init__(self, handle):
self.handle = handle
def read(self, n):
if (n == 0): return ""
elif n == -1: return self.readall()
data = win32file.ReadFile(self.file_handle,n)
return data
def readinto(self, b):
data = self.read(len(b))
for i in range(len(data)):
b[i] = data[i]
return len(data)
def readall(self):
data = ""
while True:
chunk = win32file.ReadFile(self.file_handle,10240)
if (len(chunk) == 0): return data
data += chunk
BEWARE : untested, but it should work after fixing the eventual typos.
You could then do:
with PipeIO(file_handle) as fd:
for line in fd:
# process a line
You could use the msvcrt module and open to turn the pipe into a file object.
Sending code
import win32pipe
import os
import msvcrt
from io import open
pipe = win32pipe.CreateNamedPipe(r'\\.\pipe\wfsr_pipe',
win32pipe.PIPE_ACCESS_OUTBOUND,
win32pipe.PIPE_TYPE_MESSAGE | win32pipe.PIPE_WAIT,
1,65536,65536,300,None)
# wait for another process to connect
win32pipe.ConnectNamedPipe(pipe, None)
# get a file descriptor to write to
write_fd = msvcrt.open_osfhandle(pipe, os.O_WRONLY)
with open(write_fd, "w") as writer:
# now we have a file object that we can write to in a standard way
for i in range(0, 10):
# create "a\n" in the first iteration, "bb\n" in the second and so on
text = chr(ord("a") + i) * (i + 1) + "\n"
writer.write(text)
Receiving code
import win32file
import os
import msvcrt
from io import open
handle = win32file.CreateFile(r"\\.\pipe\wfsr_pipe",
win32file.GENERIC_READ,
0, None,
win32file.OPEN_EXISTING,
0, None)
read_fd = msvcrt.open_osfhandle(handle, os.O_RDONLY)
with open(read_fd, "r") as reader:
# now we have a file object with the readlines and other file api methods
lines = reader.readlines()
print(lines)
Some notes.
I've only tested this with python 3.4, but I believe you may be using python 2.x.
Python seems to get weird if you try to close both the file object and the pipe..., so I've only used the file object (by using the with block)
I've only created the file objects to read on one end and write on the other. You can of course make the file objects duplex by
Creating the file descriptors (read_fd and write_fd) with the os.O_RDWR flag
Creating the file objects in in "r+" mode rather than "r" or "w"
Going back to creating the pipe with the win32pipe.PIPE_ACCESS_DUPLEX flag
Going back to creating the file handle object with the win32file.GENERIC_READ | win32file.GENERIC_WRITE flags.

python gnupg sign, and verify

I am trying to see if I can get the python-gnupg module working to sign and verify a file using a python script. I have the following code, which does not interpret any errors when called.
However the code prints "unverified" at the end, when I thought that I had signed the file (example.txt).
I must be missing something in the documentation but after I read it this is what I came up with for signing and verifying. Any help please?
import gnupg
gpg = gnupg.GPG(gnupghome="/home/myname")
stream = open("example.txt", "rb")
signed_data = gpg.sign_file(stream)
verified = gpg.verify_file(stream)
print "Verified" if verified else "Unverified"
There are a few issues with your code,
1.) gpg = gnupg.GPG(gnupghome="/home/myname") needs to be gpg = gnupg.GPG(gnupghome="/home/myname/.gnupg")
2.) You are attempting to verify the stream, using verify_file(stream), however the stream is still a handle to the original, unsigned file. You would first need to either write the signed data to a new file and call verify_file() against a handle to that file, or verify the result sign_file.
Below is a working example of your demo, using the result of sign_file - but before we get to that, the way to troubleshoot what is happening in your script, you can review the output of stderr on the returned object of the gnupg methods. for example, you can review the result of the signed data by printing out signed_data.stderr. Likewise for return of the verify_file method.
On to the code -
import gnupg
gpg = gnupg.GPG(gnupghome="/home/myname/.gnupg")
stream = open("example.txt", "rb")
signed_data = gpg.sign_file(stream)
verified = gpg.verify(signed_data.data)
print "Verified" if verified else "Unverified"
I hope this helps!

Getting an empty file when served by django

I have the following code for managing file download through django.
def serve_file(request, id):
file = models.X.objects.get(id=id).file #FileField
file.open('rb')
wrapper = FileWrapper(file)
mt = mimetypes.guess_type(file.name)[0]
response = HttpResponse(wrapper, content_type=mt)
import unicodedata, os.path
filename = unicodedata.normalize('NFKD', os.path.basename(file.name)).encode("utf8",'ignore')
filename = filename.replace(' ', '-') #Avoid browser to ignore any char after the space
response['Content-Length'] = file.size
response['Content-Disposition'] = 'attachment; filename={0}'.format(filename)
#print response
return response
Unfortunately, my browser get an empty file when downloading.
The printed response seems correct:
Content-Length: 3906
Content-Type: text/plain
Content-Disposition: attachment; filename=toto.txt
blah blah ....
I have similar code running ok. I don't see what can be the problem. Any idea?
PS: I have tested the solution proposed here and get the same behavior
Update:
Replacing wrapper = FileWrapper(file) by wrapper = file.read() seems to fix the problem
Update: If I comment the print response, I get similar issue:. the file is empty. Only difference: FF detects a 20bytes size. (the file is bigger than this)
File object is an interable, and a generator. It can be read only once before being exausted. Then you have to make a new one, of use a method to start at the begining of the object again (e.g: seek()).
read() returns a string, which can be read multiple times without any problem, this is why it solves your issue.
So just make sure that if you use a file like object, you don't read it twice in a row. E.G: don't print it, then returns it.
From django documentation:
FieldFile.open(mode='rb') Behaves like the standard Python open()
method and opens the file associated with this instance in the mode
specified by mode.
If it works like pythons open then it should return a file-object, and should be used like this:
f = file.open('rb')
wrapper = FileWrapper(f)

Categories