How to encode and decode docx file using base64 in python - python

I am trying to encode docx file and decode/pass it on frontend/UI in streamlit. As of now i knew how to encode/decode strings using base64 but not with docx file.
If any of you guys have any code on how to achieve it. Please do share here.
import base64
import streamlit as st
data = open('/home/lungsang/Desktop/streamlit-practice/content/A0/A0.02-vocab.docx', 'rb').read()
encoded = base64.b64encode(data)
decoded = base64.b64decode(encoded)
st.download_button('Download Here', decoded)
I used the above code but not getting the desired result.
Instead I got collection of .xml file. As shown in the below screenshot
The supposed decoded document should look like this..
If you guys need the docx file that i am trying to encode/decode, here is the link https://docs.google.com/document/d/10zkg1HLDHhZNh83i2tbJqBVMfIsdqW-3/edit

You need to add filename argument to download_button function
import base64
import streamlit as st
data = open("test.docx", "rb").read()
encoded = base64.b64encode(data)
decoded = base64.b64decode(encoded)
st.download_button('Download Here', decoded, "decoded_file.docx")

This is just encoding You Have to Decode
with open('YOUR DATA FILE', 'rb') as binary_file:
binary_file_data = binary_file.read()
base64_encoded_data = base64.b64encode(binary_file_data)
base64_message = base64_encoded_data.decode('utf-8')
print(base64_message)
open the file using open('Your Data file', 'rb'). Note how we passed the 'rb' argument along with the file path - this tells Python that we are reading a binary file. Without using 'rb', Python would assume we are reading a text file.
use the read() method to get all the data in the file into the binary_file_data variable. Similar to how we treated strings, we Base64 encoded the bytes with base64.b64encode and then used the decode('utf-8') on base64_encoded_data to get the Base64 encoded data using human-readable characters.
Executing the code will produce similar output to:
python3 encoding_binary.py

Related

Issue trying to decode utf-8 encoded json in python

I do have a json file I am trying to read.
The Json includes the following text in the file, which I am trying to decode with the code following this text.
Silikon-Ersatzpl\u00ef\u00bf\u00bdttchen Regenlichtsensor
with open("file_name", encoding = "utf-8") as file:
pdf_labels = json.loads(file.read())
When I try to load it with the json module and specify utf-8 encoding I get some weird results.
"\u00ef\u00bf\u00bd" will become "�" instead of a desired "ä"
The desired ouput should look like the following.
Silikon-Ersatzplättchen Regenlichtsensor
Please don´t be harsh, this is my first question :)

Cannot properly decode Base64 string from Power Apps to audio file

I am trying to properly decode a Base64 string from Power Apps to an audio file. The point is: I do decode it and I can play it. But as soon as I try to convert it using ffmpeg or any website, all kind of errors are thrown. I have tried changing the formats too (aac, weba, m4a, wav, mp3, ogg, 3gp, caf), but none of them could be converted to another format.
PS: If I decode the string (which is too big to post here) directly using a website, then the audio file can finally be converted, indicating that the issue is in the code or even in the Python library.
===============
CODE ===============
import os
import base64
mainDir = os.path.dirname(__file__)
audioFileOGG = os.path.join(mainDir, "myAudio.ogg")
audioFile3GP = os.path.join(mainDir, "myAudio.3gp")
audioFileAAC = os.path.join(mainDir, "myAudio.aac")
binaryFileTXT = os.path.join(mainDir, 'binaryData.txt')
with open(binaryFileTXT, 'rb') as f:
audioData = f.readlines()
audioData = audioData[0]
with open(audioFileAAC, "wb") as f:
f.write(base64.b64decode(audioData))
Result: the audio file is playable, but it cannot be converted to any other format (I need *.wav).
What am I missing here?
I found the issue myself: in order to decode the Base64 string, one must remove the header first (eg.: "data:audio/webm;base64,"). Then it works!

Saving uploaded binary to local file

I'm trying to upload files from a javascript webpage, to a python-based server, with websockets.
In the JS, this is how I'm transmitting the package of data over the websocket:
var json = JSON.stringify({
'name': name,
'iData': image
});
in the python, I'm decoding it like this:
noJson = json.loads(message)
fName = noJson["name"]
fData = noJson["iData"]
I know fData is in unicode format, but when I try to save the file locally is when the problems begin. Say, I'm trying to upload/save a JPG file. Looking at that file after upload I see at the beginning:
ÿØÿà^#^PJFIF
the original code should be:
<FF><D8><FF><E0>^#^PJFIF
So how do I get it to save with the codes, instead of the interpreted unicode characters?
fd = codecs.open( fName, encoding='utf-8', mode='wb' ) ## On Unix, so the 'b' might be ignored
fd.write( fData)
fd.close()
(if I don't use the "encoding=" bit, it throws a UnicodeDecodeError exception)
Use 'latin-1' encoding to save the file.
The fData that you are getting already has the characters encoded, i.e. you get the string u'\xff\xd8\xff\xe0^#^PJFIF'. The latin-1 encoding will literally convert all codepoints between U+00 and U+FF to a single char, and fail to convert any codepoint above U+FF.

Reading a raw RAM data in python

I took the dump of RAM data using a freeware called DumpIt(http://www.downloadcrew.com/article/23854-dumpit). The software saved the RAM data as a raw file which can be read using a Hex Editor(http://www.downloadcrew.com/article/10814-hxd).
How do I get the string data as visible in the Hex Editor(see image) in python?
For eg: I want to get the string "http://www.downloadcrew.com/article/23854-dumpit" in red box in image in python by reading the raw file generated by DumpIt.
EDIT
I tried using this code but it just gets stalled and nothing happens
#!/usr/bin/python
import binascii
filename = "LEMARC-20140401-181003.raw"
g = open("out","w")
str=""
with open(filename,"rb") as f:
for lines in f:
str+=lines
str = binascii.unhexlify(str)
f.close()
g.write(str)
g.close
In Python2
"437c2123".decode('hex')
'C|!#'
In Python3 (also works in Python2, for <2.6 you can't have the b prefixing the string)
import binascii
binascii.unhexlify(b"437c2123")
b'C|!#'
So in your case decode the entire hex string to get the ascii, and then you can extract the url with a regex or your own parsing function

Decode base64 encoded file and print results to the console

So I have a file that I have already encoded with base64, now I want to decode it back but instead of creating another file, I want to decode it in the console and print the results to screen. How to do that?
encoded file string = MUhRRy1ITVRELU0zWDItNlcxSA==
FYI:
this would mean first opening the file in console, then decoding the give string
Thanks
Unless I'm overlooking something, this is as simple as reading in your encoded string and then calling the standard library's base64.b64decode function on it.
Something like:
with open(path_to_encoded_file) as encoded_file:
print base64.b64decode(encoded_file.read().strip())
Using base64.decode, set sys.stdout (sys.stdout.buffer.raw in python 3.x) as output.
import sys
import base64
with open('filepath') as f:
base64.decode(f, sys.stdout)

Categories