I was looking for an appropriate way to get a specific image from an eml file, but unfortunately, I always get text data without images!
Here is the code I used but it gives me just text data :
from email.parser import BytesParser
from email import policy
with open(em, 'rb') as fp:
name = fp.name # Get file name
msg = BytesParser(policy=policy.default).parse(fp)
data = msg.get_body(preferencelist=('plain')).get_content()
print(data)
fp.close()
do you find any way to solve this? I'm eager to know the method
Related
I'm looking for a quick way to get the source code of a smartcontract. I tried using the following python code:
import requests
import json
address = "0xBB9bc244D798123fDe783fCc1C72d3Bb8C189413"
api_key = my_api_key
request_string = f'''https://api.etherscan.io/api?module=contract&action=getsourcecode&address={address}&apikey={api_key}'''
response = requests.get(request_string)
print(response.text)
data = json.loads(response.text)['result'][0]['SourceCode']
file = open("contract.sol", "w")
a = file.write(data)
file.close()
So while this works for the given address, it doesn't work if the source code consists of multiple files (like with this address: 0xED5AF388653567Af2F388E6224dC7C4b3241C544). You can see there's 13 individual files. So is there a quick and easy way to save all of them into one file? Or do I just have to create a separate file for each one of the files?
Good Morning,
I have downloaded my *.eml from my Gmail and wanted to extract the content of the email as text.
I used the following codes:
import email
from email import policy
from email.parser import BytesParser
filepath = 'Project\Data\Your GrabPay Wallet Statement for 15 Feb 2022.eml'
fp = open(filepath, 'rb')
msg = BytesParser(policy=policy.default).parse(fp)
text = msg.get_body(preferencelist=('plain')).get_content()
I am unable to extract the content of the email. The length of text is 0.
When I attempted to open the *.eml using Word/Outlook, I could see the content.
When I use a normal file handler to open it:
fhandle = open(filepath)
print(fhandle)
print(fhandle.read())
I get
<_io.TextIOWrapper name='Project\Data\Your GrabPay Wallet Statement
for 15 Feb 2022.eml' mode='r' encoding='cp1252'>
And the contents look something like the one below:
Content-Transfer-Encoding: base64
Content-Type: text/html; charset=UTF-8
PCFET0NUWVBFIGh0bWwgUFVCTElDICItLy9XM0MvL0RURCBYSFRNTCAxLjAgVHJhbnNpdGlvbmFs
Ly9FTiIgImh0dHA6Ly93d3cudzMub3JnL1RSL3hodG1sMS9EVEQveGh0bWwxLXRyYW5zaXRpb25h
bC5kdGQiPgo8aHRtbCB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94aHRtbCI+CjxoZWFk
I might have underestimated the amount of codes needed to extract email body content from *eml to Python.
I do not have access to your email, but I've been able to extract text from an email that I downloaded myself as a .eml from google.
import email
with open('email.eml') as email_file:
email_message = email.message_from_file(email_file)
print(email_message.get_payload())
When working with files it is important to consider using context managers such as I did in my example because it ensures that files are properly cleaned up and file handles are closed when they are no longer needed.
I briefly read over https://docs.python.org/3/library/email.parser.html for additional information on how to achieve the intended goal.
I realised the email is in multipart. So there is a need to get to the specific part, and decode the email. While doing do, it returns a chunk of HTML codes. To strip off the HTML codes and get plain-text, I used html2text.
import email
from email import policy
from email.parser import BytesParser
import html2text
filepath = 'Project\Data\Your GrabPay Wallet Statement for 15 Feb 2022.eml'
with open(filepath) as email_file:
email_message = email.message_from_file(email_file)
if email_message.is_multipart():
for part in email_message.walk():
#print(part.is_multipart())
#print(part.get_content_type())
#print()
message = str(part.get_payload(decode=True))
plain_message = html2text.html2text(message)
print(plain_message)
print()
I have a .blf file, I have to convert that to a .asc file so that my ASCREADER is able to read the data.
from can.io.blf import BLFReader
blf_file = "/home/ranjeet/Downloads/CAN/BLF_Files/input.blf"
with BLFReader(blf_file) as can_log:
for msg in can_log:
print(msg)
I've tried this so far.
Able to read BLF File, need to write data as per .asc file
Very similar to my other answer you should read your blf file in binary mode then write the messages in the asc one:
import can
with open(blf_file, 'rb') as f_in:
log_in = can.io.BLFReader(f_in)
with open("file_out.asc", 'w') as f_out:
log_out = can.io.ASCWriter(f_out)
for msg in log_in:
log_out.on_message_received(msg)
log_out.stop()
Hi everyone this is my first post here and wanted to know how can ı write image files that ı scraped from a website to a csv file or if its not possible to write on csv how can ı write this header,description,time info and image to a maybe word file Here is the code
Everything works perfectly just wanna know how can ı write the images that i downloaded to disk to a csv or word file
Thanks for your helps
import csv
import requests
from bs4 import BeautifulSoup
site_link = requests.get("websitenamehere").text
soup = BeautifulSoup(site_link,"lxml")
read_file = open("blogger.csv","w",encoding="UTF-8")
csv_writer = csv.writer(read_file)
csv_writer.writerow(["Header","links","Publish Time"])
counter = 0
for article in soup.find_all("article"):
###Counting lines
counter += 1
print(counter)
#Article Headers
headers = article.find("a")["title"]
print(headers)
#### Links
links = article.find("a")["href"]
print(links)
#### Publish time
publish_time = article.find("div",class_="mkdf-post-info-date entry-date published updated")
publish_time = publish_time.a.text.strip()
print(publish_time)
###image links
images = article.find("img",class_="attachment-full size-full wp-post-image nitro-lazy")["nitro-lazy-src"]
print(images)
###Download Article Pictures to disk
pic_name = f"{counter}.jpg"
with open(pic_name, 'wb') as handle:
response = requests.get(images, stream=True)
for block in response.iter_content(1024):
handle.write(block)
###CSV Rows
csv_writer.writerow([headers, links, publish_time])
print()
read_file.close()
You could basically convert to base64 and write to a file as you need it
import base64
with open("image.png", "rb") as image_file:
encoded_string= base64.b64encode(img_file.read())
print(encoded_string.decode('utf-8'))
A csv file is supposed to only contain text fields. Even if the csv module does its best to quote fields to allow almost any character in them, including the separator or a new line, it is not able to process NULL characters that could exist in an image file.
That means that you will have to encode the image bytes if you want to store them in a csv file. Base64 is a well known format natively supported by the Python Standard Library. So you could change you code to:
import base64
...
###Download Article Pictures
response = requests.get(images, stream=True)
image = b''.join(block for block in response.iter_content(1024)) # raw image bytes
image = base64.b64encode(image) # base 64 encoded (text) string
###CSV Rows
csv_writer.writerow([headers, links, publish_time, image])
Simply the image will have to be decoded before being used...
I have a data, let's say
data = [
['header_1', 'header_2'],
['row_1_!', 'row_1_2'],
['row_2_1', 'row_2_2'],
]
I need to send that data as .csv file attachment to email message.
I can not save it as .csv and then attach existing csv - application is working in Googpe App Engine sandbox environment. so no files can be saved.
As I understand, email attachment consists of file name and file encoded as base64.
I tried to make attachment body in the following way:
import csv
if sys.version_info >= (3, 0):
from io import StringIO
else:
from StringIO import StringIO
in_memory_data = StringIO()
csv.writer(inmemory_data).writerows(data)
encoded = base64.b64encode(inmemory_data.getvalue())
But in result I have received by email not valid file 2 columns and 3 rows, but just one string in file (see the picture).
csv_screen
What I'm doing wrong?
I've found out the mistake. I should have been convert it to bytearray instead of encoding to base64:
encoded = bytearray(inmemory_data.getvalue(), "utf-8")
Worked fine that way.