I have the following code in Python to send an email with an attached pdf.
result_url = '%s%s?analysis_id=%s' % (
constants.HOST_URL, reverse('results'), analysis.id)
pdf_filename = ''.join(['report', str(analysis_id), '.pdf'])
utils.convert_to_pdf(result_url, pdf_filename)
and here is my utils.convert_to_pdf.
def convert_to_pdf(url, filename):
command = "phantomjs export.js %s %s" % (url, filename)
execute_command(command).communicate()
and here is how I am sending the email.
email_ids = []
if analysis.user is not None:
email_ids.append(analysis.user.email)
if email is not None:
email_ids.append(email)
body = ANALYSIS_EMAIL_BODY % (result_url)
try:
message = EmailMultiAlternatives(ANALYSIS_EMAIL_SUBJECT, body, settings.EMAIL_SENDER, email_ids)
message.attach('Report.pdf', pdf_filename, 'application/pdf')
message.send()
except Exception as ex:
logging.error('Send mail failed: %s', ex)
Now I see the PDF file is properly getting generated in my current folder and is attached with the mail but it's size is 0KB in the mail and when I try to open the file it tells .
It may be damaged or use a file format that preview doesn't recognize
What is going wrong here.
You are merely attaching the filename as the contents of the file.
You need to attach the actual contents, not a filename.
message.attach('Report.pdf', read_pdf_contents(), 'application/pdf')
It is your job to determine how to get the raw data from the filename.
It would be something like myfile_like_object.read()
Related
I am trying to iterate through the contents of a subfolder, and if the message contains an .xlsx attachment, download the attachment to a local directory. I have confirmed all other parts of this program work until that line, which throws an exception each time.
I am running the following code in a Jupyter notebook through VSCode:
# import libraries
import win32com.client
import re
import os
# set up connection to outlook
path = os.path.expanduser("~\\Desktop\\SBD_DB")
print(path)
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)
target_folder = inbox.Folders['SBD - Productivity'].Folders['Productivity Data Request']
target_folder.Name
messages = target_folder.Items
message = messages.GetLast()
# while True:
x=0
while x < 100:
try:
# print(message.subject) # get the subject of the email
for attachment in message.attachments:
if 'xlsx' in attachment.FileName:
# print("reached")
attachment.SaveAsFile(os.path.join(path, str(attachment.FileName)))
print("found excel:", attachment.FileName)
message = messages.GetPrevious()
x+=1
except:
print("exception")
message = messages.GetPrevious()
x+=1
Looks like the following line of code throws an exception at runtime:
attachment.SaveAsFile(os.path.join(path, str(attachment.FileName)))
First, make sure that you deal with an attached file, not a link to the actual file. The Attachment.Type property returns an OlAttachmentType constant indicating the type of the specified object. You are interested in the olByValue value when the attachment is a copy of the original file and can be accessed even if the original file is removed.
Second, you need to make sure that the file path (especially the FileName property) doesn't contain forbidden symbols, see What characters are forbidden in Windows and Linux directory names? for more information.
Third, make sure that a target folder exists on the disk and points to the local folder. According to the exception message:
'Cannot save the attachment. Path does not exist. Verify the path is correct.'
That is it. Try to open the folder manually first, according to the error message the path doesn't exist. Before calling the SaveAsFile method you need to created the target folder or make sure it exists before.
I am trying to make a downloadable text file on the fly, i think i have achieved this but when i run the code i get a permission denied error.
also when i open this text file, does it get created anywhere in the file system? as i dont want to store these files, just create them and have them downloaded to users machine
IOError at /networks/configs/STR-RTR-01/7
[Errno 13] Permission denied: u'STR-CARD-RTR-01.txt'
config:
def configs(request, device, site_id):
site = get_object_or_404(ShowroomConfigData, pk=site_id)
config_template = get_object_or_404(ConfigTemplates, device_name=device)
file_name = device[:4] + site.location.upper()[:4] + "-" + device[4:] + ".txt"
device_config = None
with open(file_name, 'w') as config_file:
device_config = env.from_string(config_template.config)
device_config.stream(
STR = site.location.upper()[:4],
IP = site.subnet,
BGPASNO = site.bgp_as,
LOIP = site.r1_loopback_ip,
Location = site.location,
Date = site.opening_date,
).dump(config_file)
return render(request, file_name, {
})
If the goal is to provide a link where the user can download a file which is automatically generated, there's no need to write anything to disk.
You can just build the desired content in a Python string, and use the Content-Disposition header to suggest that a user's browser should download the file rather than display it, and its filename parameter to specify a default filename for the user to save the file as.
A slightly simpler example of a view function which does this...
from django.http import HttpResponse
def get_config_file(request):
filename = 'config.txt'
content = 'This is the content of my config file'
content_type = 'text/plain'
content_disposition = 'attachment; filename=%r' % filename
response = HttpResponse(content, content_type=content_type)
response['Content-Disposition'] = content_disposition
return response
I have just written a small function to download and save some images to my hard disk. Now that some urls redirect and/or contain bad file extensions. I have added some validations, however, they cause the script to stop immediately as they hit a bad url. Now, I would like to modify the script a bit that loop continues discarding any bad urls, eventually breaking the loop as I successfully download an image. (Here I need to download just one image successfully). Can you please take a look at my code and share some tips? Thank you
from pattern.web import URL, DOM, plaintext, extension
import requests, re, os, sys, datetime, time, re, random
def download_single_image(query, folder, image_options=None):
download_fault = 0
url_link = None
valid_image_ext_list = ['.png', '.jpg', '.gif', '.bmp', '.tiff', 'jpeg'] # not comprehensive
pic_links = scrape_links(query, image_options) # pic_links contains an array of urls
for url in pic_links:
url = URL(url)
print "checking re-direction"
if url.redirect:
print "redirected, returning"
return # if there is a redirect, return
file_ext = extension(url.page)
print "checking file extension", file_ext
if file_ext.lower() not in valid_image_ext_list:
print "not a valid extension, returning"
return # return if not valid image extension found
# Download the image.
print('Downloading image %s... ' % (pic))
res = requests.get(pic)
try:
res.raise_for_status()
except Exception as exc:
print('There was a problem: %s' % (exc))
print ('Saving image to %s...'% (folder))
if not os.path.exists(folder + '/' + os.path.basename(pic)):
imageFile = open(os.path.join(folder, os.path.basename(pic)), mode='wb')
for chunk in res.iter_content(100000):
imageFile.write(chunk)
imageFile.close()
print('pic saved %s' % os.path.basename(pic))
else:
print('File already exists!')
return os.path.basename(pic)
Change this:
return # return if not valid image extension found
to this:
continue # return if not valid image extension found
First just aborts the loop, second skips to next step.
PS.File extensions in the world of Internet mean nothing... I would rather just send HEAD request with CURL to check if it's image or not (by content-type that servers returns).
OK I'm trying to scrape jpg image from Gucci website. Take this one as example.
http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg
I tried urllib.urlretrieve, which doesn't work becasue Gucci blocked the function. So I wanted to use requests to scrape the source code for the image and then write it into a .jpg file.
image = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg").text.encode('utf-8')
I encoded it because if I don't, it keeps telling me that gbk cannot encode the string.
Then:
with open('1.jpg', 'wb') as f:
f.write(image)
looks good right? But the result is -- the jpg file cannot be opened. There's no image! Windows tells me the jpg file is damaged.
What could be the problem?
I'm thinking that maybe when I scraped the image, I lost some information, or some characters are wrongly scraped. But how can I find out which?
I'm thinking that maybe some information is lost via encoding. But if I don't encode, I cannot even print it, not to mention writing it into a file.
What could go wrong?
I am not sure about the purpose of your use of encode. You're not working with text, you're working with an image. You need to access the response as binary data, not as text, and use image manipulation functions rather than text ones. Try this:
from PIL import Image
from io import BytesIO
import requests
response = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg")
bytes = BytesIO(response.content)
image = Image.open(bytes)
image.save("1.jpg")
Note the use of response.content instead of response.text. You will need to have PIL or Pillow installed to use the Image module. BytesIO is included in Python 3.
Or you can just save the data straight to disk without looking at what's inside:
import requests
response = requests.get("http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/277520_F4CYG_4080_001_web_full_new_theme.jpg")
with open('1.jpg','wb') as f:
f.write(response.content)
A JPEG file is not text, it's binary data. So you need to use the request.content attribute to access it.
The code below also includes a get_headers() function, which can be handy when you're exploring a Web site.
import requests
def get_headers(url):
resp = requests.head(url)
print("Status: %d" % resp.status_code)
resp.raise_for_status()
for t in resp.headers.items():
print('%-16s : %s' % t)
def download(url, fname):
''' Download url to fname '''
print("Downloading '%s' to '%s'" % (url, fname))
resp = requests.get(url)
resp.raise_for_status()
with open(fname, 'wb') as f:
f.write(resp.content)
def main():
site = 'http://www.gucci.com/images/ecommerce/styles_new/201501/web_full/'
basename = '277520_F4CYG_4080_001_web_full_new_theme.jpg'
url = site + basename
fname = 'qtest.jpg'
try:
#get_headers(url)
download(url, fname)
except requests.exceptions.HTTPError as e:
print("%s '%s'" % (e, url))
if __name__ == '__main__':
main()
We call the .raise_for_status() method so that get_headers() and download() raise an Exception if something goes wrong; we catch the Exception in main() and print the relevant info.
I have written one function which recieves a url and copy it to all server.
Server remote path is stored in db.
def copy_image_to_server(image_url):
server_list = ServerData.objects.values_list('remote_path', flat=True).filter(active=1)
file = cStringIO.StringIO(urllib.urlopen(image_url).read())
image_file = Image.open(file)
image_file.seek(0)
for remote_path in server_list:
os.system("scp -i ~/.ssh/haptik %s %s " % (image_file, remote_path))
I am geeting this error at last line cannot open PIL.JpegImagePlugin.JpegImageFile: No such file
Please suggest me what's wrong in the code, i have checked url is not broken
The problem is that image_file is not a path (string), it's an object. Your os.system call is building up a string that expects a path.
You need to write the file to disk (perhaps using the tempfile module) before you can pass it to scp in this manner.
In fact, there's no need for you (at least in what you're doing in the code snippet) to convert it to a PIL Image object at all, you can just write it to disk once you've retrieved it, and then pass it to scp to move it:
file = cStringIO.StringIO(urllib.urlopen(image_url).read())
diskfile = tempfile.NamedTemporaryFile(delete=False)
diskfile.write(file.getvalue())
path = diskfile.name
diskfile.close()
for remote_path in server_list:
os.system("scp -i ~/.ssh/haptik %s %s " % (path, remote_path))
You should delete the file after you're done using it.