How to properly send file through selenium in python - python

I work with selenium 3.141.0 and chromewebdriver 83.0.4103 .
All the selenium libraries are proprerly imported and my script is working fine until i got this error.
I'm currently trying to upload a json file to an input :
<input type="file" class="file" id="ext-gen1563">
upload = self.driver.find_element(By.XPATH, '//input[#type="file"]')
upload.send_keys("‪C:\\absolutepathtofile.json")
I'm getting the same error all the time :
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument:
I tried to click on the "Choose File" button on the form and it's working well untill I need to pass the desired file, I understood that it's not the good way to do it so I worked on the input way.
I cannot test with the geckodriver or edge drive cause my organisation do not allow me to us them.
Here is the complete code of the element :
<div class="uploader"><div class="import-file-form"><input type="file" class="file" id="ext-gen1563"></div><div class="filename">No file chosen.</div><div class="clickable btn" id="ext-gen1564">Choose File</div></div>
Can you give me some nudges to solve this problem ?
Regards.

I've dealt with the same issue myself recently and it's possible that the solution I found for the website I was dealing with might work for yours.
I first identified the box where the file name ends up after you choose your file. This box never shows the full file path nor can you type in it as a human in the browser. Once I did this, I simply sent the keys of the full path to the file to this box... and it worked.
I was then able to just identify the 'submit' button and click it.
Here is the code I used but you will obviously have to identify the elements of your website.
The CSV variable is simply a CSV file I'm passing in.
ChooseFile = browser.find_element_by_name('files[field_import_file_und_0]')
import os
ChooseFile.send_keys(os.path.abspath(CSV))
Upload = browser.find_element_by_xpath('/html/body/div[3]/div[1]/div[2]/div/div/div/div/div/div[2]/div/div/div/form/div/div[2]/div/div/div[1]/input[2]')
Upload.click()
Submit = browser.find_element_by_name('op')
Submit.click()

def base(request):
if request.method=="GET":
return render(request, 'base.html')
else:
title = request. POST.get('title')
file = request.POST.get('file')
data = models.base(title=title,file=file)
data.save()
return render(request,'base.html')

Related

"non-zero exit status 1" due to pdf file not found when using pypdftk to fill pdf forms in Django project in virtual env on dev server in Windows

The following python code successfully fills out a pdf form:
import pypdftk
data_dict = {key:value pairs}
PDF_PATH = 'form.pdf' #form to be filled out in same folder as the file executing this code
out_file = 'out_file.pdf' #completed pdf file
generated_pdf = pypdftk.fill_form(
pdf_path = PDF_PATH,
datas = data_dict,
out_file = out_file,
)
However, the same code used in my django project results in the following error message:
Error: Unable to find file.
Error: Failed to open PDF file:
form.pdf
Errors encountered. No output created.
Done. Input errors, so no output created.
... REMAINDER OF TRACEBACK EXCLUDED FOR BREVITY IF YOU WANT TO SEE IT I'LL POST...
raise subprocess.CalledProcessError(retcode, cmd, output=output) output=output) df fill_form C:\Users\Home\AppData\Local\Temp\tmpbqq__7c4 output out_file.pdf flatten
subprocess.CalledProcessError: Command 'pdftk l_a_r.pdf fill_form C:\Users\Home\AppData\Local\Temp\tmpbqq_0 87495_7c4 output out_file.pdf flatten'
returned non-zero exit status 1.
pypdftk is installed in the virtual environment the project is running in.
The pdftk server is added as a windows path variable.
In the above example, and every other time this has happened the temp file referenced at the end of the error message contains all of the expected data in XML.
I've tried the following combinations of code to try to make this work:
Running the exact above code within a view function, with the pdf form to be filled in the same folder as the views.py file:
import pypdftk
def filler_view(request):
form = MyForm()
if request.method =='POST':
#code to successfully populate dictionary data_dict with form data
PDF_PATH = 'form.pdf' #form to be filled out in same folder as the file executing this code
out_file = 'out_file.pdf #completed pdf file
generated_pdf = pypdftk.fill_form(
pdf_path = PDF_PATH,
datas = data_dict,
out_file = out_file,
)
return render(request, 'success.html')
Storing the code and file in a folder and importing to call the relevant function within the view:
-appFolder
-pfd_filler_folder
-form.pdf
-form_filler.py
-views.py
views.py
from appFolder.pdf_filler_folder import form_filler as f
def filler_view(request):
form = MyForm()
if request.method =='POST':
#code to successfully populate dictionary data_dict with form data
f.fill_form(data_dict, 'output.pdf')
form_filler.py:
import pypdftk
def fill_form(data_dict, out_file):
PDF_PATH = 'form.pdf'
generated_pdf = pypdftk.fill_form(
pdf_path = PDF_PATH,
datas = data_dict,
out_file = out_file,
)
Running both of the above with the full path from c:\... of the form.pdf file.
I've also verified that I can successfully fill a form with the executing .py file and the form.pdf file in same folder on two storage drives and from within the django project itself, when not being executed by the django project. pdftk finds the forms.py with no problems at all in this circumstance.
I believe that the file not found error message is key, as it seems to refer to the pdf form I'm trying to fill out. I've spent from 1500 till 1800 researching this, and I haven't managed to get it to work, although I am lead to believe that my error message indicates a missing parameter in the cl execution command. I'm not sure what this would be, as all parameters seem present and correct.
Interestingly enough, a friend of mine is experiencing the same error message just in windows. I'm aware that pdftk can sometimes be touchy in windows, and I think there's probably a nuance I'm missing here.
The outcome I'd like is to fill out a pdf form from within my django project, with data obtained from a form through a post request.
I'd welcome either someone enlightening me as to why pdftk is struggling to either see or use the form file whilst being used from within my django project and pointing me in the right direction
I'm aware that there are alternatives to using pdftk, but pdftk is the simplest, and honestly pypdftk is the only library I've found to reliably work with python to fill out pdf forms so far in Windows. I don't want to go down the route of generating my own replica form and populating it with data, but I'm aware that that is also an option.
Question answered just now on Reddit:
When in Django, it is either wsgi.py or manage.py which is ultimately responsible for what goes on. On that basis, placing the form.pdf file in the same folder as wsgy.py solved the problem and the code now runs as intended, with an unbound form POSTing data back to a view, and a pdf form being filled out and a duplicate saved with said data. Hope that helps anyone else who comes up against this!

Python + selenium download image without extension

I'm using python 3 with selenium, I have to download an image
HTML:
<img id="labelImage" name="labelImage" border="0" width="672" height="456" alt="labelImage" src="/shipping/labelAction.handle?method=doGetLabelFromCache&isDecompressRequired=false&utype=null&cacheKey=774242409034SHIPPING_L">
Python code:
found = browser.find_element_by_css_selector('img[alt="labelImage"]')
src = found.get_attribute('src')
urllib.request.urlretrieve(src, 'image.png')
that image file is empty, if I try to switch extension to html, shows me message below:
"We're sorry, we can't process your request right now. It appears you don't have permission to view this webpage"
The error you recieve when attempt to download comes from the fact the urllib call is a brand new session for their server - it does not have the cookies and authentication your browser does. E.g. it is the same as if you open incognito mode in the browser, and paste in the address bar the src attribute - for the server you are a new client, that hasn't fill in the form, logged in, etc.
You may want to try something else - in the selenium/the browser session, taking a screenshot of just the <img> element. That op is with variable success, Chrome for instance added support for it only recently, and in some situations it fails:
found = browser.find_element_by_css_selector('img[alt="labelImage"]')
try:
found.screenshot('element.png')
except Exception as ex: # FIXME: anti-pattern - I don't recall the exact exception - when you run the code, change it to the proper one
print('The correct exception is {}'.format(ex))
browser.get_screenshot_as_file('page.png')
If taking the element's screenshot fails, you'll get one of the whole page - which you can then trim to the element.

Python flask application not displaying generated html file for second time

I have a Python flask application which takes input id's and dynamically generates data into a html file. Below is my app.py file.
#app.route('/execute', methods=['GET', 'POST'])
def execute():
if request.method == 'POST':
id = request.form['item_ids']
list = [id]
script_output = subprocess.Popen(["python", "Search_Script.py"] + list)
# script_output = subprocess.call("python Search_Script.py "+id, shell=True)
# render_template('running.html')
script_output.communicate()
#driver = webdriver.Chrome()
#driver.get("home.html")
#driver.execute_script("document.getElementById('Executed').style.display = '';")
return render_template('execute.html')
#app.route('/output')
def output():
return render_template('output.html')
output.html file has below code at the bottom.
<div class="container" style="text-align: center;">
{% include 'itemSearchDetails.html' %}
</div>
itemSearchDetails.html is generated every time dynamically based on the input. I check for different inputs and it is generating perfectly. When I run it with some input(assume 2) values for the first time, it runs perfectly and shows the output correctly. But, when I run for different values(assume 4) for the next time, the file 'itemSearchDetails.html' is generated for those 4 values but the browser only shows output for the first 2 values. No matter how many times I run it, browser shows only output with the first run values.
So, every time only the first inputted values are shown no matter how many times I run. I am not sure if it is browser cache issue since I tried "disabling cache" in chrome. Still it didn't work. Please let me know if there is something I am missing.
Try solution from this answer:
Parameter TEMPLATES_AUTO_RELOAD
Whether to check for modifications of the template source and reload
it automatically. By default the value is None which means that Flask
checks original file only in debug mode.
Original documentation could be found here.
Looks like Jinja is caching the included template.
If you don't need to interpret the HTML as a Jinja template, but instead just include its contents as-is, read the file first and pass the contents into the template:
with open('itemSearchDetails.html', 'r') as infp:
data = infp.read()
return render_template('execute.html', data=data)
...
{{ data|safe }}
(If you do need to interpret the HTML page as Jinja (as include will), you can parse a Jinja Template out of data, then use the include tag with that dynamically compiled template.)

Provide tab title with reportlab generated pdf

This question is really simple, but I can't find any data on it.
When I generate a pdf with reportlab, passing the httpresponse as a file, browsers that are configured to show files display the pdf correctly. However, the title of the tab remains "(Anonymous) 127.0.0.1/whatnot", which is kinda ugly for the user.
Since most sites are able to somehow display an appropiate title, I think it's doable... Is there some sort of title parameter that I can pass to the pdf? Or some header for the response? This is my code:
def render_pdf_report(self, context, file_name):
response = HttpResponse(content_type='application/pdf')
response['Content-Disposition'] = 'filename="{}"'.format(file_name)
document = BaseDocTemplate(response, **self.get_create_document_kwargs())
# pdf generation code
document.build(story)
return response
Seems that Google Chrome doesn't display the PDF titles at all.
I tested the link in your comment (biblioteca.org.ar) and it displays in Firefox as " - 211756.pdf", seems there's an empty title and Firefox then just displays the filename instead of the full URL path.
I reproduced the same behaviour using this piece of code:
from reportlab.pdfgen import canvas
c = canvas.Canvas("hello.pdf")
c.setTitle("hello stackoverflow")
c.drawString(100, 750, "Welcome to Reportlab!")
c.save()
Opening it in Firefox yields the needed result:
I found out about setTitle in ReportLab's User Guide. It has it listed on page 16. :)
I was also looking for this and I found this in the source code.
reportlab/src/reportlab/platypus/doctemplate.py
# line - 467
We can set the document's title by
document.title = 'Sample Title'
I realise this is an old question but dropping in an answer for anyone using SimpleDocTemplate. The title property can be set in constructor of SimpleDocTemplate as a kwarg. e.g.
doc = SimpleDocTemplate(pdf_bytes, title="my_pdf_title")
If you are using trml2pdf, you will need to add the "title" attribute in the template tag, ie., <template title="Invoices" ...
In addition to what others have said, you can use
Canvas.setTitle("yourtitle")
which shows up fine in chrome.

python webdriver os window

I need to upload a file using Python and Selenium. When I click the upload HTML element a "File Upload" window is opened and the click() method does not return since it waits to fully load the page. Therefore I cannot continue using pywinauto code to control the window.
The first method clicks the HTML element (an img) to upload a new file:
def add_file(self):
return self.selenium.find_element(By.ID, "add_file").click()
and the second method is using pywinauto to type the path to the file and then click open
def upload(self):
from pywinauto import application
app = application.Application()
app.connect_(title_re = "File Upload")
app.file_upload.TypeKeys("C:\\Path\\To\\FIle")
app.file_upload.Open.Click()
How can I force add_file method to return and to be able to run the upload method?
Solve it. There was an iframe dealing with the upload but was hidden and didn't see it in the first place. The iframe contains an input of type file also hidden. To solve it make the iframe visible using javascript:
selenium.execute_script("document.getElementById('iframe_id').style.display = 'block';")
then switch to the iframe and make the input visible also:
selenium.switch_to_frame(0)
selenium.execute_script("document.getElementById('input_field_id').type = 'visible';")
and simply send the path to the input:
selenium.find_element(By.ID, 'input_field_id').send_keys("path\\\\to\\\\file")
For windows use 4 '\\\\' as path separator.

Categories