Having issues while downloading favicon using python script - python

I am using a script to crawl and download favicons from websites. Some sites gave me 2-3 favicon images of various sizes (16x16, 32x32) etc..embedded in the same image. When I try to use this image it is not displaying properly as a favicon. Is there anything that I can do to make sure I download a proper image?

That's a feature of the ico file format. They're perfectly valid files, but you're going to need to process them with something that actually understands Windows Icon files.

Related

TIFF File Download Automation in EarthExplorer

I'm working on a project to download cropped tiff files for the following area from this website https://earthexplorer.usgs.gov/.
The crop will generate 417 datasets (tiff files) which are divided into 10 pages and ready to be downloaded by clicking the download icon in the following image.
Then a choice of file formats to download will appear and I choose GeoTIFF 1 Arc-second Download.
The website requires login to download the files.
Because there are too many files that I have to download, I want to automate the download process. I have used selenium and chrome web driver in python to do this task. But due to limited information I have not been able to complete it.
Any suggestions on how to finish this job? Or is there a similar project reference? Your information means a lot to me. Thank you

Pytube direct video downloading without any folder

I finished a streamlit app that basically downloads video and audio using pytube, and currently, I want to deploy it so that me and the others can use it on their phones, comps, and so on. However, when I deploy it and use it from another device, even though the model itself works, the file is not downloaded. Therefore I want to make it to be downloaded directly from chrome. (like the way that pictures and others are downloaded from chrome). Is there any particular method or trick for that? Thanks
you need to provide code of your app. For specify a download directory in pytube you need to write
directory = '/home/user/Documents'
file.download(directory)

Convert PDF to HTML without losing any format

I'm developing a Python Flask webapp and I'm trying to convert some user uploaded pdfs to nicely formatted HTML, like the HTML that is being produced when you display a pdf inside an iframe.
I tried several things so far:
the pdfminer.six library, produced messy HTML,
trying to grab the produced HTML, when rendering a PDF with pdf.js, which is apparently hidden in a Shadow DOM with no access to its inner HTML
finally I came across pdf2htmlEX (https://github.com/pdf2htmlEX/pdf2htmlEX) which produced exactly what I wanted.
Locally, this solution worked great, however in the production state (Heroku) I was unable to install it correctly. The project is deprecated and the documentation is limited and terrible. The problem has something to do with broken dependencies.
So, how to convert PDFs to HTML effectively without losing any format using Python or any other tool?
Thanks a lots.
if anyone is willing to help me getting the pdf2htmlEX to work on heroku, leave a comment and I will post more details in a different post
This is not going to be trivial. But I'll give some pointers.
You need an app.json in which you define your buildpacks.
https://devcenter.heroku.com/articles/app-json-schema#buildpacks
If this project is available via apt it's going to be easy. You just use the Heroku's Apt buildpack define an Aptfile that says which packages it needs to install. Example
Then it installs it automatically and you are done.
If it is not available as a package you will need to create your own buildpack.
https://devcenter.heroku.com/articles/buildpack-api
Example used here.
Another solution is to dockerize your project and execute it as a docker container.

Socket.IO works with Firefox and Edge but doesn't work with Chrome?

I'm fairly new to Python and i'm trying to build something on Flask/Socket.IO. Before starting, i decided to study this example. I tested it on Firefox and Edge and it wors perfectly, i get the numbers in real-time, but on Chrome i will only get the page without the numbers. I opened my console and there is nothing, it looks like the Socket won't connect to Chrome. I made some research and could not find any similar issue, any advice is appreciated!
Which version of Google Chrome do you use? I just tested on my "Version 69.0.3497.100 (Official Build) (64 bit)" and it works very well.
So considering that you have the same (or higher) version of Chrome, try this:
Delete your browsing data (images and cached files) and refresh the page. There may be a conflict between cached files and files loaded from CDNs in your HTML page.
If it still does not work, try this:
In your templates/index.html file, replace line <script src="static/js/application.js"></script> with <script src="{{url_for('static',filename='js/application.js')}}"></script> to load the javascript file. Finally reload the server. See here for more details on how to load a static file into flask.
I hope this will help.

Taking a screenshot of uploaded files using Python/Heroku

There are design files uploaded/downloaded by users on a website. For every uploaded file, I would like to show a screenshot of the file so people can see an image before they download them.
They are very esoteric files though that need to be opened in particular design tools. (I even don't have the software to run them on my local machine).
My thinking is that I can run a virtual machine that has these programs installed, programmatically open the files, and then take a screenshot of the opened file and save a thumbnail. But I want to do this when the user uploads the design file.
But someone told me PIL could do this. But I investigated and can't seem to figure out any documentation on how to go about this.
Has anyone ever done something like this before? What is the best approach?

Categories