how can i download the images with no extension on linux

how can i download the images with no extension on linux - python

In my python script i am getting the links of the images from the webpage. But few of the images links are like this
image.php?u=155594&dateline=1182409179
The terminal says like this
HTTP request sent, awaiting response... 200 OK
Length: 4159 (4.1K) [image/png]
But image gets saved like image.php?blabla
is there any to save it in proper format with extension

For using wget, you need to use the -O option to specify output file. For example:
wget -O img.png http://example.com/image.php?foo=bar
Here's a little python script for when you don't know the type:
import os
import urllib2
import sys
d = urllib2.urlopen(sys.argv[0])
o = open('image.%s' % d.info().gettype(), 'w')
o.write(d.read())

Related

Specify a download path usig wget module in python

I'm trying to download files from a site using the wget module.
The code is really simple:
image = 'linkoftheimage'
wget.download(image)
This works fine, but it saves the file in the folder with the python script. My goal is to download it in a different folder, but I can't find a way to specify it.
I tried a different approach with os module .
os.system(f'wget -O {directory} {image}')
This metod gives me an error: sh: -c: line 0: syntax error near unexpected token `('
So I tried another method:
with open(f'{directory}/photo %s.jpg' %a,'wb') as handler:
handler.write(image)
This also didn't worked out.
Does anyone have an idea on how could I solve this?

the package you specified has not been updated since 2015, it's repository is gone and so should probably be avoided. you can download files using the built-in requests module like so:
import requests
image_url = 'https://www.fillmurray.com/200/300'
file_destination = 'desired/destination/file.jpg'
res = requests.get(image_url)
if res.status_code == 200: # http 200 means success
with open(file_destination, 'wb') as file_handle: # wb means Write Binary
file_handle.write(res.content)

How to download and silent install .exe file with given URL using Python 3

I have a URL which is a download link to a software .exe file.
The intended operation is to use Python 3 to download the said file and then do a silent installation.
import ssl
from urllib.parse import urlparse
import requests
#convert to string
url = str(url)
#convert ftp download path to http and remove chars to make it downloadable
httpurl = re.sub("ftp://","https://",url)
print("url is: " + url)
#function to break file name from the full path
def split(downloadurl):
p,downloadexename = os.path.split(downloadurl)
return [downloadexename]
#Remove all the unnecessary stuff you don't need
downloadurl = httpurl.replace("'","").replace(',','').replace(":2100/FTP Folders/Software","").replace(" ","%20").strip("(").strip(")")
print("Download URL is: "+downloadurl)
down_name = os.path.basename(downloadurl)
down_dir = r"C:\Desktop"
#Create folder if it doesn't exist for download as required
if not os.path.exists(down_dir):
os.makedirs(down_dir)
full_path = os.path.join(down_dir, down_name)
# Silent Install
subprocess.call([full_path, '/Silent'], shell=True)

It depends on what kind of installer the .exe is, but there is a high chance it already has the ability to install silently. For example, InnoSetup installers (a very common type) use the parameter /SILENT. If you don't know the installer type, you may try running the .exe with different variants of /s, /S, /SILENT etc. -- or some variants of /?, --help to show command line usage.

Internal Server Error in my Code?

#!/usr/bin/python3
import cgi
import cgitb
import urllib.request
import os
import sys
def enco_print(string="", encoding = "utf8"):
sys.stdout.buffer.write(string.encode(encoding) + b"\n")
cgitb.enable()
form = cgi.FieldStorage(endcoding="utf8")
name_name= form.getvalue("name")
url_name = form.getvalue("url")
response = urllib.request.urlopen(str(url_name))
html = response.read().decode("utf8")
if not os.path.exists("gecrwalt"):
os.mkdir("gecrwalt")
with open("/gecrwalt/" + str(url_name) + ".html", "w", endcoding="utf8")
as f:
f.write(str(html))
When I try to run this script, I get 500 Status Error on my Website. I can´t see what´s wrong with this code.
I´m very thankful for help.

There are a few typos where you wrote endcoding instead of encoding.
The last segment here
with open("/gecrwalt/" + str(url_name) + ".html", "w", endcoding="utf8")
as f:
f.write(str(html))
has broken indentation, not sure if this was due to a copy-paste error here on Stackoverflow.
Another issue here is that (if I understood your code correlty) url_name will contain a complete URL like http://example.com, which will result in an error because that filename is invalid. You will have to come up with some schema to safely store these files, urlencode the URL or take a hash of the URL. Your save path also starts with /, not sure if intentational, starts at the file system root.
Changing the typo and the last bit has worked for me in a quick test:
with open("gecrwalt/something.html", "w", endcoding="utf8") as f:
f.write(str(html))
Debugging hint: I started a local Python webserver process with this command (from here)
python3 -m http.server --bind localhost --cgi 8000
Accessing http://localhost:8000/cgi-bin/filename.py will show you all errors that occur, not sure about the webserver you are currently using (there should be error logs somewhere I guess).

Using BeautifulSoup in CGI without installing

I am trying to build a simple scraper in Python, which will run on a Webserver via CGI. Basically it will return a value determined by a parameter passed to it in a URL. I need BeautifulSoup to do the processing of HTML pages on the webserver. However, I'm using HelioHost, which doesn't give me shell access or pip etc. I can only use FTP. One the BS website, it says you can directly extract it and use it without installing.
So I got the tarball on my Win7 machine, used 7-zip to remove bz2 compression, and then tar compression, which gave me a bs4 folder and a setup.py file. I transferred the complete bs4 folder to my cgi-bin directory where the python script is located via ftp. My script code is :
#!/usr/bin/python
import cgitb
cgitb.enable()
import urllib
import urllib2
from bs4 import *
print "Content-type: text/html\n\n"
print "<html><head><title>CGI Demo</title></head>"
print "<h1>Hello World</h1>"
print "</html>"
But it is giving me an error:
/home/poiasd/public_html/cgi-bin/lel.py
6 import urllib
7 import urllib2
8 from bs4 import *
9
10 print "Content-type: text/html\n\n"
bs4 undefined
SyntaxError: invalid syntax (__init__.py, line 29)
args = ('invalid syntax', ('/home/poiasd/public_html/cgi-bin/bs4/__init__.py', 29, 6, 'from .builder import builder_registry\n'))
filename = '/home/poiasd/public_html/cgi-bin/bs4/__init__.py'
lineno = 29
msg = 'invalid syntax'
offset = 6
print_file_and_line = None
text = 'from .builder import builder_registry\n'
How can I use the bs4 module via CGI? How can I install but not-install it? Can I convert the BeautifulSoup I have on my PC to a nice little BeautifulSoup4.py which will contain all the code?

You are using a version of Python that doesn't yet support PEP 328 Relative Imports; e.g. Python 2.4 or older. BeautifulSoup 4 requires Python 2.7 or newer.
Presumably you cannot upgrade to a newer Python version. In that case you can try using BeautifulSoup 3; it'll have a few bugs and you'll be missing some features, but at least you can get past the syntax error.
However, I note that HelioHost does list Python 2.7 as supported.

Python - How do you run a .py file?

I've looked all around Google and its archives. There are several good articles, but none seem to help me out. So I thought I'd come here for a more specific answer.
The Objective: I want to run this code on a website to get all the picture files at once. It'll save a lot of pointing and clicking.
I've got Python 2.3.5 on a Windows 7 x64 machine. It's installed in C:\Python23.
How do I get this script to "go", so to speak?
=====================================
WOW. 35k views. Seeing as how this is top result on Google, here's a useful link I found over the years:
http://learnpythonthehardway.org/book/ex1.html
For setup, see exercise 0.
=====================================
FYI: I've got zero experience with Python. Any advice would be appreciated.
As requested, here's the code I'm using:
"""
dumpimages.py
Downloads all the images on the supplied URL, and saves them to the
specified output file ("/test/" by default)
Usage:
python dumpimages.py http://example.com/ [output]
"""
from BeautifulSoup import BeautifulSoup as bs
import urlparse
from urllib2 import urlopen
from urllib import urlretrieve
import os
import sys
def main(url, out_folder="C:\asdf\"):
"""Downloads all the images at 'url' to /test/"""
soup = bs(urlopen(url))
parsed = list(urlparse.urlparse(url))
for image in soup.findAll("img"):
print "Image: %(src)s" % image
filename = image["src"].split("/")[-1]
parsed[2] = image["src"]
outpath = os.path.join(out_folder, filename)
if image["src"].lower().startswith("http"):
urlretrieve(image["src"], outpath)
else:
urlretrieve(urlparse.urlunparse(parsed), outpath)
def _usage():
print "usage: python dumpimages.py http://example.com [outpath]"
if __name__ == "__main__":
url = sys.argv[-1]
out_folder = "/test/"
if not url.lower().startswith("http"):
out_folder = sys.argv[-1]
url = sys.argv[-2]
if not url.lower().startswith("http"):
_usage()
sys.exit(-1)
main(url, out_folder)

On windows platform, you have 2 choices:
In a command line terminal, type
c:\python23\python xxxx.py
Open the python editor IDLE from the menu, and open xxxx.py, then press F5 to run it.
For your posted code, the error is at this line:
def main(url, out_folder="C:\asdf\"):
It should be:
def main(url, out_folder="C:\\asdf\\"):

Usually you can double click the .py file in Windows explorer to run it. If this doesn't work, you can create a batch file in the same directory with the following contents:
C:\python23\python YOURSCRIPTNAME.py
Then double click that batch file. Or, you can simply run that line in the command prompt while your working directory is the location of your script.

Since you seem to be on windows you can do this so python <filename.py>. Check that python's bin folder is in your PATH, or you can do c:\python23\bin\python <filename.py>. Python is an interpretive language and so you need the interpretor to run your file, much like you need java runtime to run a jar file.

use IDLE Editor {You may already have it} it has interactive shell for python and it will show you execution and result.

Your command should include the url parameter as stated in the script usage comments.
The main function has 2 parameters, url and out (which is set to a default value)
C:\python23\python "C:\PathToYourScript\SCRIPT.py" http://yoururl.com "C:\OptionalOutput\"

If you want to run .py files in Windows, Try installing Git bash
Then download python(Required Version) from python.org and install in the main c drive folder
For me, its :
"C:\Python38"
then open Git Bash and go to the respective folder where your .py file is stored :
For me, its :
File Location : "Downloads"
File Name : Train.py
So i changed my Current working Directory From "C:/User/(username)/" to "C:/User/(username)/Downloads"
then i will run the below command
" /c/Python38/python Train.py "
and it will run successfully.
But if it give the below error :
from sklearn.model_selection import train_test_split
ModuleNotFoundError: No module named 'sklearn'
Then Do not panic :
and use this command :
" /c/Python38/Scripts/pip install sklearn "
and after it has installed sklearn go back and run the previous command :
" /c/Python38/python Train.py "
and it will run successfully.
!!!!HAPPY LEARNING !!!!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how can i download the images with no extension on linux - python

Related

Specify a download path usig wget module in python

How to download and silent install .exe file with given URL using Python 3

Internal Server Error in my Code?

Using BeautifulSoup in CGI without installing

Python - How do you run a .py file?

Categories

Resources