Copy text from website to text/excel file - python

I'm trying to create a simple (hopefully) Python script that copies the text from this address:
http://api.bitcoincharts.com/v1/trades.csv?symbol=mtgoxUSD
to either a simple text file or an excel spreadsheet.
I've tried utilising urllib and resquests libraries, but every time I would try and run a very basic script, the shell wouldn't display anything.
For example,
import requests
data = requests.get('http://api.bitcoincharts.com/v1/trades.csv?symbol=mtgoxUSD')
data.text
Any help would be appreciated. Thank you.

You're almost done;
import requests
symbol = "mtgoxUSD"
url = 'http://api.bitcoincharts.com/v1/trades.csv?symbol={}'.format(symbol)
data = requests.get(url)
# dump resulting text to file
with open("trades_{}.csv".format(symbol), "w") as out_f:
out_f.write(data.text)

Using urllib:
import urllib
f = urllib.urlopen("http://api.bitcoincharts.com/v1/trades.csv?symbol=mtgoxUSD")
print f.read()

Related

How to get file from url in python?

I want to download text files using python, how can I do so?
I used requests module's urlopen(url).read() but it gives me the bytes representation of file.
For me, I had to do the following (Python 3):
from urllib.request import urlopen
data = urlopen("[your url goes here]").read().decode('utf-8')
# Do what you need to do with the data.
You can use multiple options:
For the simpler solution you can use this
file_url = 'https://someurl.com/text_file.txt'
for line in urllib.request.urlopen(file_url):
print(line.decode('utf-8'))
For an API solution
file_url = 'https://someurl.com/text_file.txt'
response = requests.get(file_url)
if (response.status_code):
data = response.text
for line in enumerate(data.split('\n')):
print(line)
When downloading text files with python I like to use the wget module
import wget
remote_url = 'https://www.google.com/test.txt'
local_file = 'local_copy.txt'
wget.download(remote_url, local_file)
If that doesn't work try using urllib
from urllib import request
remote_url = 'https://www.google.com/test.txt'
file = 'copy.txt'
request.urlretrieve(remote_url, file)
When you are using the request module you are reading the file directly from the internet and it is causing you to see the text in byte format. Try to write the text to a file then view it manually by opening it on your desktop
import requests
remote_url = 'test.com/test.txt'
local_file = 'local_file.txt'
data = requests.get(remote_url)
with open(local_file, 'wb')as file:
file.write(data.content)

Send Rust variable to Python?

I am building a Rust program in which the user types in a command, and then the program reads the command and responds accordingly. One of these commands is to download a file from a set site.
I have a .py file with the following code that I made a while ago that downloads files from a set site:
import urllib
import urllib2
import requests
url = 'http://www.blog.pythonlibrary.org/wpcontent/uploads/2012/06/wxDbViewer.zip'
print "downloading with urllib"
urllib.urlretrieve(url, "code.zip")
print "downloading with urllib2"
f = urllib2.urlopen(url)
data = f.read()
with open("code2.zip", "wb") as code:
code.write(data)
print "downloading with requests"
r = requests.get(url)
with open("code3.zip", "wb") as code:
code.write(r.content)
The URLs in the code are not ones that I will be using; they are examples.
If the Rust program sets the site it needs to go to as a variable, is there a way that I could send the variable to my Python program? I know you can send Python to Rust:
Passing a list of strings from Python to Rust
http://www.joesacher.com/blog/2017/08/24/ptr-types/
Is there a way to do this in the other direction?

How to read remote page content using Python

I need to read the remote file content using python but here I am facing some challenges. My code is below:
import subprocess
path = 'http://securityxploded.com/remote-file-inclusion.php'
subprocess.Popen(["rsync", host-ip+path],stdout=subprocess.PIPE)
for line in ssh.stdout:
line
Here I am getting the error NameError: name 'host' is not defined. I could not know what should be the host-ip value because I am running my Python file using terminal(python sub.py). Here I need to read the content of the http://securityxploded.com/remote-file-inclusion.php remote file.
You need the urllib library. Also you are using parameters which you don't use.
Try something like this:
import urllib.request
fp = urllib.request.urlopen("http://www.stackoverflow.com")
mybytes = fp.read()
mystr = mybytes.decode("utf8")
fp.close()
print(mystr)
Note: this is for python 3
For python 2.7 use this:
import urllib
fp = urllib.urlopen("http://www.stackoverflow.com")
myfile = fp.read()
print myfile
if you want to read remote content via http.
requests or urllib2 are both good choice.
for Python2, use requests.
import requests
resp = requests.get('http://example.com/')
print resp.text
will work.

Open a file from urlfetch in GAE

I'm trying to open a file in GAE that was retrieved using urlfetch().
Here's what I have so far:
from google.appengine.api import urlfetch
result = urlfetch.fetch('http://example.com/test.txt')
data = result.content
## f = open(...) <- what goes in here?
This might seem strange but there's a very similar function in the BlobStore that can write data to a blobfile:
f = files.blobstore.create(mime_type='txt', _blobinfo_uploaded_filename='test')
with files.open(f, 'a') as data:
data.write(result.content)
How can I write data into an arbitrary file object?
Edit: Should've been more clear; I'm trying to urlfetch any file and open result.content in a file object. So it might be a .doc instead of a .txt
You can use the StringIO module to emulate a file object using the contents of your string.
from google.appengine.api import urlfetch
from StringIO import StringIO
result = urlfetch.fetch('http://example.com/test.txt')
f = StringIO(result.content)
You can then read() from the f object or use other file object methods like seek(), readline(), etc.
Yoy do not have to open a file. You have received the txt data in data = result.content.

Saving a downloaded ZIP file w/Python

I'm working on a script that will automatically update an installed version of Calibre. Currently I have it downloading the latest portable version. I seem to be having trouble saving the zipfile. Currently my code is:
import urllib2
import re
import zipfile
#tell the user what is happening
print("Calibre is Updating")
#download the page
url = urllib2.urlopen ( "http://sourceforge.net/projects/calibre/files" ).read()
#determin current version
result = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', url).groups()[0][:-1]
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
urllib2.urlopen( download )
#save
output = open('install.zip', 'w')
output.write(zipfile.ZipFile("install.zip", ""))
output.close()
You don't need to use zipfile.ZipFile for this (and the way you're using it, as well as urllib2.urlopen, has problems as well). Instead, you need to save the urlopen result in a variable, then read it and write that output to a .zip file. Try this code:
#download file
download = "http://status.calibre-ebook.com/dist/portable/" + result
request = urllib2.urlopen( download )
#save
output = open("install.zip", "w")
output.write(request.read())
output.close()
There also can be a one-liner:
open('install.zip', 'wb').write(urllib.urlopen('http://status.calibre-ebook.com/dist/portable/' + result).read())
which doesn't have a good memory-efficiency, but still works.
If you just want to download a file from the net, you can use urllib.urlretrieve:
Copy a network object denoted by a URL to a local file ...
Example using requests instead of urllib2:
import requests, re, urllib
print("Calibre is updating...")
content = requests.get("http://sourceforge.net/projects/calibre/files").content
# determine current version
v = re.search('title="/[0-9.]*/([a-zA-Z\-]*-[0-9\.]*)', content).groups()[0][:-1]
download_url = "http://status.calibre-ebook.com/dist/portable/{0}".format(v)
print("Downloading {0}".format(download_url))
urllib.urlretrieve(download_url, 'install.zip')
# file should be downloaded at this point
have you tryed
output = open('install.zip', 'wb') // note the "b" flag which means "binary file"

Categories