I would like to POST a .tgz file with the Python urllib2 library to a backend server. I can't use requests due to some licensing issues. There are some examples of file upload on stackoverflow but all relate to attaching a file in a form.
My code is the following but it unfortunately fails:
stats["random"] = "data"
statsFile = "mydata.json"
headersFile = "header-data.txt"
tarFile = "body.tgz"
headers = {}
#Some custom headers
headers["X-confidential"] = "Confidential"
headers["X-version"] = "2"
headers["Content-Type"] = "application/x-gtar"
#Create the json and txt files
with open(statsFile, 'w') as a, open(headersFile, 'w') as b:
json.dump(stats, a, indent=4)
for k,v in headers.items():
b.write(k+":"+v+"\n")
#Create a compressed file to send
tar = tarfile.open(tarFile, 'w:gz' )
for name in [statsFile,headersFile]:
tar.add(name)
tar.close()
#Read the binary data from the file
with open(tarFile, 'rb') as f:
content = f.read()
url = "http://www.myurl.com"
req = urllib2.Request(url, data=content, headers=headers)
response = urllib2.urlopen(req, timeout=timeout)
If I use requests, it works like a charm:
r = requests.post(url, files={tarFile: open(tarFile, 'rb')}, headers=headers)
I essentially need the equivalent of the above for urllib2. Does anybody maybe know it? I have checked the docs as well but I was not able to make it work..What am I missing?
Thanks!
Related
This work fine but I want to upload multiple file (metadata) to ipfs under same CID using python.
import requests
import json
import os
import csv
header = ['image', 'IPFS']
images = os.listdir("./images/")
with open('ipfs.csv', 'w', encoding='UTF8') as f:
writer = csv.writer(f)
# write the header
writer.writerow(header)
for image in images:
# write the data
data=[]
name=image.replace(".png","").replace(".jpg","")
data.append(name)
url = "https://api.pinata.cloud/pinning/pinFileToIPFS"
payload={}
files=[
('file',('file',open("./images/"+image,'rb'),'application/octet-stream'))
]
headers = {
'pinata_api_key': 'APIKEY',
'pinata_secret_api_key': 'SECRETAPIKEY'
}
response = requests.request("POST", url, headers=headers, data=payload, files=files)
info=json.loads(response.text)
data.append("ipfs://"+info['IpfsHash'])
writer.writerow(data)
And if I get solution by using another api it fine.
And one more thing I'm running this code on Android pydroid3
You cant use the same cid for different files. Cid's hash the file and return a cid/the hash.
I have a jpg image that is stored at a url that I need to access and read the binary/byte data from.
I can get the file in Python by using:
import urllib3
http = urllib3.PoolManager()
url = 'link to jpg'
contents = http.request('GET' url)
Purely reading the data from this request with contents.data doesn't provide the correct binary but if I download the file and read it locally, I get the correct binary. But I cannot continue with reading the file contents as such:
with open(contents, "rb") as image:
f = image.read()
Using the bytes from the request doesn't work either:
with open(contents.data, "rb") as image:
f = image.read()
How can I treat the jpg from the url as if it were local so that I can read the binary correctly?
The result obtained in f when file is read locally and the result of contents.data is exactly the same.
import urllib3
http = urllib3.PoolManager()
url = 'https://tinyjpg.com/images/social/website.jpg'
contents = http.request('GET', url)
with open('website.jpg', "rb") as image:
f = image.read()
print(f==contents.data)
You can download the image from the link in the code and then run this code, you will receive output True which implies the data read from local image file is same as data read from website.
I am doing this for the first time and so far have setup a simple script to fetch 2 columns of data from an APIThe data comes through and I can see it with print commandNow I am trying to write it to CSV and setup the code below which creates the file but I can't figure out how to:1. Remove the blank lines in between each data row2. Add delimiters to the data which I want to be " "3. If a value such as IP is blank then just show " "I searched and tried all sorts of examples but just getting errorsMy code snippet which writes the CSV successfully is
import requests
import csv
import json
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#Open a File for Writing
f = csv.writer(open(filename, "w", encoding='utf8'))
# Write CSV Header
f.writerow(["Computer_Name", "IP_Addresses"])
for computer in response_dict["advanced_computer_search"]["computers"]:
f.writerow([computer["Computer_Name"],computer["IP_Addresses"]])
CSV output I get looks like this:
Computer_Name,IP_Addresses
HYDM002543514,
HYDM002543513,10.93.96.144 - AirPort - en1
HYDM002544581,192.168.1.8 - AirPort - en1 / 10.93.224.177 -
GlobalProtect - gpd0
HYDM002544580,10.93.80.101 - Ethernet - en0
HYDM002543515,192.168.0.6 - AirPort - en0 / 10.91.224.58 -
GlobalProtect - gpd0
CHAM002369458,10.209.5.3 - Ethernet - en0
CHAM002370188,192.168.0.148 - AirPort - en0 / 10.125.91.23 -
GlobalProtect - gpd0
MacBook-Pro,
I tried adding
csv.writer(f, delimiter =' ',quotechar =',',quoting=csv.QUOTE_MINIMAL)
after the f = csv.writer line but that creates an error:TypeError: argument 1 must have a "write" method
I am sure its something simple but just can't find the correct solution to implement in the code I have. Any help is appreciated.
Also, does the file get closed automatically? Some examples suggest to use something like f.close() but that causes errors. Do I need it? The file seems to get created fine as-is.
I suggest you use pandas package to write .csv file, which is a most used package for data analysis.
For your problem:
import requests
import csv
import json
import pandas
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#collect data to build pandas.DataFrame
data = []
for computer in response_dict["advanced_computer_search"]["computers"]:
# filter blank line
if computer["Computer_Name"] or computer["IP_Addresses"]:
data.append({"Computer_Name":computer["Computer_Name"],"IP_Addresses":computer["IP_Addresses"]})
pandas.DataFrame(data=data).to_csv(filename, index=False)
if you want use " " to separate value, you can set sep=" " in the last line output the .csv file. However, I recommend to use , as delimiters due to it's a common standard. Also much more configs could be set for DataFrame.to_csv() method, you can check the official docs. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
As you said in comment, pandas is not a standard python package. You can simply open a file and write lines to that file, with the lines you build manually. For example:
import requests
import csv
import json
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#Open a File for Writing
f = csv.writer(open(filename, "w", encoding='utf8'))
with open(filename, mode='w') as f:
# Write CSV Header
f.write("Computer_Name,"+"IP_Addresses"+"\n")
for computer in response_dict["advanced_computer_search"]["computers"]:
# filter blank line
if computer["Computer_Name"] or computer["IP_Addresses"]:
f.write("\""+computer["Computer_Name"]+"\","+"\""+computer["IP_Addresses"]+"\"\n")
Note that " around value was build by appending \". \n to change new line after each loop.
I'm using this code to download .torrent files:
torrent = urllib2.urlopen(torrent URL, timeout = 30)
output = open('mytorrent.torrent', 'wb')
output.write(torrent.read())
The resultant mytorrent.torrent file doesn't open in any bittorrent client and throws up "unable to parse meta file" error. The problem apparently is that although the torrent URL (e.g. http://torcache.com/torrent-file-1.torrent) ends with a '.torrent' suffix, it is compressed using gzip and needs to be uncompressed and then saved as a torrent file. I've confirmed this by unzipping the file in terminal:gunzip mytorrent.torrent > test.torrent and opening the file in the bittorrent client which opens fine.
How do I modify python to look up the file encoding and figure out how the file is compressed and use the right tool to uncompress it and save as a .torrent file?
gzip'ed data must be unziped. You can deteted this, if you look out for the content-encoding header.
import gzip, urllib2, StringIO
req = urllib2.Request(url)
opener = urllib2.build_opener()
response = opener.open(req)
data = response.read()
if response.info()['content-encoding'] == 'gzip':
gzipper = gzip.GzipFile(StringIO(fileobj=data))
plain = gzipper.read()
data = plain
output.write(data)
I'm uploading potentially large files to a web server. Currently I'm doing this:
import urllib2
f = open('somelargefile.zip','rb')
request = urllib2.Request(url,f.read())
request.add_header("Content-Type", "application/zip")
response = urllib2.urlopen(request)
However, this reads the entire file's contents into memory before posting it. How can I have it stream the file to the server?
Reading through the mailing list thread linked to by systempuntoout, I found a clue towards the solution.
The mmap module allows you to open file that acts like a string. Parts of the file are loaded into memory on demand.
Here's the code I'm using now:
import urllib2
import mmap
# Open the file as a memory mapped string. Looks like a string, but
# actually accesses the file behind the scenes.
f = open('somelargefile.zip','rb')
mmapped_file_as_string = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
# Do the request
request = urllib2.Request(url, mmapped_file_as_string)
request.add_header("Content-Type", "application/zip")
response = urllib2.urlopen(request)
#close everything
mmapped_file_as_string.close()
f.close()
The documentation doesn't say you can do this, but the code in urllib2 (and httplib) accepts any object with a read() method as data. So using an open file seems to do the trick.
You'll need to set the Content-Length header yourself. If it's not set, urllib2 will call len() on the data, which file objects don't support.
import os.path
import urllib2
data = open(filename, 'r')
headers = { 'Content-Length' : os.path.getsize(filename) }
response = urllib2.urlopen(url, data, headers)
This is the relevant code that handles the data you supply. It's from the HTTPConnection class in httplib.py in Python 2.7:
def send(self, data):
"""Send `data' to the server."""
if self.sock is None:
if self.auto_open:
self.connect()
else:
raise NotConnected()
if self.debuglevel > 0:
print "send:", repr(data)
blocksize = 8192
if hasattr(data,'read') and not isinstance(data, array):
if self.debuglevel > 0: print "sendIng a read()able"
datablock = data.read(blocksize)
while datablock:
self.sock.sendall(datablock)
datablock = data.read(blocksize)
else:
self.sock.sendall(data)
Have you tried with Mechanize?
from mechanize import Browser
br = Browser()
br.open(url)
br.form.add_file(open('largefile.zip'), 'application/zip', 'largefile.zip')
br.submit()
or, if you don't want to use multipart/form-data, check this old post.
It suggests two options:
1. Use mmap, Memory Mapped file object
2. Patch httplib.HTTPConnection.send
Try pycurl. I don't have anything setup will accept a large file that isn't in a multipart/form-data POST, but here's a simple example that reads the file as needed.
import os
import pycurl
class FileReader:
def __init__(self, fp):
self.fp = fp
def read_callback(self, size):
return self.fp.read(size)
c = pycurl.Curl()
c.setopt(pycurl.URL, url)
c.setopt(pycurl.UPLOAD, 1)
c.setopt(pycurl.READFUNCTION, FileReader(open(filename, 'rb')).read_callback)
filesize = os.path.getsize(filename)
c.setopt(pycurl.INFILESIZE, filesize)
c.perform()
c.close()
Using the requests library you can do
with open('massive-body', 'rb') as f:
requests.post('http://some.url/streamed', data=f)
as mentioned here in their docs
Below is the working example for both Python 2 / Python 3:
try:
from urllib2 import urlopen, Request
except:
from urllib.request import urlopen, Request
headers = { 'Content-length': str(os.path.getsize(filepath)) }
with open(filepath, 'rb') as f:
req = Request(url, data=f, headers=headers)
result = urlopen(req).read().decode()
The requests module is great, but sometimes you cannot install any extra modules...