how can i download database throgh api and get.request? - python

I try to download the database through Api.
I extract the name of data through csv file and I try to iterate the url through the list. but in last step: get.request I get the error and non of the input can not be recognized,
This is my code:
#!/usr/bin/env python
import requests
import json
import urllib
import os
from pandas import *
data = read_csv("A3D_Human_Database.csv")
job_iden = data['job_id'].tolist()
for d in job_iden:
p = d[0:14]
print(f"downloading {p}")
req = requests.get('http://biocomp.chem.uw.edu.pl/A3D2/RESTful/hproteome_job/{p}/')
print(req.status_code)

Related

Creating a tablib Dataset using a URL that points to a .csv file

I am trying to use the tablib library and create a Dataset from a .csv file. The following works:
import tablib
dataset = tablib.Dataset().load(open('data.csv').read())
However, in some cases, I'd like to load the .csv file from a URL.
Any ideas on how to do that?
You wrote
def get_ds(filename):
return tablib.Dataset().load(open(filename).read())
You want
import os.path
import requests
def get_ds(src):
if os.path.exists(src):
txt = open(src).read()
else:
req = requests.get(src)
req.raise_for_status()
txt = req.text
return tablib.Dataset().load(txt)

Post Large File Using requests_toolbelt to vk

I am new to python, I wrote simple script for uploading video from url to vk, I test this script with small files it's working, but for large files I get run out of memory, I read that using 'requests_toolbelt' it's possible to post large file, How can I add this to my script?
import vk
import requests
from homura import download
import glob
import os
import json
url=raw_input("Enter URL: ")
download(url)
file_name = glob.glob('*.mp4')[0]
session = vk.Session(access_token='TOKEN')
vkapi = vk.API(session,v='5.80' )
params={'name' : file_name,'privacy_view' : 'nobody', 'privacy_comment' : 'nobody'}
param = vkapi.video.save(**params)
upload_url = param['upload_url']
print ("Uploading ...")
request = requests.post(upload_url, files={'video_file': open(file_name, "rb")})
os.remove (file_name)
requests_toolbelt (https://github.com/requests/toolbelt) has just the example that might work for you:
import requests
from requests_toolbelt import MultipartEncoder
...
...
m=MultipartEncoder( fields={'video_file':(file_name, open(file_name, "rb"))})
response = requests.post(upload_url, data=m, headers={'Content-Type': m.content_type})
If you know your video file's MIME type, you can add it as a 3-rd item in the () tuple like this:
m=MultipartEncoder( fields={
'video_file':(file_name, open(file_name,"rb"), "video/mp4")})

python urllib alternative to requests.put command?

I want to send image data via urllib python 3.6 library.
i presently have an implementation of a python 2.7 with the help of requests library.
Id there a way to replace the requests lib with the urllib in this code.
import argparse
import io
import os
import sys
import base64
import requests
def read_file_bytestream(image_path):
data = open(image_path, 'rb').read()
return data
if __name__== "__main__":
data=read_file_bytestream("testimg.png")
requests.put("http.//0.0.0.0:8080", files={'image': data})
Here is one way, pretty much taken from the docs:
import urllib.request
def read_file_bytestream(image_path):
data = open(image_path, 'rb').read()
return data
DATA = read_file_bytestream("file.jpg")
req = urllib.request.Request(url='http://httpbin.org/put', data=DATA, method='PUT')
with urllib.request.urlopen(req) as f:
pass
print(f.status)
print(f.reason)

How to send a request in a keep-alive "urlopen" session using python?

I'm trying to write a HTML based grabber that can grab a twitter user's all pictures.
I realized that only when we scroll down to the bottom that a GET request would be sent to load more tweet/pictures.
But I have no idea about how to simulate that in a python code.
This is my code tho it can only grab the "first page" pictures.
import urllib2
import urllib
import re
import sys
import os
import urllib3
generalurl='https://twitter.com/'
INPUT_id = raw_input('Please input the target userid:')
targetpage = generalurl + INPUT_id + '/media'
page = urllib2.urlopen(targetpage)
fo = open('test0.html','w')
fo.write(page.read())
fo.close()
fo = open('test0.html','r')
pics = re.findall('(https://pbs.twimg.com/media/\S+.jpg)',fo.read())
fo.close()
for everyid in pics:
open_ = urllib.urlopen(everyid)
filename = re.findall('https://pbs.twimg.com/media/(\S+.jpg)',everyid)
f=open(filename[0],'wb')
f.write(open_.read())
f.close()

BadZipFile while downloading from Kaggle

I am trying to download and unzip Kaggle dataset by python script(Python 3.5), but I get an error.
import io
from zipfile import ZipFile
import csv
import urllib.request
url = 'https://www.kaggle.com/c/quora-question-pairs/download/test.csv.zip'
response = urllib.request.urlopen(url)
c=ZipFile(io.BytesIO(response.read()))
After running this code, I get the following error.
BadZipFile: File is not a zip file
How can I get rid of this error? What's the cause?
Using requests module and some minor fix to http://ramhiser.com/2012/11/23/how-to-download-kaggle-data-with-python-and-requests-dot-py/ the solution is:
import io
from zipfile import ZipFile
import csv
import requests
# The direct link to the Kaggle data set
data_url = 'https://www.kaggle.com/c/quora-question-pairs/download/test.csv.zip'
# The local path where the data set is saved.
local_filename = "test.csv.zip"
# Kaggle Username and Password
kaggle_info = {'UserName': "my_username", 'Password': "my_password"}
# Attempts to download the CSV file. Gets rejected because we are not logged in.
r = requests.get(data_url)
# Login to Kaggle and retrieve the data.
r = requests.post(r.url, data = kaggle_info)
# Writes the data to a local file one chunk at a time.
f = open(local_filename, 'wb')
for chunk in r.iter_content(chunk_size = 512 * 1024): # Reads 512KB at a time into memory
if chunk: # filter out keep-alive new chunks
f.write(chunk)
f.close()
c = ZipFile(local_filename)

Categories