I am not an expert in Python but I used it to call data from an API. I got a code 200 and printed part of the data but I am not able to export/ save/ write the output (to CSV). Can anyone assist?
This is my code:
import requests
headers = {
'Accept': 'text/csv',
'Authorization': 'Bearer ...'
}
response = requests.get('https://feeds.preqin.com/api/investor/pe', headers=headers)
print response
output = response.content
And here is how the data (should be CSV, correct?) looks like:
enter image description here
I managed to save it as txt but the output is not usable/ importable (e.g. to Excel). I used the following code:
text_file = open("output.txt", "w")
n = text_file.write(output)
text_file.close()
Thank you and best regards,
A
Your content uses pipes | as a separator. CSVs use , commas (that's why they're called Comma Separated Values).
You can simply replace your data's pipes with commas. However, this may be problematic if the data you have itself uses commas.
output = response.content.replace("|", ",")
As comments have suggested, you could also use pandas:
import pandas as pd
from StringIO import StringIO
# Get your output normally...
output = response.content
df = pd.read_csv(StringIO(output), sep="|")
# Saving to .CSV
df.to_csv(r"C:\output.csv")
# Saving to .XLSX
df.to_excel(r"C:\output.xlsx")
Related
I am doing this for the first time and so far have setup a simple script to fetch 2 columns of data from an APIThe data comes through and I can see it with print commandNow I am trying to write it to CSV and setup the code below which creates the file but I can't figure out how to:1. Remove the blank lines in between each data row2. Add delimiters to the data which I want to be " "3. If a value such as IP is blank then just show " "I searched and tried all sorts of examples but just getting errorsMy code snippet which writes the CSV successfully is
import requests
import csv
import json
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#Open a File for Writing
f = csv.writer(open(filename, "w", encoding='utf8'))
# Write CSV Header
f.writerow(["Computer_Name", "IP_Addresses"])
for computer in response_dict["advanced_computer_search"]["computers"]:
f.writerow([computer["Computer_Name"],computer["IP_Addresses"]])
CSV output I get looks like this:
Computer_Name,IP_Addresses
HYDM002543514,
HYDM002543513,10.93.96.144 - AirPort - en1
HYDM002544581,192.168.1.8 - AirPort - en1 / 10.93.224.177 -
GlobalProtect - gpd0
HYDM002544580,10.93.80.101 - Ethernet - en0
HYDM002543515,192.168.0.6 - AirPort - en0 / 10.91.224.58 -
GlobalProtect - gpd0
CHAM002369458,10.209.5.3 - Ethernet - en0
CHAM002370188,192.168.0.148 - AirPort - en0 / 10.125.91.23 -
GlobalProtect - gpd0
MacBook-Pro,
I tried adding
csv.writer(f, delimiter =' ',quotechar =',',quoting=csv.QUOTE_MINIMAL)
after the f = csv.writer line but that creates an error:TypeError: argument 1 must have a "write" method
I am sure its something simple but just can't find the correct solution to implement in the code I have. Any help is appreciated.
Also, does the file get closed automatically? Some examples suggest to use something like f.close() but that causes errors. Do I need it? The file seems to get created fine as-is.
I suggest you use pandas package to write .csv file, which is a most used package for data analysis.
For your problem:
import requests
import csv
import json
import pandas
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#collect data to build pandas.DataFrame
data = []
for computer in response_dict["advanced_computer_search"]["computers"]:
# filter blank line
if computer["Computer_Name"] or computer["IP_Addresses"]:
data.append({"Computer_Name":computer["Computer_Name"],"IP_Addresses":computer["IP_Addresses"]})
pandas.DataFrame(data=data).to_csv(filename, index=False)
if you want use " " to separate value, you can set sep=" " in the last line output the .csv file. However, I recommend to use , as delimiters due to it's a common standard. Also much more configs could be set for DataFrame.to_csv() method, you can check the official docs. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
As you said in comment, pandas is not a standard python package. You can simply open a file and write lines to that file, with the lines you build manually. For example:
import requests
import csv
import json
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#Open a File for Writing
f = csv.writer(open(filename, "w", encoding='utf8'))
with open(filename, mode='w') as f:
# Write CSV Header
f.write("Computer_Name,"+"IP_Addresses"+"\n")
for computer in response_dict["advanced_computer_search"]["computers"]:
# filter blank line
if computer["Computer_Name"] or computer["IP_Addresses"]:
f.write("\""+computer["Computer_Name"]+"\","+"\""+computer["IP_Addresses"]+"\"\n")
Note that " around value was build by appending \". \n to change new line after each loop.
i have an api end point where i am uploading data to using python. end point accepts
putHeaders = {
'Authorization': user,
'Content-Type': 'application/octet-stream' }
My current code is doing this
.Save a dictionary as csv file
.Encode csv to utf8
dataFile = open(fileData['name'], 'r').read()).encode('utf-8')
.Upload file to api end point
fileUpload = requests.put(url,
headers=putHeaders,
data=(dataFile))
What i am trying to acheive is
loading the data without saving
so far i tried
converting my dictionary to bytes using
data = json.dumps(payload).encode('utf-8')
and loading to api end point . This works but the output in api end point is not correct.
Question
Does anyone know how to upload csv type data without actually saving the file ?
EDIT: use io.StringIO() as your file-like object when your writing your dict to csv. Then call get_value() and pass that as your data param to requests.put().
See this question for more details: How do I write data into CSV format as string (not file)?.
Old answer:
If your dict is this:
my_dict = {'col1': 1, 'col2': 2}
then you could convert it to a csv format like so:
csv_data = ','.join(list(my_dict.keys()))
csv_data += ','.join(list(my_dict.values()))
csv_data = csv_data.encode('utf8')
And then do your requests.put() call with data=csv_data.
Updated answer
I hadn't realized your input was a dictionary, you had mentioned the dictionary was being saved as a file. I assumed the dictionary lookup in your code was referencing a file. More work needs to be done if you want to go from a dict to a CSV file-like object.
Based on the I/O from your question, it appears that your input dictionary has this structure:
file_data = {"name": {"Col1": 1, "Col2": 2}}
Given that, I'd suggest trying the following using csv and io:
import csv
import io
import requests
session = requests.Session()
session.headers.update(
{"Authorization": user, "Content-Type": "application/octet-stream"}
)
file_data = {"name": {"Col1": 1, "Col2": 2}}
with io.StringIO() as f:
name = file_data["name"]
writer = csv.DictWriter(f, fieldnames=name)
writer.writeheader()
writer.writerows([name]) # `data` is dict but DictWriter expects list of dicts
response = session.put(url, data=f)
You may want to test using the correct MIME type passed in the request header. While the endpoint may not care, it's best practice to use the correct type for the data. CSV should be text/csv. Python also provides a MIME types module:
>>> import mimetypes
>>>
>>> mimetypes.types_map[".csv"]
'text/csv'
Original answer
Just open the file in bytes mode and rather than worrying about encoding or reading into memory.
Additionally, use a context manager to handle the file rather than assigning to a variable, and pass your header to a Session object so you don't have to repeatedly pass header data in your request calls.
Documentation on the PUT method:
https://requests.readthedocs.io/en/master/api/#requests.put
data – (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the Request.
import requests
session = requests.Session()
session.headers.update(
{"Authorization": user, "Content-Type": "application/octet-stream"}
)
with open(file_data["name"], "rb") as f:
response = session.put(url, data=f)
Note: I modified your code to more closely follow python style guides.
import base64
import requests
USERNAME, PASSWORD = 'notworking', 'notworking'
def send_request():
# Request
try:
response = requests.get(
url="https://api.mysportsfeeds.com/v1.1/pull/nhl/2017-2018-regular/cumulative_player_stats.{format}",
params={
"fordate": "20171009"
},
headers={
"Authorization": "Basic " +
base64.b64encode('{}:{}'.format(USERNAME,PASSWORD)\
.encode('utf-8')).decode('ascii')
}
)
print('Response HTTP Status Code: {status_code}'.format(
status_code=response.status_code))
print('Response HTTP Response Body: {content}'.format(
content=response.content))
except requests.exceptions.RequestException:
print('HTTP Request failed')
That code allows me to pull data from mysportsfeeds.com. Eventually, I will need to take the output of send_request function and format it in a .xlsx file with openpyxl library. I don't know which format will be the most easier to treat, i.e. the output with csv or with json format.
That excellent website will show you how to get the output of cumulate_player_stats.
For instance,
https://api.mysportsfeeds.com/v1.1/pull/nhl/2016-2017-regular/cumulative_player_stats.{format}
where {format} is either csv or json
Questions :
What is the better choice : the output in csv format ou json format so that it will work well with openpyxl lib? Will anyone be able to show me how it could work with csv (with csv library) and json (with json library) in using openpyxl?
Excel is a row-based file format. This would suggest CSV, which is row based. But CSV files are text-only, meaning that they contain no type information and you have to guess whether "9/10/17" means the 9th of October (20)17, the 10th of September (20)17, simply "9/10/17".
JSON is at least types but will need to be read into memory all at once. Assuming that it merely a list of list then it would probably be the best option here because Excel worksheets cannot have more than a million rows.
JSON data output when printed in command line I am currently pulling data via an API and am attempting to write the data into a CSV in order to run calculations in SQL. I am currently able to pull the data, open the CSV, however an error occurs when the data is being written into the CSV. The error is that each individual character is separated by a comma.
I am new to working with JSON data so I am curious if I need to perform an intermediary step between pulling the JSON data and inserting it into a CSV. Any help would be greatly appreciated as I am completely stuck on this (even the data provider does not seem to know how to get around this).
Please see the code below:
import requests
import time
import pyodbc
import csv
import json
headers = {'Authorization': 'Token'}
Metric1 = ['Website1','Website2']
Metric2 = ['users','hours','responses','visits']
Metric3 = ['Country1','Country2','Country3']
obs_list = []
obs_file = r'TEST.csv'
with open(obs_file, 'w') as csvfile:
f=csv.writer(csvfile)
for elem1 in Metric1:
for elem2 in Metric2:
for elem3 in Metric3:
URL = "www.data.com"
r = requests.get(URL, headers=headers, verify=False)
for elem in r:
f.writerow(elem) `
Edit: When I print the data instead of writing it to a CSV, the data appears in the command window in the following format:
[timestamp, metric], [timestamp, metric], [timestamp, metric] ...
Timestamp = 12 digit character
Metric = decimal value
I am trying to send data from a text file to a server looking for a match to the sent data in order to get that matched data returned back to me that I store in an existing text file. If I send a list of names to the server within the script, I am fine. I however want to repeat the request and insert a text file as the names to be matched and returned. Here is my text so far:
import json
import urllib2
values = 'E:\names.txt'
url = 'https://myurl.com/get?name=values&key=##########'
response = json.load(urllib2.urlopen(url))
with open('E:\data.txt', 'w') as outfile:
json.dump(response, outfile, sort_keys = True, indent = 4,ensure_ascii=False);
This code just send back a one line file showing nothing has matched. I am assuming that it is just looking at the values as the name instead of the data in the values text file.
Update Trial 1: I updated my code as per suggested below to include the urllib.urlencode suggestion. Here is my updated code:
import json
import urllib
import urllib2
file = 'E:\names.txt'
url = 'https://myurl.com/get'
values = {'name' : file,
'key' : '##########'}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = json.load(urllib2.urlopen(req))
with open('E:\data.txt', 'w') as outfile:
json.dump(response, outfile, sort_keys = True, indent = 4,ensure_ascii=False);
fixed traceback errors by editing url. However it is just passing "e:\names.txt" as name in the JSON request. So it seems my issue now is just trying to send the data in the names.txt file to the tuple 'names' properly. Any thoughts?
Make sure when sending parameters to server, they're encoded -- see urllib.urlencode()