Python - How to add delimiter and remove line breaks in CSV output? - python

I am doing this for the first time and so far have setup a simple script to fetch 2 columns of data from an APIThe data comes through and I can see it with print commandNow I am trying to write it to CSV and setup the code below which creates the file but I can't figure out how to:1. Remove the blank lines in between each data row2. Add delimiters to the data which I want to be " "3. If a value such as IP is blank then just show " "I searched and tried all sorts of examples but just getting errorsMy code snippet which writes the CSV successfully is
import requests
import csv
import json
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#Open a File for Writing
f = csv.writer(open(filename, "w", encoding='utf8'))
# Write CSV Header
f.writerow(["Computer_Name", "IP_Addresses"])
for computer in response_dict["advanced_computer_search"]["computers"]:
f.writerow([computer["Computer_Name"],computer["IP_Addresses"]])
CSV output I get looks like this:
Computer_Name,IP_Addresses
HYDM002543514,
HYDM002543513,10.93.96.144 - AirPort - en1
HYDM002544581,192.168.1.8 - AirPort - en1 / 10.93.224.177 -
GlobalProtect - gpd0
HYDM002544580,10.93.80.101 - Ethernet - en0
HYDM002543515,192.168.0.6 - AirPort - en0 / 10.91.224.58 -
GlobalProtect - gpd0
CHAM002369458,10.209.5.3 - Ethernet - en0
CHAM002370188,192.168.0.148 - AirPort - en0 / 10.125.91.23 -
GlobalProtect - gpd0
MacBook-Pro,
I tried adding
csv.writer(f, delimiter =' ',quotechar =',',quoting=csv.QUOTE_MINIMAL)
after the f = csv.writer line but that creates an error:TypeError: argument 1 must have a "write" method
I am sure its something simple but just can't find the correct solution to implement in the code I have. Any help is appreciated.
Also, does the file get closed automatically? Some examples suggest to use something like f.close() but that causes errors. Do I need it? The file seems to get created fine as-is.

I suggest you use pandas package to write .csv file, which is a most used package for data analysis.
For your problem:
import requests
import csv
import json
import pandas
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#collect data to build pandas.DataFrame
data = []
for computer in response_dict["advanced_computer_search"]["computers"]:
# filter blank line
if computer["Computer_Name"] or computer["IP_Addresses"]:
data.append({"Computer_Name":computer["Computer_Name"],"IP_Addresses":computer["IP_Addresses"]})
pandas.DataFrame(data=data).to_csv(filename, index=False)
if you want use " " to separate value, you can set sep=" " in the last line output the .csv file. However, I recommend to use , as delimiters due to it's a common standard. Also much more configs could be set for DataFrame.to_csv() method, you can check the official docs. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
As you said in comment, pandas is not a standard python package. You can simply open a file and write lines to that file, with the lines you build manually. For example:
import requests
import csv
import json
# Make an API call and store response
url = 'https://api-url-goes-here.com'
filename = "test.csv"
headers = {
'accept': 'application/json',
}
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
r = requests.get(url, headers=headers, auth=('User','PWD'))
print(f"Status code: {r.status_code}")
#Store API response in a variable
response_dict = r.json()
#Open a File for Writing
f = csv.writer(open(filename, "w", encoding='utf8'))
with open(filename, mode='w') as f:
# Write CSV Header
f.write("Computer_Name,"+"IP_Addresses"+"\n")
for computer in response_dict["advanced_computer_search"]["computers"]:
# filter blank line
if computer["Computer_Name"] or computer["IP_Addresses"]:
f.write("\""+computer["Computer_Name"]+"\","+"\""+computer["IP_Addresses"]+"\"\n")
Note that " around value was build by appending \". \n to change new line after each loop.

Related

Exporting API output response to CSV - Python

I am not an expert in Python but I used it to call data from an API. I got a code 200 and printed part of the data but I am not able to export/ save/ write the output (to CSV). Can anyone assist?
This is my code:
import requests
headers = {
'Accept': 'text/csv',
'Authorization': 'Bearer ...'
}
response = requests.get('https://feeds.preqin.com/api/investor/pe', headers=headers)
print response
output = response.content
And here is how the data (should be CSV, correct?) looks like:
enter image description here
I managed to save it as txt but the output is not usable/ importable (e.g. to Excel). I used the following code:
text_file = open("output.txt", "w")
n = text_file.write(output)
text_file.close()
Thank you and best regards,
A
Your content uses pipes | as a separator. CSVs use , commas (that's why they're called Comma Separated Values).
You can simply replace your data's pipes with commas. However, this may be problematic if the data you have itself uses commas.
output = response.content.replace("|", ",")
As comments have suggested, you could also use pandas:
import pandas as pd
from StringIO import StringIO
# Get your output normally...
output = response.content
df = pd.read_csv(StringIO(output), sep="|")
# Saving to .CSV
df.to_csv(r"C:\output.csv")
# Saving to .XLSX
df.to_excel(r"C:\output.xlsx")

How to read excel using pandas row by row?

So i'm trying to make a program reads a excel sheet and passes the commands into the curl command. So it should read the excel file and pass each variable into the curl command and keep doing it to each row. So I got the curl command to work.
However, when I tried to read my excel sheet. I'm getting this error and not really sure how to fix this so it goes away.
charmap' codec can't decode byte 0x8f in position 114: character maps to
So here is my code:
import requests
import json
import pprint
import urllib
import sys
import pandas as pd
turl='*'
headers={'authorization':'Basic *'}
data={
'grant_type':'*',
'username':'*',
'password':'*'
}
token=requests.post(turl,data=data,headers=headers)
jtoken=token.json()
json_str=json.dumps(jtoken)
resp=json.loads(json_str)
jkk=resp['access_token']
print(jkk)
path='C:\\Users\\temp\\Desktop\\Test123.xlsx'
data = []
with open(path) as f:
for line in f:
data.append(line.strip().split(','))
data = data[1:] # get the data without the first row which is data headers
print(data)
for entry in data:
name, path, Tname, formatG = entry
url1='*'
data={"name": "{}".format(name),
"path": "{}".format(path) ,
"Tname" : "{}".format(Tname),
"formatG":"{}".format(formatG)
}
pprint.pprint(response.json())
data_json = json.dumps(data)
headers = {'Content-type': 'application/json','Authorization': 'Bearer {}'.format(jkk)}
response = requests.post(url1, data=data_json, headers=headers)
pprint.pprint(response.json())
In my code I have * for privacy reasons. I'm currently having problems reading the excel sheet row by row and passing the data into the curl command.

Downloading XML files from a web services URL in python

Please correct me if I am wrong as I am a beginner in python.
I have a web services URL which contains an XML file:
http://abc.tch.xyz.edu:000/patientlabtests/id/1345
I have a list of values and I want to append each value in that list to the URL and download file for each value and the name of the downloaded file should be the same to the value appended from the list.
It is possible to download one file at a time but I have 1000's of values in the list and I was trying to write a function with a for loop and I am stuck.
x = [ 1345, 7890, 4729]
for i in x :
url = http://abc.tch.xyz.edu:000/patientlabresults/id/{}.format(i)
response = requests.get(url2)
****** Missing part of the code ********
with open('.xml', 'wb') as file:
file.write(response.content)
file.close()
The files downloaded from URL should be like
"1345patientlabresults.xml"
"7890patientlabresults.xml"
"4729patientlabresults.xml"
I know there is a part of the code which is missing and I am unable to fill in that missing part. I would really appreciate if anyone can help me with this.
Accessing your web service url seem not to be working. Check this.
import requests
x = [ 1345, 7890, 4729]
for i in x :
url2 = "http://abc.tch.xyz.edu:000/patientlabresults/id/"
response = requests.get(url2+str(i)) # i must be converted to a string
Note: When you use 'with' to open a file, you do not have close the file since it will closed automatically.
with open(filename, mode) as file:
file.write(data)
Since the Url you provide is not working, I am going to use a different url. And I hope you get the idea and how to write to a file using the custom name
import requests
categories = ['fruit', 'car', 'dog']
for category in categories :
url = "https://icanhazdadjoke.com/search?term="
response = requests.get(url + category)
file_name = category + "_JOKES_2018" #Files will be saved as fruit_JOKES_2018
r = requests.get(url + category)
data = r.status_code #Storing the status code in 'data' variable
with open(file_name+".txt", 'w+') as f:
f.write(str(data)) # Writing the status code of each url in the file
After running this code, the status codes will be written in each of the files. And the file will also be named as follows:
car_JOKES_2018.txt
dog_JOKES_2018.txt
fruit_JOKES_2018.txt
I hope this gives you an understanding of how to name the files and write into the files.
I think you just want to create a path using str.format as you (almost) are for the URL. maybe something like the following
import os.path
x = [ 1345, 7890, 4729]
for i in x:
path = '1345patientlabresults.xml'.format(i)
# ignore this file if we've already got it
if os.path.exists(path):
continue
# try and get the file, throwing an exception on failure
url = 'http://abc.tch.xyz.edu:000/patientlabresults/id/{}'.format(i)
res = requests.get(url)
res.raise_for_status()
# write the successful file out
with open(path, 'w') as fd:
fd.write(res.content)
I've added some error handling and better behaviour on retry

Looping through IDs for with API

I am trying to bulk download movie information from The Movie Database. The preferred method mentioned on their website is to loop through movie IDs from 1 until the most recent movie ID. When I pull individual movies using their ID, I get the entire set of information. However, when I pull it into a loop, I receive an error 34, resource cannot be found. For my example, I picked specifically a movie ID that I have grabbed individual (Skyfall, 37724), which returns the resource cannot be found error.
import requests
dataset = []
for i in range(37724, 37725):
url = 'https://api.themoviedb.org/3/movie/x?api_key=*****&language=en-US'
movieurl = url[:35] + str(i) + url[36:]
payload = "{}"
response = requests.request("GET", url, data=payload)
data = response.json()
dataset.append(data)
print(movieurl)
dataset
[ANSWERED] 1) Is there a reason for why the loop cannot pull the information? Is this a programming question or specific to the API?
2) Is the way my code set up the best to pull the information and store it in bulk? My ultimate goal is to create a CSV file with the data.
Your request uses url, while your actual url is in the movieurl variable.
To write your data to csv, I would recommend the python csv DictWriter, as your data are dicts (response.json() produces a dict).
BONUS: If you want to format a string, use the string.format method:
url = 'https://api.themoviedb.org/3/movie/{id}?api_key=*****&language=en-US'.format(id=i)
this is much more robust.
The working, improved version of your code, with writing to csv would be:
import csv
import requests
with open('output.csv', 'w') as csvfile:
writer = csv.DictWriter(csvfile)
for i in range(37724, 37725):
url = 'https://api.themoviedb.org/3/movie/{id}?api_key=*****&language=en-US'.format(id=i)
payload = "{}"
response = requests.request("GET", url, data=payload)
writer.writerow(response.json())

How to POST a tgz file in Python using urllib2

I would like to POST a .tgz file with the Python urllib2 library to a backend server. I can't use requests due to some licensing issues. There are some examples of file upload on stackoverflow but all relate to attaching a file in a form.
My code is the following but it unfortunately fails:
stats["random"] = "data"
statsFile = "mydata.json"
headersFile = "header-data.txt"
tarFile = "body.tgz"
headers = {}
#Some custom headers
headers["X-confidential"] = "Confidential"
headers["X-version"] = "2"
headers["Content-Type"] = "application/x-gtar"
#Create the json and txt files
with open(statsFile, 'w') as a, open(headersFile, 'w') as b:
json.dump(stats, a, indent=4)
for k,v in headers.items():
b.write(k+":"+v+"\n")
#Create a compressed file to send
tar = tarfile.open(tarFile, 'w:gz' )
for name in [statsFile,headersFile]:
tar.add(name)
tar.close()
#Read the binary data from the file
with open(tarFile, 'rb') as f:
content = f.read()
url = "http://www.myurl.com"
req = urllib2.Request(url, data=content, headers=headers)
response = urllib2.urlopen(req, timeout=timeout)
If I use requests, it works like a charm:
r = requests.post(url, files={tarFile: open(tarFile, 'rb')}, headers=headers)
I essentially need the equivalent of the above for urllib2. Does anybody maybe know it? I have checked the docs as well but I was not able to make it work..What am I missing?
Thanks!

Categories