I have JSON data which I am pulling in via API.
here's my code
# list of each api url to use
link =[]
#for every id in the accounts , create a new url link into the link list
for id in accounts:
link.append('https://example.ie:0000/v123/accounts/'+id+'/users')
accountReq = []
for i in link:
accountReq.append(requests.get(i, headers=headers).json())
with open('masterSheet.txt', 'x') as f:
for each in accountReq:
account = each['data']
for data in account:
list=(data['username']+" "+" ",data['first_name'],data['last_name'])
f.write(str(list)+"\n")
This pulls in data no problem .
If I do
print(data['username']+" "+" ",data['first_name'],data['last_name'])
I get all of the data back , around 500lines.
However my problem I am having is when I try to write to my file , it writes about 8lines of data and then stops running with no errors.
I'm assuming its due to the data size. How can I fix my issue of not printing all of the data to the .txt file??
Are you trying to write each data point to the file? Your write function is outside the nested for loop, so you are actually only writing the last list variable that you create to the file.
You should move the f.write() under the for loop if you intend to write every single data point into the file.
for data in account:
list=(data['username']+" "+" ",data['first_name'],data['last_name'])
f.write(str(list)+"\n")
Related
I have a list of a million pins and one URL that has a pin within the URL
Example:
https://www.example.com/api/index.php?pin=101010&key=113494
I have to change the number "101010" for the pin with the list of a million values like 093939,493943,344454 that I have in a csv file and then save all of those new urls to a csv file.
Here's what I have tried doing so far that has not worked:
def change(var_data):
var = str(var_data)
url = 'https://www.example.com/api/index.php?pin=101010&key=113494'
url1 = url.split('=')
url2 = ''.join(url1[:-2] + [var] + [url1[-1]])
print(url2)
change('xxxxxxxxxx')
Also this is for an api request that goes to a json page. Would using python and then reiterating through these urls I save on a csv file be the best way to do this? I want to collect some information for all of the pins that I have and save it to a BigQuery database, or somewhere where I can connect to Google Data Studio in order to have the ability to create a dashboard using all of this data.
Any ideas? What do you think the best way of getting this done would be?
Answering the first part of the question, the changes to the change function returns a list of urls using f-strings.
This can then be applied via list comprehension.
The variable url_variables would be the list of integers for the variable you are reading in from the other file.
Then writing the url list to rows in a csv.
import csv
url_variables = [93939, 493943, 344454]
def change(var_data):
var_data = str(var_data)
url = 'https://www.example.com/api/index.php?pin='
key = 'key=113494'
new_url = f'{url}{var_data}&{key}'
return(new_url)
url_list = [change(x) for x in url_variables]
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
for val in url_list:
writer.writerow([val])
Output in output.csv
1. First part of the question: (replace the numbers between "pin=" and "&") I will use an answer from Change a text between two strings in Python with Regex post:
import re
def change(var_data):
var = str(var_data)
url = 'https://www.example.com/api/index.php?pin=101010&key=113494'
url2 = re.sub("(?<=pin=).*?(?=&)",var,url)
print(url2)
change('xxxxxxxxxx')
Here I use the sub method from the built-in package "re" and the RegEx Lookarounds sintax, where:
(?<=pin=) # Asserts that what immediately precedes the current position in the string is "pin="
.*? # is the occurrence of any character
(?=&) #Asserts that what immediately follows the current position in the string is "&"
Here is a formal explanation about the Lookarounds syntax.
2. Second part of the question: As another answer explains, you can register the urls in the csv file by rows but I recommend you to read this post about handling csv files with python and you can give yourself an idea of the way you want to save them.
I am not very good at english but I hope that I have explained myself well.
tldr;
When writing to csv, how do I ensure the last element of a previous write is separated from the first element of a subsequent write by a comma and not a new line?
I am currently trying to collect followers data with a list of Twitter users using Tweepy. In the code below, you can see that I'm using pagination as some users have a lot of followers. I'm trying to put all the followers into a csv file for each user, however when I test this code and inspect the csv, I can see there's only a new line between page writes, but no commas. I do not want improper csv format to come back and bite me later in this project.
for page in tweepy.Cursor(api.followers_ids,screen_name=username).pages():
with open(f'output/{username}.csv', 'a') as outfile:
writer = csv.writer(outfile)
writer.writerow(page)
I've thought of enumerating the pages:
for i, page in enumerate(tweepy.Cursor(api.followers_ids,screen_name=username).pages()):
and doing something like if i > 0, add comma at end of the current file. This way feels inefficient, as I'd have to open, write a ',', and close the file each time this happens and I need every second I can save for this project.
I am new to python. I have created a Script which asks the user to add date and file name and then generates csv file. I want to run that Script on our network. So that everyone on the network can put the dates and generate their report. Can anybody please suggest me which module should i use and how.Although my script is generating two files , i only want everyone to download Revenue report not the missing id's.
here is the snippet from my program which is calling all of my functions,i made.
Thanks in advance.
print "Enter state date(eg:-2015-01-01):",
start_date = raw_input()
print "Enter the last date(eg:-2015-01-01):",
end_date=raw_input()
print "Please give a filename for this report(eg:-January_rev_report): ",
file_name=raw_input()
in_file = open(""+file_name+".csv", "w")
in_file2=open("missiong_ids.csv","w")
in_file2.write("Missing_ids\n")
in_file.write("Partner_id|Partner_name|Price_of_lead|Date|Osdial_Lead_id|Bob_lead_id|list_id|Phone_number|State|Postal_code|Status\n")
data_=getPidsForThisMonth(start_date,end_date)
for j in data_:
if getReport(j,start_date,end_date) is None:
missing_ids=""
missing_ids+=j
#print missing_ids + " is missing id, the whole list of missing id's will be added to missing_ids.csv file "
in_file2.write(missing_ids)
else:
data=""
details = getPartnerDetails(j)
pid = str(details[0])
name = str(details[1])
price = str(details[2])
report_data=getReport(j,start_date,end_date)
date=str(report_data[0])
lead_id=str(report_data[1])
bob_id=str(report_data[2])
list_id=str(report_data[3])
phone=str(report_data[4])
state=str(report_data[5])
postal_code=str(report_data[6])
status=str(report_data[7])
data+=pid+"|"+name+"|"+price+"|"+date +"|"+lead_id+"|"+bob_id+"|"+list_id+"|"+phone+"|"+state+"|"+postal_code+"|"+status
data+="\n"
in_file.write(data)
Flask would be suited to turn this into a small web-app: http://flask.pocoo.org/
I would have one controller that takes two parameters, the start- and end-date. Or better have a small page where dates can be selected and pass this using POST to a controller. This would run the script and return the file. If you set the response correctly the csv file will start as a download.
You won't need to write the file, just store the lines in a list and at the end generate the full content using '\n'.join(lines).
I am writing a code which creates several URLs, which again are stored in a list.
The next step would be, open each URL, download the data (which is only text, formatted in XML or JSON) and save the downloaded data.
My code works fine thanks to the online community here up. It stuck at the point to open the URL and download the data. I want the url.request to loop through the list with my created urls and call each url seperately, open it, display it and move on to the next. But it only does the loop to create the urls, but then nothing. No feedback, nothing.
import urllib.request
.... some calculations for llong and llat ....
#create the URLs and store in list
urls = []
for lat,long,lat1,long1 in (zip(llat, llong,llat[1:],llong[1:])):
for pages in range (1,17):
print ("https://api.flickr.com/services/rest/?method=flickr.photos.search&format=json&api_key=5.b&nojsoncallback=1&page={}&per_page=250&bbox={},{},{},{}&accuracy=1&has_geo=1&extras=geo,tags,views,description".format(pages,long,lat,long1,lat1))
print (urls)
#accessing the website
data = []
for amounts in urls:
response = urllib.request.urlopen(urls)
flickrapi = data.read()
data.append(+flickrapi)
data.close()
print (data)
What am I doing wrong`?
The next step would be, downloading the data and save them to a file or somewhere else for further processing.
Since I will receive heaps of data, like a lot lot lot, I am not sure what would be the best way to store it to precess it with R (or maybe Python? - need to do some statistical work on it). Any suggestions?
You're not appending your generated urls to the url list, you are printing them:
print ("https://api.flickr.com/services/rest/?method=flickr.photos.search&format=json&api_key=5.b&nojsoncallback=1&page={}&per_page=250&bbox={},{},{},{}&accuracy=1&has_geo=1&extras=geo,tags,views,description".format(pages,long,lat,long1,lat1))
Should be:
urls.append("https://api.flickr.com/services/rest/?method=flickr.photos.search&format=json&api_key=5.b&nojsoncallback=1&page={}&per_page=250&bbox={},{},{},{}&accuracy=1&has_geo=1&extras=geo,tags,views,description".format(pages,long,lat,long1,lat1))
Then you can iterate over the urls as planned.
But you'll run into the error on the following line:
response = urllib.request.urlopen(urls)
Here you are feeding the whole set of urls into urlopen, where you should be passing in a single url from urls which you have named amounts like so:
response = urllib.request.urlopen(amounts)
Ok so I've been playing with python and spss to achieve almost what I want. I am able to open the file and make the changes, however I am having trouble saving the files (and those changes). What I have (using only one school in the schoollist):
begin program.
import spss, spssaux
import os
schoollist = ['brow']
for x in schoollist:
school = 'brow'
school2 = school + '06.sav'
filename = os.path.join("Y:\...\Data", school2) #In this instance, Y:\...\Data\brow06.sav
spssaux.OpenDataFile(filename)
#--This block are the changes and not particularly relevant to the question--#
cur=spss.Cursor(accessType='w')
cur.SetVarNameAndType(['name'],[8])
cur.CommitDictionary()
for i in range(cur.GetCaseCount()):
cur.fetchone()
cur.SetValueChar('name', school)
cur.CommitCase()
cur.close()
#-- What am I doing wrong here? --#
spss.Submit("save outfile = filename".)
end program.
Any suggestions on how to get the save outfile to work with the loop? Thanks. Cheers
In your save call, you are not resolving filename to its actual value. It should be something like this:
spss.Submit("""save outfile="%s".""" % filename)
I'm unfamiliar with spssaux.OpenDataFile and can't find any documentation on it (besides references to working with SPSS data files in unicode mode). But what I am going to guess is the problem is that it grabs the SPSS data file for use in the Python program block, but it isn't actually opened to further submit commands.
Here I make a test case that instead of using spssaux.OpenDataFile to grab the file, does it all with SPSS commands and just inserts the necessary parts via python. So first lets create some fake data to work with.
*Prepping the example data files.
FILE HANDLE save /NAME = 'C:\Users\andrew.wheeler\Desktop\TestPython'.
DATA LIST FREE / A .
BEGIN DATA
1
2
3
END DATA.
SAVE OUTFILE = "save\Test1.sav".
SAVE OUTFILE = "save\Test2.sav".
SAVE OUTFILE = "save\Test3.sav".
DATASET CLOSE ALL.
Now here is a paired down version of what your code is doing. I have the LIST ALL. command inserted in so you can check the output that it is adding the variable of interest to the file.
*Sequential opening the data files and appending data name.
BEGIN PROGRAM.
import spss
import os
schoollist = ['1','2','3']
for x in schoollist:
school2 = 'Test' + x + '.sav'
filename = os.path.join("C:\\Users\\andrew.wheeler\\Desktop\\TestPython", school2)
#opens the SPSS file and makes a new variable for the school name
spss.Submit("""
GET FILE = "%s".
STRING Name (A20).
COMPUTE Name = "%s".
LIST ALL.
SAVE OUTFILE = "%s".
""" %(filename, x,filename))
END PROGRAM.