how can we hit a URL/service when a Google spreadsheet document is saved or modified. For example lets say I have a example spreadsheet on Google docs. I want to hit a URL each time when a change is made in that spreadsheet. How can we do this in Python? Any help with this will be appreciated.
Thanks
I just wrote a script which reports when documents are created/edited. You should be able to able to adapt this to hit a URL (or do whatever) when changes are seen.
https://gist.github.com/1646532 -- code below
# Stuart Powers
# report when any google docs are created or changed
import os
import sys
import simplejson
import gdata.docs.service
"""
This script will report which google docs have been modified or created since
it was last run.
It compares the timestamps retrieved from google with the timestamps from the
JSON file which is updated each time the script is called. It compares each
document's last-updated timestamp against what they were the previous time the
script was ran, it does this by using a 'docs.json' to save state.
Inspired by the stackoverflow question:
"How to hit a URL when Google docs spreadsheet is changed"
http://stackoverflow.com/questions/8927164/
"""
docs = gdata.docs.service.DocsService()
docs.ClientLogin('stuart.powers#gmail.com','xxxxxxxx')
# create a dictionary of doc_id/timestamp key/values
mydict = {}
for e in docs.GetDocumentListFeed().entry:
mydict[e.id.text] = e.updated.text
# if docs.json doesn't exist, create it with our dict's data and then exit
# because there's nothing to compare against
if not os.path.exists('docs.json'):
with open('docs.json','w') as o:
o.write(simplejson.JSONEncoder().encode(mydict))
sys.exit(0)
# otherwise, load the previous data from docs.json
last_data = simplejson.load(open('docs.json'))
# and compare the timestamps
for id in mydict.keys():
if id not in last_data:
print 'new: %s' % id
if mydict[id] != last_data[id]:
print 'changed: %s' % id
# update docs.json for next time and then quit
with open('docs.json','w') as o:
o.write(simplejson.JSONEncoder().encode(mydict))
Related
I am new to Python and working with the Infusionsoft API and I am hitting a snag here. I am writing a script that retrieves all of the contacts in our system and adds them to a Pandas Data frame if they contain a given string. From what I can tell my code for retrieving the contacts is code and it will even break it down to just the ID number of the contact I want to receive. The issue comes when I try to pass that data into my delete method.
When I first started looking into this I looked into a github posting (see here: https://github.com/GearPlug/infusionsoft-python) And planned to use the method delete_contact = Client.delete_contact('ID') which takes the param 'ID' as a string. I have broken it down in my code so that the Ids will read into an array as a string and my program iterates over them and prints out all of the strings like so:
1
2
3
What has me thrown off is when I try to pass them into the method delete_contact = client.delete_contact('ID') it comes back with
File "C:\Users\Bryan\OneDrive\Desktop\Python_Scripts\site-packages\NEW_Infusion_Script.py", line 28, in <module>
delete_contact(infusion_id)
File "C:\Users\Bryan\OneDrive\Desktop\Python_Scripts\site-packages\NEW_Infusion_Script.py", line 26, in delete_contact
Client.delete_contact('id')
TypeError: Client.delete_contact() missing 1 required positional argument: 'id'
Here is my code with the obvious API keys removed:
import pandas as pd
import infusionsoft
from infusionsoft.client import Client
import xmlrpc.client
#Python has built-in support for xml-rpc. All you need to do is add the
#line above.
#Set up the API server variable
server = xmlrpc.client.ServerProxy("https://productname.infusionsoft.com:443/api/xmlrpc")
key = "#The encrypted API key"
test_rigor = []
var = server.DataService.findByField(key,"Contact",100,0,"Email","%testrigor-mail.com",["LastName","Id",'Email'] )
for result in var:
server.DataService.update(key,"Contact",result["Id"],{"LastName":" "})
test_rigor.append
##create a Data Frame from the info pull above
df = pd.DataFrame.from_dict(var)
print("Done")
print(var)
df
##Pull the data and put into a seperate array and feed that into the delete method
infusion_ids = []
for num in df['Id']:
infusion_ids.append(num)
def delete_contact(x):
Client.delete_contact('id')
for infusion_id in infusion_ids:
infusion_id[0]
delete_contact(infusion_id[0])
infusion_id.pop(0)
##print(df)
Any suggestions or obvious missteps would be greatly appreciated thanks!
I have a DataFrame "budget" that im trying to upload in a heavy spreadsheet with 22 tabs and more than 1 with RawData in some form in their name: "Raw Data >>", "RawData", "RawData_TargetCompletion"
I have the following code:
class GoogleSheets():
def __init__(self):
google_service_account_path = 'some_path'
scopes = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']
self.credentials = ServiceAccountCredentials.from_json_keyfile_name(google_service_account_path, scopes)
self.sheets_connection = gspread.authorize(self.credentials)
def load_spreadsheet(self, spreadsheet_key):
self.sheet = self.sheets_connection.open_by_key(spreadsheet_key)
def load_worksheet(self, worksheet_name):
self.worksheet = self.sheet.worksheet(worksheet_name)
def clear_range(self, data_range):
self.sheet.values_clear(data_range)
spreadsheet_key = "this is a spreadsheet key"
worksheet_name = "RawData"
cell_ref = 'A:AT'
google_sheets = sheets.GoogleSheets()
google_sheets.load_spreadsheet(spreadsheet_key)
google_sheets.load_worksheet(worksheet_name)
google_sheets.clear_range(cell_ref)
google_sheets.upload_dataframe(budget)
I have a problem that in that heavy spreadsheet, its clearing the first tab (not the RawData), and updating in the RawData sheet.
This exact same code, but with another spreadsheet_key works fine and clears and updates the correct RawData tab regardless of the position of that RawData tab.
But in this heavy one, RawData has to be the first tab in the document because the clear part is not mapping correctly and clears the first tab always.
Is there a problem you see in the code I'm not seeing or have you encountered the same problem when updating heavy spreadsheets?
I believe your goal as situation as follows.
You want to clear the range using gspread.
You have already been able to use Sheets API.
Modification points:
When I saw values_clear(range) in the document of gspread, it seems that it is the method of class gspread.models.Spreadsheet. Ref And, range of values_clear(range) is A1Notation.
In your script, self.sheet.values_clear('A:AT') is run. In this case, 1st tab is always used because the sheet name is not used. I thouthg that this is the reason of your issue.
In order to remove your issue, I would like to propose to use the sheet name to the A1Notation for values_clear(range).
When above points are reflected to your script, it becomes as follows.
Modified script:
From:
google_sheets.clear_range(cell_ref)
To:
google_sheets.clear_range("'{0}'!{1}".format(worksheet_name, cell_ref))
References:
values_clear(range)
A1 notation
I'm using streamlit to make a basic visualization app to compare two datasets, for that I'm using the following example made by Marc Skov from the streamlit gallery:
from typing import Dict
import streamlit as st
#st.cache(allow_output_mutation=True)
def get_static_store() -> Dict:
"""This dictionary is initialized once and can be used to store the files uploaded"""
return {}
def main():
"""Run this function to run the app"""
static_store = get_static_store()
st.info(__doc__)
result = st.file_uploader("Upload", type="py")
if result:
# Process you file here
value = result.getvalue()
# And add it to the static_store if not already in
if not value in static_store.values():
static_store[result] = value
else:
static_store.clear() # Hack to clear list if the user clears the cache and reloads the page
st.info("Upload one or more `.py` files.")
if st.button("Clear file list"):
static_store.clear()
if st.checkbox("Show file list?", True):
st.write(list(static_store.keys()))
if st.checkbox("Show content of files?"):
for value in static_store.values():
st.code(value)
main()
This does work, but it is odd to compare datasets without been able to display their names.
The code does explicitly says that is not possible to get the file names using this method. But this is an example from 8 months ago, I wonder if is there another way to accomplish this now.
In commit made on 9 July a slight modification of file_uploader() was made. It now returns a dict that contains:
name key contains the uploaded file name
data key contains a BytesIO or StringIO object
So you should be able to get the filename using result.name and the data using result.data.
So my question is can you use Python and/or Microsoft Graph to look into your outlook email and pull data out and put it into a excel document.
Here's what I'm trying to do, if there's anything way of doing it please feel free to let me know:
I want to create a folder in my outlook Inbox that get redirected emails. I'd like to make a script that looks at all those emails in that folder and extracts certain data within each email and puts it into a excel document.
For instance, you could set up a python script that is connected to the Outlook REST APIs. Get the access token by following the instruction in the above website and use the access token to login. You could set time intervals to re-check your message/mail box and process those data. There could be functions/parameters in the api which allows you to receive update automatically every n seconds (I have not looked into the details yet). Write your own process function to process data to mine the data for your own usage.
import time
def main():
data = get_my_messages(<your_access_token>)
time.sleep(5)
process(data)
main()
Such examples python code could be found in the website above.
def get_my_messages(access_token):
get_messages_url = graph_endpoint.format('/me/mailfolders/inbox/messages')
# Use OData query parameters to control the results
# - Only first 10 results returned
# - Only return the ReceivedDateTime, Subject, and From fields
# - Sort the results by the ReceivedDateTime field in descending order
query_parameters = {'$top': '10',
'$select': 'receivedDateTime,subject,from',
'$orderby': 'receivedDateTime DESC'}
r = make_api_call('GET', get_messages_url, access_token, parameters = query_parameters)
if (r.status_code == requests.codes.ok):
return r.json()
else:
return "{0}: {1}".format(r.status_code, r.text)
I have been able to get the column to output the values of the column in a separated list. However I need to retain these values and use them one by one to perform an Amazon lookup with them. The amazon lookup is not the problem. Getting XLRD to give one value at a time has been a problem. Is there also an efficient method of setting a time in Python? The only answer I have found to the timer issue is recording the time the process started and counting from there. I would prefer just a timer. This question is somewhat two parts here is what I have done so far.
I load the spreadsheet with xlrd using argv[1] i copy it to a new spreadsheet name using argv[2]; argv[3] i need to be the timer entity however I am not that far yet.
I have tried:
import sys
import datetime
import os
import xlrd
from xlrd.book import colname
from xlrd.book import row
import xlwt
import xlutils
import shutil
import bottlenose
AMAZON_ACCESS_KEY_ID = "######"
AMAZON_SECRET_KEY = "####"
print "Executing ISBN Amazon Lookup Script -- Please be sure to execute it python amazon.py input.xls output.xls 60(seconds between database queries)"
print "Copying original XLS spreadsheet to new spreadsheet file specified as the second arguement on the command line."
print "Loading Amazon Account information . . "
amazon = bottlenose.Amazon(AMAZON_ACCESS_KEY_ID, AMAZON_SECRET_KEY)
response = amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
shutil.copy2(sys.argv[1], sys.argv[2])
print "Opening copied spreadsheet and beginning ISBN extraction. . ."
wb = xlrd.open_workbook(sys.argv[2])
print "Beginning Amazon lookup for the first ISBN number."
for row in colname(colx=2):
print amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
I know this is a little vague. Should I perhaps try doing something like column = colname(colx=2) then i could do for row in column: Any help or direction is greatly appreciated.
The use of colname() in your code is simply going to return the name of the column (e.g. 'C' by default in your case unless you've overridden the name). Also, the use of colname is outside the context of the contents of your workbook. I would think you would want to work with a specific sheet from the workbook you are loading, and from within that sheet you would want to reference the values of a column (2 in the case of your example), does this sound somewhat correct?
wb = xlrd.open_workbook(sys.argv[2])
sheet = wb.sheet_by_index(0)
for row in sheet.col(2):
print amazon.ItemLookup(ItemId="row", ResponseGroup="Offer Summaries", SearchIndex="Books", IdType="ISBN")
Although I think looking at the call to amazon.ItemLookup() you probably want to refer to row and not to "row" as the latter is simply a string and the former is the actual contents of the variable named row from your for loop.