I've got a script that grabs every event off of a Google Claendar, and the searches through those events, and prints the ones that contain a search term to a file.
The problem I'm having is that I need them to be put in order by date, and this doesn't seem to do that.
while True:
events = calendar_acc.events().list(calendarId='myCal',pageToken=page_token).execute()
for event in events['items']:
if 'Search Term' in event['summary']:
#assign names to hit, and date
find = event['summary']
date = event['start'][u'dateTime']
#only print the first ten digits of the date.
month = date[5:7]
day = date[8:10]
year = date[0:4]
formatted_date = month+"/"+day+"/"+year
#write a line
messy.write(formatted_date+" "+event['summary']+"\n\n")
I think there is a way to do this with the time module maybe, but I'm not sure. Any help is appreciated.
Just in case anyone else needs to do this. With the help of jedwards.
I ended up creating an empty list: hits
And then appending the ['start']['dateTime'] as an datetime.datetime object,
and ['summary'] to the list for each event that contained my "Search Term". Like so:
hits = []
while True:
events = calendar_acc.events().list(calendarId='My_Calendar_ID', pageToken=page_token).execute()
for event in events['items']:
if "Search Term" in event['summary']:
hits.append((dateutil.parser.parse(event['start']['dateTime']), event['summary']))
page_token = events.get('nextPageToken')
if not page_token:
break
The you just sort the list, and in my case, I cut the datetime object down to just be the date. And then wrote the whole line to a file. But this code just prints it to the console.
hits.sort()
for x in hits:
d = x[0]
date = "%d/%d/%d"%(getattr(d,'month'),getattr(d,'day'), getattr(d,'year'))
final = str(date)+"\t\t"+str(x[1])
print final
Thanks again to jedwards in the comments!
You can return a sorted list (by date ascending) from the API by using the "orderBy" parameter and setting it to "updated".
page_token = None
while True:
events = service.events().list(calendarId=myID, orderBy='updated', pageToken=page_token).execute()
for event in events['items']:
print(event)
page_token = events.get('nextPageToken')
if not page_token:
break
For more information see: https://developers.google.com/calendar/v3/reference/events/list
Hope this helps.
Related
I am trying to make a call to an API and then grab event_ids from the data. I then want to use those event ids as variables in another request, then parse that data. Then loop back and make another request using the next event id in the event_id variable for all the IDs.
so far i have the following
def nba_odds():
url = "https://xxxxx.com.au/sports/summary/basketball?api_key=xxxxx"
response = requests.get(url)
data = response.json()
event_ids = []
for event in data['Events']:
if event['Country'] == 'USA' and event['League'] == 'NBA':
event_ids.append(event['EventID'])
# print(event_ids)
game_url = f'https://xxxxx.com.au/sports/detail/{event_ids}?api_key=xxxxx'
game_response = requests.get(game_url)
game_data = game_response.json()
print(game_url)
that gives me the result below in the terminal.
https://xxxxx.com.au/sports/detail/['dbx-1425135', 'dbx-1425133', 'dbx-1425134', 'dbx-1425136', 'dbx-1425137', 'dbx-1425138', 'dbx-1425139', 'dbx-1425140', 'anyvsany-nba01-1670043600000000000', 'dbx-1425141', 'dbx-1425142', 'dbx-1425143', 'dbx-1425144', 'dbx-1425145', 'dbx-1425148', 'dbx-1425149', 'dbx-1425147', 'dbx-1425146', 'dbx-1425150', 'e95270f6-661b-46dc-80b9-cd1af75d38fb', '0c989be7-0802-4683-8bb2-d26569e6dcf9']?api_key=779ac51a-2fff-4ad6-8a3e-6a245a0a4cbb
the URL above format should look like
https://xxxx.com.au/sports/detail/dbx-1425135
If anyone can point me in the right direction it would be appreciated.
thanks.
you need to loop over the event ID's again to call the API with one event_id if it is not supporting multiple event_ids like:
all_events_response = []
for event_id in event_ids
game_url = f'https://xxxxx.com.au/sports/detail/{event_id}?api_key=xxxxx'
game_response = requests.get(game_url)
game_data = game_response.json()
all_events_response.append(game_data)
print(game_url)
You can find list of json responses under all_events_response
event_ids is an entire list of event ids. You make a single URL with the full list converted to its string view (['dbx-1425135', 'dbx-1425133', ...]). But it looks like you want to get information on each event in turn. To do that, put the second request in the loop so that it runs for every event you find interesting.
def nba_odds():
url = "https://xxxxx.com.au/sports/summary/basketball?api_key=xxxxx"
response = requests.get(url)
data = response.json()
event_ids = []
for event in data['Events']:
if event['Country'] == 'USA' and event['League'] == 'NBA':
event_id = event['EventID']
# print(event_id)
game_url = f'https://xxxxx.com.au/sports/detail/{event_id}?api_key=xxxxx'
game_response = requests.get(game_url)
game_data = game_response.json()
# do something with game_data - it will be overwritten
# on next round in the loop
print(game_url)
I'm trying to pull a list of team and player stats from match IDs. Everything looks fine to me but when I run my "for loops" to call the functions for pulling the stats I want, it just prints the error from my try/except block. I'm still pretty new to python and this is my first project so I've tried everything I can think of in the past few days but no luck. I believe the problem is with my actual pull request but I'm not sure as I'm also using a GitHub library I found to help me with the Riot API while I change and update it to get the info I want.
def get_match_json(matchid):
url_pull_match = "https://{}.api.riotgames.com/lol/match/v5/matches/{}/timeline?api_key={}".format(region, matchid, api_key)
match_data_all = requests.get(url_pull_match).json()
# Check to make sure match is long enough
try:
length_match = match_data_all['frames'][15]
return match_data_all
except IndexError:
return ['Match is too short. Skipping.']
And then this is a shortened version of the stat function:
def get_player_stats(match_data, player):
# Get player information at the fifteenth minute of the game.
player_query = match_data['frames'][15]['participantFrames'][player]
player_team = player_query['teamId']
player_total_gold = player_query['totalGold']
player_level = player_query['level']
And there are some other functions in the code as well but I'm not sure they are faulty as well or if they are needed to figure out the error. But here is the "for loop" to call the request and defines the variable 'matchid'
for matchid_batch in all_batches:
match_data = []
for match_id in matchid_batch:
time.sleep(1.5)
if match_id == 'MatchId':
pass
else:
try:
match_entry = get_match_row(match_id)
if match_entry[0] == 'Match is too short. Skipping.':
print('Match', match_id, "is too short.")
else:
match_entry = get_match_row(match_id).reshape(1, -1)
match_data.append(np.array(match_entry))
except KeyError:
print('KeyError.')
match_data = np.array(match_data)
match_data.shape = -1, 17
df = pd.DataFrame(match_data, columns=column_titles)
df.to_csv('Match_data_Diamond.csv', mode='a')
print('Done Batch!')
Since this is my first project any help would be appreciated since I can't find any info on this particular subject so I really don't know where to look to learn why it's not working on my own.
I guess your issue was that the 'frame' array is subordinate to the array 'info'.
def get_match_json(matchid):
url_pull_match = "https://{}.api.riotgames.com/lol/match/v5/matches/{}/timeline?api_key={}".format(region, matchid, api_key)
match_data_all = requests.get(url_pull_match).json()
try:
length_match = match_data_all['info']['frames'][15]
return match_data_all
except IndexError:
return ['Match is too short. Skipping.']
def get_player_stats(match_data, player): # player has to be an int (1-10)
# Get player information at the fifteenth minute of the game.
player_query = match_data['info']['frames'][15]['participantFrames'][str(player)]
#player_team = player_query['teamId'] - It is not possibly with the endpoint to get the teamId
player_total_gold = player_query['totalGold']
player_level = player_query['level']
return player_query
This example worked for me. Unfortunately it is not possible to gain the teamId only through your API-endpoint. Usually the players 1-5 are in team 100 (blue side) and 6-10 in team 200 (red side).
I am trying to open up several URL's (because they contain data I want to append to a list). I have a logic saying "if amount in icl_dollar_amount_l" then run the rest of the code. However, I want the script to only run the rest of the code on that specific amount in the variable "amount".
Example:
selenium opens up X amount of links and sees ['144,827.95', '5,199,024.87', '130,710.67'] in icl_dollar_amount_l but i want it to skip '144,827.95', '5,199,024.87' and only get the information for '130,710.67' which is in the 'amount' variable already.
Actual results:
Its getting webscaping information for amount '144,827.95' only and not even going to '5,199,024.87', '130,710.67'. I only want it getting webscaping information for '130,710.67' because my amount variable has this as the only amount.
print(icl_dollar_amount_l)
['144,827.95', '5,199,024.87', '130,710.67']
print(amount)
'130,710.67'
file2.py
def scrapeBOAWebsite(url,fcg_subject_l, gp_subject_l):
from ICL_Awk_Checker import rps_amount_l2
icl_dollar_amount_l = []
amount_ack_missing_l = []
file_total_l = []
body_l = []
for link in url:
print(link)
browser = webdriver.Chrome(options=options,
executable_path=r'\\TEST\user$\TEST\Documents\driver\chromedriver.exe')
# if 'P2 Cust ID 908554 File' in fcg_subject:
browser.get(link)
username = browser.find_element_by_name("dialog:username").get_attribute('value')
submit = browser.find_element_by_xpath("//*[#id='dialog:continueButton']").click()
body = browser.find_element_by_xpath("//*[contains(text(), 'Total:')]").text
body_l.append(body)
icl_dollar_amount = re.findall('(?:[\£\$\€]{1}[,\d]+.?\d*)', body)[0].split('$', 1)[1]
icl_dollar_amount_l.append(icl_dollar_amount)
if not missing_amount:
logging.info("List is empty")
print("List is empty")
count = 0
for amount in missing_amount:
if amount in icl_dollar_amount_l:
body = body_l[count]
get_file_total = re.findall('(?:[\£\$\€]{1}[,\d]+.?\d*)', body)[0].split('$', 1)[1]
file_total_l.append(get_file_total)
return icl_dollar_amount_l, file_date_l, company_id_l, client_id_l, customer_name_l, file_name_l, file_total_l, \
item_count_l, file_status_l, amount_ack_missing_l
I don't know if I understand problem but this
if amount in icl_dollar_amount_l:
doesn't give information on which position is '130,710.67' in icl_dollar_amount_l and you need also
count = icl_dollar_amount_l.index(amount)
for amount in missing_amount:
if amount in icl_dollar_amount_l:
count = icl_dollar_amount_l.index(amount)
body = body_l[count]
But it will works if you expect only one amount on list icl_dollar_amount_l. For more elements you would have to use rather for-loop and check every element separatelly
for amount in missing_amount:
for count, item in enumerate(icl_dollar_amount_l)
if amount == item :
body = body_l[count]
But frankly I don't know why you don't check it in first loop for link in url: when you have direct access to icl_dollar_amount and body
I have a script to delete the snapshots after a retention period. It works good and deletes the snapshots that passes the retention period. But I have to filter it with tags. Means only the snapshots that has a particular tag should be deleted.
from botocore.exceptions import ClientError
import datetime
# Set the global variables
globalVars = {}
globalVars['Owner'] = "Cloud"
globalVars['Environment'] = "Test"
globalVars['REGION_NAME'] = "ap-south-1"
globalVars['tagName'] = "Testing"
globalVars['findNeedle'] = "DeleteOn"
globalVars['RetentionDays'] = "1"
globalVars['tagsToExclude'] = "Do-Not-Delete"
ec2_client = boto3.client('ec2')
"""
This function looks at *all* snapshots that have a "DeleteOn" tag containing
the current day formatted as YYYY-MM-DD. This function should be run at least
daily.
"""
def janitor_for_snapshots():
account_ids = list()
account_ids.append( boto3.client('sts').get_caller_identity().get('Account') )
snap_older_than_RetentionDays = ( datetime.date.today() - datetime.timedelta(days= int(globalVars['RetentionDays'])) ).strftime('%Y-%m-%d')
delete_today = datetime.date.today().strftime('%Y-%m-%d')
tag_key = 'tag:' + globalVars['findNeedle']
filters = [{'Name': tag_key, 'Values': [delete_today]},]
# filters={ 'tag:' + config['tag_name']: config['tag_value'] }
# Get list of Snaps with Tag 'globalVars['findNeedle']'
snaps_to_remove = ec2_client.describe_snapshots(OwnerIds=account_ids,Filters=filters)
# Get the snaps that doesn't have the tag and are older than Retention days
all_snaps = ec2_client.describe_snapshots(OwnerIds=account_ids)
for snap in all_snaps['Snapshots']:
if snap['StartTime'].strftime('%Y-%m-%d') <= snap_older_than_RetentionDays:
snaps_to_remove['Snapshots'].append(snap)
snapsDeleted = {'Snapshots': []}
for snap in snaps_to_remove['Snapshots']:
try:
ec2_client.delete_snapshot(SnapshotId=snap['SnapshotId'])
snapsDeleted['Snapshots'].append({'Description': snap['Description'], 'SnapshotId': snap['SnapshotId'], 'OwnerId': snap['OwnerId']})
except ClientError as e:
if "is currently in use by" in str(e):
print("Snapshot {} is part of an AMI".format(snap.get('SnapshotId')))
snapsDeleted['Status']='{} Snapshots were Deleted'.format( len(snaps_to_remove['Snapshots']))
return snapsDeleted
def lambda_handler(event, context):
return janitor_for_snapshots()
if __name__ == '__main__':
lambda_handler(None, None)
I want to delete the snapshots only with "DeleteOn" Tag. But this script deletes all that passed the retention period. Its not checking the Tag part.
Please check and help on this.
Thank You.
If you are asking how to fix the code so that it only deletes snapshots that:
Have the given tag, AND
Have passed the retention period
then look closely at your code.
This part:
# Get list of Snaps with Tag 'globalVars['findNeedle']'
snaps_to_remove = ec2_client.describe_snapshots(OwnerIds=account_ids,Filters=filters)
is obtaining a list of snapshots by tag. Great!
Then this part:
# Get the snaps that doesn't have the tag and are older than Retention days
all_snaps = ec2_client.describe_snapshots(OwnerIds=account_ids)
for snap in all_snaps['Snapshots']:
if snap['StartTime'].strftime('%Y-%m-%d') <= snap_older_than_RetentionDays:
snaps_to_remove['Snapshots'].append(snap)
is getting a NEW list of snapshots and checking the retention.
Then, the resulting snaps_to_remove contains the results from BOTH of them.
You will need to combine your logic so it is only adding snaps that meet both criteria rather than compiling the list of snapshots separately.
I created a web scraping program that open several URLs, it checks which one of the URLs has information related to "tomorrow"s date and then it prints some specific information that is on that URL. My problem is that sometimes none of the URLs in that list has information concerning "tomorrow". So I would like that in such case, the program prints other innformation like "no data found". How could I accomplish that? Other doubt I have, do I need the while loop at the beginning? Thanks.
My code is:
from datetime import datetime, timedelta
tomorrow = datetime.now() + timedelta(days=1)
tomorrow = tomorrow.strftime('%d-%m-%Y')
day = ""
while day != tomorrow:
for url in list_urls:
browser.get(url)
time.sleep(1)
dia_page = browser.find_element_by_xpath("//*[#id='item2']/b").text
dia_page = dia_page[-10:]
day_uns = datetime.strptime(dia_page, "%d-%m-%Y")
day = day_uns.strftime('%d-%m-%Y')
if day == tomorrow:
meals = browser.find_elements_by_xpath("//*[#id='item2']/span")
meal_reg = browser.find_element_by_xpath("//*[#id='item_frm']/span[1]").text
sopa2 = (meals[0].text)
refeicao2 = (meals[1].text)
sobremesa2 = (meals[2].text)
print(meal_reg)
print(sopa2)
print(refeicao2)
print(sobremesa2)
break
No need for a while loop, you can use the for-else Python construct for this:
for url in list_urls:
# do stuff
if day == tomorrow:
# do and print stuff
break
else: # break never encountered
print("no data found")