I am trying to use import time, and strftime(), to import time, then use the index.open() to write the date on a html file, this is not working , is there any way to display time on html for every 10 mins for example ?
Related
I am using win32 to pull Outlook calendar data and I cannot seem to specify a start date, so it is pulling all of my data for the last ~5 years. In the example below, I'd like to pull events only if they are after 11/1/2020. It doesn't throw an error when I put it into a dataframe. It just simply doesn't work and continues to include everything.
I could just filter the dates after they are in the dataframe, however, its making the run time of the script quite long.
import win32com.client
import calendar
outlook = win32com.client.Dispatch('Outlook.Application').GetNamespace('MAPI')
calendar = outlook.GetDefaultFolder(9)
appointments = calendar.Items
appointments.sort('[Start]')
appointments.IncludeRecurrences = True
appointments.restrict("[Start] > '11/1/2020 12:00 AM'")
Ahoy! I've written a quick (Python) program that grabs the occupancy of a climbing gym every five minutes for later analysis. I'd like it to run non-stop, but I've noticed that after a couple hours pass, one of two things will happen.
It will detect a keyboard interrupt (which I did not enter) and stop, or
It will simply stop writing to the .csv file without showing any failure in the shell.
Here is the code:
import os
os.chdir('~/Documents/Other/g1_capacity') #ensure program runs in correct directory if opened elsewhere
import requests
import time
from datetime import datetime
import numpy as np
import csv
def get_count():
url = 'https://portal.rockgympro.com/portal/public/b01ab221559163c5e9a73e078fe565aa/occupancy?&iframeid=occupancyCounter&fId='
text = requests.get(url).text
line = ""
for item in text.split("\n"):
if "\'count\'" in item:
line = (item.strip())
count = int(line.split(":")[1][0:-1]) #really gross way to get count number for this specific source
return count
while True: #run until manual stop
with open('g1_occupancy.csv', mode='a') as occupancy:
occupancy_writer = csv.writer(occupancy)
occupancy_writer.writerow([datetime.now(), get_count()]) #append new line to .csv with timestamp and current count
time.sleep(60 * 5) #wait five minutes before adding new line
I am new to web scraping (in fact, this is my first time) and I'm wondering if anyone might have a suggestion to help eliminate the issue I described above. Many thanks!
I'm trying to capture AND present data in a table format after the script is finished. The website I am using is http://en.wikipedia.org/wiki/List_of_all-time_NFL_win-loss_records And the logic is working as such:
I run the command, it opens to the URL
I then go to the URL http://en.wikipedia.org/wiki/List_of_all-time_NFL_win-loss_records
I proceed to copy any selected rows/columns from the Table/chart
I then go back to my IDE (Jupyter Notebook) and it takes the captured data and spits it out
I can select the data on that particular webpage and copy it using my cursor by highlighting and selecting “copy”. It will then spit out all that I have selected and copied to my clipboard.
So far, my script that I wrote, is working to only capture the data and then spit it back out as is (unformatted).
PROBLEM: I would like the data I captured to be presented in a table format after I have finished selecting it and have it copied in my clipboard.
I realize I need to probably write the logic for the data I captured to be then be formatted. What would be the best approach for accomplishing this?
Below is my code that I have written so far:
Here is my code:
import numpy as np
Import pandas as pd
from pandas import Series, Dataframe
website='http://en.wikipedia.org/wiki/NFL_win_loss_records'
web browser.open(website)
nfl_frame= pd.read_clipboard(Sep='\t')
nfl_frame
You can read your data directly to DataFrame with pandas.read_html
import pandas as pd
WIKI_URL = 'http://en.wikipedia.org/wiki/List_of_all-time_NFL_win-loss_records'
df = pd.read_html(WIKI_URL,header=0)[1]
df.head() # in jupyter or print(df.head()) to show a table with first 5 rows
As pd.read_html returns a list. In them are tables that are in that HTML/URL. I set header to first raw, and selected the second element of the list which is the table you are looking for.
I have a csv file in my computer that updates automatically after every 1 minute eg. after 08:01(it updates), after 08:02(it updates) etc...
importing this file to python is easy...
import pandas as pd
myfile=pd.read_csv(r'C:\Users\HP\Desktop\levels.csv')
i want to update/re-import this file after every minute based on my pc clock/time. i want to use 'threading' since i want to run other cells while the import function is running at all times.
so basically the code might be(other suggestions are welcome):
import pandas as pd
import threading
import datetime
import time
# code to import the csv file based on pc clock automatically after every
minute.
i want this to run in a way that i can still run other functions in other cells(i tried using "schedule" but i cant run other functions after that since it shows the asterisk symbol(*))
meaning if i run on another cell the variable 'myfile'
myfile
it shows a dataframe with updated values each time.
I have a Jupyter Notebook. Here is just a simplified example.
#Parsing the website
def parse_website_function(url):
return(value,value2)
#Making some calculations(hypothesis)
def linear_model(value, value2):
return(calculations)
#Jot down calculations to csv file
pd.to_csv(calculations)
I would like to know how to make it work every hour and enable to rewrite(add new rows) to csv time series data in the same output file. Thanks!
A really basic way to do this would be to just make the program sleep for 3600 seconds.
For example this would make your program pause for 1 hour:
import time
time.sleep(3600)