the time difference in python - python

i have data set content many columns and date time ['%y/%m/%d %H:%M:%S']
input
I'm try to find the difference between date time for all rows.
I'm try by using this code
df['difference_time'] = (df['timezone']-df['timezone'].shift()).fillna(0)
and the output
but the output not right I'm not sure where is the problem in my code
output

Related

Problem with recursion in Python: Kernel Crash

I am trying to read data from a list, but there are some inconsistencies. So I want to store only the data which shows periodicity, assuming all the inconsistencies have the same format.
We expect each 12 items a datetime object, but for some days there are less data, and I am not interested in those dates (for simplicity sake). When a date has missing data I think it only has 6 elements instead of 11. Days and all data are items of a list. So I'm trying to store the index of the dates which don't follow the described pattern (for the next date, the element we are seeing shouldn't be a date)
I'm trying to do this using recursion, but every time I run the function I have created, the kernel restarts.
I cannot link the data of clean_values because AEMET opendata eliminates the data requested after like five minutes
import datetime as dt
tbe=[]
def recursive(x, clean_values):
if x<len(clean_values) and x>=0:
for i in range(0,len(clean_values),12):
if type(clean_values[i]) == dt.datetime: #If it is not a datetime
pass
else:
tbe.append(i-12) #We store the date before (that should be the one with the problem)
break
recursive(i-6, clean_values) # And restore the function but using the position in which we think the date is
else:
return
recursive(0, clean_values)
Sorry I cannot provide more information

Get date format code from a string/datetime using python

is there a way to find out in Python the date format code of a string?
My Input would be e.g.:
2020-09-11T17:42:33.040Z
What I am looking for is in this example to get this:
'%Y-%m-%dT%H:%M:%S.%fZ'
Point is that I have diffrent time Formats for diffrent Files, therefore I don't know in Advancce how my datetime code format will look like.
For processing my data, I need unix time format, but to calculate that I need a solution to this problem.
data["time_unix"] = data.time.apply(lambda row: (datetime.datetime.strptime(row, '%Y-%m-%dT%H:%M:%S.%fZ').timestamp()*100))
Thank you for the support!

Write a function that prints the number of days with sunshine >= 6 hours

I have a questions for some homework, that I am really struggling with.
I need to write a function in python that prints the number of days with at least six hours of sunshine durations, given a large csv-file (picture of the first 25 lines attached).
I am not sure how to tell the function, that it should only search in a given column (the column containing sunshine duration-information, named sdk). However, I have tried the following code, but can't even run it, since it says "invalid syntax" at my if-statement (which I don't understand why it does)
def sunshine(file):
data = np.genfromtxt('file', delimiter=",")
count=0
for value in data [data[:,9]: #because sdk=9th column
if value>=6:
count+=1
print(count)
I would recommend using the pandas library for this. Read the CSV into a data frame, then filter it based on the SDK value and get the length of the data frame. Since you said this is a homework assignment, im not going to do all the work for you, but here is some pseudocode to get you started:
import pandas as pd
df = pd.read_csv(r'Path To File')
df1 = df.loc[df['column_name'] > some_value]
total_number = len(df1)

Apply a function to each row python

I am trying to convert from UTC time to LocaleTime in my dataframe. I have a dictionary where I store the number of hours I need to shift for each country code. So for example if I have df['CountryCode'][0]='AU' and I have a df['UTCTime'][0]=2016-08-12 08:01:00 I want to get df['LocaleTime'][0]=2016-08-12 19:01:00 which is
df['UTCTime'][0]+datetime.timedelta(hours=dateDic[df['CountryCode'][0]])
I have tried to do it with a for loop but since I have more than 1 million rows it's not efficient. I have looked into the apply function but I can't seem to be able to put it to take inputs from two different columns.
Can anyone help me?
Without having a more concrete example its difficult but try this:
pd.to_timedelta(df.CountryCode.map(dateDict), 'h') + df.UTCTime

Parsing dates in Python using Pandas

So my question is when I run this code for first time and it was giving me the results correctly i.e. in the format of 2013-01-23.
But when i tried running this code next time I was not getting the correct result (giving the output as 23/01/2013).
Why is it different the second time?
from pandas import *
fec1 = read_csv("/user_home/w_andalib_dvpy/sample_data/sample.csv")
def convert_date(val):
d, m, y = val.split('/')
return datetime(int(y),int(m),int(d))
# FECHA is the date column name in raw file. format: 23/01/2013
fec1.FECHA.map(convert_date)
fec1.FECHA
Parsing dates with pandas can be done at the time you read the csv by passing parse_dates=['yourdatecolumn'] and date_parser=convert_date to the pandas.read_csv method.
Doing it this way is a much faster operation than loading the data, then parsing the dates.
The reason you get different outputs when you do the same operation twice is probably due to that when you parse the dates, you take D/M/Y as input, but have Y/M/D as output. it basically flips the D and Y every time.

Categories