Change dates in list with multiple dictionaries in Python - python

I have a list with multiple dictionaries, like the following:
[{'Date': '6-1-2017', 'Rate':'0.3', 'Type':'A'},
{'Date': '6-1-2017', 'Rate':'0.4', 'Type':'B'},
{'Date': '6-1-2017', 'Rate':'0.6', 'Type':'F'},
{'Date': '6-1-2017', 'Rate':'0.1', 'Type':'B'}
]
I would now like to change the dates, because they need to be in the format 'yyymmdd', which starts at 1900-01-01. In other words, I would like to change the '6-1-2017' to '1170106'.
As this has to be done every week (with the then current date), I do not want to change this by hand. So next week, '13-1-2017' has to be transformed into '1170113'.
Anyone ideas how to do this? I have tried several things, but I can't even get my code to select the date-values of all dictionaries.
Many thanks!

You can use the datetime module, which provides a lot of functionality to manipulate datetime objects including converting datetime to string and the way back, accessing different components of the datetime object, etc:
from datetime import datetime
for l in lst:
l['Date'] = datetime.strptime(l['Date'], "%d-%m-%Y")
l['Date'] = str(l['Date'].year - 1900) + l['Date'].strftime("%m%d")
lst
#[{'Date': '1170106', 'Rate': '0.3', 'Type': 'A'},
# {'Date': '1170106', 'Rate': '0.4', 'Type': 'B'},
# {'Date': '1170106', 'Rate': '0.6', 'Type': 'F'},
# {'Date': '1170106', 'Rate': '0.1', 'Type': 'B'}]

Related

How to convert format date from MM/DD/YYYY to YYYY-MM-DDT00:00:00.000Z in Python

I used pandas to create a list of dictionaries. The following codes is how I create the list:
sheetwork = client.open('RMA Daily Workload').sheet1
list_of_work = sheetwork.get_all_records()
dfr = pd.DataFrame(list_of_work, columns = ['date' , 'value'])
rnow = dfrnow.to_dict('records')
The following is the output of my list:
rnow =
[{'date': '01/02/2020', 'value': 13},
{'date': '01/03/2020', 'value': 2},
{'date': '01/06/2020', 'value': 5},
...
{'date': '01/07/2020', 'value': 6}]
I want to change the date format from MM/DD/YYYY to YYYY-MM-DDT00:00:00.000Z, so that my data will be compatible with my javascript file where I want to add my data.
I want my list to be shown as:
rnow =
[{'date': '2020-01-02T00:00:00.000Z', 'value': 13},
{'date': '2020-01-03T00:00:00.000Z', 'value': 2},
{'date': '2020-01-06T00:00:00.000Z', 'value': 5},
...
{'date': '2020-01-07T00:00:00.000Z', 'value': 6}]
I tried so many methods but can only convert them into 2020-01-02 00:00:00 but not 2020-01-02T00:00:00.000Z. Please advise what should I do
If you need exact T00:00:00.000Z this string after the time, try to use string format after time conversion,
e.g.,
import datetime
# '2020-01-07T00:00:00.000Z'
datetime.datetime.strptime("07/02/2020", '%d/%m/%Y').strftime('%Y-%m-%dT00:00:00.000Z'))
How to apply to pandas:
def func(x):
myDate = x.date
return datetime.datetime.strptime(myDate, '%d/%m/%Y').strftime('%Y-%m-%dT00:00:00.000Z')
df['new_date'] = df.apply(func, axis=1)
To make it easy and keeping UTC and since you are using pandas:
rnow = [{'date': '01/02/2020', 'value': 13},
{'date': '01/03/2020', 'value': 2},
{'date': '01/06/2020', 'value': 5},
{'date': '01/07/2020', 'value': 6}]
def get_isoformat(date):
return pd.to_datetime(date, dayfirst=False, utc=True).isoformat()
for i in range (len(rnow)):
rnow[i]['date'] = get_isoformat(rnow[i]['date'])
rnow
which outputs:
[{'date': '2020-01-02T00:00:00+00:00', 'value': 13},
{'date': '2020-01-03T00:00:00+00:00', 'value': 2},
{'date': '2020-01-06T00:00:00+00:00', 'value': 5},
{'date': '2020-01-07T00:00:00+00:00', 'value': 6}]
in fact, you probably want to consider using the function get_isoformat() applied to your dataframe for simplicity. Also, if you use utc=None will get rid of the +00:00 part in case you don't want it or need it.
Edit
To get specificly 2020-01-02T00:00:00Z try:
pd.to_datetime(date, dayfirst=False, utc=False).isoformat()+'Z'
You can use the isoformat function of Python's builtin datetime package:
from datetime import datetime, timezone
formatted = datetime.strptime('01/02/2020', '%m/%d/%Y', tzInfo=timezone.utc).isoformat()
formatted
# Output: '2020-01-02T00:00:00+00:00'
Note that Python doesn't support the Z suffix for UTC timezone, instead it will be +00:00 which is according to ISO 8601 as well and should parse in other code just fine.
If this is a problem, you can omit the timezone and instead manually put a Z there:
from datetime import datetime
formatted = datetime.strptime('01/02/2020', '%m/%d/%Y').isoformat() + 'Z'
formatted
# Output: '2020-01-02T00:00:00Z'
Alternatively (in a more "manual" approach), you could format the date using strftime:
from datetime import datetime
formatted = datetime.strptime('01/02/2020', '%m/%d/%Y').strftime('%Y-%m-%dT00:00:00Z')
formatted
# Output: '2020-01-02T00:00:00Z'

Best way to break apart long string using redshift SQL (included in question)?

looking for the best way to break apart this blob of information into columns
DATE
AMOUNT
TYPE
UNDISCLOSED
INVESTORS
INVESTORS WEBSITES
[{'date': 'Mon Aug 07 00:00:00 UTC 2004', 'amount': '1900000', 'type': 'Series D', 'undisclosed': 'false', 'investor': [{'name': 'Jobius Venture', 'website': 'jobiusvc.com'}]}, {'date': 'Tues July 06 00:00:00 UTC 2010', 'amount': '12000000000', 'type': 'Series A1', 'undisclosed': 'false', 'investor': [{'name': 'Fatthead Partners', 'website': 'fpartnazs.com'}, {'name': 'Jobius Venture', 'website': 'jobiusvc.com'}, {'name': 'Pista Pentures ', 'website': 'pisptavc.com'}]}, {'date': 'Sat Jun 01 00:00:00 UTC 2015', 'amount': '10000000000', 'type': 'Series X', 'undisclosed': 'false', 'investor': [{'name': 'Fatthead Partners', 'website': 'fpartnazs.com'}, {'name': 'Jobius Venture', 'website': 'jobiusvc.com'}, {'name': 'Pista Pentures', 'website': 'vistavc.com'}]}, {'date': 'Sun Aug 31 00:00:00 UTC 2015', 'amount': '3913000', 'type': 'Unknown', 'undisclosed': 'false'}, {'date': 'Mon Aug 12 00:00:00 UTC 2023', 'amount': '40000', 'type': 'Series D34', 'undisclosed': 'false', 'investor': [{'name': 'Fatthead Partners', 'website': 'fpartnazs.com'}, {'name': 'Jobius Venture', 'website': 'jobiusvc.com'}]}]
Your output is almost in JSON format.
For JSON, you could use: JSON_EXTRACT_PATH_TEXT Function - Amazon Redshift
However, it seems that the quotation marks are not standard JSON. It should use double-quotes (") in JSON, not single quotes (').
Also, the string appears to start with a List ([...]), which makes it incompatible with the JSON functions. A JSON object would normally be in {..} braces.
The output looks more like it came from a Python program. If so, and you have access to the Python program, it would be better to have it output in correct JSON format, so that you could use the above function. (Or just output the fields you actually want.)
You could write a Python User-Defined Function to do the conversion, such as:
create or replace function f_parse (str varchar(2000))
returns varchar
stable
as $$
return eval(str)[0]['date']
$$ language plpythonu;
Then:
select f_parse(s) from table
Results in: Mon Aug 07 00:00:00 UTC 2004
However, it appears that multiple records are in that one line, so I really suggest that you get a better version of the input data rather than trying to parse that line.

Convert boto3 output to convenient format

I am getting the below output using describe_snapshots function in boto3
u'StartTime': datetime.datetime(2017, 4, 7, 4, 21, 42, tzinfo=tzutc())
I wish to convert it into proper date so that I can proceed with sorting the snapshots and removing the ones which are older than a particular number of days.
Is there a python functionality which can be used to attain this ?
This is almost certainly already the format you need. datetime objects are easily comparable / sortable. For example:
from datetime import datetime
import boto3
ec2 = boto3.client('ec2')
account_id = 'MY_ACCOUNT_ID'
response = ec2.describe_snapshots(OwnerIds=[account_id])
snapshots = response['Snapshots']
# January 1st, 2017
target_date = datetime(2017, 01, 01)
# Get the snapshots older than the target date
old_snapshots = [s for s in snapshots if s['StartTime'] < target_date]
# Sort the old snapshots
old_snapshots = sorted(old_snapshots, key=lambda s: s['StartTime'])
docs: https://docs.python.org/3.6/library/datetime.html
Really late to this post but I recently ran into this. I am assuming you're comparing the dates by hand/eye vs comparing the datetime objects programatically. Or you're debugging and you just want to see the date/time in the json objects in human readable format.
I found that the converter in the aws aha samples works really well.
def myconverter(json_object):
if isinstance(json_object, datetime.datetime):
return json_object.__str__()
From there you can just pass your original event/message from boto to json.dump and get the converted json string back
In [34]: print(json_msg)
{'arn': 'arn:aws:service:region::X', 'service': 'SERVICE', 'eventTypeCode': 'SOME_CODE', 'eventTypeCategory': 'CAT', 'eventScopeCode': 'SCOPE', 'region': 'us-east-1', 'startTime': datetime.datetime(YYYY, MM, DD, HH, MM, tzinfo=tzlocal()), 'endTime': datetime.datetime(YYYY, MM, DD, HH, MM, tzinfo=tzlocal()), 'lastUpdatedTime': datetime.datetime(YYYY, MM, DD, HH, MM, SS, tzinfo=tzlocal()), 'statusCode': 'CODE' }
In [35]: json_msg = json.dumps(json_event, default=myconverter)
In [36]: print(json_event)
{'arn': 'arn:aws:service:region::X', 'service': 'SERVICE', 'eventTypeCode': 'SOME_CODE', 'eventTypeCategory': 'CAT', 'eventScopeCode': 'SCOPE', 'region': 'us-east-1', 'startTime': "YYYY-MM-DD HH:MM:SS-OH:OS", 'endTime': "YYYY-MM-DD HH:MM:SS-OH:OS", 'lastUpdatedTime': "YYYY-MM-DD HH:MM:SS-OH:OS" , 'statusCode': 'CODE' }
probably needs more code from your end - but it almost seems like you're the one making it output that way (or something you're using is by default)
aws returns this format:
<startTime>YYYY-MM-DDTHH:MM:SS.SSSZ</startTime>
so im guessing you are somewhere using datetime.datetime() instead of something else on the date fields? (https://docs.python.org/2/library/datetime.html)

Sort timestamp in python dictionary

mydict = [{'Counted number': '26', 'Timestamp': '8/10/2015 13:07:38'},{'Counted number': '14','Timestamp': '8/10/2015 11:51:14'},{'Counted number': '28','Timestamp': '8/10/2015 13:06:27'}, {'Counted number': '20','Timestamp': '8/10/2015 12:53:42'}]
How to sort this dict based on timestamp?
This should work
import time
mydict.sort(key=lambda x:time.mktime(time.strptime(x['Timestamp'], '%d/%m/%Y %H:%M:%S')))
mydict.sort(key=lambda x:x['Timestamp'])
This will compare the elements of mydict based on their time stamp and sort it that way. Now, if you want to sort it by the actual time, then you have to convert that timestamp string to a Time object of some sort, and then sort mydict based on that. This question will likely help with that.

python parse java calendar to isodate

I've data like this.
startDateTime: {'timeZoneID': 'America/New_York', 'date': {'year': '2014', 'day': '29', 'month': '1'}, 'second': '0', 'hour': '12', 'minute': '0'}
This is just a representation for 1 attribute. Like this i've 5 other attributes. LastModified, created etc.
I wanted to derive this as ISO Date format yyyy-mm-dd hh:mi:ss. is this the right way for doing this?
def parse_date(datecol):
x=datecol;
y=str(x.get('date').get('year'))+'-'+str(x.get('date').get('month')).zfill(2)+'-'+str(x.get('date').get('day')).zfill(2)+' '+str(x.get('hour')).zfill(2)+':'+str(x.get('minute')).zfill(2)+':'+str(x.get('second')).zfill(2)
print y;
return;
That works, but I'd say it's cleaner to use the string formatting operator here:
def parse_date(c):
d = c["date"]
print "%04d-%02d-%02d %02d:%02d:%02d" % tuple(map(str, (d["year"], d["month"], d["day"], c["hour"], c["minute"], c["second"])))
Alternatively, you can use the time module to convert your fields into a Python time value, and then format that using strftime. Remember the time zone, though.

Categories