I have a list of dictionaries which is as follow-
VehicleList = [
{
'id': '1',
'VehicleType': 'Car',
'CreationDate': datetime.datetime(2021, 12, 10, 16, 9, 44, 872000)
},
{
'id': '2',
'VehicleType': 'Bike',
'CreationDate': datetime.datetime(2021, 12, 15, 11, 8, 21, 612000)
},
{
'id': '3',
'VehicleType': 'Truck',
'CreationDate': datetime.datetime(2021, 9, 13, 10, 1, 50, 350095)
},
{
'id': '4',
'VehicleType': 'Bike',
'CreationDate': datetime.datetime(2021, 12, 10, 21, 1, 00, 300012)
},
{
'id': '5',
'VehicleType': 'Car',
'CreationDate': datetime.datetime(2021, 12, 21, 10, 1, 50, 600095)
}
]
How can I get a list of the latest vehicles for each 'VehicleType' based on their 'CreationDate'?
I expect something like this-
latestVehicles = [
{
'id': '5',
'VehicleType': 'Car',
'CreationDate': datetime.datetime(2021, 12, 21, 10, 1, 50, 600095)
},
{
'id': '2',
'VehicleType': 'Bike',
'CreationDate': datetime.datetime(2021, 12, 15, 11, 8, 21, 612000)
},
{
'id': '3',
'VehicleType': 'Truck',
'CreationDate': datetime.datetime(2021, 9, 13, 10, 1, 50, 350095)
}
]
I tried separating out each dictionary based on their 'VehicleType' into different lists and then picking up the latest one.
I believe there might be a more optimal way to do this.
Use a dictionary mapping from VehicleType value to the dictionary you want in your final list. Compare the date of each item in the input list with the one your dict, and keep the later one.
latest_dict = {}
for vehicle in VehicleList:
t = vehicle['VehicleType']
if t not in latest_dict or vehicle['CreationDate'] > latest_dict[t]['CreationDate']:
latest_dict[t] = vehicle
latestVehicles = list(latest_dict.values())
Here is a solution using max and filter:
VehicleLatest = [
max(
filter(lambda _: _["VehicleType"] == t, VehicleList),
key=lambda _: _["CreationDate"]
) for t in {_["VehicleType"] for _ in VehicleList}
]
Result
print(VehicleLatest)
# [{'id': '2', 'VehicleType': 'Bike', 'CreationDate': datetime.datetime(2021, 12, 15, 11, 8, 21, 612000)}, {'id': '3', 'VehicleType': 'Truck', 'CreationDate': datetime.datetime(2021, 9, 13, 10, 1, 50, 350095)}, {'id': '5', 'VehicleType': 'Car', 'CreationDate': datetime.datetime(2021, 12, 21, 10, 1, 50, 600095)}]
I think you can acheive what you want using the groupby function from itertools.
from itertools import groupby
# entries sorted according to the key we wish to groupby: 'VehicleType'
VehicleList = sorted(VehicleList, key=lambda x: x["VehicleType"])
latestVehicles = []
# Then the elements are grouped.
for k, v in groupby(VehicleList, lambda x: x["VehicleType"]):
# We then append to latestVehicles the 0th entry of the
# grouped elements after sorting according to the 'CreationDate'
latestVehicles.append(sorted(list(v), key=lambda x: x["CreationDate"], reverse=True)[0])
Sort by 'VehicleType' and 'CreationDate', then create a dictionary from 'VehicleType' and vehicle to get the latest vehicle for each type:
VehicleList.sort(key=lambda x: (x.get('VehicleType'), x.get('CreationDate')))
out = list(dict(zip([item.get('VehicleType') for item in VehicleList], VehicleList)).values())
Output:
[{'id': '2',
'VehicleType': 'Bike',
'CreationDate': datetime.datetime(2021, 12, 15, 11, 8, 21, 612000)},
{'id': '5',
'VehicleType': 'Car',
'CreationDate': datetime.datetime(2021, 12, 21, 10, 1, 50, 600095)},
{'id': '3',
'VehicleType': 'Truck',
'CreationDate': datetime.datetime(2021, 9, 13, 10, 1, 50, 350095)}]
This is very straightforwards in pandas. First load the list of dicts as a pandas dataframe, then sort the values by date, take the top n items (3 in the example below), and export to dict.
import pandas as pd
df = pd.DataFrame(VehicleList)
df.sort_values('CreationDate', ascending=False).head(3).to_dict(orient='records')
You can use the operator to achieve that goal:
import operator
my_sorted_list_by_type_and_date = sorted(VehicleList, key=operator.itemgetter('VehicleType', 'CreationDate'))
A small plea for more readable code:
from operator import itemgetter
from itertools import groupby
vtkey = itemgetter('VehicleType')
cdkey = itemgetter('CreationDate')
latest = [
# Get latest from each group.
max(vs, key = cdkey)
# Sort and group by VehicleType.
for g, vs in groupby(sorted(vehicles, key = vtkey), vtkey)
]
A variation on Blckknght's answer using defaultdict to avoid the long if condition:
from collections import defaultdict
import datetime
from operator import itemgetter
latest_dict = defaultdict(lambda: {'CreationDate': datetime.datetime.min})
for vehicle in VehicleList:
t = vehicle['VehicleType']
latest_dict[t] = max(vehicle, latest_dict[t], key=itemgetter('CreationDate'))
latestVehicles = list(latest_dict.values())
latestVehicles:
[{'id': '5', 'VehicleType': 'Car', 'CreationDate': datetime.datetime(2021, 12, 21, 10, 1, 50, 600095)},
{'id': '2', 'VehicleType': 'Bike', 'CreationDate': datetime.datetime(2021, 12, 15, 11, 8, 21, 612000)},
{'id': '3', 'VehicleType': 'Truck', 'CreationDate': datetime.datetime(2021, 9, 13, 10, 1, 50, 350095)}]
When I query MySQL with Python and the query has datetime fields then I get this list as a result.
[{'_id': 1, 'name': 'index', '_cdate': datetime.datetime(2020, 10, 27, 9, 4, 34), 'title': 'DataExtract'}, {'_id': 2, 'name': 'topmenu', '_cdate': datetime.datetime(2020, 11, 4, 19, 52, 17), 'title': 'topmenu'}, {'_id': 3, 'name': 'functions_common', '_cdate': datetime.datetime(2020, 11, 4, 19, 52, 50), 'title': 'common functions'}, {'_id': 4, 'name': 'leftmenu', '_cdate': datetime.datetime(2020, 11, 4, 19, 53, 56), 'title': 'Left Menu'}, {'_id': 5, 'name': 'todo', '_cdate': datetime.datetime(2020, 11, 7, 8, 49, 38), 'title': 'Todo'}, {'_id': 6, 'name': 'cron_publish', '_cdate': datetime.datetime(2020, 12, 2, 19, 30, 11), 'title': 'Run Publish reports'}, {'_id': 7, 'name': 'test', '_cdate': datetime.datetime(2020, 12, 2, 22, 32, 54), 'title': 'test'}, {'_id': 8, 'name': 'help', '_cdate': datetime.datetime(2020, 12, 5, 7, 12, 44), 'title': 'Help'}, {'_id': 9, 'name': 'api', '_cdate': datetime.datetime(2020, 12, 5, 21, 22, 13), 'title': 'API'}, {'_id': 10, 'name': 'ben', '_cdate': datetime.datetime(2021, 10, 4, 11, 37, 3), 'title': 'List of Reports'}]
How do I either get the query to return the date fields in YYYY-MM-DD HH:MM:SS format? Or how do I convert them in the returned list. When I try to change them by enumerating over the results python throw as error that the dictionary has changed.
The datetime.datetime() objects you're getting are the standard representation of these objects - if you were expecting strings instead, you could simple convert them with datetime.strftime('%Y-%m-%d %H:%M:%S', value) but keep in mind that the datetime object is a more flexible way of keeping the data around. I'd recommend only formatting the date in a specific way if you're writing it to the screen or a file format that expects a string.
Example:
data = [{'_id': 1, 'name': 'index', '_cdate': datetime.datetime(2020, 10, 27, 9, 4, 34), 'title': 'DataExtract'}, {'_id': 2, 'name': 'topmenu', '_cdate': datetime.datetime(2020, 11, 4, 19, 52, 17), 'title': 'topmenu'}, {'_id': 3, 'name': 'functions_common', '_cdate': datetime.datetime(2020, 11, 4, 19, 52, 50), 'title': 'common functions'}, {'_id': 4, 'name': 'leftmenu', '_cdate': datetime.datetime(2020, 11, 4, 19, 53, 56), 'title': 'Left Menu'}, {'_id': 5, 'name': 'todo', '_cdate': datetime.datetime(2020, 11, 7, 8, 49, 38), 'title': 'Todo'}, {'_id': 6, 'name': 'cron_publish', '_cdate': datetime.datetime(2020, 12, 2, 19, 30, 11), 'title': 'Run Publish reports'}, {'_id': 7, 'name': 'test', '_cdate': datetime.datetime(2020, 12, 2, 22, 32, 54), 'title': 'test'}, {'_id': 8, 'name': 'help', '_cdate': datetime.datetime(2020, 12, 5, 7, 12, 44), 'title': 'Help'}, {'_id': 9, 'name': 'api', '_cdate': datetime.datetime(2020, 12, 5, 21, 22, 13), 'title': 'API'}, {'_id': 10, 'name': 'ben', '_cdate': datetime.datetime(2021, 10, 4, 11, 37, 3), 'title': 'List of Reports'}]
for rec in data:
rec['date_str'] = datetime.datetime.strftime('%Y-%m-%d %H:%M:%S', rec['_cdate'])
That would add 'date_str' field to every record with the format you require. Of course, you could also modify it to overwrite the original value.
rows is a list of dict from mysql.
rows example
[{'date': datetime.datetime(2017, 3, 21, 13, 27, 20), 'tid': 648605515L, 'price': Decimal('1080.04000000'), 'type': 1, 'amount': Decimal('10.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 20), 'tid': 648605549L, 'price': Decimal('1081.55000000'), 'type': 1, 'amount': Decimal('16.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 20), 'tid': 648605547L, 'price': Decimal('1081.33000000'), 'type': 1, 'amount': Decimal('20.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 20), 'tid': 648605545L, 'price': Decimal('1081.30000000'), 'type': 1, 'amount': Decimal('16.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 20), 'tid': 648605543L, 'price': Decimal('1081.29000000'), 'type': 1, 'amount': Decimal('20.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 20), 'tid': 648605541L, 'price': Decimal('1080.46000000'), 'type': 1, 'amount': Decimal('26.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 20), 'tid': 648605517L, 'price': Decimal('1080.04000000'), 'type': 1, 'amount': Decimal('8.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 22), 'tid': 648605601L, 'price': Decimal('1079.69000000'), 'type': -1, 'amount': Decimal('70.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 25), 'tid': 648605686L, 'price': Decimal('1079.72000000'), 'type': -1, 'amount': Decimal('4.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 26), 'tid': 648605765L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('6.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 26), 'tid': 648605753L, 'price': Decimal('1079.60000000'), 'type': -1, 'amount': Decimal('106.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 26), 'tid': 648605751L, 'price': Decimal('1079.60000000'), 'type': -1, 'amount': Decimal('80.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 26), 'tid': 648605749L, 'price': Decimal('1079.67000000'), 'type': -1, 'amount': Decimal('430.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 26), 'tid': 648605747L, 'price': Decimal('1079.70000000'), 'type': -1, 'amount': Decimal('66.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 26), 'tid': 648605745L, 'price': Decimal('1079.74000000'), 'type': -1, 'amount': Decimal('12.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 27), 'tid': 648605785L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 27), 'tid': 648605774L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('6.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 27), 'tid': 648605771L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('14.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 28), 'tid': 648605827L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('42.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 28), 'tid': 648605842L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('10.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 32), 'tid': 648605973L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 37), 'tid': 648606114L, 'price': Decimal('1079.44000000'), 'type': 1, 'amount': Decimal('24.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 37), 'tid': 648606116L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('40.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 42), 'tid': 648606258L, 'price': Decimal('1079.45000000'), 'type': 1, 'amount': Decimal('56.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 45), 'tid': 648606345L, 'price': Decimal('1079.46000000'), 'type': -1, 'amount': Decimal('10.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 46), 'tid': 648606392L, 'price': Decimal('1079.69000000'), 'type': 1, 'amount': Decimal('44.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 48), 'tid': 648606418L, 'price': Decimal('1079.60000000'), 'type': -1, 'amount': Decimal('40.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 48), 'tid': 648606420L, 'price': Decimal('1079.46000000'), 'type': -1, 'amount': Decimal('36.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 48), 'tid': 648606422L, 'price': Decimal('1079.46000000'), 'type': -1, 'amount': Decimal('94.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 50), 'tid': 648606499L, 'price': Decimal('1079.31000000'), 'type': 1, 'amount': Decimal('80.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 50), 'tid': 648606478L, 'price': Decimal('1079.31000000'), 'type': -1, 'amount': Decimal('6.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 50), 'tid': 648606476L, 'price': Decimal('1079.31000000'), 'type': -1, 'amount': Decimal('34.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 50), 'tid': 648606474L, 'price': Decimal('1079.55000000'), 'type': -1, 'amount': Decimal('8.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 55), 'tid': 648606666L, 'price': Decimal('1079.31000000'), 'type': 1, 'amount': Decimal('44.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 55), 'tid': 648606650L, 'price': Decimal('1079.17000000'), 'type': 1, 'amount': Decimal('8.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 27, 55), 'tid': 648606648L, 'price': Decimal('1079.17000000'), 'type': 1, 'amount': Decimal('8.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 1), 'tid': 648606820L, 'price': Decimal('1079.03000000'), 'type': -1, 'amount': Decimal('28.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 2), 'tid': 648606825L, 'price': Decimal('1079.03000000'), 'type': 1, 'amount': Decimal('30.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 2), 'tid': 648606836L, 'price': Decimal('1079.02000000'), 'type': -1, 'amount': Decimal('22.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 5), 'tid': 648606945L, 'price': Decimal('1078.58000000'), 'type': -1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 5), 'tid': 648606943L, 'price': Decimal('1078.61000000'), 'type': -1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 5), 'tid': 648606941L, 'price': Decimal('1078.63000000'), 'type': -1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 5), 'tid': 648606939L, 'price': Decimal('1078.88000000'), 'type': -1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 5), 'tid': 648606926L, 'price': Decimal('1078.88000000'), 'type': -1, 'amount': Decimal('428.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 6), 'tid': 648606984L, 'price': Decimal('1078.58000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 6), 'tid': 648606982L, 'price': Decimal('1078.05000000'), 'type': -1, 'amount': Decimal('10.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 6), 'tid': 648606971L, 'price': Decimal('1078.58000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 6), 'tid': 648606957L, 'price': Decimal('1078.05000000'), 'type': -1, 'amount': Decimal('74.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 6), 'tid': 648606955L, 'price': Decimal('1078.15000000'), 'type': -1, 'amount': Decimal('6.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 6), 'tid': 648606953L, 'price': Decimal('1078.15000000'), 'type': -1, 'amount': Decimal('14.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 6), 'tid': 648606951L, 'price': Decimal('1078.42000000'), 'type': -1, 'amount': Decimal('16.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 7), 'tid': 648606992L, 'price': Decimal('1078.05000000'), 'type': -1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 7), 'tid': 648606995L, 'price': Decimal('1078.58000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 7), 'tid': 648607023L, 'price': Decimal('1078.06000000'), 'type': -1, 'amount': Decimal('4.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 8), 'tid': 648607047L, 'price': Decimal('1078.86000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 10), 'tid': 648607113L, 'price': Decimal('1078.06000000'), 'type': -1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 10), 'tid': 648607115L, 'price': Decimal('1078.03000000'), 'type': -1, 'amount': Decimal('148.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 12), 'tid': 648607192L, 'price': Decimal('1079.00000000'), 'type': -1, 'amount': Decimal('10.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 13), 'tid': 648607218L, 'price': Decimal('1078.99000000'), 'type': 1, 'amount': Decimal('98.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 13), 'tid': 648607220L, 'price': Decimal('1079.00000000'), 'type': 1, 'amount': Decimal('42.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 13), 'tid': 648607222L, 'price': Decimal('1079.03000000'), 'type': 1, 'amount': Decimal('342.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 13), 'tid': 648607224L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('512.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 14), 'tid': 648607250L, 'price': Decimal('1078.98000000'), 'type': 1, 'amount': Decimal('44.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 14), 'tid': 648607252L, 'price': Decimal('1078.98000000'), 'type': 1, 'amount': Decimal('12.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 14), 'tid': 648607254L, 'price': Decimal('1079.00000000'), 'type': 1, 'amount': Decimal('106.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 14), 'tid': 648607256L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('40.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 20), 'tid': 648607431L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('28.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 20), 'tid': 648607429L, 'price': Decimal('1079.01000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 20), 'tid': 648607427L, 'price': Decimal('1079.01000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 23), 'tid': 648607518L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('8.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 24), 'tid': 648607544L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('344.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 25), 'tid': 648607593L, 'price': Decimal('1078.79000000'), 'type': -1, 'amount': Decimal('6.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 26), 'tid': 648607631L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('430.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 26), 'tid': 648607623L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('18.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 26), 'tid': 648607621L, 'price': Decimal('1078.79000000'), 'type': 1, 'amount': Decimal('14.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 29), 'tid': 648607695L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('776.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 32), 'tid': 648607803L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 32), 'tid': 648607805L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('10.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 36), 'tid': 648607905L, 'price': Decimal('1079.16000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 37), 'tid': 648607940L, 'price': Decimal('1079.31000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 42), 'tid': 648608110L, 'price': Decimal('1079.46000000'), 'type': -1, 'amount': Decimal('12.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 46), 'tid': 648608211L, 'price': Decimal('1079.88000000'), 'type': -1, 'amount': Decimal('12.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 46), 'tid': 648608213L, 'price': Decimal('1079.88000000'), 'type': -1, 'amount': Decimal('6.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 57), 'tid': 648608534L, 'price': Decimal('1080.29000000'), 'type': 1, 'amount': Decimal('14.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 28, 57), 'tid': 648608536L, 'price': Decimal('1080.30000000'), 'type': 1, 'amount': Decimal('2.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 29, 2), 'tid': 648608683L, 'price': Decimal('1080.59000000'), 'type': 1, 'amount': Decimal('40.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 29, 3), 'tid': 648608733L, 'price': Decimal('1080.59000000'), 'type': 1, 'amount': Decimal('360.00000000')}, {'date': datetime.datetime(2017, 3, 21, 13, 29, 7), 'tid': 648608838L, 'price': Decimal('1080.90000000'), 'type': 1, 'amount': Decimal('82.00000000')}]
if I didn't use set_index ,it will have an TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
if rows:
df = pd.DataFrame(rows)
print df.head()
# TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'
df = df.set_index("date")
print df.head()
resample_data = df.resample("1min", how={"price": "ohlc", "amount": "sum"})
print resample_data
Result :
Connected to pydev debugger (build 162.1967.10)
amount date price tid type
0 2.00000000 2017-03-21 11:15:12 1075.83000000 648370156 -1
1 10.00000000 2017-03-21 11:15:15 1076.00000000 648370241 -1
2 10.00000000 2017-03-21 11:15:17 1075.83000000 648370297 -1
3 10.00000000 2017-03-21 11:15:17 1075.83000000 648370311 1
4 8.00000000 2017-03-21 11:15:19 1076.13000000 648370370 1
amount price tid type
date
2017-03-21 11:15:12 2.00000000 1075.83000000 648370156 -1
2017-03-21 11:15:15 10.00000000 1076.00000000 648370241 -1
2017-03-21 11:15:17 10.00000000 1075.83000000 648370297 -1
2017-03-21 11:15:17 10.00000000 1075.83000000 648370311 1
2017-03-21 11:15:19 8.00000000 1076.13000000 648370370 1
/Users/wyx/bitcoin_workspace/fibo-strategy/ticker.py:45: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...)..apply(<func>)
resample_data = df.resample("1min", how={"price": "ohlc", "amount": "sum"})
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1580, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 964, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Users/wyx/bitcoin_workspace/fibo-strategy/ticker.py", line 45, in <module>
resample_data = df.resample("1min", how={"price": "ohlc", "amount": "sum"})
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/generic.py", line 4216, in resample
limit=limit)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/tseries/resample.py", line 582, in _maybe_process_deprecations
r = r.aggregate(how)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/tseries/resample.py", line 320, in aggregate
result, how = self._aggregate(arg, *args, **kwargs)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/base.py", line 549, in _aggregate
result = _agg(arg, _agg_1dim)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/base.py", line 500, in _agg
result[fname] = func(fname, agg_how)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/base.py", line 483, in _agg_1dim
return colg.aggregate(how, _level=(_level or 0) + 1)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/groupby.py", line 2652, in aggregate
return getattr(self, func_or_funcs)(*args, **kwargs)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/groupby.py", line 1128, in ohlc
lambda x: x._cython_agg_general('ohlc'))
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/groupby.py", line 3103, in _apply_to_column_groupbys
return func(self)
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/groupby.py", line 1128, in <lambda>
lambda x: x._cython_agg_general('ohlc'))
File "/Users/wyx/bitcoin_workspace/fibo-strategy/.env/lib/python2.7/site-packages/pandas/core/groupby.py", line 808, in _cython_agg_general
raise DataError('No numeric types to aggregate')
pandas.core.base.DataError: No numeric types to aggregate
Process finished with exit code 1
I am a rookie for pandas.
How solve the error?
And if I want to use the last close price to fill the NaN of next
min ohlc. How to do that?
You need to set an index using your dates.
Code:
from io import StringIO
df = pd.read_csv(StringIO(
u"""amount date price tid type
6.00000000 2017-03-21t10:46:32 1059.26000000 648313975 -1
4.00000000 2017-03-21t10:46:37 1059.42000000 648314094 -1
2.00000000 2017-03-21t10:46:37 1059.42000000 648314096 -1
2.00000000 2017-03-21t10:46:41 1059.26000000 648314176 -1
32.00000000 2017-03-21t10:46:41 1059.26000000 648314189 -1
"""), sep='\s+', parse_dates='date'.split())
print(df)
resample_data = df.set_index('date').resample(
"1min", how={"price": "ohlc", "amount": "sum"})
print(resample_data)
Results:
amount date price tid type
0 6.0 2017-03-21 10:46:32 1059.26 648313975 -1
1 4.0 2017-03-21 10:46:37 1059.42 648314094 -1
2 2.0 2017-03-21 10:46:37 1059.42 648314096 -1
3 2.0 2017-03-21 10:46:41 1059.26 648314176 -1
4 32.0 2017-03-21 10:46:41 1059.26 648314189 -1
price amount
open high low close amount
date
2017-03-21 10:46:00 1059.26 1059.42 1059.26 1059.26 46.0