I have a dictionary with some share related data:
share_data = {'2016-06-13':{'open': 2190, 'close':2200}, '2015-09-10':{'open': 2870, 'close':2450} # and so on, circa 1,500 entries
is there a way of iterating over the dictionary in order, so the oldest date is retrieved first, then the one soon after etc?
thanks!
Sure, the default lexicographical order of your date strings will map to chronological order. So it is very easy:
for key in sorted(share_data.keys()):
#do something
This post has some nice examples of custom sorting on dictionaries.
Related
The input is a dictionary, for example:
{'first_name':'Jane', 'occupation': 'astronaut', 'age':27, 'last_name':'Doe'}
The keys need to be rearranged to be in a specific order, given in a list, for example:
preferred_order = ['first_name', 'last_name', 'age', 'location']
The dictionary might not have all the keys in the preferred_order list, and might have keys that don't appear on the list.
In this specific case, the result of the sorting should be:
{'first_name':'Jane', 'last_name':'Doe', 'age':27, 'occupation': 'astronaut'}
Key location didn't get added to the dictionary, and the keys not in preferred_order are at the end.
Suggested algorithm:
Create a new, initially empty, dictionary;
Iterate through preferred_order: for every key in preferred_order, add key: value to the new dictionary if it exists in the old dictionary, and remove it from the old dictionary;
for every remaining key: value pair in the old dictionary, add it to the new dictionary.
For step 3, you can use dict.update or |=.
Further reading:
documentation on dict.update and |=;
more about |=;
How do I sort a dictionary by value?;
You can search for "sort dictionary by key" on stackoverflow, but be aware that most answers are outdated and recommend using collections.OrderedDict instead of dict, which is no longer necessary since Python 3.7.
I'm writing a script in Python 3, where I go through a file, and collect information about the duration of various tasks. I need to maintain a list of summations of these durations (in the form of datetime.timedelta objects), split by date and which task was done. Each task is identified by an ID string.
This means that while going through the file I build a list of records, where each record consist of a date, an ID string and a duration. When adding a new record I first check if the date and ID string combination is already present in the list. If it is I add the new duration to the current duration in the list. If the date and ID string combination doesn't exist, I append the record to the list.
I don't know in advance how many different combinations of date and ID string there is, so I can't pre-allocate them.
At the end I would like to be able to sort the list on date and ID string before printing it to standard out.
I tried doing it in a list of tuples, but tuples are immutable, so I can't add a new duration to an existing duration I found.
If pressed I could create a new ID string by concatenating a string representation of the date and the ID string. But I would really prefer to keep those two values separate.
Is this possible? And if so: How?
I wouldn't use a list in this case, but rather a dict. Here's a simple example:
data = {}
with open("myfile.txt") as file:
for line in file:
# Parse the line for the following:
# tid: The task ID we read
# date: The date we read
# duration: The duration we read
# Once the data has been parsed out, store it:
data.setdefault((date, tid), 0)
data[(date, tid)] += duration
After parsing the file you can get the keys to the dict (data.keys()), sort them, and print out the results.
I want to interface with rocksdb in my python application and store arbitrary dicts in it. I gather that for that I can use something like pickle to for serialisation. But I need to be able to filter the records based on values of their keys. What's the proper approach here?
so let's say you have a list of keys named dict_keys and you have a dict named big_dict and you want to filter out only the values from dict_keys. You can write a dict comprehension that iterates through the list grabbing the items from the dict if they exist like this:
new_dict = {key: big_dict.get(key) for key in dict_keys}
RocksDB is a key-value store, and both key and value are binary strings.
If you want to filter by given keys, just use the Get interface to search the DB.
If you want to filter by given key patterns, you have to use the Iterator interface to iterating the whole DB, and filter the records with keys that match the pattern.
If you want to filter by values or value patterns, you still need to iterating the whole DB. For each key-value pair, deserialize the value, and check if it equals to the give value or matches the give pattern.
For case 1 and case 2, you don't need to deserialize all values, but only values that equal to the give key or match the pattern. However, for case 3, you have to deserialize all values.
Both case 2 and case 3 are inefficient, since they need to iterate the whole key space.
You can configure RocksDB's key to be ordered, and RocksDB have a good support for prefix indexing. So you can efficiently do range query and prefix query by key. Check the documentation for details.
In order to efficiently do value filter/search, you have to create a value index with RocksDB.
I have a list of IDs for objects that I need to grab, then I have to sort them by their timestamp. Here's how I was going to do it:
For i in object_ids:
instance = Model.objects.get(id = i)
# Append instance to list of instances
#sort the instances list
But there are two things that bother me:
Is there no way to grab a collection of entries by a list of their IDs - without having to loop?
Can I append arbitrary objects to the QuerySet just based on their IDs ?
Thanks,
This can be done using such a code:
objects = Model.objects.filter(id__in=object_ids).order_by('-timestamp')
the order_by can be positive or negative timestamp, depending how you want it sorted.
Try the following:
result = Model.objects.filter(id__in=object_ids)
This returns all Model objects that have their id in the given list of object_ids. This way, you also don't need to append additional models to the resulting QuerySet.
I have a nested tuple returned from a MySQL cursor.fetchall() containing some results in the form (datetime.date, float). I need to separate these out in to a nested dictionary of the form [month/year][day of month] - so I would like to have a dictionary (say) readings which I would reference like readings['12/2011'][13] to get the reading for 13th day of the month '12/2011'. This is with a view to producing graphs showing the daily readings for multiple months overlaid.
My difficulty is that (I believe) I need to set up the first dimension of the dictionary with the unique month/year identifiers. I am currently getting a list of these via:
list(set(["%02d/%04d" % (z[0].month, z[0].year) for z in raw]))
where raw is a list of tuples returned from the database.
Now I can easily do this as a two stage process - set up the first dimenion of the dictionary then go through the data once more to set-up the second. I wondered though if there is a readable way to do both steps at once possibly with nested dictionary/list comprehensions.
I'd be graetful for any advice. Thank you.
it seems difficult to do both levels in a concise oneliner, I propose you instead to use defaultdict like this:
res = defaultdict(dict)
for z in raw:
res["%02d/%04d"%(z[0].month, z[0].year)][z[0].day] = z