If I want to assign to an element of a list only one value I use always a dictionary. For example:
{'Monday':1, 'Tuesday':2,...'Friday':5,..}
But I want to assign to one element of a list many values, like for example:
Monday: Jogging, Swimming, Skating
Tuesday: School, Work, Dinner, Cinema
...
Friday: Doctor
Is any built-in structure or a simple way to make something like this in python?
My idea: I was thinking about something like: a dictionary which as a key holds a day and as a value holds a list, but maybe there is a better solution.
A dictionary whose values are lists is perfectly fine, and in fact very common.
In fact, you might want to consider an extension to that: a collections.defaultdict(list). This will create a new empty list the first time you access any key, so you can write code like this:
d[day].append(activity)
… instead of this:
if not day in d:
d[day] = []
d[day].append(activity)
The down-side of a defaultdict is that you no longer have a way to detect that a key is missing in your lookup code, because it will automatically create a new one. If that matters, use a regular dict together with the setdefault method:
d.setdefault(day, []).append(activity)
You could wrap either of these solutions up in a "MultiDict" class that encapsulates the fact that it's a dictionary of lists, but the dictionary-of-lists idea is such a common idiom that it really isn't necessary to hide it.
Related
This looks like a CS 101 style homework but it actually isn't. I am trying to learn more python so I took up this personal project to write a small app that keeps my grade-book for me.
I have a class semester which holds a dictionary of section objects.
A section is a class that I am teaching in which ever semester object I am manipulating (I didn't want to call them classes for obvious reasons). I originally had sections as a list not a dictionary, and when I wanted to add a roster of students to that semester I could do this.
for sec in working_semester.sections:
sec.addRosterFromFile(filename)
Now I have changed the sections member of semester to a dictionary so I can look up a specific one to work with, but I am having trouble when I want to loop over all of them to do something like when I first set up a new semester I want to add all the sections, then loop over them and add students to each. If I try the same code to loop over the dictionary it gives me the key, but I was hoping to get the value.
I have also tried to iterate over a dictionary like this, which I got out of an older stack over flow question
for sec in iter(sorted(working_semester.sections.iteritems())):
sec.addRosterFromFile(filename)
But iter(sorted ... returns a tuple (key, value) not the item so the line in side the loop gives me an error that tuple does not have a function called addStudent.
Currently I have this fix in place where I loop through the keys and then use the key to access the value like this:
for key in working_semester.sections:
working_semester.sections[key].addRosterFromFile(filename)
There has to be a way to loop over dictionary values, or is this not desirable? My understanding of dictionaries is that they are like lists but rather than grabbing an element by its position it has a specific key, which makes it easier to grab the one you want no matter what order they are in. Am I missing how dictionaries should be used?
Using iteritems is a good approach, you just need to unpack the key and value:
for key, value in iter(sorted(working_semester.sections.iteritems())):
value.addRosterFromFile(filename)
If you really only need the value, you could use the aptly named itervalues:
for sec in sorted(working_semester.sections.itervalues()):
sec.addRosterFromFile(filename)
(It's not clear from your example whether you really need sorted there. If you don't need to iterate over the sections in sorted order just leave sorted out.)
I'm working with the Flask framework in Python, and need to hand off a list of lists to a renderer.
I step through a loop and create a list, sort it, append it to another list, then call the render function with the masterlist, like so:
for itemID in itemsArray:
avgQuantity = getJitaQuantity(itemID)
lowestJitaSell = getJitaLowest(itemID)
candidateArray = findLowestPrices(itemID, lowestJitaSell, candidateArray, avgQuantity)
candidateArray.sort()
multiCandidateArray.append(candidateArray)
renderPage(multiCandidateArray)
My problem is that I need to clear the candidateArray and create a new one each time through the loop, but it looks like the candidateArray that I append to the multiCandidateArray is actually a pointer, not the values themselves.
When I do this:
for itemID in itemsArray:
avgQuantity = getJitaQuantity(itemID)
lowestJitaSell = getJitaLowest(itemID)
candidateArray = findLowestPrices(itemID, lowestJitaSell, candidateArray, avgQuantity)
candidateArray.sort()
multiCandidateArray.append(candidateArray)
**del candidateArray[:]**
renderPage(multiCandidateArray)
I end up with no values.
Is there a way to handle this situation that I'm missing?
I would probably go with something like:
for itemID in itemsArray:
avgQuantity = getJitaQuantity(itemID)
lowestJitaSell = getJitaLowest(itemID)
candidateArray = findLowestPrices(itemID, lowestJitaSell, candidateArray, avgQuantity)
multiCandidateArray.append(sorted(candidateArray))
No need to del anything here, and sorted returns a new list, so even if FindLowestPrices is for some reason returning references to the same list (which is unlikely), then you'll still have unique lists in the multiCandidateArray (although your unique lists could hold references to the same objects).
Your code already creates a new one each time through the loop.
candidateArray = findLowestPrices(...)
This assigns a new list to the variable, candidateArray. It should work fine.
When you do this:
del candidateArray[:]
...you're deleting the contents of the same list you just appended to the master list.
Don't think about pointers or variables; just think about objects, and remember nothing in Python is ever implicitly copied. A list is an object. At the end of the loop, candidateArray names the same list object as multiCandidateArray[-1]. They're different names for the same thing. On the next run through the loop, candidateArray becomes a name for a new list as produced by findLowestPrices, and the list at the end of the master list is unaffected.
I've written about this before; the C way of thinking about variables as being predetermined blocks of memory just doesn't apply to Python at all. Names are moved onto values, rather than values being copied into some fixed number of buckets.
(Also, nitpicking, but Python code generally uses under_scores and doesn't bother with types in names unless it's really ambiguous. So you might have candidates and multi_candidates. Definitely don't call anything an "array", since there's an array module in the standard library that does something different and generally not too useful. :))
I have a dict that has unix epoch timestamps for keys, like so:
lookup_dict = {
1357899: {} #some dict of data
1357910: {} #some other dict of data
}
Except, you know, millions and millions and millions of entries. I'd like to subset this dict, over and over again. Ideally, I'd love to be able to write something like I can in R, like:
lookup_value = 1357900
dict_subset = lookup_dict[key >= lookup_value]
# dict_subset now contains {1357910: {}}
But I confess, I can't find any actual proof that this is something Python can do without having, one way or the other, to iterate over every row. If I understand Python correctly (and I might not), key lookup of the form key in dict uses binary search, and is thus very fast; any way to do a binary search, on dict keys?
To do this without iterating, you're going to need the keys in sorted order. Then you just need to do a binary search for the first one >= lookup_value, instead of checking each one for >= lookup_value.
If you're willing to use a third-party library, there are plenty out there. The first two that spring to mind are bintrees (which uses a red-black tree, like C++, Java, etc.) and blist (which uses a B+Tree). For example, with bintrees, it's as simple as this:
dict_subset = lookup_dict[lookup_value:]
And this will be as efficient as you'd hope—basically, it adds a single O(log N) search on top of whatever the cost of using that subset. (Of course usually what you want to do with that subset is iterate the whole thing, which ends up being O(N) anyway… but maybe you're doing something different, or maybe the subset is only 10 keys out of 1000000.)
Of course there is a tradeoff. Random access to a tree-based mapping is O(log N) instead of "usually O(1)". Also, your keys obviously need to be fully ordered, instead of hashable (and that's a lot harder to detect automatically and raise nice error messages on).
If you want to build this yourself, you can. You don't even necessarily need a tree; just a sorted list of keys alongside a dict. You can maintain the list with the bisect module in the stdlib, as JonClements suggested. You may want to wrap up bisect to make a sorted list object—or, better, get one of the recipes on ActiveState or PyPI to do it for you. You can then wrap the sorted list and the dict together into a single object, so you don't accidentally update one without updating the other. And then you can extend the interface to be as nice as bintrees, if you want.
Using the following code will work out
some_time_to_filter_for = # blah unix time
# Create a new sub-dictionary
sub_dict = {key: val for key, val in lookup_dict.items()
if key >= some_time_to_filter_for}
Basically we just iterate through all the keys in your dictionary and given a time to filter out for we take all the keys that are greater than or equal to that value and place them into our new dictionary
I'm using python for my shopping cart class which has a list of items. When a customer wants to edit an item, I need to pass the JavaScript front-end some way to refer to the item so that it can call AJAX methods to manipulate it.
Basically, I need a simple way to point to a particular item that isn't its index, and isn't a reference to the object itself.
I can't use an index, because another item in the list might be added or removed while the identifier is "held" by the front end. If I were to pass the index forward, if an item got deleted from the list then that index wouldn't point to the right object.
One solution seems to be to use UUIDs, but that seems particularly heavyweight for a very small list. What's the simplest/best way to do this?
Instead of using a list, why not use a dictionary and use small integers as the keys? Adding and removing items from the dictionary will not change the indices into the dictionary. You will want to keep one value in the dictionary that lets you know what the next assigned index will be.
A UUID seems perfect for this. Why don't you want to do that?
Do the items have any sort of product_id? Can the shopping cart have more than one of the same product_id, or does it store a quantity? What I'm getting at is: If product_id's in the cart are unique, you can just use that.
I'm working through some tutorials on Python and am at a position where I am trying to decide what data type/structure to use in a certain situation.
I'm not clear on the differences between arrays, lists, dictionaries and tuples.
How do you decide which one is appropriate - my current understanding doesn't let me distinguish between them at all - they seem to be the same thing.
What are the benefits/typical use cases for each one?
How do you decide which data type to use? Easy:
You look at which are available and choose the one that does what you want. And if there isn't one, you make one.
In this case a dict is a pretty obvious solution.
Tuples first. These are list-like things that cannot be modified. Because the contents of a tuple cannot change, you can use a tuple as a key in a dictionary. That's the most useful place for them in my opinion. For instance if you have a list like item = ["Ford pickup", 1993, 9995] and you want to make a little in-memory database with the prices you might try something like:
ikey = tuple(item[0], item[1])
idata = item[2]
db[ikey] = idata
Lists, seem to be like arrays or vectors in other programming languages and are usually used for the same types of things in Python. However, they are more flexible in that you can put different types of things into the same list. Generally, they are the most flexible data structure since you can put a whole list into a single list element of another list, but for real data crunching they may not be efficient enough.
a = [1,"fred",7.3]
b = []
b.append(1)
b[0] = "fred"
b.append(a) # now the second element of b is the whole list a
Dictionaries are often used a lot like lists, but now you can use any immutable thing as the index to the dictionary. However, unlike lists, dictionaries don't have a natural order and can't be sorted in place. Of course you can create your own class that incorporates a sorted list and a dictionary in order to make a dict behave like an Ordered Dictionary. There are examples on the Python Cookbook site.
c = {}
d = ("ford pickup",1993)
c[d] = 9995
Arrays are getting closer to the bit level for when you are doing heavy duty data crunching and you don't want the frills of lists or dictionaries. They are not often used outside of scientific applications. Leave these until you know for sure that you need them.
Lists and Dicts are the real workhorses of Python data storage.
Best type for counting elements like this is usually defaultdict
from collections import defaultdict
s = 'asdhbaklfbdkabhvsdybvailybvdaklybdfklabhdvhba'
d = defaultdict(int)
for c in s:
d[c] += 1
print d['a'] # prints 7
Do you really require speed/efficiency? Then go with a pure and simple dict.
Personal:
I mostly work with lists and dictionaries.
It seems that this satisfies most cases.
Sometimes:
Tuples can be helpful--if you want to pair/match elements. Besides that, I don't really use it.
However:
I write high-level scripts that don't need to drill down into the core "efficiency" where every byte and every memory/nanosecond matters. I don't believe most people need to drill this deep.