It's a usual programming problem to have a list of objects that you want to map into another. This is usually done with a dictionary:
l1=[1,2,3]
d={1:'_ONE', 2:'_TWO', 3:'_THREE'}
l2= [d[i] for i in l1]
In many cases, it's natural to see a dictionary as a function that maps keys to values. Then, you'd simply l2= map(d,l1) instead of using the list comprehension (or the ugly map(d.__getitem__,l1).
Of course, since python is so flexible, you can get this behaviour easily:
class CallableDict(dict):
def __call__(self, k):
return self[k]
but this obviously doesn't work when accessing dicts you don't create yourself.
Are there good reasons why the base python dictionaries are not callable by default?
Disclaimer: The motivation for this question is mostly ruby, where dictionaries work this way
There should be one-- and preferably only one --obvious way to do it. http://legacy.python.org/dev/peps/pep-0020/
If you make dict callable to subscript a dictionary, now you have two ways of doing the same thing:
d[k] # Existing
d(k) # Proposed
Even the case that you bring up, having a callable to pass to map, already has one obvious solution:
map(d.get, it) # Existing
map(d, it) # Proposed
There doesn't seem to be any particular advantage to your proposal, and it duplicates what are already obvious ways.
Related
I've read about this cool new dictionary type, the transformdict
I want to use it in my project, by initializing a new transform dict with regular dict:
tran_d = TransformDict(str.lower, {'A':1, 'B':2})
which succeeds but when I run this:
tran_d.keys()
I get:
['A', 'B']
How would you suggest to execute the transform function on the parameter (regular) dict when creating the new transform dict?
Just to be clear I want the following:
tran_d.keys() == ['a', 'b']
I already said it in the comments but it's important to realize that this is not what TransformDict is meant to do. Therefore you could subclass it with a custom implementation for keys:
class MyTransformDict(TransformDict):
def keys(self):
return map(self.transform_func, super().keys())
Depending on your Python version you probably need to use list() around the map (Python 3) or provide arguments for super: super(TransformDict, self) (Python 2). But it should illustrate the principle.
As #Rawing pointed out in the comments there will be more methods that don't work as expected, i.e. __iter__, items and probably also __repr__.
Per the implementation I have seen, the transformation function can be achieved through a property named transform_func, so
list(map(tran_d.transform_func, tran_d.keys()))
should do.
I wouldn't bother using TransformDict. It has been proposed as PEP 455 and been rejected. This means it won't be a built-in feature, so you'd have to manually implement it on your own or use some library that does it.
The BDFL delegate's conclusions about the PEP can be found here. The stripped down version is:
It is less readable than converting keys before usage.
It breaks in strange ways that sometimes even emit wrong errors.
It introduces unneeded complexity, since using plain dicts avoids above problems.
In addition to #Ronan-Paixão answer
TransformDict was a hypergeneralization which sprang up out of wanting case-folding keys, but with no rigorous research into what real world users might need the generalization for --- meaning the user expectations of what it should do were not well thought through, as the original question illustrates.
A recommendation is to implement your own dictionary subclass, to fit your own use case, as other answers have suggested.
So rather than suggesting "do not use TransformDict" I would suggest, "build your own, but give your class a better more descriptive name", then you'll know what it does, will have it quarantined, and not encourage bad stuff in the repos.
A good reference in addition to the PEP 455 is Hettinger's presentation: http://il.pycon.org/2016/static/sessions/raymond-hettinger.pdf
USAGE CONTEXT ADDED AT END
I often want to operate on an abstract object like a list. e.g.
def list_ish(thing):
for i in xrange(0,len(thing)):
print thing[i]
Now this appropriate if thing is a list, but will fail if thing is a dict for example. what is the pythonic why to ask "do you behave like a list?"
NOTE:
hasattr('__getitem__') and not hasattr('keys')
this will work for all cases I can think of, but I don't like defining a duck type negatively, as I expect there could be cases that it does not catch.
really what I want is to ask.
"hey do you operate on integer indicies in the way I expect a list to do?" e.g.
thing[i], thing[4:7] = [...], etc.
NOTE: I do not want to simply execute my operations inside of a large try/except, since they are destructive. it is not cool to try and fail here....
USAGE CONTEXT
-- A "point-lists" is a list-like-thing that contains dict-like-things as its elements.
-- A "matrix" is a list-like-thing that contains list-like-things
-- I have a library of functions that operate on point-lists and also in an analogous way on matrix like things.
-- for example, From the users point of view destructive operations like the "spreadsheet-like" operations "column-slice" can operate on both matrix objects and also on point-list objects in an analogous way -- the resulting thing is like the original one, but only has the specified columns.
-- since this particular operation is destructive it would not be cool to proceed as if an object were a matrix, only to find out part way thru the operation, it was really a point-list or none-of-the-above.
-- I want my 'is_matrix' and 'is_point_list' tests to be performant, since they sometimes occur inside inner loops. So I would be satisfied with a test which only investigated element zero for example.
-- I would prefer tests that do not involve construction of temporary objects, just to determine an object's type, but maybe that is not the python way.
in general I find the whole duck typing thing to be kinda messy, and fraught with bugs and slowness, but maybe I dont yet think like a true Pythonista
happy to drink more kool-aid...
One thing you can do, that should work quickly on a normal list and fail on a normal dict, is taking a zero-length slice from the front:
try:
thing[:0]
except TypeError:
# probably not list-like
else:
# probably list-like
The slice fails on dicts because slices are not hashable.
However, str and unicode also pass this test, and you mention that you are doing destructive edits. That means you probably also want to check for __delitem__ and __setitem__:
def supports_slices_and_editing(thing):
if hasattr(thing, '__setitem__') and hasattr(thing, '__delitem__'):
try:
thing[:0]
return True
except TypeError:
pass
return False
I suggest you organize the requirements you have for your input, and the range of possible inputs you want your function to handle, more explicitly than you have so far in your question. If you really just wanted to handle lists and dicts, you'd be using isinstance, right? Maybe what your method does could only ever delete items, or only ever replace items, so you don't need to check for the other capability. Document these requirements for future reference.
When dealing with built-in types, you can use the Abstract Base Classes. In your case, you may want to test against collections.Sequence or collections.MutableSequence:
if isinstance(your_thing, collections.Sequence):
# access your_thing as a list
This is supported in all Python versions after (and including) 2.6.
If you are using your own classes to build your_thing, I'd recommend that you inherit from these abstract base classes as well (directly or indirectly). This way, you can ensure that the sequence interface is implemented correctly, and avoid all the typing mess.
And for third-party libraries, there's no simple way to check for a sequence interface, if the third-party classes didn't inherit from the built-in types or abstract classes. In this case you'll have to check for every interface that you're going to use, and only those you use. For example, your list_ish function used __len__ and __getitem__, so only check whether these two methods exist. A wrong behavior of __getitem__ (e.g. a dict) should raise an exception.
Perhaps their is no ideal pythonic answer here, so I am proposing a 'hack' solution, but don't know enough about the class structure of python to know if I am getting this right:
def is_list_like(thing):
return hasattr(thing, '__setslice__')
def is_dict_like(thing):
return hasattr(thing, 'keys')
My reduce goals here are to simply have performant tests that will:
(1) never call a dict-thing, nor a string-like-thing a list List item
(2) returns the right answer for python types
(3) will return the right answer if someone implement a "full" set of core method for a list/dict
(4) is fast (ideally does not allocate objects during the test)
EDIT: Incorporated ideas from #DanGetz
I was wondering what is the correct way to check a key:value pair of a dict. Lets say I have this dict
dict_ = {
'key1':'val1',
'key2':'val2'
}
I can check a condition like this
if dict_['key1'] == 'val1'
but I feel like there is a more elegant way that takes advantage of the dict data structure.
What you're doing already does take advantage of the data structure, which is why it's "the one obvious way" to do what you want to do. (You can find examples like this all over the tutorial, the reference docs, and the stdlib implementation.)
However, I can see what you're thinking: the dict is in some sense a container of key-value pairs (even if it's only a collections.Container of keys…), so… shouldn't there be some way to just check whether a key-value pair exists?
Up to Python 2.6, there really isn't.* But in 3.0, the items() method returns a special set-like view of the key-value pairs. And 2.7 backported that functionality, under the name viewitems. So:
('key1', 'val1') in d.viewitems()
But I don't think that's really clearer or cleaner; "items" feels like a lower-level way to think of dictionaries than "mappings", which is what both your original code and smci's answer rely on.
It's also less concise, it doesn't work in 2.6 or earlier, and many dict-like mapping objects don't support it,** and it's and slightly slower on 2.7 to boot, but these are probably less important, and not what you asked about.
* Well, there is, but only by iterating over all of the items with iteritems, or using items to effectively do the same exhaustive search behind your back, neither of which is what you want.
** In fact, in 2.7, it's not actually possible to support it with a pure-Python class…
If you want to avoid throwing KeyError if dict doesn't even contain 'key1':
if dict_.get('key1')=='val1':
(However, throwing an exception for missing key is perfectly fine Python idiom.)
Otherwise, #Cyber is correct that it's already fine! (What exactly is the problem?)
There is a has_key function
dict_.has_key('key1')
This returns a boolean true or false.
Alternatively, you can have you get function return a default value when the key is not present.
dict_.get('key3','Default Value')
Modified typo*
So, I'm trying to be a good Python programmer and duck-type wherever I can, but I've got a bit of a problem where my input is either a dict or a list of dicts.
I can't distinguish between them being iterable, because they both are.
My next thought was simply to call list(x) and hope that returned my list intact and gave me my dict as the only item in a list; alas, it just gives me the list of the dict's keys.
I'm now officially out of ideas (short of calling isinstance which is, as we all know, not very pythonic). I just want to end up with a list of dicts, even if my input is a single solitary dict.
Really, there is no obvious pythonic way to do this, because it's an unreasonable input format, and the obvious pythonic way to do it is to fix the input…
But if you can't do that, then yes, you need to write an adapter (as close to the input edge as possible). The best way to do that depends on the actual data. If it really is either a dict, or a list of dicts, and nothing else is possible (e.g., you're calling json.loads on the results from some badly-written service that returns an object or an array of objects), then there's nothing wrong with isinstance.
If you want to make it a bit more general, you can use the appropriate ABCs. For example:
if isinstance(dict_or_list, collections.abc.Mapping):
return [dict_or_list]
else:
return dict_or_list
But unless you have some good reason to need this generality, you're just hiding the hacky workaround, when you're better off keeping it as visible as possible. If it's, e.g., coming out of json.loads from some remote server, handling a Mapping that isn't a dict is not useful, right?
(If you're using some third-party client library that just returns you "something dict-like" or "something list-like containing dict-like things", then yes, use ABCs. Or, if that library doesn't even support the proper ABCs, you can write code that tries a specific method like keys. But if that's an issue, you'll know the specific details you're working around, and can code and document appropriately.)
Accessing a dict using a non-int key will get you either an item, or a KeyError. It will get you a TypeError with a list. So you can use exception handling:
def list_dicts(dict_or_list):
try:
dict_or_list[None]
return [dict_or_list] # no error, we have a dict
except TypeError:
return dict_or_list # wrong index type, we have a list
except Exception:
return [dict_or_list] # probably KeyError but catch anything to be safe
This function will give you a list of dicts regardless of whether it got a list or a dict. (If it got a dict, it makes a list of one item out of it.) This should be fairly safe type-wise, too; other dict-like or list-like objects would probably be considered broken if they didn't have similar behavior.
You could check for the presence of an items attribute.
dict has it and list does not.
>>> hasattr({}, 'items')
True
>>> hasattr([], 'items')
False
Here's a complete list of the differences in attribute names between dict and list (in Python 3.3.2).
Attributes on list but not dict:
>>> print('\n'.join(sorted(list(set(dir([])) - set(dir({}))))))
__add__
__iadd__
__imul__
__mul__
__reversed__
__rmul__
append
count
extend
index
insert
remove
reverse
sort
Attributes on dict but not list:
>>> print('\n'.join(sorted(list(set(dir({})) - set(dir([]))))))
fromkeys
get
items
keys
popitem
setdefault
update
values
Maybe I'm being naive, but how about something like
try:
data.keys()
print "Probs just a dictionary"
except AttributeError:
print "List o' dictionaries!"
Can you just go ahead and do whatever you were going to do anyways with the data, and decide whether it's a dict or list when something goes awry?
Don't use the types module:
import types
d = {}
print type(d) is types.DictType
l = [{},{}]
print type(l) is types.ListType and len(l) and type(l[0]) is types.DictType
I recently wrote some code that looked something like this:
# dct is a dictionary
if "key" in dct.keys():
However, I later found that I could achieve the same results with:
if "key" in dct:
This discovery got me thinking and I began to run some tests to see if there could be a scenario when I must use the keys method of a dictionary. My conclusion however is no, there is not.
If I want the keys in a list, I can do:
keys_list = list(dct)
If I want to iterate over the keys, I can do:
for key in dct:
...
Lastly, if I want to test if a key is in dct, I can use in as I did above.
Summed up, my question is: am I missing something? Could there ever be a scenario where I must use the keys method?...or is it simply a leftover method from an earlier installation of Python that should be ignored?
On Python 3, use dct.keys() to get a dictionary view object, which lets you do set operations on just the keys:
>>> for sharedkey in dct1.keys() & dct2.keys(): # intersection of two dictionaries
... print(dct1[sharedkey], dct2[sharedkey])
In Python 2.7, you'd use dct.viewkeys() for that.
In Python 2, dct.keys() returns a list, a copy of the keys in the dictionary. This can be passed around an a separate object that can be manipulated in its own right, including removing elements without affecting the dictionary itself; however, you can create the same list with list(dct), which works in both Python 2 and 3.
You indeed don't want any of these for iteration or membership testing; always use for key in dct and key in dct for those, respectively.
Source: PEP 234, PEP 3106
Python 2's relatively useless dict.keys method exists for historical reasons. Originally, dicts weren't iterable. In fact, there was no such thing as an iterator; iterating over sequences worked by calling __getitem__, the element access method, with increasing integer indices until an IndexError was raised. To iterate over the keys of a dict, you had to call the keys method to get an explicit list of keys and iterate over that.
When iterators went in, dicts became iterable, because it was more convenient, faster, and all around better to say
for key in d:
than
for key in d.keys()
This had the side-effect of making d.keys() utterly superfluous; list(d) and iter(d) now did everything d.keys() did in a cleaner, more general way. They couldn't get rid of keys, though, since so much code already called it.
(At this time, dicts also got a __contains__ method, so you could say key in d instead of d.has_key(key). This was shorter and nicely symmetrical with for key in d; the symmetry is also why iterating over a dict gives the keys instead of (key, value) pairs.)
In Python 3, taking inspiration from the Java Collections Framework, the keys, values, and items methods of dicts were changed. Instead of returning lists, they would return views of the original dict. The key and item views would support set-like operations, and all views would be wrappers around the underlying dict, reflecting any changes to the dict. This made keys useful again.
Assuming you're not using Python 3, list(dct) is equivalent to dct.keys(). Which one you use is a matter of personal preference. I personally think dct.keys() is slightly clearer, but to each their own.
In any case, there isn't a scenario where you "need" to use dct.keys() per se.
In Python 3, dct.keys() returns a "dictionary view object", so if you need to get a hold of an unmaterialized view to the keys (which could be useful for huge dictionaries) outside of a for loop context, you'd need to use dct.keys().
key in dict
is much faster than checking
key in dict.keys()