I have a variable (object from database). In some cases this variable can be type of list and in some cases dictionary.
Standard for cycle if variable is list:
for value in object_values:
self.do_something(value)
Standard for cycle if variable is dictionary:
for key, value in object_values.items():
self.do_something(value)
I can use instanceof() two check the type, but then I still need two functions or if with two for cycles. I have now if condition which calls one of the two functions, one for iterating as list (e.g. iterate_list()) and the second for iterating as dictionary (e.g. iterate_dict()) .
Is there any better option how elegantly and more pythonic way resolve problem that I don't know if the variable will be list or dictionary?
in your case, since the data is either the items or the values of the dictionary, you could use a ternary to get values() or just the iterable depending on the type:
def iterate(self,object_values):
for value in object_values.values() if isinstance(object_values,dict) else object_values:
self.do_something(value)
If you pass a tuple, generator or other iterable, it falls back on "standard" iteration. If you pass a dictionary (or OrderedDict or other), it iterates on the values.
Performance-wise, the ternary expression is evaluated only once at the start of the iteration, so it's fine.
The isinstance bit could even be replaced by if hasattr(object_values,"values") so even non-dict objects with a values member would match.
(Note that you should be aware of the "least atonishment" principle. Some people may expect an iteration on the keys of the dictionary when calling the method)
Related
What is the use case of immutable types/objects like tuple in python.
Tuple('hello')
('h','i')
Where we can use the not changeable sequences.
One common use case is the list of (unnamed) arguments to a function.
In [1]: def foo(*args):
...: print(type(args))
...:
In [2]: foo(1,2,3)
<class 'tuple'>
Technically, tuples are semantically different to lists.
When you have a list, you have something that is... a list. Of items of some sort. And therefore can have items added or removed to it.
A tuple, on the other hand, is a set of values in a given order. It just happens to be one value that is made up of more than one value. A composite value.
For example. Say you have a point. X, Y. You could have a class called Point, but that class would have a dictionary to store its attributes. A point is only two values which are, most of the time, used together. You don't need the flexibility or the cost of a dictionary for storing named attributes, you can use a tuple instead.
myPoint = 70, 2
Points are always X and Y. Always 2 values. They are not lists of numbers. They are two values in which the order of a value matters.
Another example of tuple usage. A function that creates links from a list of tuples. The tuples must be the href and then the label of the link. Fixed order. Order that has meaning.
def make_links(*tuples):
return "".join('%s' % t for t in tuples)
make_links(
("//google.com", "Google"),
("//stackoveflow.com", "Stack Overflow")
)
So the reason tuples don't change is because they are supposed to be one single value. You can only assign the whole thing at once.
Here is a good resource that describes the difference between tuples and lists, and the reasons for using each: https://mail.python.org/pipermail/tutor/2001-September/008888.html
The main reason outlined in that link is that tuples are immutable and less extensive than say, lists. This makes them useful only in certain situations, but if those situations can be identified, tuples take up much less resources.
Immutable objects will make life simpler in many cases. They are especially applicable for value types, where objects don't have an identity so they can be easily replaced. And they can make concurrent programming way safer and cleaner (most of the notoriously hard to find concurrency bugs are ultimately caused by mutable state shared between threads). However, for large and/or complex objects, creating a new copy of the object for every single change can be very costly and/or tedious. And for objects with a distinct identity, changing an existing objects is much more simple and intuitive than creating a new, modified copy of it.
I have a function which must return many values (statistics) for other function to interact with them. So I thought about returning them inside a list (array). But then I wondered: should I do so using a list (["foo", "bar"]) or using a tuple (("foo", "bar"))? what are the problems or differences there are when using one instead of the other??
Use a tuple. In your application, it doesn't seem like you will want or need to change the list of results after.
Though, with many return values you might want to consider returning a dictionary with named values. That way is more flexible and extensible, as adding a new statistic doesn't requiring modifying every single time you use the function.
If you do not need to edit the return value, use a tuple. The main difference is that lists can be edited.
See this: What's the difference between lists and tuples?
Does orderedDict.values(), .keys(), .iterkeys() etc. returns the values in the order that items were first inserted?
I assume that values\keys function does not change the order of the dict, and if it's orderedDict, then i get the values in the order that they were added to the dictionary.
That's true?
Does orderedDict.values(), .keys(), .iterkeys() etc. returns the values in the order that items were first inserted?
Yes, they are preserved. From the documentation (emphasis mine):
Equality tests between OrderedDict objects are order-sensitive and are
implemented as list(od1.items())==list(od2.items()).
Thats how equality comparison is done, and it implies that the result of items() are ordered. Now for other two functions you can look into a substitute implementation for OrderedDict here
Your assumption is correct. OrderedDict maintains a list of (thus ordered) keys to iterate over values and keys, so they will always be ordered the same way as in for loops.
The comments in the source code also states this, saying:
The inherited dict provides __getitem__, __len__, __contains__, and get. The remaining methods are order-aware. Big-O running times for all methods are the same as regular dictionaries.
Emphasis mine.
I'm wondering what's happening when I execute this code and also if there is a better way to accomplish the same task. Is a list of lists being made in memory to preform the sort, and then bar is assigned to be an iterator of foo.values()? Or possibly foo.values() is sorted in the allotted dictionary memory space (seems unlikely)?
Imagine the first value in the list, the integer, refers to a line number in a file. I want to open the file and update only the lines referenced in the foo.values() lists with the rest of the data in the list (EG update line 1 with strings '123' and '097').
from itertools import imap
>>> foo = {'2134':[1, '123', '097'], '6543543':[3, '1'], '12315':[2, '454']}
>>> bar = imap([].sort(), foo.values())
Thanks~
First, you're passing [].sort(), which is just None, as the first argument to imap, meaning it's doing nothing at all. As the docs explain: "If function is set to None, then imap() returns the arguments as a tuple."
To pass a callable to a higher-order function like imap, you have to pass the callable itself, not call it and pass the result.
Plus, you don't want [].sort here; that's a callable with no arguments that just sorts an empty list, which is useless.
You probably wanted list.sort, the unbound method, which is a callable with one argument that will sort whatever list it's given.
So, if you did that, what would happen is that you'd creating an iterator that, if you iterated it, would generate a bunch of None values and, as a side effect, sort each list in foo.values(). No new lists would be created anywhere, because list.sort mutates the list in-place and returns None.
But since you don't ever iterate it anyway, it hardly matters what you put into imap; what it actually does is effectively nothing.
Generally, abusing map/imap/comprehensions/etc. for side-effects of the expressions is a bad idea. An iterator that generates useless values, but that you have to iterate anyway, is a recipe for confusion at best.
The simple thing to do here is to just use a loop:
for value in foo.values():
value.sort()
Or, instead of sorting in-place, generate new sorted values:
bar = imap(sorted, foo.values())
Now, as you iterate bar, each list will be sorted and given to you, so you can use it. If you iterate this, it will generate a sorted list in memory for each list… but only one will ever be alive at a time (unless you explicitly stash them somewhere).
What does this mean?
The only types of values not acceptable as dictionary keys are values containing lists or dictionaries or other mutable types that are compared by value rather than by object identity, the reason being that the efficient implementation of dictionaries requires a key’s hash value to remain constant.
I think even for tuples, comparison will happen by value.
The problem with a mutable object as a key is that when we use a dictionary, we rarely want to check identity. For example, when we use a dictionary like this:
a = "bob"
test = {a: 30}
print(test["bob"])
We expect it to work - the second string "bob" may not be the same as a, but it is the same value, which is what we care about. This works as any two strings that equate will have the same hash, meaning that the dict (implemented as a hashmap) can find those strings very efficiently.
The issue comes into play when we have a list as a key, imagine this case:
a = ["bob"]
test = {a: 30}
print(test[["bob"]])
We can't do this any more - the comparison won't work as the hash of a list is not based on it's value, but rather the instance of the list (aka (id(a) != id(["bob"))).
Python has the choice of making the list's hash change (undermining the efficiency of a hashmap) or simply comparing on identity (which is useless in most cases). Python disallows these specific mutable keys to avoid subtle but common bugs where people expect the values to be equated on value, rather than identity.
The documentation mixes together two different things: mutability, and value-comparable. Let's separate them out.
Immutable objects that compare by identity are fine. The identity can
never change, for any object.
Immutable objects that compare by value are fine. The value can never
change for an immutable object. This includes tuples.
Mutable objects that compare by identity are fine. The identity can
never change, for any object.
Mutable objects that compare by value are not acceptable. The value
can change for a mutable object, which would make the dictionary
invalid.
Meanwhile, your wording isn't quite the same as Mapping Types (4.10 in Python 3.3 or 5.8 in Python 2.7, both of which say:
A dictionary’s keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries or other mutable types (that are compared by value rather than by object identity) may not be used as keys.
Anyway, the key point here is that the rule is "not hashable"; "mutable types (that are compared by value rather than by object identity)" is just to explain things a little further. It isn't strictly true that comparing by object identity and hashing by object identity are always the same (the only thing that's required is that if id is equal, the hash is equal).
The part about "efficient implementation of dictionaries" from the version you posted just adds to the confusion (which is probably why it's not in the reference documentation). Even if someone came up with an efficient way to deal with storing lists as dict keys tomorrow, the language doesn't allow it.
A hash is way of calculating an unique code for an object, this code always the same for the same object. hash('test') for example is 2314058222102390712, so is a = 'test'; hash(a) = 2314058222102390712.
Internally a dictionary value is searched by the hash, not by the variable you specify. A list is mutable, a hash for a list, if it where defined, would be changing whenever the list changes. Therefore python's design does not hash lists. Lists therefore can not be used as dictionary keys.
Tuples are immutable, therefore tubles have hashes e.G. hash((1,2)) = 3713081631934410656. one could compare whether a tuple a is equal to the tuple (1,2) by comparing the hash, rather than the value. This would be more efficient as we have to compare only one value instead of two.