Turn dict into list wipes out the values [closed] - python

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 9 years ago.
Short question: Why when we do list(dict()) the return is the keys of the dict, but not the values?
Cause all that I know about (key, value) pairs, is that what matters is the value, not the key. The key it's just a page in a book. Since we don't actually want the page, but the content of that page, giving me the page makes no sense at all at first.
I believe that it, somehow, makes sense. But please, clarify this one.
Thanks!
EDITED:
now, since the most relevant part of a (key, value) pair ITS THE VALUE. Why not the the iter method of dict returns the value?

It is simply untrue that the value is "the most relevant part" of the key-value pair. The pair itself is what is relevant. That's why you're using a dict. If all you wanted was the values, you'd just use a list.
Also, as #Blender rightly points out, if you know the key, you can easily get the value, whereas the reverse is not true. So if you're only going to get one, it definitely makes sense to get the key and not the value.
Although it's true that in and iteration behavior are not necessarily linked, it's also true that for most other container types, iterating over the container yields all and only the items for which item in container would be true. I seem to recall seeing threads on comp.lang.python at one point where people said that the decision to make in on dictionaries work by key, and to make iteration work like in, was made a long time ago and then maintained for backwards compatibility, although I can't find any references for that right now.
It is legitimate to wonder why iterating overa dict yields the keys and not the key/value pairs. But the answer to this is just "that's the way the dict API specifies it". Iterating over the key-value pairs (or the values alone, if it comes to that) is so trivially easy, with a single method call, that it hardly matters which one is the default behavior.

The reason why this occurs is because list accepts an iterator, and uses each item as if it was an iterator by calling iter on it. Since the __iter__ method of the dict type returns an iterator over it's keys, calling list on a dict object gives you it's keys.
>>> class A(object):
def __init__(self,lst):
self.lst = lst
def __iter__(self):
print 'iter on A'
return iter(self.lst)
>>> a = A(range(10))
>>> list(a)
iter on A
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In terms of implementation, returning only keys would be the faster than returning both, and since they explicitly include an items method, there doesn't exist a very good reason for including values in the default __iter__ implementation. Implementation of DICT The TimeComplexity data from python indicate that iterating over keys is O(n) and retrieving values is O(1), which may seem insignificant, until you realize that iterating and retrieving values given keys is O(n) also. This would be wasteful unless you really wanted the key,value pairs (as opposed to just keys, or just values), so it's not the default.
If you wanted it to be the default, you could do this:
class myDict(dict):
def __iter__(self):
return self.iteritems()
and calling list on an instance of myDict will give you key, value pairs.

Why when we do list(dict()) the return is the keys of the dict, but
not the values?
First, doing dict() will not return any keys or values, but an empty dictionary. You are calling the built-in dict function.
If you type exactly that in the shell, you'll end up with an empty list [].
By the way, you also can't do:
d = {'a': 1, 'b': 2}
list(d())
This will raise a TypeError since dictionary objects are not callable.
By default if you loop over a dictionary, due to its implementation, the default iterator will return keys. This is the straight forward answer to your question. The reason for this implementation is that in Python, there is only one type that you can use to retrieve values by using any hashable type, and that is a dictionary. Hence, the primary use case for this type would be to retrieve items by their key, which can be any arbitrary value. In addition, since dictionaries are unordered getting easy access to the keys is the easiest way and I would argue the primary reason to even use a dictionary. Otherwise, what's wrong with a list? or tuple?
If you have a dictionary and you want to convert it to a list, you need to somehow 'flatten' the dictionary. This is because lists already have a 0-indexed key, which I am sure you already know.
To get list(somedict) to create a list of the values of any dictionary, you have a few ways.
The first one which was hinted in the comments; and is the most straightforward way:
list({'a': 1, 'b': 2}.values())
If you want to add some syntactic sugar on it, but this is just being silly:
d = {'a': 1, 'b': 2}.values
list(d())
Finally, if you want to have both the keys and the values in your list, you can do this:
list({'a': 1, 'b': 2}.values())
[('a', 1),('b', 2)]
Now you have a list of tuples, each representing a key/value pair. Some developers use this to sort the dictionary as dictionaries are unsorted in Python. In Python 2.7, OrderedDict was added to the collections module, which provides sorted dictionaries.

Related

Use lists to get objects in dictionary but not as keys

I was a bit surprised when I tried to do
if list in dict:
and got
TypeError: unhashable type: 'list'
I know it does not make sense to use lists as keys as they are mutable and their hash could change when you do operations on them. However, why is it not possible to use them to simply look up objects in a dictionary? I know it is not much work doing
if tuple(list) in dict:
rather than just
if list in dict:
But still, I feel like it would work as default behavior as the hash of its current elements should be the exact same thing as the hash of the corresponding tuple that may be in the dictionary? Or am I missing something in what makes lists unusable in dictionaries?
Actually, you can calculate hash of a list (as any other sequence of bits) and implement desired behavior.
Hovewer,
tuples are immutable objects, hash of a tuple may be calculated once, at initialization
lists are mutable. They may change its state and hashing based on elements of a list will produce different hashes, if state was changed. So, using this hashing method, hash of list can't be precalculated
Now consider in operation.
When you check a_tuple in a_dict, hash of a_tuple is known, and in operation takes linear time O(len(a_dict)).
Suppose, that someone implemented a_list in a_dict operation. This would take O(len(a_list)) + O(len(a_dict)) time, because you have to calculate hash of a_list. Therefore, unintended behavior happens.
On other hand, if we consider another hashing method that takes O(1), e.g. just by link to an object. You will get another behavior of a_list in a_dict. Because if a_list == b_list and not a_list is b_list, then (a_list in a_dict) != (b_list in a_dict).
Also, notice that
Tuples can be used as keys if they contain only strings, numbers, or tuples
While I don't think "why does Python work like that" questions are on-topic for this site, I'm gonna say this seems like a terrible idea. You are not supposed to (cannot) use lists as keys in a dictionary, so why would you expect them to be among the keys? If you are looking for a tuple, why would
you want to pass a list instead?

Logic behind accessing dictionary values in python

Rookie here and I couldn't find a proper explanation for this.
We have a simple dict:
a_dict = {'color': 'blue', 'fruit': 'apple', 'pet': 'dog'}
to loop through and access the values of this dict I have to call
for key in a_dict:
print(key, '->', a_dict[key])
I am saying about
a_dict[key]
specifically. Why python use this convention? Where is a logic behind this? When I want to get values of a dictionary I should call it something like
a_dict[value] or a_dict[values] etc
instead (thinking logically).
Could anyone explain it to make more sense please?
edit:
to be clear: why python use a_dict[key] to access dict VALUE instead of a_dict[value]. LOGICALLY.
according to your question, I think you meant why python does not use index instead of key to reach values in the dict.
Please take note that there are 4 main data container in python, and each for its usage. (there are also other containers like counter and ...)
for example elements of list and tuple is reachable by their indices.
a = [1,2,3,4,5]
print(a[0]) would print 1
but dictionary as its name shows, maps from some objects (keys in python terminology) to some other objects(values in python terminology). so we would call the key instead of index and the output would be the value.
a = { 'a':1 , 'b':2 }
print(a['a']) would print 1
I hope it makes it a bit more clear for you.
I think you are misunderstanding some terminology around dictionaries:
In your example, your keys are color, fruit, and pet.
Your values are blue, apple, and dog.
In python, you access your values by calling a_dict[key], for example a_dict["color"] will return "blue".
If python instead used your suggested method of a_dict[value], you would have to know what your value was before trying to access it, e.g. a_dict["blue"] would be needed to get "blue", which makes very little sense.
As in Feras's answer, try reading up more on how dictionaries work here
Its because, a dictionary in python, maps the keys and values with a hash function internally in the memory.
Thus, to get the value, you've to pass in the key.
You can sort of think it like indices of the list vs the elements of the list, now to extract a particular element, you would use lst[index]; this is the same way dictionaries work; instead of passing in index you would've to pass in
the key you used in the dictionary, like dict[key].
One more comparison is the dictionary (the one with words and meanings), in that the meanings are mapped to the words, now you would of course search for the word and not the meaning given to the word, directly.
You are searching for a value wich you don't know if it exists or not in the dict, so the a_dict[key] is logic and correct

When dictionary keys are identical, why does Python keep only the last key-value pair?

Let's say I create a dictionary a_dictionary where two of the key-value pairs have an identical key:
In [1]: a_dictionary = {'key': 5, 'another_key': 10, 'key': 50}
In [2]: a_dictionary
Out[2]: {'key': 50, 'another_key': 10}
Why does Python choose here to keep the last key-value pair instead of throwing an error (or at least raising a warning) about using identical keys?
The way I see it, the main downside here is that you may lose data without being aware.
(If it's relevant, I ran the code above on Python 3.6.4.)
If your question is why Python dict displays were originally designed this way… Probably nobody knows.
We know when the decision was made. Python 0.9.x (1991-1993) didn't have dict displays; Python 1.0.x (1994) did. And they worked exactly the same as they do today. From the docs:1
A dictionary display yields a new dictionary object.
The key/datum pairs are evaluated from left to right to define the
entries of the dictionary: each key object is used as a key into the
dictionary to store the corresponding datum.
Restrictions on the types of the key values are listed earlier in
section types.
Clashes between duplicate keys are not detected; the last
datum (textually rightmost in the display) stored for a given key
value prevails.
And, testing it:
$ ./python
Python 1.0.1 (Aug 21 2018)
Copyright 1991-1994 Stichting Mathematisch Centrum, Amsterdam
>>> {'key': 1, 'other': 2, 'key': 3}
{'other': 2, 'key': 3}
But there's no mention of why Guido chose this design in:
The 1.0 docs.
The Design & History FAQ.
Guido's History of Python blog.
Anywhere else I can think of that might have it.
Also, if you look at different languages with similar features, some of them keep the last key-value pair like Python, some keep an arbitrary key-value pair, some raise some kind of error… there are enough of each that you can't argue that this was the one obvious design and that's why Guido chose it.
If you want a wild guess that's probably no better than what you could guess on your own, here's mine:
The compiler not only could, but does, effectively construct const values out of literals by creating an empty dict and inserting key-values pairs into it. So, you get duplicates-allowed, last-key-wins semantics by default; if you wanted anything else, you'd have to write extra code. And, without a compelling reason to pick one over another, Guido chose to not write extra code.
So, if there's no compelling reason for the design, why has nobody tried to change it in the 24 years since?
Well, someone filed a feature request (b.p.o. #16385), to made duplicate keys an error in 3.4.
but apparently went away when it was suggested it bring it up on -ideas.) It may well have come up a few others times, but obviously nobody wanted it changed badly enough to push for it.
Meanwhile, he closest thing to an actual argument for Python's existing behavior is this comment by Terry J. Reedy:
Without more use cases and support (from discussion on python-ideas), I think this should be rejected. Being able to re-write keys is fundamental to Python dicts and why they can be used for Python's mutable namespaces. A write-once or write-key-once dict would be something else.
As for literals, a code generator could depend on being able to write duplicate keys without having to go back and erase previous output.
1. I don't think the docs for 1.0 are directly linkable anywhere, but you can download the whole 1.0.1 source archive and build the docs from the TeX source.
I think #tobias_k has the ultimate answer -- because otherwise there would be inconsistencies. If
{'key': 0, 'key': 1}
threw an error then I would expect
lst = [('key', 0), ('key', 1)]
dict(lst)
to fail and then I would expect
d = {}
d['key'] = 0
d['key'] = 1
to also. But of course, that last option is obviously not what I want, so going back up the chain we reach the current behaviour.
Conceptually, you can think of dictionary creation as an iterative, incremental process. In other words, the assignment of a dictionary literal:
a_dictionary = {'key': 5, 'another_key': 10, 'key': 50}
is equivalent to a sequence of single assignment statements:
a_dictionary['key'] = 5
a_dictionary['another_key'] = 10
a_dictionary['key'] = 50
Naturally, if a key happens more than once, there is nothing wrong with reassigning a new value to it.
Usually you want to overwrite the value rather than throwing an error.
If you want to have a dictionary that protects itself from overwriting values then create a new clase that wrap the Dictionary class and throw an error if any value is overwritten.

How can I initialize and increment an undefined value within a list in Python 3?

What I have is a dictionary of words and I'm generating objects that contain
(1) Original word (e.g. cats)
(2) Alphabetized word (e.g. acst)
(3) Length of the word
Without knowing the length of the longest word, is it possible to create an array (or, in Python, a list) such that, as I scan through the dictionary, it will append an object with x chars into a list in array[x]?
For example, when I encounter the word "a", it will append the generated object to the list at array[1]. Next, for aardvark, if will append the generated object to the list at array[8], etc.
I thought about creating an array of size 1 and then adding on to it, but I'm not sure how it would work.
Foe example: for the first word, a, it will append it to the list stored in array[1]. However, for next word, aardvark, how am I supposed to check/generate more spots in the list until it hits 8? If I append to array, I need give the append function an arg. But, I can't give it just any arg since I don't want to change previously entered values (e.g. 'a' in array[1]).
I'm trying to optimize my code for an assignment, so the alternative is going through the list a second time after I've determined the longest word. However, I think it would be better to do it as I alphabetize the words and create the objects such that I don't have to go through the lengthy dictionary twice.
Also, quick question about syntax: listOfStuff[x].append(y) will initialize/append to the list within listOfStuff at the value x with the value y, correct?
Store the lengths as keys in a dict rather than as indexes in a list. This is really easy if you use a defaultdict from the collections module - your algorithm will look like this:
from collections import defaultdict
results = defaultdict(list)
for word in words:
results[len(word)].append(word)
This ties in to your second question: listOfStuff[x].append(y) will append to a list that already exists at listofStuff[x]. It will not create a new one if that hasn't already been initialised to a (possibly empty) list. If x isn't a valid index to the list (eg, x=3 into a listOfStuff length 2), you'll get an IndexError. If it exists but there is something other than another list there, you will probably get an AttributeError.
Using a dict takes care of the first problem for you - assigning to a non-existent dict key is always valid. Using a defaultdict extends this idea to also reading from a non-existent key - it will insert a default value given by calling the function you give the defaultdict when you create it (in this case, we gave it list, so it calls it and gets an empty list) into the dict the first time you use it.
If you can't use collections for some reason, the next best way is still to use dicts - they have a method called setdefault that works similarly to defaultdicts. You can use it like this:
results = {}
for word in words:
results.setdefault(len(word), []).append(word)
as you can see, setdefault takes two arguments: a key and a default value. If the key already exists in the dict, setdefault just returns its current value as if you'd done results[key]. If that would be an error, however, it inserts the second argument into the dictionary at that key, and then returns it. This is a little bit clunkier to use than defaultdict, but when your default value is an empty list it is otherwise the same (defaultdict is better to use when your default is expensive to create, however, since it only calls the factory function as needed, but you need to precompute it to pass into setdefault).
It is technically possible to do this with nested lists, but it is ugly. You have to:
Detect the case that the list isn't big enough
Figure out how many more elements the list needs
Grow the list to that size
the most Pythonic way to do the first bit is to catch the error (something you could also do with dicts if setdefault and defaultdict didn't exist). The whole thing looks like this:
results = []
for word in words:
try:
results[len(word)]
except IndexError:
# Grow the list so that the new highest index is
# len(word)
new_length = len(word) + 1
difference = len(results) - new_length
results.extend([] for _ in range(difference))
finally:
results[len(word)].append(word)
Stay with dicts to avoid this kind of mess. lists are specifically optimised for the case that the exact numeric index of any element isn't meaningful outside of the list, which doesn't meet your use case. This type of code is really common when you have a mismatch between what your code needs to do and what the data structures you're using are good at, and it is worth learning as early as possible how to avoid it.

Extend list at time of creation in python, how?

Here is the sample dict one
one = {'a': 1,'b': 3}
and the second dict is
second = {'x': 45,'y': 45}
here is the key container(of type list)
key_con = one.keys()
key_con = key_con.extend(second.keys())
and all work good.
But i try to shorten the code
like this
key_con = one.keys().extend(second.keys())
now, key_con is NoneType
i want to make this key container in one line code.
how to achieve it?
key_con = one.keys() + second.keys()
extend modifies the list in-place and doesn't return anything. Are you sure your first snippet works?
While the other answer by Pavel Anossov answered the question you explicitly asked, I would still argue that it's not the best solution to the problem at hand. Dictionaries are unordered, and can't have duplicate keys, so using a list to store the keys is inherently misleading and a bad idea.
Instead, it is a much better idea to store this data in a set - sets don't have order, and can't contain duplicates, and so fill this role much more effectively.
In Python 3.x, dict.keys() gives a set-like dictionary view, so this would be best done with:
key_con = one.keys() | two.keys()
We use the | (binary or) operator, which, on sets and set-like objects, signals a union (all elements in one set or the other).
In 2.7, the same behaviour can be obtained with dict.viewkeys():
key_con = one.viewkeys() | two.viewkeys()
In older versions, then we can use dict.iterkeys() with set():
key_con = set(one.iterkeys()) | set(two.iterkeys())

Categories