I am trying to get a deeper understanding to how for loops for different data types in Python. The simplest way of using a for loop an iterating over an array is as
for i in range(len(array)):
do_something(array[i])
I also know that I can
for i in array:
do_something(i)
What I would like to know is what this does
for i, j in range(len(array)):
# What is i and j here?
or
for i, j in array:
# What is i and j in this case?
And what happens if I try using this same idea with dictionaries or tuples?
The simplest and best way is the second one, not the first one!
for i in array:
do_something(i)
Never do this, it's needlessly complicating the code:
for i in range(len(array)):
do_something(array[i])
If you need the index in the array for some reason (usually you don't), then do this instead:
for i, element in enumerate(array):
print("working with index", i)
do_something(element)
This is just an error, you will get TypeError: 'int' object is not iterable when trying to unpack one integer into two names:
for i, j in range(len(array)):
# What is i and j here?
This one might work, assumes the array is "two-dimensional":
for i, j in array:
# What is i and j in this case?
An example of a two-dimensional array would be a list of pairs:
>>> for i, j in [(0, 1), ('a', 'b')]:
... print('i:', i, 'j:', j)
...
i: 0 j: 1
i: a j: b
Note: ['these', 'structures'] are called lists in Python, not arrays.
Your third loop will not work as it will throw a TypeError for an int not being iterable. This is because you are trying to "unpack" the int that is the array's index into i, and j which is not possible. An example of unpacking is like so:
tup = (1,2)
a,b = tup
where you assign a to be the first value in the tuple and b to be the second. This is also useful when you may have a function return a tuple of values and you want to unpack them immediately when calling the function. Like,
train_X, train_Y, validate_X, validate_Y = make_data(data)
More common loop cases that I believe you are referring to is how to iterate over an arrays items and it's index.
for i, e in enumerate(array):
...
and
for k,v in d.items():
...
when iterating over the items in a dictionary. Furthermore, if you have two lists, l1 and l2 you can iterate over both of the contents like so
for e1, e2 in zip(l1,l2):
...
Note that this will truncate the longer list in the case of unequal lengths while iterating. Or say that you have a lists of lists where the outer lists are of length m and the inner of length n and you would rather iterate over the elements in the inner lits grouped together by index. This is effectively iterating over the transpose of the matrix, you can use zip to perform this operation as well.
for inner_joined in zip(*matrix): # will run m times
# len(inner_joined) == m
...
Python's for loop is an iterator-based loop (that's why bruno desthuilliers says that it "works for all iterables (lists, tuples, sets, dicts, iterators, generators etc)". A string is also another common type of iterable).
Let's say you have a list of tuples. Using that nomenclature you shared, one can iterate through both the keys and values simultaneously. For instance:
tuple_list = [(1, "Countries, Cities and Villages"),(2,"Animals"),(3, "Objects")]
for k, v in tuple_list:
print(k, v)
will give you the output:
1 Countries, Cities and Villages
2 Animals
3 Objects
If you use a dictionary, you'll also gonna be able to do this. The difference here is the need for .items()
dictionary = {1: "Countries, Cities and Villages", 2: "Animals", 3: "Objects"}
for k, v in dictionary.items():
print(k, v)
The difference between dictionary and dictionary.items() is the following
dictionary: {1: 'Countries, Cities and Villages', 2: 'Animals', 3: 'Objects'}
dictionary.items(): dict_items([(1, 'Countries, Cities and Villages'), (2, 'Animals'), (3, 'Objects')])
Using dictionary.items() we'll get a view object containig the key-value pairs of the dictionary, as tuples in a list. In other words, with dictionary.items() you'll also get a list of tuples. If you don't use it, you'll get
TypeError: cannot unpack non-iterable int object
If you want to get the same output using a simple list, you'll have to use something like enumerate()
list = ["Countries, Cities and Villages","Animals", "Objects"]
for k, v in enumerate(list, 1): # 1 means that I want to start from 1 instead of 0
print(k, v)
If you don't, you'll get
ValueError: too many values to unpack (expected 2)
So, this naturally raises the question... do I need always a list of tuples? No. Using enumerate() we'll get an enumerate object.
Actually, "the simplest way of using a for loop an iterating over an array" (the Python type is named "list" BTW) is the second one, ie
for item in somelist:
do_something_with(item)
which FWIW works for all iterables (lists, tuples, sets, dicts, iterators, generators etc).
The range-based C-style version is considered highly unpythonic, and will only work with lists or list-like iterables.
What I would like to know is what this does
for i, j in range(len(array)):
# What is i and j here?
Well, you could just test it by yourself... But the result is obvious: it will raise a TypeError because unpacking only works on iterables and ints are not iterable.
or
for i, j in array:
# What is i and j in this case?
Depends on what is array and what it yields when iterating over it. If it's a list of 2-tuples or an iterator yielding 2-tuples, i and j will be the elements of the current iteration item, ie:
array = [(letter, ord(letter)) for letter in "abcdef"]
for letter, letter_ord in array:
print("{} : {}".format(letter, letter_ord))
Else, it will most probably raise a TypeError too.
Note that if you want to have both the item and index, the solution is the builtin enumerate(sequence), which yields an (index, item) tuple for each item:
array = list("abcdef")
for index, letter in enumerate(array):
print("{} : {}".format(index, letter)
Understood your question, explaining it using a different example.
1. Multiple Assignments using -> Dictionary
dict1 = {1: "Bitcoin", 2: "Ethereum"}
for key, value in dict1.items():
print(f"Key {key} has value {value}")
print(dict1.items())
Output:
Key 1 has value Bitcoin
Key 2 has value Ethereum
dict_items([(1, 'Bitcoin'), (2, 'Ethereum')])
Explaining dict1.items():
dict1_items() creates values dict_items([(1, 'Bitcoin'), (2, 'Ethereum')])
It comes in pairs (key, value) for each iteration.
2. Multiple Assignments using -> enumerate() Function
coins = ["Bitcoin", "Ethereum", "Cardano"]
prices = [48000, 2585, 2]
for i, coin in enumerate(coins):
price = prices[i]
print(f"${price} for 1 {coin}")
Output:
$48000 for 1 Bitcoin
$2585 for 1 Ethereum
$2 for 1 Cardano
Explaining enumerate(coins):
enumerate(coins) creates values ((0, 'Bitcoin'), (1, 'Ethereum'), (2, 'Cardano'))
It comes in pairs (index, value) for each (one) iteration
3. Multiple Assignments using -> zip() Function
coins = ["Bitcoin", "Ethereum", "Cardano"]
prices = [48000, 2585, 2]
for coin, price in zip(coins, prices):
print(f"${price} for 1 {coin}")
Output:
$48000 for 1 Bitcoin
$2585 for 1 Ethereum
$2 for 1 Cardano
Explaining
zip(coins, prices) create values (('Bitcoin', 48000), ('Ethereum', 2585), ('Cardano', 2))
It comes in pairs (value-list1, value-list2) for each (one) iteration.
I just wanted to add that, even in Python, you can get a for in effect using a classic C style loop by just using a local variable
l = len(mylist) #I often need to use this more than once anyways
for n in range(l-1):
i = mylist[n]
print("iterator:",n, " item:",i)
Related
I have an OrderedDict in Python, and I only want to get the first key-vale pairs. How to get it? For example, to get the first 4 elements, i did the following:
subdict = {}
for index, pair in enumerate(my_ordered_dict.items()):
if index < 4:
subdict[pair[0]] = pair[1]
Is this the good way to do it?
That approach involves running over the whole dictionary even though you only need the first four elements, checking the index over and over, and manually unpacking the pairs, and manually performing index checking unnecessarily.
Making it short-circuit is easy:
subdict = {}
for index, pair in enumerate(my_ordered_dict.items()):
if index >= 4:
break # Ends the loop without iterating all of my_ordered_dict
subdict[pair[0]] = pair[1]
and you can nested the unpacking to get nicer names:
subdict = {}
# Inner parentheses mandatory for nested unpacking
for index, (key, val) in enumerate(my_ordered_dict.items()):
if index >= 4:
break # Ends the loop
subdict[key] = value
but you can improve on that with itertools.islice to remove the manual index checking:
from itertools import islice # At top of file
subdict = {}
# islice lazily produces the first four pairs then stops for you
for key, val in islice(my_ordered_dict.items(), 4):
subdict[key] = value
at which point you can actually one-line the whole thing (because now you have an iterable of exactly the four pairs you want, and the dict constructor accepts an iterable of pairs):
subdict = dict(islice(my_ordered_dict.items(), 4))
You can use a map function, like this
item = dict(map(lambda x: (x, subdict[x]),[*subdict][:4]))
Here is one approach:
sub_dict = dict(pair for i, pair in zip(range(4), my_ordered_dict.items()))
The length of zip(a,b) is equal to the length of the shortest of a and b, so if my_ordered_dict.items() is longer than 4, zip(range(4), my_ordered_dict.items() just takes the first 4 items. These key-value pairs are passed to the dict builtin to make a new dict.
if a tuple is iterable why the following code results to an error?
I have read many threads about this but I am not able to understand.
tp1 = (5,7)
for i,j in tp1:
print(i,j)
When you use the for i,j in tp1: syntax, Python is expecting that item is an iterable of iterables, which can be unpacked into the variables i and j. In this case, when you iterate over item, the individual members of item are int's, which cannot be unpacked
if you want to unpack the items in the tuple, you should do it like this:
for i in tp1:
print(i)
and if you want to iterate throught that way you need tuple or list inside tuple or list
(i.e any iterable inside iterable)
a = [(1,2), (3,4)]
for i, j in a: # This unpacks the tuple's contents into i and j as you're iterating over a
print(i, j)
hope it would help you
If a tuple is iterable why the following code results to an error?
Because for i, j in (5, 7, ..., n) is essentially equivalent to:
i, j = 5
i, j = 7
...
i, j = n
And the resulting int on the right-hand side is not iterable. You cannot "unpack"* each element of the tuple further because it is a single integer.
What you can do is the more straightforward assignment:
tp1 = (5, 7)
i, j = tp1
The syntax that you're currently using would apply if each element of the tuple is iterable, such as :
>>> tp2 = (['a', 5], ['b', 6])
>>> for i, j in tp2: print('%s -> %d' % (i, j))
...
a -> 5
b -> 6
*From comments: Can you explain to me what you mean with the phrase 'you cannot "unpack" each element of the tuple further because it is a single integer'?
Unpacking is the concept of expanding or extracting multiple parts of a thing on the right hand side into multiple things on the left-hand side.
Unpacking, extended unpacking, and nested extended unpacking gives a thorough overview of unpacking. There are also some extended forms of unpacking syntax specified in PEP 3132.
Your for loop is trying to unpack a tuple from each element of the tuple. You could instead do:
for i in tp1:
print i
or you could just do:
i, j = tpl
print(i, j)
but combining the two only makes sense if you have a tuple of tuples, like ((5, 7), (9, 11)).
I defined a dictionary like this (list is a list of integers):
my_dictionary = {'list_name' : list, 'another_list_name': another_list}
Now, I want to create a new list by iterating over this dictionary. In the end, I want it to look like this:
my_list = [list_name_list_item1, list_name_list_item2,
list_name_list_item3, another_list_name_another_list_item1]
And so on.
So my question is: How can I realize this?
I tried
for key in my_dictionary.keys():
k = my_dictionary[key]
for value in my_dictionary.values():
v = my_dictionary[value]
v = str(v)
my_list.append(k + '_' + v)
But instead of the desired output I receive a Type Error (unhashable type: 'list') in line 4 of this example.
You're trying to get a dictionary item by it's value whereas you already have your value.
Do it in one line using a list comprehension:
my_dictionary = {'list_name' : [1,4,5], 'another_list_name': [6,7,8]}
my_list = [k+"_"+str(v) for k,lv in my_dictionary.items() for v in lv]
print(my_list)
result:
['another_list_name_6', 'another_list_name_7', 'another_list_name_8', 'list_name_1', 'list_name_4', 'list_name_5']
Note that since the order in your dictionary is not guaranteed, the order of the list isn't either. You could fix the order by sorting the items according to keys:
my_list = [k+"_"+str(v) for k,lv in sorted(my_dictionary.items()) for v in lv]
Try this:
my_list = []
for key in my_dictionary:
for item in my_dictionary[key]:
my_list.append(str(key) + '_' + str(item))
Hope this helps.
Your immediate problem is that dict().values() is a generator yielding the values from the dictionary, not the keys, so when you attempt to do a lookup on line 4, it fails (in this case) as the values in the dictionary can't be used as keys. In another case, say {1:2, 3:4}, it would fail with a KeyError, and {1:2, 2:1} would not raise an error, but likely give confusing behaviour.
As for your actual question, lists do not attribute any names to data, like dictionaries do; they simply store the index.
def f()
a = 1
b = 2
c = 3
l = [a, b, c]
return l
Calling f() will return [1, 2, 3], with any concept of a, b, and c being lost entirely.
If you want to simply concatenate the lists in your dictionary, making a copy of the first, then calling .extend() on it will suffice:
my_list = my_dictionary['list_name'][:]
my_list.extend(my_dictionary['another_list_name'])
If you're looking to keep the order of the lists' items, while still referring to them by name, look into the OrderedDict class in collections.
You've written an outer loop over keys, then an inner loop over values, and tried to use each value as a key, which is where the program failed. Simply use the dictionary's items method to iterate over key,value pairs instead:
["{}_{}".format(k,v) for k,v in d.items()]
Oops, failed to parse the format desired; we were to produce each item in the inner list. Not to worry...
d={1:[1,2,3],2:[4,5,6]}
list(itertools.chain(*(
["{}_{}".format(k,i) for i in l]
for (k,l) in d.items() )))
This is a little more complex. We again take key,value pairs from the dictionary, then make an inner loop over the list that was the value and format those into strings. This produces inner sequences, so we flatten it using chain and *, and finally save the result as one list.
Edit: Turns out Python 3.4.3 gets quite confused when doing this nested as generator expressions; I had to turn the inner one into a list, or it would replace some combination of k and l before doing the formatting.
Edit again: As someone posted in a since deleted answer (which confuses me), I'm overcomplicating things. You can do the flattened nesting in a chained comprehension:
["{}_{}".format(k,v) for k,l in d.items() for v in l]
That method was also posted by Jean-François Fabre.
Use list comprehensions like this
d = {"test1":[1,2,3,],"test2":[4,5,6],"test3":[7,8,9]}
new_list = [str(item[0])+'_'+str(v) for item in d.items() for v in item[1]]
Output:
new_list:
['test1_1',
'test1_2',
'test1_3',
'test3_7',
'test3_8',
'test3_9',
'test2_4',
'test2_5',
'test2_6']
Let's initialize our data
In [1]: l0 = [1, 2, 3, 4]
In [2]: l1 = [10, 20, 30, 40]
In [3]: d = {'name0': l0, 'name1': l1}
Note that in my example, different from yours, the lists' content is not strings... aren't lists heterogeneous containers?
That said, you cannot simply join the keys and the list's items, you'd better cast these value to strings using the str(...) builtin.
Now it comes the solution to your problem... I use a list comprehension
with two loops, the outer loop comes first and it is on the items (i.e., key-value couples) in the dictionary, the inner loop comes second and it is on the items in the corresponding list.
In [4]: res = ['_'.join((str(k), str(i))) for k, l in d.items() for i in l]
In [5]: print(res)
['name0_1', 'name0_2', 'name0_3', 'name0_4', 'name1_10', 'name1_20', 'name1_30', 'name1_40']
In [6]:
In your case, using str(k)+'_'+str(i) would be fine as well, but the current idiom for joining strings with a fixed 'text' is the 'text'.join(...) method. Note that .join takes a SINGLE argument, an iterable, and hence in the list comprehension I used join((..., ...))
to collect the joinands in a single argument.
I am trying to iterate through a dictionary where each key contains a list which in turn contains from 0 up to 20+ sub-lists. The goal is to iterate through the values of dictionary 1, check if they are in any of the sublists of dictionary 2 for the same key, and if so, add +1 to a counter and not consider that sublist again.
The code looks somewhat like this:
dict1={"key1":[[1,2],[6,7]],"key2":[[1,2,3,4,5,6,7,8,9]]}
dict2={"key1":[[0,1,2,3],[5,6,7,8],[11,13,15]],"key2":[[7,8,9,10,11],[16,17,18]]}
for (k,v), (k2,v2) in zip(dict1.iteritems(),dict2.iteritems()):
temp_hold=[]
span_overlap=0
for x in v:
if x in v2 and v2 not in temp_hold:
span_overlap+=1
temp_hold.append(v2)
else:
continue
print temp_hold, span_overlap
This does obviously not work mainly due to the code not being able to check hierarchally through the list and sublists, and partly due to likely incorrect iteration syntax. I have not the greatest of grasp of nested loops and iterations which makes this a pain. Another option would be to first join the sublists into a single list using:
v=[y for x in v for y in x]
Which would make it easy to check if one value is in another dictionary, but then I lose the ability to work specifically with the sublist which contained parts of the values iterated through, nor can I count that sublist only once.
The desired output is a count of 2 for key1, and 1 for key2, as well as being able to handle the matching sublists for further analysis.
Here is one solution. I am first converting the list of lists into a list of sets. If you have any control over the lists, make them sets.
def matching_sublists(dict1, dict2):
result = dict()
for k in dict1:
assert(k in dict2)
result[k] = 0
A = [set(l) for l in dict1[k]]
B = [set(l) for l in dict2[k]]
for sublistA in A:
result[k] += sum([1 for sublistB in B if not sublistA.isdisjoint(sublistB) ])
return result
if __name__=='__main__':
dict1={"key1":[[1,2],[6,7]],"key2":[[1,2,3,4,5,6,7,8,9]]}
dict2={"key1":[[0,1,2,3],[5,6,7,8],[11,13,15]],"key2":[[7,8,9,10,11],[16,17,18]]}
print(matching_sublists(dict1, dict2))
What does for row_number, row in enumerate(cursor): do in Python?
What does enumerate mean in this context?
The enumerate() function adds a counter to an iterable.
So for each element in cursor, a tuple is produced with (counter, element); the for loop binds that to row_number and row, respectively.
Demo:
>>> elements = ('foo', 'bar', 'baz')
>>> for elem in elements:
... print elem
...
foo
bar
baz
>>> for count, elem in enumerate(elements):
... print count, elem
...
0 foo
1 bar
2 baz
By default, enumerate() starts counting at 0 but if you give it a second integer argument, it'll start from that number instead:
>>> for count, elem in enumerate(elements, 42):
... print count, elem
...
42 foo
43 bar
44 baz
If you were to re-implement enumerate() in Python, here are two ways of achieving that; one using itertools.count() to do the counting, the other manually counting in a generator function:
from itertools import count
def enumerate(it, start=0):
# return an iterator that adds a counter to each element of it
return zip(count(start), it)
and
def enumerate(it, start=0):
count = start
for elem in it:
yield (count, elem)
count += 1
The actual implementation in C is closer to the latter, with optimisations to reuse a single tuple object for the common for i, ... unpacking case and using a standard C integer value for the counter until the counter becomes too large to avoid using a Python integer object (which is unbounded).
It's a builtin function that returns an object that can be iterated over. See the documentation.
In short, it loops over the elements of an iterable (like a list), as well as an index number, combined in a tuple:
for item in enumerate(["a", "b", "c"]):
print item
prints
(0, "a")
(1, "b")
(2, "c")
It's helpful if you want to loop over a sequence (or other iterable thing), and also want to have an index counter available. If you want the counter to start from some other value (usually 1), you can give that as second argument to enumerate.
I am reading a book (Effective Python) by Brett Slatkin and he shows another way to iterate over a list and also know the index of the current item in the list but he suggests that it is better not to use it and to use enumerate instead.
I know you asked what enumerate means, but when I understood the following, I also understood how enumerate makes iterating over a list while knowing the index of the current item easier (and more readable).
list_of_letters = ['a', 'b', 'c']
for i in range(len(list_of_letters)):
letter = list_of_letters[i]
print (i, letter)
The output is:
0 a
1 b
2 c
I also used to do something, even sillier before I read about the enumerate function.
i = 0
for n in list_of_letters:
print (i, n)
i += 1
It produces the same output.
But with enumerate I just have to write:
list_of_letters = ['a', 'b', 'c']
for i, letter in enumerate(list_of_letters):
print (i, letter)
As other users have mentioned, enumerate is a generator that adds an incremental index next to each item of an iterable.
So if you have a list say l = ["test_1", "test_2", "test_3"], the list(enumerate(l)) will give you something like this: [(0, 'test_1'), (1, 'test_2'), (2, 'test_3')].
Now, when this is useful? A possible use case is when you want to iterate over items, and you want to skip a specific item that you only know its index in the list but not its value (because its value is not known at the time).
for index, value in enumerate(joint_values):
if index == 3:
continue
# Do something with the other `value`
So your code reads better because you could also do a regular for loop with range but then to access the items you need to index them (i.e., joint_values[i]).
Although another user mentioned an implementation of enumerate using zip, I think a more pure (but slightly more complex) way without using itertools is the following:
def enumerate(l, start=0):
return zip(range(start, len(l) + start), l)
Example:
l = ["test_1", "test_2", "test_3"]
enumerate(l)
enumerate(l, 10)
Output:
[(0, 'test_1'), (1, 'test_2'), (2, 'test_3')]
[(10, 'test_1'), (11, 'test_2'), (12, 'test_3')]
As mentioned in the comments, this approach with range will not work with arbitrary iterables as the original enumerate function does.
The enumerate function works as follows:
doc = """I like movie. But I don't like the cast. The story is very nice"""
doc1 = doc.split('.')
for i in enumerate(doc1):
print(i)
The output is
(0, 'I like movie')
(1, " But I don't like the cast")
(2, ' The story is very nice')
I am assuming that you know how to iterate over elements in some list:
for el in my_list:
# do something
Now sometimes not only you need to iterate over the elements, but also you need the index for each iteration. One way to do it is:
i = 0
for el in my_list:
# do somethings, and use value of "i" somehow
i += 1
However, a nicer way is to user the function "enumerate". What enumerate does is that it receives a list, and it returns a list-like object (an iterable that you can iterate over) but each element of this new list itself contains 2 elements: the index and the value from that original input list:
So if you have
arr = ['a', 'b', 'c']
Then the command
enumerate(arr)
returns something like:
[(0,'a'), (1,'b'), (2,'c')]
Now If you iterate over a list (or an iterable) where each element itself has 2 sub-elements, you can capture both of those sub-elements in the for loop like below:
for index, value in enumerate(arr):
print(index,value)
which would print out the sub-elements of the output of enumerate.
And in general you can basically "unpack" multiple items from list into multiple variables like below:
idx,value = (2,'c')
print(idx)
print(value)
which would print
2
c
This is the kind of assignment happening in each iteration of that loop with enumerate(arr) as iterable.
the enumerate function calculates an elements index and the elements value at the same time. i believe the following code will help explain what is going on.
for i,item in enumerate(initial_config):
print(f'index{i} value{item}')