Python enumerate - what cannot I see? - python

Can anyone explain what is going on here as I am flummoxed
I have a module-wide list variable with elements that have fields - mylist with 'n' entries, each of field1, field2..fieldx
I want to access them in a procedure, so have (with some trace/debug statements)
print mylist [1].dataFieldCheckType
for lIndex, lField in enumerate(mylist, start = 1):
print lField.dataFieldCheckType
The first print statement gives the value -4 (which is correct), the second gives a different value, 0, over a simple one-statement step.
To my mind, lField is being created as a new element with default values but I do not know, nor understand, why. Why is the second print statement giving a different value from the first?
What am I doing wrong? Or, probably more pertinently, what am I not understanding?
I have asked this in another forum but no-one has come up with a plausible explanation.

In enumerate(), start does not specify the starting index into the iterable. It specifies the starting value of the count. enumerate() iterates over the whole iterable, from the first (index 0) to the last element, regardless of the start parameter.
The first print statement in your loop prints mylist[0].dataFieldCheckType, just as it ought to. You're just hoping it would be mylist[1].dataFieldCheckType.
If you want to take all elements of the list starting at the second (index 1), just slice it:
mylist[1:]
And if you really do need the index, too, combine the slice with enumerate():
enumerate(mylist[1:], start=1)

enumerate yields (index + start, value) tuples for every value of an iterable. The optional start parameter is used as an offset value to compute the first element of the generated tuples:
>>> a = ['hi', 'stack', 'overflow']
>>> for x in enumerate(a, -4):
... x
...
(-4, 'hi') # 0 + (-4)
(-3, 'stack') # 1 + (-4)
(-2, 'overflow') # 2 + (-4)
If you want to skip elements of an iterable, but don't need that particular slice in memory (all you want to do is iteration), use itertools.islice:
>>> from itertools import islice
>>> for x in islice(a, 2, None):
... x
...
'overflow'
Of course, you could combine the two for great justice.
>>> for x in islice(enumerate(a), 0, 2):
... x
...
(0, 'hi')
(1, 'stack')

Related

Using tuples as indexes to compare items across a list

I have a list of three tuples and a list of three strings:
pairs = [(0, 1), (0, 2), (1, 2)]
values = ['aac', 'ccc', 'caa']
I would like to use the elements of the pairs as indexes to compare the strings in the following way:
The first pair of indexes, (0, 1) operate across the first letter of each string: a from the first, c from the second, and c from the third. That is, it compares the values at index 0 and 1 in the sequence a, c, c. Since a is lexically less than c, this comparison should give 'smaller'.
The second pair is (0, 2) and operates across the second letter of each string: a, c, a. Since they're both a, the result should be 'equal'.
Finally, (1, 2) is checked on c, c, a, resulting in 'bigger'.
So the total expected output is the following list:
['smaller', 'bigger', 'equal']
I have tried the following code:
n=0
for x,y in pairs:
if ord(values[x][n])>ord(values[y][n]):
print('bigger')
n+=1
elif ord(values[x][n])<ord(values[y][n]):
print('smaller')
n+=1
else:
print('equal')
n+=1
However, not only does it print the results instead of building a list, it also gives incorrect results (smaller, equal, bigger). How do I achieve my intended result?
You could use a list comprehension with the zip function to combine the two lists:
pairs = [(0, 1), (0, 2), (1, 2)]
values = ['aac', 'ccc', 'caa']
result = [ ("smaller","equal","bigger")[(v[x]>v[y])+(v[x]>=v[y])]
for v,(x,y) in zip(zip(*values),pairs) ]
print(result)
['smaller', 'equal', 'bigger']
zip(*values) will create tuples with the nth character of each string: ('a','c','c'), ('a','c','a'), ('c','c','a')
zip(zip(*values),pairs) combines those character tuples with each corresponding pair: (('a','c','c'),(0,1)), (('a','c','a'),(0,2)), (('c','c','a'),(1,2))
these become v (the nth characters of each value) and x,y (the nth index pair)
The appropriate keyword is then chosen in ("smaller","equal","bigger") using the index 0, 1 or 2
Python treats True as 1 and False as 0 when adding booleans (comparison results) so the index will be 1+1 if the v[x] is greater than v[y], It will be 0+1 if v[x] is equal to v[y] and zero otherwise. BTW, you don't need ord() to compare characters.
Your code is good, there are just a few things to improve.
No need to compare the ord, you can compare characters directly.
Instead of printing, save each step in a list
You can use enumerate to enumerate iterables:
out = []
for n, (x,y) in enumerate(pairs):
if values[x][n]>values[y][n]:
out.append('bigger')
elif values[x][n]<values[y][n]:
out.append('smaller')
else:
out.append('equal')
out
Output:
>>> out
['smaller', 'equal', 'bigger']
NB. I am not commenting on the global logic as what you ultimately want to do was not explicited

Python, getting the last element with slice() object

How do i create a slice() object so that it would include the last element of a list/string
s = 'abcdef'
s[slice(2,4)]
works fine.
Say I wanted to get elements from second to the end, the equivalent of s[2:]
s[slice(2)] # only gives first two elements, argument is interpreted as the end of the range
s[slice(2,)] # same as above
s[slice(2, -1)] # gives a range from second to the end excluding the last element
s[slice(2, 0)] # gives empty as expected, since end of range before the start
I can get specifically the last element with slice(-1, -2, -1), this won't work correctly for more then one element.
If you want to include the last element you can do that in the following two ways :
s[slice(2,6)]
or replace 6 with len(s)
Or you could also do:
s[slice(2,None)]
You can test it with magic method __getitem__. The last object can be get with slice(-1, None, None):
s = 'abcdef'
class A:
def __getitem__(self, v):
print(v)
a = A()
a[-1:]
print("s[-1:] = ", s[-1:])
print("s[slice(-1, None, None)] = ", s[slice(-1, None, None)])
Prints:
slice(-1, None, None)
s[-1:] = f
s[slice(-1, None, None)] = f
Python sequence, including list object allows indexing. Any element in list can be accessed using zero based index. If index is a negative number, count of index starts from end. As we want last element in list, use -1 as index.
So you can just use:
s= "abcdef"
print(s[-1])
Result:
f

What does list.insert() in actually do in python?

I have code like this:
squares = []
for value in range(1, 5):
squares.insert(value+1,value**2)
print(squares)
print(squares[0])
print(len(squares))
And the output is :
[1, 4, 9, 16]
1
4
So even if I ask python to insert '1' at index '2', it inserts at the first available index. So how does 'insert' makes the decision?
From the Python3 doc:
list.insert(i, x)
Insert an item at a given position. The first
argument is the index of the element before which to insert, so
a.insert(0, x) inserts at the front of the list, and a.insert(len(a),
x) is equivalent to a.append(x).
What is not mentionned is that you can give an index that is out of range and Python will then append to the list.
If you dig into the Python implementation you find the following in the ins1 function that does the insertion:
if (where > n)
where = n;
So basically Python will max out your index to the length of the list.
Basically, it's similar to append, except that it allows you to insert a new item at any position in the list, as opposed to just at the end.

What does enumerate() mean?

What does for row_number, row in enumerate(cursor): do in Python?
What does enumerate mean in this context?
The enumerate() function adds a counter to an iterable.
So for each element in cursor, a tuple is produced with (counter, element); the for loop binds that to row_number and row, respectively.
Demo:
>>> elements = ('foo', 'bar', 'baz')
>>> for elem in elements:
... print elem
...
foo
bar
baz
>>> for count, elem in enumerate(elements):
... print count, elem
...
0 foo
1 bar
2 baz
By default, enumerate() starts counting at 0 but if you give it a second integer argument, it'll start from that number instead:
>>> for count, elem in enumerate(elements, 42):
... print count, elem
...
42 foo
43 bar
44 baz
If you were to re-implement enumerate() in Python, here are two ways of achieving that; one using itertools.count() to do the counting, the other manually counting in a generator function:
from itertools import count
def enumerate(it, start=0):
# return an iterator that adds a counter to each element of it
return zip(count(start), it)
and
def enumerate(it, start=0):
count = start
for elem in it:
yield (count, elem)
count += 1
The actual implementation in C is closer to the latter, with optimisations to reuse a single tuple object for the common for i, ... unpacking case and using a standard C integer value for the counter until the counter becomes too large to avoid using a Python integer object (which is unbounded).
It's a builtin function that returns an object that can be iterated over. See the documentation.
In short, it loops over the elements of an iterable (like a list), as well as an index number, combined in a tuple:
for item in enumerate(["a", "b", "c"]):
print item
prints
(0, "a")
(1, "b")
(2, "c")
It's helpful if you want to loop over a sequence (or other iterable thing), and also want to have an index counter available. If you want the counter to start from some other value (usually 1), you can give that as second argument to enumerate.
I am reading a book (Effective Python) by Brett Slatkin and he shows another way to iterate over a list and also know the index of the current item in the list but he suggests that it is better not to use it and to use enumerate instead.
I know you asked what enumerate means, but when I understood the following, I also understood how enumerate makes iterating over a list while knowing the index of the current item easier (and more readable).
list_of_letters = ['a', 'b', 'c']
for i in range(len(list_of_letters)):
letter = list_of_letters[i]
print (i, letter)
The output is:
0 a
1 b
2 c
I also used to do something, even sillier before I read about the enumerate function.
i = 0
for n in list_of_letters:
print (i, n)
i += 1
It produces the same output.
But with enumerate I just have to write:
list_of_letters = ['a', 'b', 'c']
for i, letter in enumerate(list_of_letters):
print (i, letter)
As other users have mentioned, enumerate is a generator that adds an incremental index next to each item of an iterable.
So if you have a list say l = ["test_1", "test_2", "test_3"], the list(enumerate(l)) will give you something like this: [(0, 'test_1'), (1, 'test_2'), (2, 'test_3')].
Now, when this is useful? A possible use case is when you want to iterate over items, and you want to skip a specific item that you only know its index in the list but not its value (because its value is not known at the time).
for index, value in enumerate(joint_values):
if index == 3:
continue
# Do something with the other `value`
So your code reads better because you could also do a regular for loop with range but then to access the items you need to index them (i.e., joint_values[i]).
Although another user mentioned an implementation of enumerate using zip, I think a more pure (but slightly more complex) way without using itertools is the following:
def enumerate(l, start=0):
return zip(range(start, len(l) + start), l)
Example:
l = ["test_1", "test_2", "test_3"]
enumerate(l)
enumerate(l, 10)
Output:
[(0, 'test_1'), (1, 'test_2'), (2, 'test_3')]
[(10, 'test_1'), (11, 'test_2'), (12, 'test_3')]
As mentioned in the comments, this approach with range will not work with arbitrary iterables as the original enumerate function does.
The enumerate function works as follows:
doc = """I like movie. But I don't like the cast. The story is very nice"""
doc1 = doc.split('.')
for i in enumerate(doc1):
print(i)
The output is
(0, 'I like movie')
(1, " But I don't like the cast")
(2, ' The story is very nice')
I am assuming that you know how to iterate over elements in some list:
for el in my_list:
# do something
Now sometimes not only you need to iterate over the elements, but also you need the index for each iteration. One way to do it is:
i = 0
for el in my_list:
# do somethings, and use value of "i" somehow
i += 1
However, a nicer way is to user the function "enumerate". What enumerate does is that it receives a list, and it returns a list-like object (an iterable that you can iterate over) but each element of this new list itself contains 2 elements: the index and the value from that original input list:
So if you have
arr = ['a', 'b', 'c']
Then the command
enumerate(arr)
returns something like:
[(0,'a'), (1,'b'), (2,'c')]
Now If you iterate over a list (or an iterable) where each element itself has 2 sub-elements, you can capture both of those sub-elements in the for loop like below:
for index, value in enumerate(arr):
print(index,value)
which would print out the sub-elements of the output of enumerate.
And in general you can basically "unpack" multiple items from list into multiple variables like below:
idx,value = (2,'c')
print(idx)
print(value)
which would print
2
c
This is the kind of assignment happening in each iteration of that loop with enumerate(arr) as iterable.
the enumerate function calculates an elements index and the elements value at the same time. i believe the following code will help explain what is going on.
for i,item in enumerate(initial_config):
print(f'index{i} value{item}')

Optimize search to find next matching value in a list

I have a program that goes through a list and for each objects finds the next instance that has a matching value. When it does it prints out the location of each objects. The program runs perfectly fine but the trouble I am running into is when I run it with a large volume of data (~6,000,000 objects in the list) it will take much too long. If anyone could provide insight into how I can make the process more efficient, I would greatly appreciate it.
def search(list):
original = list
matchedvalues = []
count = 0
for x in original:
targetValue = x.getValue()
count = count + 1
copy = original[count:]
for y in copy:
if (targetValue == y.getValue):
print (str(x.getLocation) + (,) + str(y.getLocation))
break
Perhaps you can make a dictionary that contains a list of indexes that correspond to each item, something like this:
values = [1,2,3,1,2,3,4]
from collections import defaultdict
def get_matches(x):
my_dict = defaultdict(list)
for ind, ele in enumerate(x):
my_dict[ele].append(ind)
return my_dict
Result:
>>> get_matches(values)
defaultdict(<type 'list'>, {1: [0, 3], 2: [1, 4], 3: [2, 5], 4: [6]})
Edit:
I added this part, in case it helps:
values = [1,1,1,1,2,2,3,4,5,3]
def get_next_item_ind(x, ind):
my_dict = get_matches(x)
indexes = my_dict[x[ind]]
temp_ind = indexes.index(ind)
if len(indexes) > temp_ind + 1:
return(indexes)[temp_ind + 1]
return None
Result:
>>> get_next_item_ind(values, 0)
1
>>> get_next_item_ind(values, 1)
2
>>> get_next_item_ind(values, 2)
3
>>> get_next_item_ind(values, 3)
>>> get_next_item_ind(values, 4)
5
>>> get_next_item_ind(values, 5)
>>> get_next_item_ind(values, 6)
9
>>> get_next_item_ind(values, 7)
>>> get_next_item_ind(values, 8)
There are a few ways you could increase the efficiency of this search by minimising additional memory use (particularly when your data is BIG).
you can operate directly on the list you are passing in, and don't need to make copies of it, in this way you won't need: original = list, or copy = original[count:]
you can use slices of the original list to test against, and enumerate(p) to iterate through these slices. You won't need the extra variable count and, enumerate(p) is efficient in Python
Re-implemented, this would become:
def search(p):
# iterate over p
for i, value in enumerate(p):
# if value occurs more than once, print locations
# do not re-test values that have already been tested (if value not in p[:i])
if value not in p[:i] and value in p[(i + 1):]:
print(e, ':', i, p[(i + 1):].index(e))
v = [1,2,3,1,2,3,4]
search(v)
1 : 0 2
2 : 1 2
3 : 2 2
Implementing it this way will only print out the values / locations where a value is repeated (which I think is what you intended in your original implementation).
Other considerations:
More than 2 occurrences of value: If the value repeats many times in the list, then you might want to implement a function to walk recursively through the list. As it is, the question doesn't address this - and it may be that it doesn't need to in your situation.
using a dictionary: I completely agree with Akavall above, dictionary's are a great way of looking up values in Python - especially if you need to lookup values again later in the program. This will work best if you construct a dictionary instead of a list when you originally create the list. But if you are only doing this once, it is going to cost you more time to construct the dictionary and query over it than simply iterating over the list as described above.
Hope this helps!

Categories