How to create iterrator for string list

How to create iterrator for string list - python

I have a list with string elements, and in the end I want to recieve:
a hello
b hello
c hello
d hello
And I've got this code:
list=['a','b','c','d']
class Iterator:
def __init__(self, start, end):
self.start=start
self.end=end
def __iter__(self):
return self
def __next__(self):
self.start += ' hello'
if self.start == list[-1]:
raise StopIterration
return self.start
if __name__ == '__main__':
for item in Iterator(list[0], list[-1]):
print(item)
However, the methond __next__ CANNOT MOVE FROM list[0] to list[1], and the python began to be crazy and add billion of "hello" to the list[0], and can't even stop the program, so it's the hell's loop now.
Problems are:
Adding billon of "hello" to list[0], not moving to list[1].
Doesn't finish the program at all, despite I wrote what is a condition for finish.

Your instance of Iterator isn't tied to the list at all; it's irrelevant that you used the list to create the instance; Iterator.__init__ only saw two string values.
__init__ needs a reference to the list itself for use by __next__. Further, hello is something you append to the return value of __next__, not something you need to append to internal state every time you call __next__.
list=['a','b','c','d']
class Iterator:
def __init__(self, lst):
self.lst = lst
self.start = 0
def __iter__(self):
return self
def __next__(self):
try:
value = self.lst[self.start]
except IndexError:
raise StopIteration
self.start += 1
return value + ' hello'
if __name__ == '__main__':
for item in Iterator(list):
print(item)

Related

How to get Python iterators not to communicate with each other?

Here's a simple iterator through the characters of a string.
class MyString:
def __init__(self,s):
self.s = s
self._ix = 0
def __iter__(self):
return self
def __next__(self):
try:
item = self.s[self._ix]
except IndexError:
self._ix = 0
raise StopIteration
self._ix += 1
return item
string = MyString('abcd')
iter1 = iter(string)
iter2 = iter(string)
print(next(iter1))
print(next(iter2))
Trying to get this iterator to function like it should. There are a few requirements. First, the __next__ method MUST raise StopIteration and multiple iterators running at the same time must not interact with each other.
I accomplished objective 1, but need help on objective 2. As of right now the output is:
'a'
'b'
When it should be:
'a'
'a'
Any advice would be appreciated.
Thank you!

MyString acts as its own iterator much like a file object
>>> f = open('deleteme', 'w')
>>> iter(f) is f
True
You use this pattern when you want all iterators to affect each other - in this case advancing through the lines of a file.
The other pattern is to use a separate class to iterate much like a list whose iterators are independent.
>>> l = [1, 2, 3]
>>> iter(l) is l
False
To do this, move the _ix indexer to a separate class that references MyString. Have MyString.__iter__ create an instance of the class. Now you have a separate indexer per iterator.
class MyString:
def __init__(self,s):
self.s = s
def __iter__(self):
return MyStringIter(self)
class MyStringIter:
def __init__(self, my_string):
self._ix = 0
self.my_string = my_string
def __iter__(self):
return self
def __next__(self):
try:
item = self.my_string.s[self._ix]
except IndexError:
raise StopIteration
self._ix += 1
return item
string = MyString('abcd')
iter1 = iter(string)
iter2 = iter(string)
print(next(iter1))
print(next(iter2))

Your question title asks how to get iterators, plural, to not communicate with each other, but you don't have multiple iterators, you only have one. If you want to be able to get distinct iterators from MyString, you can add a copy method:
class MyString:
def __init__(self,s):
self.s = s
self._ix = 0
def __iter__(self):
return self
def __next__(self):
try:
item = self.s[self._ix]
except IndexError:
self._ix = 0
raise StopIteration
self._ix += 1
return item
def copy(self):
return MyString(self.s)
string = MyString('abcd')
iter1 = string.copy()
iter2 = string.copy()
print(next(iter1))
print(next(iter2))

Python Iterator class index overflows on second of for loop run

I defined a class that holds a list collection of objects, and defined the __iter__ and __next__ methods to make it for loopable. The collection here, is a Deck class that holds a list of Card objects.
Code:
import random
class Card:
#staticmethod
def get_ranks():
return ("A", "K", "Q", "J", "10", "9", "8", "7", "6", "5", "4", "3", "2") # A is highest, 2 is lowest
#staticmethod
def get_suites():
return ("H", "D", "S", "C")
def __init__(self, suite, rank):
if suite not in Card.get_suites():
raise Exception("Invalid suite")
if rank not in Card.get_ranks():
raise Exception("Invalid rank")
self.suite = suite
self.rank = rank
def __lt__(self, card2):
self_rank = Card.get_ranks().index(self.rank)
card2_rank = Card.get_ranks().index(card2.rank)
return self_rank > card2_rank
def __le__(self, card2):
self_rank = Card.get_ranks().index(self.rank)
card2_rank = Card.get_ranks().index(card2.rank)
return self_rank >= card2_rank
def __gt__(self, card2):
self_rank = Card.get_ranks().index(self.rank)
card2_rank = Card.get_ranks().index(card2.rank)
return self_rank < card2_rank
def __ge__(self, card2):
self_rank = Card.get_ranks().index(self.rank)
card2_rank = Card.get_ranks().index(card2.rank)
return self_rank <= card2_rank
def __eq__(self, card2):
self_rank = Card.get_ranks().index(self.rank)
card2_rank = Card.get_ranks().index(card2.rank)
return self_rank == card2_rank
def __ne__(self, card2):
self_rank = Card.get_ranks().index(self.rank)
card2_rank = Card.get_ranks().index(card2.rank)
return self_rank != card2_rank
def __str__(self):
return(self.rank + self.suite)
def __repr__(self):
return str(self)
class Deck:
def __init__(self):
self.contents = [Card(suite, rank) for suite in Card.get_suites() for rank in Card.get_ranks()]
random.shuffle(self.contents)
self.index = 0
def __len__(self):
return len(self.contents)
def __iter__(self):
return self
def __next__(self):
if self.index == len(self.contents):
raise StopIteration
item = self.contents[self.index]
self.index += 1
return item
def pick_card(self):
choice = random.randrange(len(self))
card = self.contents.pop(choice)
return card
def return_card_and_shuffle(self, card):
self.contents.append(card)
random.shuffle(self.contents)
def __str__(self):
dstr = ''
for card in self:
dstr += str(card) + ", "
return "{} cards: ".format(len(self)) + dstr[:-2]
def deal_bookends(deck):
card1 = deck.pick_card()
card2 = deck.pick_card()
if card1 > card2:
temp = card1
card1 = card2
card2 = temp
return (card1, card2)
if __name__ == '__main__':
deck = Deck()
for _ in range(3):
c1, c2 = deal_bookends(deck)
print("We have {} and {}".format(c1, c2))
print(deck)
deck.return_card_and_shuffle(c1)
print(deck)
print(deck.contents[-4:])
deck.return_card_and_shuffle(c2)
print(deck)
print(deck.contents[-4:])
On running, I get the following error:
We have 8H and KH
50 cards: 9H, 8C, AC, 7C, 6H, 2S, 2D, 5C, 10H, 5H, JS, 5S, KD, JH, JC, QS, 2H, 3H, 3S, 3D, 4C, 4H, AD, KS, JD, QH, 10D, 6S, 5D, 8D, 3C, 6C, 7D, AS, 7H, AH, 9S, 10C, QC, QD, 7S, 2C, KC, 8S, 4D, 4S, 6D, 10S, 9D, 9C
51 cards: QS
[7D, 5C, 10H, QS]
52 cards: 10C
[KC, 3S, 9H, 10C]
We have 2C and QD
Traceback (most recent call last):
File "playing_cards.py", line 106, in <module>
print(deck)
File "playing_cards.py", line 88, in __str__
for card in self:
File "playing_cards.py", line 73, in __next__
item = self.contents[self.index]
IndexError: list index out of range
It seems the thing doesn't for the second run of the for loop when I push the card object back into the list. How do I solve this while keeping the pop,push functionality.
Edit: The self.index is at 50 after the first call to print(). When the card is added back to list, index remains at 50, whereas the deck length is now 51 cards. So in the second (and third) call to print the last card is printed instead of the entire deck. Then subsequently error is raised.
I think I have read the documentation wrong here. My question is should I reset the index at the StopIteration bit. Is that the correct way to do this, or is the index supposed to reset on its own?

Note: If you are trying to learn how iterators work by implementing your own, then the above advice holds. If you just want to make your Deck iterable, you can just do this in Deck:
def __iter__(self):
return self.contents # lists are already iterable
Even better, if you want your deck to behave like a list (iterating, indexing, slicing, removal), you can just extend list.
Learning how iterators work:
The problem you have here is you are conflating a collection with an iterator. A collection should hold a group of items. Your Deck is a collection. A collection is iterable, which means I can do for x in collection on it. When we do for x in collection, Python actually does for x in iter(collection), which turns the collection into an iterator.
You want your iterator and collection to be separate. If you collection was its own iterator, then you can only have one iterator over it at a time (itself). Also note that iterators should only be used once. By doing self.index = 0 in your __iter__, you are making your iterator (Deck) reusable.
Consider the following:
nums = [1, 2, 3]
for i in nums:
for j in nums:
print(i, j)
We expect this to return:
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
Note that each time the inner loop iterates over the whole collection. If nums was its own iterator, then we'd have some issues:
# Here internally nums sets the current index as 0
for i in nums:
# Here internally nums sets the current index as 0 again
for j in nums:
print(i, j)
# Once this inner loop finishes the current index is 4.
# But that is also the index for the outer loop, so the
# outer loop ends too
Unexpected output:
1 1
1 2
1 3
The solution is Deck.__iter__ should return a new object called DeckIterator, which keeps track of its own index. DeckIterator.__iter__ should return self (as required by the docs), but that is just a detail. By doing this you enable multiple iterations over the deck at once that work as expected.
So a minimal example of this would be:
class Deck:
# ... snip ...
def __iter__(self):
return DeckIterator(self.contents)
class DeckIterator:
def __init__(self, cards):
self.index = 0
self.cards = cards
def __iter__(self):
return self
def __next__(self):
if self.index >= len(self.cards):
# We've gotten the end of the deck/list
raise StopIteration
item = self.cards[self.index]
self.index += 1
return item
Also, if you don't believe me about this list as its own iterator, here's a list that exhibits this bad behavior:
class BadList(list):
def __iter__(self):
self._current_index = 0
return self
def __next__(self):
print(f'current index is {self._current_index}', end='')
if self._current_index >= len(self):
print(' which is the end, so ending iteration')
raise StopIteration
item = self[self._current_index]
print(f' so returning value {item}')
self._current_index += 1
return item
# Using letters instead of numbers so difference between indices
# and items is more clear
letters = BadList('abc')
for i in letters:
for j in letters:
print(i, j)
Output from it:
current index is 0 so returning value "a"
current index is 0 so returning value "a"
a a
current index is 1 so returning value "b"
a b
current index is 2 so returning value "c"
a c
current index is 3 which is the end, so ending iteration
current index is 3 which is the end, so ending iteration

Not sure how you got there, but you are beyond the length of your list. Suggest you compare for >= length of the list like:
def __next__(self):
if self.index >= len(self.contents):
raise StopIteration
.....

Make the following changes,
def __iter__(self):
self.index = 0
return self
So that each time __iter__ is called, index is reset.
The reason your'e getting this error is, once you iterate through the deck, at the end of the iteration, self.index == len(self.contents).
The next time you iterate, the self.index should be reset to 0.
I made the above change and it worked for me.

Your specific issue at the moment is caused by the check in your __next__ method not being general enough to detect all situations where you've iterated past the last value in self.contents. Since self.contents can change, you need to use a greater-than-or-equal test:
if self.index >= len(self.contents):
This will fix the current issue, but you'll still have other problems, since your Deck can only be iterated once. That's because you've implemented the iterator protocol, rather than the iterable protocol. These are easy to confuse, so don't feel bad if you don't understand the difference immediately.
An iterable is any object with an __iter__ method that returns an iterator. Some iterables return different iterators each time they're called, so you can iterate on them multiple times.
An iterator implements a __next__ method, which yields the next value or raises StopIteration. An iterator must also have an __iter__ method, which returns itself, which allows an iterator to be used wherever an iterable is expected, though it can only be iterated on once.
For your Deck, it probably makes sense to implement the iterable protocol, and return a separate iterator each time __iter__ is called. It's only rarely useful to implement your own iterator type, but if you want to test your knowledge of how the different protocols fit together, it can be interesting:
class Deck:
def __init__(self):
self.contents = [Card(suite, rank)
for suite in Card.get_suites()
for rank in Card.get_ranks()]
random.shuffle(self.contents)
# no index here
def __iter__(self):
return DeckIterator(self)
# other methods, but no __next__
class DeckIterator:
def __init__(self, deck):
self.deck = deck
self.index = 0
def __iter__(self):
return self
def __next__(self):
if self.index > len(self.deck):
raise StopIteration
value = self.deck.contents[self.index]
self.index += 1
return value
A more practical approach is to have Deck.__iter__ borrow some convenient iterator type. For instance, you could do return iter(self.contents) and you'd get an iterator that works exactly like the custom version above. Another option is to make __iter__ a generator function, since generator objects are iterators. This can be convenient if you need to do just a little bit of processing on each item as you iterate over it.

TypeError: object takes no parameters

I'm trying to create a code that utilizes the __iter__() method as a generator, but I am getting an error saying:
TypeError: object() takes no parameters.
Additionally, I am unsure whether my yield function should be called within try: or within the main() function
I am fairly new to Python and coding, so any suggestions and advice would be greatly appreciated so that I can learn. Thanks!
class Counter(object):
def __init__(self, filename, characters):
self._characters = characters
self.index = -1
self.list = []
f = open(filename, 'r')
for word in f.read().split():
n = word.strip('!?.,;:()$%')
n_r = n.rstrip()
if len(n) == self._characters:
self.list.append(n)
def __iter(self):
return self
def next(self):
try:
self.index += 1
yield self.list[self.index]
except IndexError:
raise StopIteration
f.close()
if __name__ == "__main__":
for word in Counter('agency.txt', 11):
print "%s' " % word

Use yield for function __iter__:
class A(object):
def __init__(self, count):
self.count = count
def __iter__(self):
for i in range(self.count):
yield i
for i in A(10):
print i
In your case, __iter__ maybe looks something like this:
def __iter__(self):
for i in self.list:
yield i

You mistyped the declaration of the __init__ method, you typed:
def __init
Instead of:
def __init__

Implementation of a Trie in Python

I programmed a Trie as a class in python. The search and insert function are clear, but now i tried to programm the python function __str__, that i can print it on the screen. But my function doesn't work!
class Trie(object):
def __init__(self):
self.children = {}
self.val = None
def __str__(self):
s = ''
if self.children == {}: return ' | '
for i in self.children:
s = s + i + self.children[i].__str__()
return s
def insert(self, key, val):
if not key:
self.val = val
return
elif key[0] not in self.children:
self.children[key[0]] = Trie()
self.children[key[0]].insert(key[1:], val)
Now if I create a Object of Trie:
tr = Trie()
tr.insert('hallo', 54)
tr.insert('hello', 69)
tr.insert('hellas', 99)
And when i now print the Trie, occures the problem that the entries hello and hellas aren't completely.
print tr
hallo | ellas | o
How can i solve that problem?.

Why not have str actually dump out the data in the format that it is stored:
def __str__(self):
if self.children == {}:
s = str(self.val)
else:
s = '{'
comma = False
for i in self.children:
if comma:
s = s + ','
else:
comma = True
s = s + "'" + i + "':" + self.children[i].__str__()
s = s + '}'
return s
Which results in:
{'h':{'a':{'l':{'l':{'o':54}}},'e':{'l':{'l':{'a':{'s':99},'o':69}}}}}

There are several issues you're running into. The first is that if you have several children at the same level, you'll only be prefixing one of them with the initial part of the string, and just showing the suffix of the others. Another issue is that you're only showing leaf nodes, even though you can have terminal values that are not at a leaf (consider what happens when you use both "foo" and "foobar" as keys into a Trie). Finally, you're not outputting the values at all.
To solve the first issue, I suggest using a recursive generator that does the traversal of the Trie. Separating the traversal from __str__ makes things easier since the generator can simply yield each value we come across, rather than needing to build up a string as we go. The __str__ method can assemble the final result easily using str.join.
For the second issue, you should yield the current node's key and value whenever self.val is not None, rather than only at leaf nodes. As long as you don't have any way to remove values, all leaf nodes will have a value, but we don't actually need any special casing to detect that.
And for the final issue, I suggest using string formatting to make a key:value pair. (I suppose you can skip this if you really don't need the values.)
Here's some code:
def traverse(self, prefix=""):
if self.val is not None:
yield "{}:{}".format(prefix, self.val)
for letter, child in self.children.items():
yield from child.traverse(prefix + letter)
def __str__(self):
return " | ".join(self.traverse())
If you're using a version of Python before 3.3, you'll need to replace the yield from statement with an explicit loop to yield the items from the recursive calls:
for item in child.traverse(prefix + letter)
yield item
Example output:
>>> t = Trie()
>>> t.insert("foo", 5)
>>> t.insert("bar", 10)
>>> t.insert("foobar", 100)
>>> str(t)
'bar:10 | foo:5 | foobar:100'

You could go with a simpler representation that just provides a summary of what the structure contains:
class Trie:
def __init__(self):
self.__final = False
self.__nodes = {}
def __repr__(self):
return 'Trie<len={}, final={}>'.format(len(self), self.__final)
def __getstate__(self):
return self.__final, self.__nodes
def __setstate__(self, state):
self.__final, self.__nodes = state
def __len__(self):
return len(self.__nodes)
def __bool__(self):
return self.__final
def __contains__(self, array):
try:
return self[array]
except KeyError:
return False
def __iter__(self):
yield self
for node in self.__nodes.values():
yield from node
def __getitem__(self, array):
return self.__get(array, False)
def create(self, array):
self.__get(array, True).__final = True
def read(self):
yield from self.__read([])
def update(self, array):
self[array].__final = True
def delete(self, array):
self[array].__final = False
def prune(self):
for key, value in tuple(self.__nodes.items()):
if not value.prune():
del self.__nodes[key]
if not len(self):
self.delete([])
return self
def __get(self, array, create):
if array:
head, *tail = array
if create and head not in self.__nodes:
self.__nodes[head] = Trie()
return self.__nodes[head].__get(tail, create)
return self
def __read(self, name):
if self.__final:
yield name
for key, value in self.__nodes.items():
yield from value.__read(name + [key])

Instead of your current strategy for printing, I suggest the following strategy instead:
Keep a list of all characters in order that you have traversed so far. When descending to one of your children, push its character on the end of its list. When returning, pop the end character off of the list. When you are at a leaf node, print the contents of the list as a string.
So say you have a trie built out of hello and hellas. This means that as you descend to hello, you build a list h, e, l, l, o, and at the leaf node you print hello, return once to get (hell), push a, s and at the next leaf you print hellas. This way you re-print letters earlier in the tree rather than having no memory of what they were and missing them.
(Another possiblity is to just descend the tree, and whenever you reach a leaf node go to your parent, your parent's parent, your parent's parent's parent... etc, keeping track of what letters you encounter, reversing the list you make and printing that out. But it may be less efficient.)

list(y) behavior is "wrong" on first call

I have an iterator with a __len__ method defined. Questions:
If you call list(y) and y has a __len__ method defined, then __len__ is called.
1) Why?
In my output, you will see that the len(list(y)) is 0 on the first try. If you look at the list output, you will see that on the first call, I receive an empty list, and on the second call I receive the "correct" list.
2) Why is it returning a list of length zero at all?
3) Why does the list length correct itself on all subsequent calls?
Also notice that calling "enumerate" is not the issue. Class C does the same thing but using a while loop and calls to next().
Code:
showcalls = False
class A(object):
_length = None
def __iter__(self):
if showcalls:
print "iter"
self.i = 0
return self
def next(self):
if showcalls:
print "next"
i = self.i + 1
self.i = i
if i > 2:
raise StopIteration
else:
return i
class B(A):
def __len__(self):
if showcalls:
print "len"
if self._length is None:
for i,x in enumerate(self):
pass
self._length = i
return i
else:
return self._length
class C(A):
def __len__(self):
if showcalls:
print "len"
if self._length is None:
i = 0
while True:
try:
self.next()
except StopIteration:
self._length = i
return i
else:
i += 1
else:
return self._length
if __name__ == '__main__':
a = A()
print len(list(a)), len(list(a)), len(list(a))
print
b = B()
print len(list(b)), len(list(b)), len(list(b))
print
c = C()
print len(list(c)), len(list(c)), len(list(c))
Output:
2 2 2
0 2 2
0 2 2

If you call list(y) and y has a
len method defined, then len is called. why?
Because it's faster to build the resulting list with the final length, if known from the start, than to begin with an empty list and append one item at a time. And __len__ is, and must be, 100% guaranteed to be reliable.
IOW, do not implement special methods like __len__ if and when you can't return a reliable value.
As for the second question, your implementations of __len__ are broken because they consume the iterator (and don't return it to its pristine state) -- so they leave no items for following .next calls, so the list constructor gets a StopIteration and decides that your __len__ was just flaky (it's unfortunately flakier than poor list can guess...!-).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to create iterrator for string list - python

Related

How to get Python iterators not to communicate with each other?

Python Iterator class index overflows on second of for loop run

TypeError: object takes no parameters

Implementation of a Trie in Python

list(y) behavior is "wrong" on first call

Categories

Resources