Thinks counter is a string, not an integer - python

I'm trying to extract the first few elements of a tab-delimited file using the following:
words = []
name_elements = []
counter = 0
for line in f:
words = line.split()
for element in words:
counter = counter + 1
if words[element].isupper():
name_elements = words[0:counter-1]
print type(counter)
When I run this code, I get this error:
TypeError: list indices must be integers, not str
logout
Even though when I run type(counter) it says it's an integer.
What's the issue?

You are trying to index words with element. element is a string; it is already the item you wanted to get.
The for loop is giving you each element from words in turn, assigning it to the element variable. element is not an integer index into the words list.
Note that your counter is going to go out of bounds; if you want to have an index into the words list along with the element, use the enumerate() function. You are also replacing the name_elements list with a slice from words; perhaps you wanted to extend the list instead:
for line in f:
words = line.split()
for counter, element in enumerate(words):
if element.isupper():
name_elements.extend(words[:counter-1])
although it is not clear exactly what you wanted to do with the words list in this case.

Related

List index out of range in indented for loops

I am trying to make a program that searches a list of strings, using a second list. The first list is a list of random jumbled up letters and the second is a list of words. The program is supposed to detect whether a word in the each word in the second list can be found in any of the strings in the first. If it is, then the word that was found is added to an empty, third list.
counter = 0
for a in FirstList:
for b in SecondList:
if SecondList[counter] in FirstList[counter]:
ThirdList += [SecondList[counter]]
counter += 1
if counter >= len(SecondList):
break
return ThirdList
However, it throws up an error on the first if statement claiming that the list index is out of range. I don't quite understand what I am missing, because I am not editing the contents of any of the lists that I am iterating through, which is what the reason for this error was in other posts about the same error.
The issue here is you are you using a counter which should break at the size of SecondList, but if the FirstList is smaller "counter" will be out of bounds for the FirstList.
I think this may be closer to what you are trying to achieve:
ThirdList = [word for word in SecondList for string in FirstList if word in string]
This is a list comprehension which is running through both lists and outputting any values from the second list which appear in any of the values in the first list.
So for example with the code:
FirstList = ["aadada", "asdtest", "hasdk"]
SecondList = ["test", "data"]
ThirdList = [word for word in SecondList for string in FirstList if word in string]
print(ThirdList)
You would receive the output: "test" as test is found in the FirstList.
You should not iterate over the two lists. Your counter is exceding the amount of elements in both lists. If I understand well what you are trying to do, you could use the following code:
ThirdList = []
for word in SecondList:
for random_word in FirstList:
if word in random_word and word not in ThirdList:
ThirdList.append(word)
return ThirdList
You can remove the second condition of the if statement if you have no problem with duplicate elements in ThirdList.

Python list slicing error

I am using this code: https://pastebin.com/mQkpxdeV
wordlist[overticker] = thesentence[0:spaces]
in this function:
def mediumparser(inpdat3):
spaceswitch = 0
overticker = 0
thesentence = "this sentence is to count the spaces"
wordlist = []
while spaceswitch == 0:
spaces = thesentence.find(' ')
wordlist[overticker] = thesentence[0:spaces] # this is where we save the words at
thesentence = thesentence[spaces:len(thesentence)] # this is where we change the sentence for the
# next run-through
print('here2')
print(wordlist)
I can't figure out why it just keeps saying list index out of range.
The program seems to work but it gives an error, what am I doing wrong? I have looked through this book by Mark Lutz for an answer and I can't find one.
The "list index out of range" problem is never with list splicing, as shown in this simple test:
>>> l = []
>>> l[0:1200]
[]
>>> l[-400:1200]
[]
so the problem is with your left hand assignment wordlist[overticker] which uses a list access, not slicing, and which is subject to "list index out of range".
Just those 4 lines of your code are enough to find the issue
wordlist = []
while spaceswitch == 0:
spaces = thesentence.find(' ')
wordlist[overticker] = ...
wordlist is just empty. You have to extend/append the list (or use a dictionary if you want to dynamically create items according to a key)
Instead of doing wordlist[overticker] with wordlist being a empty list, you will need to use append instead, since indexing an empty list wouldn't make sense.
wordlist.append(thesentence[0:spaces])
Alternatively, you can pre-initiate the list with 20 empty strings.
wordlist = [""]*20
wordlist[overticker] = thesentence[0:spaces]
P.S.
wordlist[overticker] is called indexing, wordlist[1:10] is called slicing.

How to reference the next item in a list in Python?

I'm fairly new to Python, and am trying to put together a Markov chain generator. The bit that's giving me problems is focused on adding each word in a list to a dictionary, associated with the word immediately following.
def trainMarkovChain():
"""Trains the Markov chain on the list of words, returning a dictionary."""
words = wordList()
Markov_dict = dict()
for i in words:
if i in Markov_dict:
Markov_dict[i].append(words.index(i+1))
else:
Markov_dict[i] = [words.index(i+1)]
print Markov_dict
wordList() is a previous function that turns a text file into a list of words. Just what it sounds like. I'm getting an error saying that I can't concatenate strings and integers, referring to words.index(i+1), but if that's not how to refer to the next item then how is it done?
You can also do it as:
for a,b in zip(words, words[1:]):
This will assign a as an element in the list and b as the next element.
The following code, simplified a bit, should produce what you require. I'll elaborate more if something needs explaining.
words = 'Trains the Markov chain on the list of words, returning a dictionary'.split()
chain = {}
for i, word in enumerate(words):
# ensure there's a record
next_words = chain.setdefault(word, [])
# break on the last word
if i + 1 == len(words):
break
# append the next word
next_words.append(words[i + 1])
print(words)
print(chain)
assert len(chain) == 11
assert chain['the'] == ['Markov', 'list']
assert chain['dictionary'] == []
def markov_chain(list):
markov = {}
for index, i in enumerate(list):
if index<len(list)-1:
markov[i]=list[index+1]
return (markov)
The code above takes a list as an input and returns the corresponding markov chain as a dictionary.
You can use loops to get that, but it's actually a waste to have to put the rest of your code in a loop when you only need the next element.
There are two nice options to avoid this:
Option 1 - if you know the next index, just call it:
my_list[my_index]
Although most of the times you won't know the index, but still you might want to avoid the for loop.
Option 2 - use iterators
& check this tutorial
my_iterator = iter(my_list)
next(my_iterator) # no loop required

replace wrongly spelled words,python

I have a set of tuples of the form
ref_set = [(a1,b1),(a2,b2),(a3,b3)...]
and so on. I need to compare words from a list of sentences and check if it is equal to a1, a2, a3.. if word == a1, replace it with b1. If word == a2, replace with b2 and so on.
Here's my code:
def replace_words(x): #function
for line in x: #iterate over lines in list
for word in line.split(): #iterate over words in list
for i,j in ref_set: #iterate over each tuple
if word == i: #if word is equal to first element
word = j #replace it with 2nd one.
I'm getting None as a result; I know I need to return something.
Don't use a list of tuples. Use a dictionary:
ref_map = dict(ref_set)
for line in x:
line = ' '.join([ref_map.get(word, word) for word in line.split()])
otherwise you have a NxM loop; for every extra word in your text or in your ref_set you double the number of iterations you need to do.
Your code only rebinds word, not replace the word in the line; the list comprehension above produces a new line value instead. This doesn't replace the line in x though, you need another list comprehension for that:
x = [' '.join([ref_map.get(word, word) for word in line.split()]) for line in x]
It appears from the comments that x is not a list of sentences but rather one sentence. In which case you use just process that one line with one list comprehension, as in the loop iteration over x above:
def corrected(line):
return ' '.join([ref_map.get(word, word) for word in line.split()])

Why can't I append a char to an empty list in Python?

In a program I am writing to create a list of words from a list of chars, I am getting a "list index out of range" exception.
def getlist(filename):
f = open('alice.txt','r')
charlist = f.read()
wordlist = []
done = False
while(not done):
j = 0
for i in range(0,len(charlist)):
if charlist[i] != ' ' and charlist[i] != '\n':
wordlist[j] += charlist[i]
else: j+= 1
done = i == len(charlist)-1
return wordlist
So I started playing around with how lists work, and found that:
list = ['cars']
list[0]+= '!'
gives list = ['cars!']
However, with:
list = []
list[0]+= '!'
I get an out of bounds error. Why doesn't it do what seems logical: list= ['!']? How can I solve this? If I must initialize with something, how will I know the required size of the list? Are there any better, more conventional, ways to do what I'm attempting?
['cars'] is a list containing one element. That element is the string 'cars', which contains 4 characters.
list[0] += '!' actually does 3 separate things. The list[0] part selects the element of list at position 0. The += part both concatenates the two strings (like 'cars' + '!' would), and stores the resulting string back in the 0th slot of list.
When you try to apply that to the empty list, it fails at the "selects the element at position 0" part, because there is no such element. You are expecting it to behave as if you had not the empty list, but rather ['']; the list containing one element which is the empty string. You can easily append ! onto the end of an empty string, but in your example you don't have an empty string.
To add to a list, including an an empty one, use the append() method:
>>> mylist = []
>>> mylist.append('!')
>>> mylist
['!']
However, with:
list = []
list[0]+= '!'
I get an out of bounds error. Why doesn't it do what seems logical:
list= ['!']?
Because that isn't logical in Python. To append '!' to list[0], list[0] has to exist in the first place. It will not magically turn into an empty string for you to concatenate the exclamation mark to. In the general case, Python would not have a way to figure out what kind of "empty" element to magic up, anyway.
The append method is provided on lists in order to append an element to the list. However, what you're doing is massively over-complicating things. If all you want is a list consisting of the words in the file, that is as easy as:
def getlist(filename):
with open(filename) as f:
return f.read().split()
Your error is not from the statement list[0]+= '!', its from accessing an empty list which is out of range error :
>>> my_list = list()
>>> my_list[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>>
And += is not used for appending in a list, its for concatenating a string or numeric addition and internally its calling the following method.
__iadd__(...)
x.__iadd__(y) <==> x+=y

Categories