Python "list index out of range" error

Python "list index out of range" error - python

My goals is to have a list of lists, where each item in the outer list contains a word in it's first index, and the number of times it has come across it in the second index. As an example, it should look like this:
[["test1",0],["test2",4],["test3",8]]
The only issue is that when I try to, for instance, access the word "test1" from the first inner-list, I get an index out of range error. Here is my code for how I am attempting to do this:
stemmedList = [[]]
f = open(a_document_name, 'r')
#read each line of file
fileLines = f.readlines()
for fileLine in fileLines:
#here we end up with stopList, a list of words
thisReview = Hw1.read_line(fileLine)['text']
tokenList = Hw1.tokenize(thisReview)
stopList = Hw1.stopword(tokenList)
#for each word in stoplist, compare to all terms in return list to
#see if it exists, if it does add one to its second parameter, else
#add it to the list as ["word", 0]
for word in stopList:
#if list not empty
if not len(unStemmedList) == 1: #for some reason I have to do this to see if list is empty, I'm assuming when it's empty it returns a length of 1 since I'm initializing it as a list of lists??
print "List not empty."
for innerList in unStemmedList:
if innerList[0] == word:
print "Adding 1 to [" + word + ", " + str(innerList[1]) + "]"
innerList[1] = (innerList[1] + 1)
else:
print "Adding [" + word + ", 0]"
unStemmedList.append([word, 0])
else:
print "List empty."
unStemmedList.append([word, 0])
print unStemmedList[len(unStemmedList)-1]
return stemmedList
The final output ends up being:
List is empty.
["test1",0]
List not empty"
Crash with list index out of range error which points to the line if innerList[0] == word

You have a = [[]]
Now, when you are appending to this list after encountering first word, you have
a = [ [], ['test', 0] ]
In the next iteration you are accessing the 0th element of an empty list which doesn't exist.

Assuming that stemmedList and unStemmedList are similar
stemmedList = [[]]
you have an empty list in your list of lists, it has no [0]. Instead just initialize it to:
stemmedList = []

Isn't this simpler?
counts = dict()
def plus1(key):
if key in counts:
counts[key] += 1
else:
counts[key] = 1
stoplist = "t1 t2 t1 t3 t1 t1 t2".split()
for word in stoplist:
plus1(word)
counts
{'t2': 2, 't3': 1, 't1': 4}

Related

How can I group a sorted list into tuples of start and endpoints of consecutive elements of this list?

Suppose that my sorted list is as such:
L = ["01-string","02-string","03-string","05-string","07-string","08-string"]
As you can see this list has been sorted. I now want the start and end points of each block of continuous strings in this list, for example, the output for this should be:
L_continuous = [("01-string", "03-string"),("05-string","05-string"),("07-string","08-string")]
So, just to clarify, I need a list of tuples and in each of these tuples I need the start and endpoint of each consecutive block in my list. So, for example, elements 0, 1 and 2 in my list are consecutive because 01,02,03 are consecutive numbers - so the start and endpoints would be "01-string" and "03-string".
The numbers 1-3 are consecutive so they form a block, whereas 5 does not have any consecutive numbers in the list so it forms a block by itself.

Not a one-liner, but something like this might work:
L = ["01-string","02-string","03-string","05-string","07-string","08-string"]
counter = None
# lastNum = None
firstString = ""
lastString = ""
L_continuous = list()
for item in L:
currentNum = int(item[0:2])
if counter is None:
# startTuple
firstString = item
counter = currentNum
lastString = item
continue
if counter + 1 == currentNum:
# continuation of block
lastString = item
counter += 1
continue
if currentNum > counter + 1:
# end of block
L_continuous.append((firstString,lastString))
firstString = item
counter = currentNum
lastString = item
continue
else:
print ('error - not sorted or unique numbers')
# add last block
L_continuous.append((firstString,lastString))
print(L_continuous)

The first thing to do is extract an int from the string data, so that we can compare consecutive numbers:
def extract_int(s):
return int(s.split('-')[0])
Then a straightforward solution is to keep track of the last number seen, and emit a new block when it is not consecutive with the previous one. At the end of the loop, we need to emit a block of whatever is "left over".
def group_by_blocks(strs):
blocks = []
last_s = first_s = strs[0]
last_i = extract_int(last_s)
for s in strs[1:]:
i = extract_int(s)
if i != last_i + 1:
blocks.append( (first_s, last_s) )
first_i, first_s = i, s
last_i, last_s = i, s
blocks.append( (first_s, last_s) )
return blocks
Example:
>>> group_by_blocks(L)
[('01-string', '03-string'), ('05-string', '05-string'), ('07-string', '08-string')]

How to print all element of the list and not only the last one?

I wrote a script that allows me to extract through a loop all the floating numbers and put them in a list, then display this list with all the extracted floating numbers except that in my script only the last number of each list is taken into account, while I would like all the numbers to be displayed. how to do it?
there is my code :
final_result = []
result = []
k = listFps
k = 0
while k < len(listFps):
with open(listFps[k], 'r') as f:
#
statList = f.readlines()
statList = [x.strip() for x in statList]
for line in statList:
if (re.search("=", str(line))):
if (re.search('#IND', str(line))):
print("ok")
else:
result =re.findall("=\s*?(\d+\.\d+|\d+)", str(line))
print (" ca c result " ,result)
numberList = [float(q) for q in result]
print("ca c number list :",numberList)
k+=1
its print me only the last element of my list like this :
[59.889]
[60.874]
etc..
But i actually want a list with all element :
[59.889,60.874....]
Help me please im stuck with it for too long..

Instead of
result = re.findall….
use
result += re.findall…

python list within list initialization and printing

I am trying to append an element to a list within a list that has an incremented value each time:
def get_data(file):
matrix = [ ['one','two','three'] ] #list of lists
test_count = 0
line_count = 0 #keep track of which line we are on
for line in file:
if line.find('example') != -1: #test for example string
temp_a = re.findall(r"\'(.+?)\'",line)[0]
print matrix[test_count][0] #should print 'one'
matrix[test_count][0].insert(temp_a) #should insert temp_a instead of 'one'
test_count += 1 #go to next "new" list in the matrix
line_count += 1 #go to next line
What I want is the result of findall to go into temp_a and from there to insert it into index 0 of the first list within a list. Then the next time findall is true, I want to insert temp_a to index 0 of the second list.
For example if the first temp_a value is 9, I would like the first list in the matrix to be:
[ [9,y,z] ]
If on the second findall my temp_a is 4, I want the matrix to become:
[ [9,y,z], [4,y,z] ]
The above code is my best attempt so far.
I have 2 questions:
1) How can I initialize a 'list of lists' if the amount of lists isn't fixed?
2) The list ['one','two','three'] was to test with printing what is going on. If I try to print out matrix[test_count][0], I get an "index out of range" error, but the moment I change it to print out matrix[0][0] it prints 'one' correctly. Is there something with the scope that I'm missing here?

To answer your questions:
1) Like this: matrix = []
Simply put, this just creates an empty list that you can append anything you want into, including more lists. So matrix.append([1,2,3]) gives you a list like this: [[1,2,3]]
2) So you're index out of range error is coming from the fact that you're incrementing test_count to 1 but your matrix is remaining length of 1 (meaning it only has the 0 index) since you never append anything. In order to get the output that you want you're going to need to make a few changes:
def get_data(file):
example_list = ['one','two','three']
matrix = [] #list of lists
test_count = 0
line_count = 0 #keep track of which line we are on
for line in file:
if line.find('example') != -1: #test for example string
temp_a = re.findall(r"\'(.+?)\'",line)[0]
new_list = example_list[:]
new_list[0] = temp_a
matrix.append(new_list)
test_count += 1 #go to next "new" list in the matrix
line_count += 1 #go to next line
print matrix #[['boxes', 'two', 'three'], ['equilateral', 'two', 'three'], ['sphere', 'two', 'three']]

For 2), did you try to print out test_count? Since your test_count+=1 is in if statement, it shouldn't be out of range without printing "one".
For 1), you could do this before insert:
if test_count == len(matrix):
matrix.append([])
It adds a new empty list if test_count of out range of matrix.
EDIT:
"Out of range" caused by line temp_a = re.findall(r"\'(.+?)\'",line)[0] because it can't find anything. So it's an empty list, and [0] out of range.
def get_data(file):
matrix = [ ['one','two','three'] ] #list of lists
test_count = 0
line_count = 0 #keep track of which line we are on
for line in file:
if line.find('example') != -1: #test for example string
temp_a = re.findall(r"\'(.+?)\'",line)
if temp_a:
temp_a = temp_a[0]
else:
continue # do something if not found
print(matrix[test_count][0]) #should print 'one'
new_list = matrix[test_count][:]
new_list[0] = temp_a
matrix[test_count].append(new_list) #should insert temp_a instead of 'one'
test_count += 1 #go to next "new" list in the matrix
line_count += 1 #go to next line

How I display 2 words before and after a key search word in Python

Very new to Python programming. How I display 2 words before and after a key search word. In below example I am looking for a search word = lists
Sample:
Line 1: List of the keyboard shortcuts for Word 2000
Line 2: Sequences: strings, lists, and tuples - PythonLearn
Desired results (Lists word only found only in line 2)
Line 2: Sequences: strings, lists, and tuples
Thanks for your help in this.

This solution is based on Avinash Raj's second example with these amendments:
Allows the number of words to be printed each side of the search word to be varied
Uses a list comprehension instead of if inside for, which may be considered more 'Pythonic', though I'm not sure in this case if it's more readable.
.
s = """List of the keyboard shortcuts for Word 2000
Sequences: strings, lists and tuples - PythonLearn"""
findword = 'lists'
numwords = 2
for i in s.split('\n'):
z = i.split(' ')
for x in [x for (x, y) in enumerate(z) if findword in y]:
print(' '.join(z[max(x-numwords,0):x+numwords+1]))

Through re.findall function.
>>> s = """List of the keyboard shortcuts for Word 2000
Sequences: strings, lists, and tuples - PythonLearn"""
>>> re.findall(r'\S+ \S+ \S*\blists\S* \S+ \S+', s)
['Sequences: strings, lists, and tuples']
Without regex.
>>> s = """List of the keyboard shortcuts for Word 2000
Sequences: strings, lists, and tuples - PythonLearn"""
>>> for i in s.split('\n'):
z = i.split()
for x,y in enumerate(z):
if 'lists' in y:
print(z[x-2]+' '+z[x-1]+' '+z[x]+' '+z[x+1]+' '+z[x+2])
Sequences: strings, lists, and tuples

This is the solution I can think of right away for your question :-)
def get_word_list(line, keyword, length, splitter):
word_list = line.split(keyword)
if len(word_list) == 1:
return []
search_result = []
temp_result = ""
index = 0
while index < len(word_list):
result = word_list[index].strip().split(splitter, length-1)[-1]
result += " " + keyword
if index+1 > len(word_list):
search_result.append(result.strip())
break
right_string = word_list[index+1].lstrip(" ").split(splitter, length+1)[:length]
print word_list[index+1].lstrip(), right_string
result += " " + " ".join(right_string)
search_result.append(result.strip())
index += 2
return search_result
def search(file, keyword, length=2, splitter= " "):
search_results = []
with open(file, "r") as fo:
for line in fo:
line = line.strip()
search_results += get_word_list(line, keyword, length, splitter)
for result in search_results:
print "Result:", result

find the index of element the number of occurence in string

A Char_Record is a 3 item list [char, total, pos_list] where
char is a one character string
total is a Nat representing the number of occurrences of char
pos_list is a list of Nat representing the indices of char
Using the function build_char_records() should produce a sorted list with every character represented (lowercase).
For example:
>>>build_char_records('Hi there bye')
['',2,[2,8]]
['b',1,[9]]
['e',3,[5,7,11]]
['h',2[0,4]]
['i',1,[1]]
['r',1,[6]]
['t',1,[3]]
['y',1,[10]]
I just wrote it like this , I don't know how to do it, someone help please. Thanks.
def build_char_records(s):
s=sorted(s)
a=[]
for i in range(len(s)):

I think that the other answers given thus far are better answers from an overall programming perspective, but based on your question I think this answer is appropriate for your skill level
def build_char_records(phrase):
phrase = phrase.lower()
resultList = []
for character in phrase: ## iterates through the phrase
if character not in resultList:
resultList.append(character) ## This adds each character to the list
## if it is not already in the list
resultList.sort() ## sorts the list
for i in range(len(resultList)): ## goes through each unique character
character = resultList[i] ## the character in question
tphrase = phrase ## a copy of the phrase
num = phrase.count(character) ## the number of occurences
acc = 0 ## an accumulator to keep track of how many we've found
locs = [] ## list of the locations
while acc < num: ## while the number we've found is less than how many
## there should be
index = tphrase.find(character) ## finds the first occurance of the character
tphrase = tphrase[index+1:] ## chops off everything up to and including the
## character
if len(locs) != 0: ## if there is more than one character
index = locs[acc-1] + index + 1 ## adjusts because we're cutting up the string
locs.append(index)## adds the index to the list
acc += 1 ## increases the accumulator
resultList[i] = [character, num, locs] ## creates the result in the proper spot
return resultList ## returns the list of lists
print build_char_records('Hi there bye')
This will print out [[' ', 2, [2, 8]], ['b', 1, [9]], ['e', 3, [5, 7, 11]], ['h', 2, [0, 4]], ['i', 1, [1]], ['r', 1, [6]], ['t', 1, [3]], ['y', 1, [10]]]
Here is a slightly shorter, cleaner version
def build_char_records(phrase):
phrase = phrase.lower()
resultList = []
for character in phrase:
if character not in resultList:
resultList.append(character)
resultList.sort()
for i in range(len(resultList)):
tphrase = phrase
num = phrase.count(resultList[i])
locs = []
for j in range(num):
index = tphrase.find(resultList[i])
tphrase = tphrase[index+1:]
if len(locs) != 0:
index = locs[acc-1] + index + 1
locs.append(index)
resultList[i] = [resultList[i], num, locs]
return resultList
print build_char_records('Hi there bye')

Using only list, this is what you can do:
def build_char_records(s):
records = [] # Create a list to act as a dictionary
for idx, char in enumerate(s):
char = char.lower()
current_record = None # Try to find the record in our list of records
for record in records: # Iterate over the records
if record[0] == char: # We find it!
current_record = record # This is the record for current char
break # Stop the search
if current_record is None: # This means the list does not contain the record for this char yet
current_record = [char, 0, []] # Create the record
records.append(current_record) # Put the record into the list of records
current_record[1] += 1 # Increase the count by one
current_record[2].append(idx) # Append the position of the character into the list
for value in records: # Iterate over the Char_Record
print value # Print the Char_Record
Or, if you need to sort it, you can do what #Dannnno said, or as an example, it can be sorted in this way (although you might have not learned about lambda):
records.sort(key=lambda x: x[0])
Just put that before printing the records.
Or, you can do it using dict and list:
def build_char_records(s):
record = {} # Create an empty dictionary
for idx, char in enumerate(s): # Iterate over the characters of string with the position specified
char = char.lower() # Convert the character into lowercase
if char not in record: # If we have not seen the character before, create the Char_Record
record[char] = [char,0,[]] # Create a Char_Record and put it in the dictionary
record[char][1] += 1 # Increase the count by one
record[char][2].append(idx) # Append the position of the character into the list
for value in record.values(): # Iterate over the Char_Record
print value # Print the Char_Record

from collections import defaultdict
def build_char_records(s):
cnt = defaultdict(int)
positions = defaultdict(list)
for i,c in enumerate(s):
cnt[c] += 1
positions[c].append(i)
return [ [c, cnt[c], positions[c]] for c in cnt.keys() ]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python "list index out of range" error - python

You have a = [[]] Now, when you are appending to this list after encountering first word, you have a = [ [], ['test', 0] ] In the next iteration you are accessing the 0th element of an empty list which doesn't exist.

Assuming that stemmedList and unStemmedList are similar stemmedList = [[]] you have an empty list in your list of lists, it has no [0]. Instead just initialize it to: stemmedList = []

Isn't this simpler? counts = dict() def plus1(key): if key in counts: counts[key] += 1 else: counts[key] = 1 stoplist = "t1 t2 t1 t3 t1 t1 t2".split() for word in stoplist: plus1(word) counts {'t2': 2, 't3': 1, 't1': 4}

Related

How can I group a sorted list into tuples of start and endpoints of consecutive elements of this list?

How to print all element of the list and not only the last one?

python list within list initialization and printing

How I display 2 words before and after a key search word in Python

find the index of element the number of occurence in string

Categories

Resources