IndexError: list assignment index out of range Python - python

def mode(given_list):
highest_list = []
highest = 0
index = 0
for x in range(0, len(given_list)):
occurrences = given_list.count(given_list[x])
if occurrences > highest:
highest = occurrences
highest_list[0] = given_list[x]
elif occurrences == highest:
highest_list.append(given_list[x])
The code is meant to work out the mode of a given list. I do not understand where I am going wrong.
Exact Error I am receiving.
line 30, in mode
highest_list[0] = given_list[x]
IndexError: list assignment index out of range

The problem is that you have an empty list originally:
highest_list = []
And then in the loop you try to access it at index 0:
highest_list[0] = ...
It's impossible, because it's an empty list and so is not indexable at position 0.
A better way to find the mode of a list is to use a collections.Counter object:
>>> from collections import Counter
>>> L = [1,2,3,3,4]
>>> counter = Counter(L)
>>> max(counter, key=counter.get)
3
>>> [(mode, n_occurrences)] = counter.most_common(1)
>>> mode, n_occurrences
(3, 2)

As far as getting the mode, you can just use a Counter from the collections library
from collections import Counter
x = [0, 1, 2, 0, 1, 0] #0 is the mode
g = Counter(x)
mode = max(g, key = lambda x: g[x])

At that point, at the start of the loop, highest_list is empty, so there's no first index. You can initialize highest_list as [0] so that there is always at least one "highest value."
That said, you can accomplish this more simply as follows:
def mode(given_list):
return max(set(given_list), key=given_list.count)
This will find the highest item in the passed given_list, based on each item's count() in it. Making a set first ensures that each item is only counted once.

Related

removing numbers which are close to each other in a list

I have a list like
mylist = [75,75,76,77,78,79,154,155,154,156,260,262,263,550,551,551,552]
i need to remove numbers are close to each other by maxumim four number like:
num-4 <= x <= num +4
the list i need at the end should be like :
list = [75,154,260,550]
or
list = [76,156,263,551]
doesn't really matter which number to stay in the list , only one of those which are close.
i tried this which gave me :
for i in range(len(l)):
for j in range(len(l)):
if i==j or i==j+1 or i==j+2 or i == j+3:
pp= l.pop(j)
print(pp)
print(l)
IndexError: pop index out of range
and this one which doesn't work the way i need:
for q in li:
for w in li:
print(q,'////',w)
if q == w or q ==w+1 or q==w+2 or q==w+3:
rem = li.remove(w)
thanks
The below uses groupby to identify runs from the iterable that start with a value start and contain values that differ from start by no more than 4. We then collect all of those start values into a list.
from itertools import groupby
def runs(difference=4):
start = None
def inner(n):
nonlocal start
if start is None:
start = n
elif abs(start-n) > difference:
start = n
return start
return inner
print([next(g) for k, g in groupby(mylist, runs())])
# [75, 154, 260, 550]
This assumes that the input data is already sorted. If it's not, you'll have to sort it: groupby(sorted(mylist), runs()).
You can accomplish this using a set or list, you don't need a dict.
usedValues = set()
newList = []
for v in myList:
if v not in usedValues:
newList.append(v)
for lv in range(v - 4, v + 5):
usedValues.add(lv)
print(newList)
This method stores all values within 4 of every value you've seen so far. When you look at a new value from myList, you only need to check if you've seen something in it's ballpark before by checking usedValues.

How can I use string formatting to assign unique variable?

I've got a list and i've managed to turn the list into strings. Now I want to assign a variable to each item in the list by using string formatting to append a 1 onto the end of the variable.
listOne = ['33.325556', '59.8149016457', '51.1289412359']
itemsInListOne = int(len(listOne))
num = 4
varIncrement = 0
while itemsInListOne < num:
for i in listOne:
print a = ('%dfinalCoords{0}') % (varIncrement+1)
print (str(listOne).strip('[]'))
break
I get the following error: SyntaxError: invalid syntax
How can I fix this and assign a new variable in the format:
a0 = 33.325556
a1 = 59.8149016457 etc.
Your current code has a few issues:
listOne = ['33.325556', '59.8149016457', '51.1289412359']
itemsInListOne = int(len(listOne)) # len will always be an int
num = 4 # magic number - why 4?
varIncrement = 0
while itemsInListOne < num: # why test, given the break?
for i in listOne:
print a = ('%dfinalCoords{0}') % (varIncrement+1) # see below
print (str(listOne).strip('[]')) # prints list once for each item in list
break # why break on first iteration
One line in particular is giving you trouble:
print a = ('%dfinalCoords{0}') % (varIncrement+1)
This:
simultaneously tries to print and assign a = (hence the SyntaxError);
mixes two different types of string formatting ('%d' and '{0}'); and
never actually increments varIncrement, so you will always get '1finalCoords{0}' anyway.
I would suggest the following:
listOne = ['33.325556', '59.8149016457', '51.1289412359']
a = list(map(float, listOne)) # convert to actual floats
You can easily access or edit individual values by index, e.g.
# edit one value
a[0] = 33.34
# print all values
for coord in a:
print(coord)
# double every value
for index, coord in enumerate(a):
a[index] = coord * 2
Looking at your previous question, it seems that you probably want pairs of coordinates from two lists, which can also be done with a simple list of 2-tuples:
listOne = ['33.325556', '59.8149016457', '51.1289412359']
listTwo = ['2.5929778', '1.57945488999', '8.57262235411']
coord_pairs = zip(map(float, listOne), map(float, listTwo))
Which gives:
coord_pairs == [(33.325556, 2.5929778),
(59.8149016457, 1.57945488999),
(51.1289412359, 8.57262235411)]

Loop to Match Parts of List

My code:
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
count = 0
i = 1
while count < 1000:
if mylist[i] == mylist[i+12] and mylist [i+3] == mylist [i+14]:
print mylist[i]
count = count+1
i = i+12
My intention is to look at elt 1, elt 2. If elt 1 == elt 13 AND elt 2==elt 14 I want to print elt 1. Then, I want to look at elt 13 and elt 14. If elt 2 matches elt 13+12 AND elt 14 matches elt 14+12 I want to print it. ETC...
There are certainly parts of my list that fit this criteria, but the program returns no output.
One problem is your indices. Be advised that lists begin with an index of 0.
I'm surprised nobody's answered this yet:
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
count = 0
i = 0
while count < 1000:
#print mylist[i]
#print mylist[i+12]
#print mylist[i+13]
#print mylist[i+14]
#...use prints to help you debug
if mylist[i] == mylist[i+12] and mylist [i+1] == mylist [i+13]:
print mylist[i]
count = count+1
i = i+12
This is probably what you want.
To iterate over multiple lists (technically, iterables) in "lockstep", you can use zip. In this case, you want to iterate over four versions of mylist, offset by 0, 12, 2 and 13.
zippedLists = zip(mylist, mylist[12:], mylist[2:], mylist[13:])
Next, you want the 0th, 12th, 24th, etc elements. This is done with slice:
slicedList = zippedLists[::12]
Then you can iterate over that:
for elt1, elt13, elt2, elt14 in slicedList:
if elt1 == elt13 and elt2 == elt14:
print elt1
Putting it together with the file operations, we get
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
zippedLists = zip(mylist, mylist[12:], mylist[2:], mylist[13:])
slicedList = zippedLists[::12]
for elt1, elt13, elt2, elt14 in slicedList:
if elt1 == elt13 and elt2 == elt14:
print elt1
Code like this is generally considered more "pythonic" than your current version, as using list indexes are generally discouraged when you are iterating over the list.
Note that if you've got a huge number of elements in your list the above code creates (and destroys at some point) five extra lists. Therefore, you may get better memory performance if you use the equivalent functions in itertools, which uses lazy iterators to prevent copying lists needlessly:
from itertools import islice, izip
#prints out samenodes
f = open('newerfile.txt')
mylist = list(f)
zippedLists = itertools.izip(mylist, islice(mylist, 12), islice(mylist, 2), islice(mylist, 13))
slicedList = itertools.islice(zippedLists, 0, None, 12)
for elt1, elt13, elt2, elt14 in slicedList:
if elt1 == elt13 and elt2 == elt14:
print elt1
There's probably a way in itertools to avoid slurping the entire file into mylist, but I'm not sure I remember what it is - I think that is the use case for itertools.tee.

Python 3.0+ Calculating Mode

I have written a program to calculate the most often occurring number. This works great unless you have 2 most occurring numbers in a list such as 7,7,7,9,9,9. For that I wrote in:
if len(modeList) > 1 and modeList[0] != modeList[1]:
break
but then I encounter other problems like a set of number with 7,9,9,9,9. What do I do. Below is my code that will calculate one Mode.
list1 = [7,7,7,9,9,9,9]
numList=[]
modeList=[]
finalList =[]
for i in range(len(list1)):
for k in range(len(list1)):
if list1[i] == list1[k]:
numList.append(list1[i])
numList.append("EOF")
w = 0
for w in range(len(numList)):
if numList[w] == numList[w + 1]:
modeList.append(numList[w])
if numList[w + 1] == "EOF":
break
w = 0
lenMode = len(modeList)
print(lenMode)
while lenMode > 1:
for w in range(lenMode):
print(w)
if w != lenMode - 1:
if modeList[w] == modeList[w + 1]:
finalList.append(modeList[w])
print(w)
lenFinal = len(finalList)
modeList = []
for i in range(lenFinal):
modeList.append(finalList[i])
finalList = []
lenMode = len(modeList)
and then
print(modeList)
We have not learned counters but I would be open to it if someone could explain!
I would just use collections.Counter for this:
>>> from collections import Counter
>>> c = Counter([7,9,9,9,9])
>>> max(c.items(), key=lambda x:x[1])[0]
9
This is really rather simple. All it does is count how many times each value appears in the list, and then selects the element with the highest count.
I would use statistics.mode() for this. If there is more than one mode, it will raise an exception. If you need to handle multiple modes (it's not clear to me whether that's the case), you probably want to use a collections.Counter object as suggested by NPE.

Finding the most frequent character in a string

I found this programming problem while looking at a job posting on SO. I thought it was pretty interesting and as a beginner Python programmer I attempted to tackle it. However I feel my solution is quite...messy...can anyone make any suggestions to optimize it or make it cleaner? I know it's pretty trivial, but I had fun writing it. Note: Python 2.6
The problem:
Write pseudo-code (or actual code) for a function that takes in a string and returns the letter that appears the most in that string.
My attempt:
import string
def find_max_letter_count(word):
alphabet = string.ascii_lowercase
dictionary = {}
for letters in alphabet:
dictionary[letters] = 0
for letters in word:
dictionary[letters] += 1
dictionary = sorted(dictionary.items(),
reverse=True,
key=lambda x: x[1])
for position in range(0, 26):
print dictionary[position]
if position != len(dictionary) - 1:
if dictionary[position + 1][1] < dictionary[position][1]:
break
find_max_letter_count("helloworld")
Output:
>>>
('l', 3)
Updated example:
find_max_letter_count("balloon")
>>>
('l', 2)
('o', 2)
There are many ways to do this shorter. For example, you can use the Counter class (in Python 2.7 or later):
import collections
s = "helloworld"
print(collections.Counter(s).most_common(1)[0])
If you don't have that, you can do the tally manually (2.5 or later has defaultdict):
d = collections.defaultdict(int)
for c in s:
d[c] += 1
print(sorted(d.items(), key=lambda x: x[1], reverse=True)[0])
Having said that, there's nothing too terribly wrong with your implementation.
If you are using Python 2.7, you can quickly do this by using collections module.
collections is a hight performance data structures module. Read more at
http://docs.python.org/library/collections.html#counter-objects
>>> from collections import Counter
>>> x = Counter("balloon")
>>> x
Counter({'o': 2, 'a': 1, 'b': 1, 'l': 2, 'n': 1})
>>> x['o']
2
Here is way to find the most common character using a dictionary
message = "hello world"
d = {}
letters = set(message)
for l in letters:
d[message.count(l)] = l
print d[d.keys()[-1]], d.keys()[-1]
Here's a way using FOR LOOP AND COUNT()
w = input()
r = 1
for i in w:
p = w.count(i)
if p > r:
r = p
s = i
print(s)
The way I did uses no built-in functions from Python itself, only for-loops and if-statements.
def most_common_letter():
string = str(input())
letters = set(string)
if " " in letters: # If you want to count spaces too, ignore this if-statement
letters.remove(" ")
max_count = 0
freq_letter = []
for letter in letters:
count = 0
for char in string:
if char == letter:
count += 1
if count == max_count:
max_count = count
freq_letter.append(letter)
if count > max_count:
max_count = count
freq_letter.clear()
freq_letter.append(letter)
return freq_letter, max_count
This ensures you get every letter/character that gets used the most, and not just one. It also returns how often it occurs. Hope this helps :)
If you want to have all the characters with the maximum number of counts, then you can do a variation on one of the two ideas proposed so far:
import heapq # Helps finding the n largest counts
import collections
def find_max_counts(sequence):
"""
Returns an iterator that produces the (element, count)s with the
highest number of occurrences in the given sequence.
In addition, the elements are sorted.
"""
if len(sequence) == 0:
raise StopIteration
counter = collections.defaultdict(int)
for elmt in sequence:
counter[elmt] += 1
counts_heap = [
(-count, elmt) # The largest elmt counts are the smallest elmts
for (elmt, count) in counter.iteritems()]
heapq.heapify(counts_heap)
highest_count = counts_heap[0][0]
while True:
try:
(opp_count, elmt) = heapq.heappop(counts_heap)
except IndexError:
raise StopIteration
if opp_count != highest_count:
raise StopIteration
yield (elmt, -opp_count)
for (letter, count) in find_max_counts('balloon'):
print (letter, count)
for (word, count) in find_max_counts(['he', 'lkj', 'he', 'll', 'll']):
print (word, count)
This yields, for instance:
lebigot#weinberg /tmp % python count.py
('l', 2)
('o', 2)
('he', 2)
('ll', 2)
This works with any sequence: words, but also ['hello', 'hello', 'bonjour'], for instance.
The heapq structure is very efficient at finding the smallest elements of a sequence without sorting it completely. On the other hand, since there are not so many letter in the alphabet, you can probably also run through the sorted list of counts until the maximum count is not found anymore, without this incurring any serious speed loss.
def most_frequent(text):
frequencies = [(c, text.count(c)) for c in set(text)]
return max(frequencies, key=lambda x: x[1])[0]
s = 'ABBCCCDDDD'
print(most_frequent(s))
frequencies is a list of tuples that count the characters as (character, count). We apply max to the tuples using count's and return that tuple's character. In the event of a tie, this solution will pick only one.
I noticed that most of the answers only come back with one item even if there is an equal amount of characters most commonly used. For example "iii 444 yyy 999". There are an equal amount of spaces, i's, 4's, y's, and 9's. The solution should come back with everything, not just the letter i:
sentence = "iii 444 yyy 999"
# Returns the first items value in the list of tuples (i.e) the largest number
# from Counter().most_common()
largest_count: int = Counter(sentence).most_common()[0][1]
# If the tuples value is equal to the largest value, append it to the list
most_common_list: list = [(x, y)
for x, y in Counter(sentence).items() if y == largest_count]
print(most_common_count)
# RETURNS
[('i', 3), (' ', 3), ('4', 3), ('y', 3), ('9', 3)]
Question :
Most frequent character in a string
The maximum occurring character in an input string
Method 1 :
a = "GiniGinaProtijayi"
d ={}
chh = ''
max = 0
for ch in a : d[ch] = d.get(ch,0) +1
for val in sorted(d.items(),reverse=True , key = lambda ch : ch[1]):
chh = ch
max = d.get(ch)
print(chh)
print(max)
Method 2 :
a = "GiniGinaProtijayi"
max = 0
chh = ''
count = [0] * 256
for ch in a : count[ord(ch)] += 1
for ch in a :
if(count[ord(ch)] > max):
max = count[ord(ch)]
chh = ch
print(chh)
Method 3 :
import collections
line ='North Calcutta Shyambazaar Soudipta Tabu Roopa Roopi Gina Gini Protijayi Sovabazaar Paikpara Baghbazaar Roopa'
bb = collections.Counter(line).most_common(1)[0][0]
print(bb)
Method 4 :
line =' North Calcutta Shyambazaar Soudipta Tabu Roopa Roopi Gina Gini Protijayi Sovabazaar Paikpara Baghbazaar Roopa'
def mostcommonletter(sentence):
letters = list(sentence)
return (max(set(letters),key = letters.count))
print(mostcommonletter(line))
Here are a few things I'd do:
Use collections.defaultdict instead of the dict you initialise manually.
Use inbuilt sorting and max functions like max instead of working it out yourself - it's easier.
Here's my final result:
from collections import defaultdict
def find_max_letter_count(word):
matches = defaultdict(int) # makes the default value 0
for char in word:
matches[char] += 1
return max(matches.iteritems(), key=lambda x: x[1])
find_max_letter_count('helloworld') == ('l', 3)
If you could not use collections for any reason, I would suggest the following implementation:
s = input()
d = {}
# We iterate through a string and if we find the element, that
# is already in the dict, than we are just incrementing its counter.
for ch in s:
if ch in d:
d[ch] += 1
else:
d[ch] = 1
# If there is a case, that we are given empty string, then we just
# print a message, which says about it.
print(max(d, key=d.get, default='Empty string was given.'))
sentence = "This is a great question made me wanna watch matrix again!"
char_frequency = {}
for char in sentence:
if char == " ": #to skip spaces
continue
elif char in char_frequency:
char_frequency[char] += 1
else:
char_frequency[char] = 1
char_frequency_sorted = sorted(
char_frequency.items(), key=lambda ky: ky[1], reverse=True
)
print(char_frequency_sorted[0]) #output -->('a', 9)
# return the letter with the max frequency.
def maxletter(word:str) -> tuple:
''' return the letter with the max occurance '''
v = 1
dic = {}
for letter in word:
if letter in dic:
dic[letter] += 1
else:
dic[letter] = v
for k in dic:
if dic[k] == max(dic.values()):
return k, dic[k]
l, n = maxletter("Hello World")
print(l, n)
output: l 3
you may also try something below.
from pprint import pprint
sentence = "this is a common interview question"
char_frequency = {}
for char in sentence:
if char in char_frequency:
char_frequency[char] += 1
else:
char_frequency[char] = 1
pprint(char_frequency, width = 1)
out = sorted(char_frequency.items(),
key = lambda kv : kv[1], reverse = True)
print(out)
print(out[0])
statistics.mode(data)
Return the single most common data point from discrete or nominal data. The mode (when it exists) is the most typical value and serves as a measure of central location.
If there are multiple modes with the same frequency, returns the first one encountered in the data. If the smallest or largest of those is desired instead, use min(multimode(data)) or max(multimode(data)). If the input data is empty, StatisticsError is raised.
import statistics as stat
test = 'This is a test of the fantastic mode super special function ssssssssssssss'
test2 = ['block', 'cheese', 'block']
val = stat.mode(test)
val2 = stat.mode(test2)
print(val, val2)
mode assumes discrete data and returns a single value. This is the standard treatment of the mode as commonly taught in schools:
mode([1, 1, 2, 3, 3, 3, 3, 4])
3
The mode is unique in that it is the only statistic in this package that also applies to nominal (non-numeric) data:
mode(["red", "blue", "blue", "red", "green", "red", "red"])
'red'
Here is how I solved it, considering the possibility of multiple most frequent chars:
sentence = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, \
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim."
joint_sentence = sentence.replace(" ", "")
frequencies = {}
for letter in joint_sentence:
frequencies[letter] = frequencies.get(letter, 0) +1
biggest_frequency = frequencies[max(frequencies, key=frequencies.get)]
most_frequent_letters = {key: value for key, value in frequencies.items() if value == biggest_frequency}
print(most_frequent_letters)
Output:
{'e': 12, 'i': 12}
#file:filename
#quant:no of frequent words you want
def frequent_letters(file,quant):
file = open(file)
file = file.read()
cnt = Counter
op = cnt(file).most_common(quant)
return op
# This code is to print all characters in a string which have highest frequency
def find(str):
y = sorted([[a.count(i),i] for i in set(str)])
# here,the count of unique character and the character are taken as a list
# inside y(which is a list). And they are sorted according to the
# count of each character in the list y. (ascending)
# Eg : for "pradeep", y = [[1,'r'],[1,'a'],[1,'d'],[2,'p'],[2,'e']]
most_freq= y[len(y)-1][0]
# the count of the most freq character is assigned to the variable 'r'
# ie, most_freq= 2
x= []
for j in range(len(y)):
if y[j][0] == most_freq:
x.append(y[j])
# if the 1st element in the list of list == most frequent
# character's count, then all the characters which have the
# highest frequency will be appended to list x.
# eg :"pradeep"
# x = [['p',2],['e',2]] O/P as expected
return x
find("pradeep")

Categories