Is it possible to make a letter range in python? - python

Is there a way to do a letter range in python like this:
for x in range(a,h,)

Something like:
[chr(i) for i in range(ord('a'),ord('h'))]
Will give a list of alphabetical characters to iterate through, which you can then use in a loop
for x in [chr(i) for i in range(ord('a'),ord('h'))]:
print(x)
or this will do the same:
for x in map(chr, range(*map(ord,['a', 'h']))):
print(x)

You can use ord() to convert the letters into character ordinals and back:
def char_range(start, end, step=1):
for char in range(ord(start), ord(end), step):
yield chr(char)
It seems to work just fine:
>>> ''.join(char_range('a', 'z'))
'abcdefghijklmnopqrstuvwxy'

There is no built in letter range, but you can write one:
def letter_range(start, stop):
for c in xrange(ord(start), ord(stop)):
yield chr(c)
for x in letter_range('a', 'h'):
print x,
prints:
a b c d e f g

Emanuele's solution is great as long as one is only asking for a range of single characters, which I will admit is what the original questioner posed. There are also solutions out there to generate all multi-character combinations: How to generate a range of strings from aa... to zz. However I suspect that someone who wants a character like range function might want to be able to deal with generating an arbitrary range from say 'y' to 'af' (rolling over from 'z' to 'aa'). So here is a more general solution that includes the ability to either specify the last member of the range or its length.
def strange(start, end_or_len, sequence='ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
"""Create a generator of a range of 'sequential' strings from
start to end_or_len if end_or_len is a string or containing
end_or_len entries if end_or_len is an integer.
>>> list(strange('D', 'F'))
['D', 'E', 'F']
>>> list(strange('Y', 'AB'))
['Y', 'Z', 'AA', 'AB']
>>> list(strange('Y', 4))
['Y', 'Z', 'AA', 'AB']
>>> list(strange('A', 'BAA', sequence='AB'))
['A', 'B', 'AA', 'AB', 'BA', 'BB', 'AAA', 'AAB', 'ABA', 'ABB', 'BAA']
>>> list(strange('A', 11, sequence='AB'))
['A', 'B', 'AA', 'AB', 'BA', 'BB', 'AAA', 'AAB', 'ABA', 'ABB', 'BAA']
"""
seq_len = len(sequence)
start_int_list = [sequence.find(c) for c in start]
if isinstance(end_or_len, int):
inclusive = True
end_int_list = list(start_int_list)
i = len(end_int_list) - 1
end_int_list[i] += end_or_len - 1
while end_int_list[i] >= seq_len:
j = end_int_list[i] // seq_len
end_int_list[i] = end_int_list[i] % seq_len
if i == 0:
end_int_list.insert(0, j-1)
else:
i -= 1
end_int_list[i] += j
else:
end_int_list = [sequence.find(c) for c in end_or_len]
while len(start_int_list) < len(end_int_list) or start_int_list <= end_int_list:
yield ''.join([sequence[i] for i in start_int_list])
i = len(start_int_list)-1
start_int_list[i] += 1
while start_int_list[i] >= seq_len:
start_int_list[i] = 0
if i == 0:
start_int_list.insert(0,0)
else:
i -= 1
start_int_list[i] += 1
if __name__ =='__main__':
import doctest
doctest.testmod()

import string
def letter_range(f,l,al = string.ascii_lowercase):
for x in al[al.index(f):al.index(l)]:
yield x
print ' '.join(letter_range('a','h'))
result
a b c d e f g

this is easier for me at least to read/understand (and you can easily customize which letters are included, and in what order):
letters = 'abcdefghijklmnopqrstuvwxyz'
for each in letters:
print each
result:
a
b
c
...
z

how about slicing an already pre-arranged list?
import string
s = string.ascii_lowercase
print( s[ s.index('b'):s.index('o')+1 ] )

Malcom's example works great, but there is a little problem due to how Pythons list comparison works. If 'A' to "Z" or some character to "ZZ" or "ZZZ" will cause incorrect iteration.
Here "AA" < "Z" or "AAA" < "ZZ" will become false.
In Python [0,0,0] is smaller than [1,1] when compared with "<" or ">" operator.
So below line
while len(start_int_list) < len(end_int_list) or start_int_list <= end_int_list:
should be rewritten as below
while len(start_int_list) < len(end_int_list) or\
( len(start_int_list) == len(end_int_list) and start_int_list <= end_int_list):
It is well explained here
https://docs.python.org/3/tutorial/datastructures.html#comparing-sequences-and-other-types
I rewrote the code example below.
def strange(start, end_or_len, sequence='ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
seq_len = len(sequence)
start_int_list = [sequence.find(c) for c in start]
if isinstance(end_or_len, int):
inclusive = True
end_int_list = list(start_int_list)
i = len(end_int_list) - 1
end_int_list[i] += end_or_len - 1
while end_int_list[i] >= seq_len:
j = end_int_list[i] // seq_len
end_int_list[i] = end_int_list[i] % seq_len
if i == 0:
end_int_list.insert(0, j-1)
else:
i -= 1
end_int_list[i] += j
else:
end_int_list = [sequence.find(c) for c in end_or_len]
while len(start_int_list) < len(end_int_list) or\
(len(start_int_list) == len(end_int_list) and start_int_list <= end_int_list):**
yield ''.join([sequence[i] for i in start_int_list])
i = len(start_int_list)-1
start_int_list[i] += 1
while start_int_list[i] >= seq_len:
start_int_list[i] = 0
if i == 0:
start_int_list.insert(0,0)
else:
i -= 1
start_int_list[i] += 1
Anyway, Malcom's code example is a great illustration of how iterator in Python works.

Sometimes one can over-design what can be a simple solution.
If you know the range of letters you want, why not just use:
for letter in "ABCDEFGHIJKLMNOPQRSTUVWXYZ":
print(letter)
Or even:
start = 4
end = 9
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
for letter in alphabet[start:end]:
print(letter)
In the second example I illustrate an easy way to pick how many letters you want from a fixed list.

Related

Counting the number of elements in a molecular compound with Python (recursion if possible)?

So I'm trying to code something to tell me the number of elements in any given compound. I'm not even sure where to start: I tried coding something but then realized that it only worked for simple compounds (or did not work at all). Here's an example of what I want:
>>> function_input : 'NaMg3Al6(BO3)3Si6O18(OH)4', 'O'
>>> function_return : 31
I've come so far in my mess of a code (IT DOESN'T WORK, it just illustrates my rough thought process):
def get_pos_from_count(string: str, letter: str):
count = string.count(letter)
lens = [-1]
for i in range(count):
lens += [string[lens[i] + 1:].index(letter) + lens[i] + 1]
return lens[1:]
def find_number(string, letter):
if string.count(letter) == 0: return 0
numbers = '1234567890'
try:
mul1 = int(string[0])
except ValueError:
mul1 = 0
mul2 = []
sub_ = 1
list_of_positions = get_pos_from_count(string, letter)
for i in list_of_positions:
try:
sub_ += int(string[i + 1]) if string[i + 1] in numbers else 0
except IndexError: pass
if string[i + 1:].count(')') > string[i + 1].count('('):
try:
mul2 += int(string[string[i + 1:].count(')') + 1])
except (IndexError, ValueError): pass
return mul1 * sub_ * mul2
The approach I was trying to implement was:
Find the number of occurrences of said element in compound.
Find each subscript, multiplying by subscript outside bracket if said element is in bracket.
Sum up all subscripts, multiply by number of compounds (first character in string)
Return said number to user
But then I realized my code would either be extremely long or require recursion, which I don't know how to apply here.
If possible, I'd like a semi-working function, but a quick tip on how to approach this is also helpful!
And I don't want to use external libraries if possible.
tl;dr: This question for atomicity of elements, without external libraries (if possible).
EDIT: Yes, the question I linked does have hints on how to do this, but when I tried to make any code work for only one element, and set it's weight to 1, I ran into a host of issues I don't know to solve.
Let us divide the task into three parts:
Tokenize the string into a list of elements, numbers, and brackets;
Parse the bracket to have a nested list with sublists;
Count the elements in the nested list.
Introducing my tools:
Tokenize: more_itertools.split_when;
Parsing brackets: recursion;
Counting elements: collections.counter.
from more_itertools import split_when, pairwise
from itertools import chain
from collections import Counter
def nest_brackets(tokens, i = 0):
l = []
while i < len(tokens):
if tokens[i] == ')':
return i,l
elif tokens[i] == '(':
i,subl = nest_brackets(tokens, i+1)
l.append(subl)
else:
l.append(tokens[i])
i += 1
return i,l
def parse_compound(s):
tokens = [''.join(t) for t in split_when(s, lambda a,b: b.isupper() or b in '()' or (b.isdigit() and not a.isdigit()))]
tokens = [(int(t) if t.isdigit() else t) for t in tokens]
i, l = nest_brackets(tokens)
assert(i == len(tokens)) # crash if unmatched ')'
return l
def count_elems(parsed_compound):
c = Counter()
for a,b in pairwise(chain(parsed_compound, (1,))):
if not isinstance(a, int):
subcounter = count_elems(a) if isinstance(a, list) else {a: 1}
n = b if isinstance(b, int) else 1
for elem,k in subcounter.items():
c[elem] += k * n
return c
s = 'NaMg3Al6(B(CO2)3)3Si6O18(OH)4'
l = parse_compound(s)
print(l)
# ['Na', 'Mg', 3, 'Al', 6, ['B', ['C', 'O', 2], 3], 3, 'Si', 6, 'O', 18, ['O', 'H'], 4]
c = count_elems(l)
print(c)
# Counter({'O': 40, 'C': 9, 'Al': 6, 'Si': 6, 'Mg': 3, 'B': 3, 'Na': 1})
print(c['O'])
# 40
Try with this recursive function:
import re
def find_number(s, e):
if s == '':
return 0
x = re.search('\(([A-Za-z0-9]*)\)([0-9]*)', s)
if x is not None:
return (find_number(s[:x.start()] + s[x.end():], e) +
find_number(x.groups()[0], e) * int(x.groups()[1] or '1'))
return sum(int(x.groups()[0] or '1')
for x in re.finditer(f'{e}([0-9]*)', s))
Example:
>>> find_number('NaMg3Al6(BO3)3Si6O18(OH)4', 'O')
31

is there a way to improve the function that find a substring of a very large function

I tried to run this code but this function indeed consumes more time. I want to improve this code:
def minion_game(string):
k = 0
s = 0
for i in range(len(string)):
for j in range(i + 1, len(string) + 1):
ss = string[i:j]
if ss[0] in ['A', 'E', 'I', 'O', 'U']:
k += 1
else:
s += 1
if len(string) in range(0, 10 ** 6):
if string.isupper():
if k > s:
print(f"Kevin {k}")
if s > k:
print(f"Stuart {s}")
if k == s:
print("Draw")
Using the Counter class is usually pretty efficient in a case like this. This should be mostly similar to what you have done in terms of results, but hopefully much quicker.
from collections import Counter
k_and_s = Counter('k' if c in 'AEIOU' else 's' for c in string)
k, s = k_and_s['k'], k_and_s['s']
if k > s:
print(f'Kevin {k}')
elif k < s:
print(f'Stuart {s}')
else
print(f'Draw')
Zooming in on k_and_s = Counter('k' if c in 'AEIOU' else 's' for c in string), this uses comprehension in place of a loop. It is roughly equivalent to this:
k_and_s = Counter()
for c in string:
if c in 'AEIOU':
k_and_s['k'] += 1
else
k_and_s['s'] += 1
The answer by #jamie-deith is good and fast. It will process the complete works of Shakespeare in about 0.56 seconds on my computer. I gave up timing the original answer and modifications of it as it simply goes on and on.
This version is simple and produces the same answer in 0.26 seconds. I'm sure them are likely even faster answers:
with open("shakespeare.txt", encoding="utf-8") as file_in:
shakespeare = file_in.read().upper()
kevin = len([character for character in shakespeare if character in 'AEIOU'])
stuart = len(shakespeare) - kevin
if kevin > stuart:
print(f'Kevin {kevin}')
elif kevin < stuart:
print(f'Stuart {stuart}')
else:
print(f'Draw')
Taking the (perhaps doubtful) position that your code is doing what you intend, but slowly, I note:
The amount added to k or s for any value of i depends on how many times we go around the j loop. You're repeatedly testing the character at i (with the same result every time of course) and adding one to either s or k, as many times as we go around the loop.
So we don't need to actually go around the j loop; we can just add that amount on a single test. For the first character you go the same number of times around the loop as the length of the string, then reducing by one as you shift along the string.
So we can lose i and iterate through the string characters directly.
Then finally we don't report anything if the string is too long or not upper case, so we can just do that test first, and not even calculate in those circumstances.
def minion_game(string):
if len(string) < 10**6 and string.isupper():
k = 0
s = 0
j = len(string)
for ss in string:
if ss in 'AEIOU':
k += j
else:
s += j
j -= 1 # reducing amount to add
if k > s:
print(f"Kevin {k}")
elif s > k:
print(f"Stuart {s}")
else:
print("Draw")
As a hint for even faster options, I'll note that k+s is constant depending on the length of the string.
def minion_game(string):
k = 0
s = 0
l = len(string) # save length in a variable
for i in range(l):
for j in range(i + 1, l + 1):
ss = string[i] # take only the first
if ss in ['A', 'E', 'I', 'O', 'U']:
k += 1
else:
s += 1
if l in range(0, 10 ** 6):
if string.isupper():
if k > s: # change from three if's to if, elif, else
print(f"Kevin {k}")
elif s > k:
print(f"Stuart {s}")
else:
print("Draw")
I made a few edits that should speed up your code. They are described in comments on the lines. There seems to be some logic missing in the j-loop.
I'm not sure what you are doing on the line if l in range(0, 10 ** 6):. If you wanted to remove it, then it'd look like:
def minion_game(string):
k = 0
s = 0
l = len(string) # save length in a variable
for i in range(l):
for j in range(i + 1, l + 1):
ss = string[i] # take only the first
if ss in ['A', 'E', 'I', 'O', 'U']:
k += 1
else:
s += 1
# removed loop, which definitely saves time
if string.isupper():
if k > s: # change from thee if's to if, elif, else
print(f"Kevin {k}")
elif s > k:
print(f"Stuart {s}")
else:
print("Draw")

How to compute the weight of s tring in Python?

Given a string S, we define its weight, weight(S) as the multiplication of the positions of vowels in the string (starting from 1). Ex: weight(“e”) = 1; # weight(“age”)= 3; weight(“pippo”) = 10.
I tried this:
def weight(s):
vowels = ['a','e','i','o','u']
numbers = []
for c in s:
if c in vowels:
n = s.index(c)+1
numbers.append(n)
result = 1
for x in numbers:
result = result*x
print(result)
But it works only with different vowels. If there is the same vowel in the string, the number is wrong.
What am I missing?
Thank you all.
You can use this:
s = 'pippo'
np.prod([i+1 for i,v in enumerate(s) if v in ['a','e','i','o','u']])
10
str.index() works like str.find in that:
Return the lowest index in the string where substring sub is found [...]
Source: str.index -> str.find)
only returns the first occurences index.
functools.reduce and operator.mul together with enumerate (from 1) makes this a one-liner:
from operator import mul
from functools import reduce
value = reduce(mul, (i for i,c in enumerate("pippo",1) if c in "aeiou"))
Or for all your strings:
for t in ["e","age","pippo"]:
# oneliner (if you omit the imports and iterating over all your given examples)
print(t, reduce(mul, (i for i,c in enumerate(t,1) if c in "aeiou")))
Output:
e 1
age 3
pippo 10
Maybe not an optimal way to do it, but this works.
vowels = ['a', 'e', 'i', 'o', 'u', 'y']
mystring = 'pippo'
weight = 1
i = 0
while i < len(mystring):
if mystring[i] in vowels:
weight *= i+1
i += 1
if weight == 1 and mystring[0] not in vowels:
weight = 0
print(weight)
The final IF statement gets you rid of the ONE exceptionnal case where the string contains 0 vowels.
You may want to use enumerate. Makes the job easy
The code becomes:
def weight(s):
vowels = ['a','e','i','o','u']
wt=1
for i,c in enumerate(s):
if c in vowels:
wt*=i+1
return wt
print(weight("asdew"))
When you are trying s.index(c) this returns the index of first occurence of the character in string.
You should use enumerate for iterating through the string. Enumerate gives you the value and index of the element while iterating on iterable.
def weight(s):
vowels = ['a','e','i','o','u']
numbers = []
for ind, c in enumerate(s):
if c in vowels:
n = ind+1
numbers.append(n)
result = 1
for x in numbers:
result = result*x
print(result)
You can read about enumerate on below link :
http://book.pythontips.com/en/latest/enumerate.html

Finding Subarrays of Vowels from a given String

You are given a string S, and you have to find all the amazing substrings of S.
Amazing Substring is one that starts with a vowel (a, e, i, o, u, A, E, I, O, U).
Input
The only argument given is string S.
Output
Return a single integer X mod 10003, here X is number of Amazing Substrings in given string.
Constraints
1 <= length(S) <= 1e6
S can have special characters
Example
Input
ABEC
Output
6
Explanation
Amazing substrings of given string are :
1. A
2. AB
3. ABE
4. ABEC
5. E
6. EC
here number of substrings are 6 and 6 % 10003 = 6.
I have implemented the following algo for the above Problem.
class Solution:
# #param A : string
# #return an integer
def solve(self, A):
x = ['a', 'e','i','o', 'u', 'A', 'E', 'I', 'O', 'U']
y = []
z = len(A)
for i in A:
if i in x:
n = A.index(i)
m = z
while m > n:
y.append(A[n:m])
m -= 1
if y:
return len(y)%10003
else:
return 0
Above Solution works fine for strings of normal length but not for greater length.
For example,
A = "pGpEusuCSWEaPOJmamlFAnIBgAJGtcJaMPFTLfUfkQKXeymydQsdWCTyEFjFgbSmknAmKYFHopWceEyCSumTyAFwhrLqQXbWnXSn"
Above Algo outputs 1630 subarrays but the expected answer is 1244.
Please help me improving the above algo. Thanks for the help
Focus on the required output: you do not need to find all of those substrings. All you need is the quantity of substrings.
Look again at your short example, ABEC. There are two vowels, A and E.
A is at location 0. There are 4 total substrings, ending there and at each following location.
E is at location 2. There are 2 total substrings, ending there and at each following location.
2+4 => 6
All you need do is to find the position of each vowel, subtract from the string length, and accumulate those differences:
A = "pGpEusuCSWEaPOJmamlFAnIBgAJGtcJaMPFTLfUfkQKXeymydQsdWCTyEFjFgbSmknAmKYFHopWceEyCSumTyAFwhrLqQXbWnXSn"
lenA = len(A)
vowel = "aeiouAEIOU"
count = 0
for idx, char in enumerate(A):
if char in vowel:
count += lenA - idx
print(count%10003)
Output:
1244
In a single command:
print( sum(len(A) - idx if char.lower() in "aeiou" else 0
for idx, char in enumerate(A)) )
When you hit a vowel in a string, all sub-strings that start with this vowel are 'amazing' so you can just count them:
def solve(A):
x = ['a', 'e','i','o', 'u', 'A', 'E', 'I', 'O', 'U']
ans = 0
for i in range(len(A)):
if A[i] in x:
ans = (ans + len(A)-i)%10003
return ans
When you are looking for the index of the element n = A.index(i), you get the index of the first occurrence of the element. By using enumerate you can loop through indices and elements simultaneously.
def solve(A):
x = ['a', 'e','i','o', 'u', 'A', 'E', 'I', 'O', 'U']
y = []
z = len(A)
for n,i in enumerate(A):
if i in x:
m = z
while m > n:
y.append(A[n:m])
m -= 1
if y:
return len(y)%10003
else:
return 0
A more general solution is to find all amazing substrings and then count them :
string = "pGpEusuCSWEaPOJmamlFAnIBgAJGtcJaMPFTLfUfkQKXeymydQsdWCTyEFjFgbSmknAmKYFHopWceEyCSumTyAFwhrLqQXbWnXSn"
amazing_substring_start = ['a','e','i','o','u','A','E','I','O','U']
amazing_substrings = []
for i in range(len(string)):
if string[i] in amazing_substring_start:
for j in range(len(string[i:])+1):
amazing_substring = string[i:i+j]
if amazing_substring!='':
amazing_substrings += [amazing_substring]
print amazing_substrings,len(amazing_substrings)%10003
create a loop to calculate the number of amazing subarrays created by every vowel
def Solve(A):
sumn = 0
for i in range(len(A)):
if A[i] in "aeiouAEIOU":
sumn += len(A[i:])
return sumn%10003

String index out of range in Python

def romanToNum(word):
word = word.upper()
numeralList2 = list(zip(
[1000, 500, 100, 50, 10, 5, 1],
['M', 'D', 'C', 'L', 'X', 'V', 'I']
))
num = 0
x = []
a = 0
b = 2
if len(word) % 2 != 0:
word = word + "s"
for i in range(0,len(word)):
x.append(word[a:b])
a = a + 2
b = b + 2
print(x[i])
for n in x:
for nNum,rNum in numeralList2:
if n == rNum:
num = nNum + num
elif n == (n[0] + n[1]):
num = (nNum*2) + num
elif n[0] == rNum:
r1 = 0
r1 = nNum
elif n[1] == rNum:
r2 = 0
r2 = nNum
elif r1 < r2:
num = num + (r2 - r1)
elif r1 > r2:
num = num + (r1 + r2)
return num
romanToNum("xxx")
I am getting the following error:
elif n == (n[0] + n[1]):
IndexError: string index out of range
and it doesn't matter where I put that in the loop, it just wont recognize that n has an index value.
I also get this error: Traceback (most recent call last):
which points to when i call my function: romanToNum("xxx")
I'm not really sure what's going on because I added a print statement to where I'm appending my list and there is an index of at least [0] when I print it all out. Any help here?
I have looked through stack for similar questions but the solution for them is an indentation or because they had a negative index( [-1] ) or something along those lines but all my indentation is correct and my index's are all positive.
Well n is an element of x. The IndexError on the line n == n[0] + n[1] means that a certain n has length less than 2.
You added an word = word + 's' to probably guard against having one character elements in x but it doesn't really work.
If you look at how you build the x list you do:
x = []
a = 0
b = 2
if len(word) % 2 != 0:
word = word + "s"
for i in range(0,len(word)):
x.append(word[a:b])
a = a + 2
b = b + 2
print(x[i])
So in your example you start with x = [] and word = 'XXX'. Then you add an s to obtain word = 'XXXs'.
The loop over i does the following:
i=0 so x.append(word[0:2]); a = a+2; b = b+2 so that x = ['XX'] and a=2 and b=4.
i=1 so x.append(word[2:4]); a = a+2; b = b+2 so that x = ['XX', 'Xs'] and a=4 and b=6.
i=2 so x.append(word[4:6]); a = a+2; b = b+2 so that x = ['XX', 'Xs', ''] and a=6 and b=8.
i=3 so x.append(word[6:8]); a = a+2; b = b+2 so that x = ['XX', 'Xs', '', ''] and a=8 and b=10.
And here you see that n can be the empty string, which means when doing n == n[0] + n[1] you end up with an IndexError.
I believe you wanted to group the characters two by two, but then the i should use a step of 2:
for i in range(0, len(word), 2):
x.append(word[i:i+2])
In this way i is 0, then 2, then 4 etc
By the way: once you have fixed this the condition n == n[0] + n[1] seems pretty odd, because if n is a two character string (as it should be if you fix the code) then the condition will always be true. What are you trying to really do here?
This is the culprit:
for i in range(0,len(word)):
x.append(word[a:b])
a = a + 2
b = b + 2
At the end of this loop, x will be ['XX', 'Xs', '', '']. Since you are grouping the characters in groups of two, the total number of groups will be half the length of the string. So just halve the number of iterations with range(0,len(word)/2) or range(0,len(word),2)
You have a problem with your first for loop that goes farther than expected, affecting an empty string to x[i]. It should probably be : for i in range(int(len(word)/2)):
Then your second loop needs fixing too.
if n == rNum : is never realised since rNum is a one character string and x's length is 2. Try n == rNum+"s".
n == n[0] + n[1] is always True for a string of 2 characters. You must mean n == rNum * 2
Also, the use of x += 1 is recommended instead of x = x + 1.

Categories