Loop to find space with python - python

c = "ab cd ef gf"
n = []
for x in c:
if x == " ":
d = c.find(x)
n.append(d)
print(n)
I want this code to give me something like this. [2,5,8]
But instead it is giving me this. [2,2,2]
Please help me find the mistake. Thank you.

find() will find the first instance, so it always finds the space at index 2. You could keep track of the index as you go with enumerate() so you don't need find():
c = "ab cd ef gf"
n = []
for i, x in enumerate(c):
if x == " ":
n.append(i)
print(n)
Alternatively as a list comprehension:
[i for i, x in enumerate(c) if x == " "]

One way to do it would be:
space_idxs = []
for idx, char in enumerate(s):
if char == ' ':
space_idxs.append(idx)

That's because find(pattern) function returns the first entry of the pattern. Let me supplement your code with required function find_all(string, pattern)
def find_all(string, pattern):
start = 0
indexes = []
for char in string:
start = string.find(pattern, start)
if start == -1:
return indexes
indexes.append(start)
start += len(pattern)
c = "ab cd ef gf"
n = []
n = find_all(c, " ")
print(n)

try
c="ab cd ef gh"
x=" "
print([t for t, k in enumerate(c) if k==x])
it will return [2,5,8]
in your code you are searching for the index value of x in c, three times:
in the for loop you are taking all the characters in your string one by one,
the if loop validates if it is a space
now when the character is a space it enters the if loop
the find command will look for x (space) in c
which is 2
the same is repeated three times and are appended to n
if you want it in a list:
n=([t for t, k in enumerate(c) if k==x])

Related

Print all the "a" of a string on python

I'm trying to print all the "a" or other characters from an input string. I'm trying it with the .find() function but just print one position of the characters. How do I print all the positions where is an specific character like "a"
You can use find with while loop
a = "aabababaa"
k = 0
while True:
k = a.find("a", k) # This is saying to start from where you left
if k == -1:
break
k += 1
print(k)
This is also possible with much less amount of code if you don't care where you want to start.
a = "aabababaa"
for i, c in enumerate(a): # i = Index, c = Character. Using the Enumerate()
if ("a" in c):
print(i)
Some first-get-all-matches-and-then-print versions:
With a list comprehension:
s = "aslkbasaaa"
CHAR = "a"
pos = [i for i, char in enumerate(s) if char == CHAR]
print(*pos, sep='\n')
or with itertools and composition of functions
from itertools import compress
s = "aslkbasaaa"
CHAR = "a"
pos = compress(range(len(s)), map(CHAR.__eq__, s))
print(*pos, sep='\n')

How do you separate all possible substrings in a string?

For example lets say:
Str = "abc"
The desired output I am looking for is:
a, b, c, ab, bc, abc
so far I have:
#input
Str = input("Please enter a word: ")
#len of word
n = len(Str)
#while loop to seperate the string into substrings
for Len in range(1,n + 1):
for i in range(n - Len + 1):
j = i + Len - 1
for k in range(i,j + 1):
#printing all the substrings
print(Str[k],end="")
this would get me:
abcabbcabc
which has all the correct substrings but not seperated. What do I do to get my desired output? I would think the end='' would do the trick in seperating each substring into each individual lines but it doesn't. Any suggestions?
You could add an extra print() in the i loop, but it's easier to use a slice instead:
s = "abc"
n = len(s)
for size in range(1, n+1):
for start in range(n-size+1):
stop = start + size
print(s[start:stop])
Output:
a
b
c
ab
bc
abc
On the other hand, if you want them literally joined on comma-spaces as you wrote, the simplest way is to save them in a list then join at the end.
s = "abc"
n = len(s)
L = []
for size in range(1, n+1):
for start in range(n-size+1):
stop = start + size
L.append(s[start:stop])
print(*L, sep=', ')
Or, I would probably use a list comprehension for this:
s = "abc"
n = len(s)
L = [s[j:j+i] for i in range(1, n+1) for j in range(n-i+1)]
print(*L, sep=', ')
Output:
a, b, c, ab, bc, abc
a more pythonic solution
code:
import itertools
s = "abc"
for i in range(1,len(s)+1):
print(["".join(word) for word in list(itertools.combinations(s,i))])
result:
['a', 'b', 'c']
['ab', 'ac', 'bc']
['abc']

how to extract the max numeric substring from a string?

when given a string. how to extract the biggest numeric substring without regex?
if for example given a string: 24some555rrr444
555 will be the biggest substring
def maximum(s1)
sub=[]
max=0
for x in s1
if x.isnummeric() and x>max
sub.append(x)
max=x
return max
what to do in order this code to work?
Thank you in advance!
Replace all non digits to a space, split the resulting word based on spaces, convert each number to a int and then find the max of them
>>> s = '24some555rrr444'
>>> max(map(int, ''.join(c if c.isdigit() else ' ' for c in s).split()))
555
You could use itertools.groupby to pull out the digits in groups and find the max:
from itertools import groupby
s = "24some555rrr444"
max(int(''.join(g)) for k, g in groupby(s, key=str.isdigit) if k)
# 555
Not using regex is wierd, but ok
s = "24some555rrr444"
n = len(s)
m = 0
for i in range(n):
for len in range(i + 1, n + 1):
try:
v = int(s[i:len])
m = v if v > m else m
except:
pass
print(m)
Or if want really want to compress it to basically one line (except the convert function), you can use
s = "24some555rrr444"
n = len(s)
def convert(s):
try:
return int(s)
except:
return -1
m = max(convert(s[i:l]) for i in range(n) for l in range(i + 1, n + 1))
print(m)
In order to stay in your mindset, i propose this, it is quite close from the initial demand:
#Picked a random number+string stringchain
OriginalStringChain="123245hjkh2313k313j23b"
#Creation of the list which will contain the numbers extracted from the formerly chosen stringchain
resultstable=[]
#The b variable will contain the extracted numbers as a string chain
b=""
for i,j in zip(OriginalStringChain, range(len(OriginalStringChain))):
c= j+1
#the is.digit() fonction is your sql isnumeric equivalent in python
#this bloc of if will extract numbers one by one and concatenate them, if they are several in a row, in a string chain of numbers before adding them in the resultstable
if i.isdigit() == True:
b+=i
if j < len(OriginalStringChain)-1 and OriginalStringChain[c].isdigit() == False:
resutstable.append(int(b))
elif j== len(OriginalStringChain)-1 and OriginalStringChain[j].isdigit() == True:
resultstable.append(int(b))
else:
b=""
print(resultstable)
#At the end, you just obtain a list where every values are the numbers we extracted previously and u just use the max function
print(max(resultstable))
I hope i was clear.
Cheers

Python Optimization : Find the most occured sequence of 4 letters inside a 1000 letters string randomly generated

I'm here to ask help about my program.
I realise a program that raison d'ĂȘtre is to find the most occured four letters string on a x letters bigger string which have been generated randomly.
As example, if you would know the most occured sequence of four letters in 'abcdeabcdef' it's pretty easy to understand that is 'abcd' so the program will return this.
Unfortunately, my program works very slow, I mean, It take 119.7 seconds, for analyze all possibilities and display the results for only a 1000 letters string.
This is my program, right now :
import random
chars = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
string = ''
for _ in range(1000):
string += str(chars[random.randint(0, 25)])
print(string)
number = []
for ____ in range(0,26):
print(____)
for ___ in range(0,26):
for __ in range(0, 26):
for _ in range(0, 26):
test = chars[____] + chars[___] + chars[__] + chars[_]
print('trying :',test, end = ' ')
number.append(0)
for i in range(len(string) -3):
if string[i: i+4] == test:
number[len(number) -1] += 1
print('>> finished')
_max = max(number)
for i in range(len(number)-1):
if number[i] == _max :
j, k, l, m = i, 0, 0, 0
while j > 25:
j -= 26
k += 1
while k > 25:
k -= 26
l += 1
while l > 25:
l -= 26
m += 1
Result = chars[m] + chars[l] + chars[k] + chars[j]
print(str(Result),'occured',_max, 'times' )
I think there is ways to optimize it but at my level, I really don't know. Maybe the structure itself is not the best. Hope you'll gonna help me :D
You only need to loop through your list once to count the 4-letter sequences. You are currently looping n*n*n*n. You can use zip to make a four letter sequence that collects the 997 substrings, then use Counter to count them:
from collections import Counter
import random
chars = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
s = "".join([chars[random.randint(0, 25)] for _ in range(1000)])
it = zip(s, s[1:], s[2:], s[3:])
counts = Counter(it)
counts.most_common(1)
Edit:
.most_common(x) returns a list of the x most common strings. counts.most_common(1) returns a single item list with the tuple of letters and number of times it occurred like; [(('a', 'b', 'c', 'd'), 2)]. So to get a string, just index into it and join():
''.join(counts.most_common(1)[0][0])
Even with your current approach of iterating through every possible 4-letter combination, you can speed up a lot by keeping a dictionary instead of a list, and testing whether the sequence occurs at all first before trying to count the occurrences:
counts = {}
for a in chars:
for b in chars:
for c in chars:
for d in chars:
test = a + b + c + d
print('trying :',test, end = ' ')
if test in s: # if it occurs at all
# then record how often it occurs
counts[test] = sum(1 for i in range(len(s)-4)
if test == s[i:i+4])
The multiple loops can be replaced with itertools.permutations, though this improves readability rather than performance:
length = 4
for sequence in itertools.permutations(chars, length):
test = "".join(sequence)
if test in s:
counts[test] = sum(1 for i in range(len(s)-length) if test == s[i:i+length])
You can then display the results like this:
_max = max(counts.values())
for k, v in counts.items():
if v == _max:
print(k, "occurred", _max, "times")
Provided that the string is shorter or around the same length as 26**4 characters, then it is much faster still to iterate through the string rather than through every combination:
length = 4
counts = {}
for i in range(len(s) - length):
sequence = s[i:i+length]
if sequence in counts:
counts[sequence] += 1
else:
counts[sequence] = 1
This is equivalent to the Counter approach already suggested.

Code to output the first repeated character in given string?

I'm trying to find the first repeated character in my string and output that character using python. When checking my code, I can see I'm not index the last character of my code.
What am I doing wrong?
letters = 'acbdc'
for a in range (0,len(letters)-1):
#print(letters[a])
for b in range(0, len(letters)-1):
#print(letters[b])
if (letters[a]==letters[b]) and (a!=b):
print(b)
b=b+1
a=a+1
You can do this in an easier way:
letters = 'acbdc'
found_dict = {}
for i in letters:
if i in found_dict:
print(i)
break
else:
found_dict[i]= 1
Output:
c
Here's a solution with sets, it should be slightly faster than using dicts.
letters = 'acbdc'
seen = set()
for letter in letters:
if letter in seen:
print(letter)
break
else:
seen.add(letter)
Here is a solution that would stop iteration as soon as it finds a dup
>>> from itertools import dropwhile
>>> s=set(); next(dropwhile(lambda c: not (c in s or s.add(c)), letters))
'c'
You should use range(0, len(letters)) instead of range(0, len(letters) - 1) because range already stops counting at one less than the designated stop value. Subtracting 1 from the stop value simply makes you skip the last character of letters in this case.
Please read the documentation of range:
https://docs.python.org/3/library/stdtypes.html#range
There were a few issues with your code...
1.Remove -1 from len(letters)
2.Move back one indent and do b = b + 1 even if you don't go into the if statement
3.Indent and do a = a + 1 in the first for loop.
See below of how to fix your code...
letters = 'acbdc'
for a in range(0, len(letters)):
# print(letters[a])
for b in range(0, len(letters)):
# print(letters[b])
if (letters[a] == letters[b]) and (a != b):
print(b)
b = b + 1
a = a + 1
Nice one-liner generator:
l = 'acbdc'
next(e for e in l if l.count(e)>1)
Or following the rules in the comments to fit the "abba" case:
l = 'acbdc'
next(e for c,e in enumerate(l) if l[:c+1].count(e)>1)
If complexity is not an issue then this will work fine.
letters = 'acbdc'
found = False
for i in range(0, len(letters)-1):
for j in range(i+1, len(letters)):
if (letters[i] == letters[j]):
print (letters[j])
found = True
break
if (found):
break
The below code prints the first repeated character in a string. I used the functionality of the list to solve this problem.
def findChar(inputString):
list = []
for c in inputString:
if c in list:
return c
else:
list.append(c)
return 'None'
print (findChar('gotgogle'))
Working fine as well. It gives the result as 'g'.
def first_repeated_char(str1):
for index,c in enumerate(str1):
if str1[:index+1].count(c) > 1:
return c
return "None"
print(first_repeated_char("abcdabcd"))
str_24 = input("Enter the string:")
for i in range(0,len(str_24)):
first_repeated_count = str_24.count(str_24[i])
if(first_repeated_count > 1):
break
print("First repeated char is:{} and character is
{}".format(first_repeated_count,str_24[i]))

Categories