split string by all non alphabetic character occurences in Python

split string by all non alphabetic character occurences in Python - python

I'm trying to do the following function: We need to build a list out of a string. The list should only have alphabetical characters in it.
#if the input is the following string
mystring = "ashtray ol'god for, shure! i.have "
#the output should give a list like this:
mylist = ['ashtray','ol','god','for','shure','i','have']
No modules should be imported. I created the following function and it works, but I would be happy if someone could provide a better way to do it.
for ch in mystring:
if ch.isalpha() == False:
mystring = mystring.replace(ch,' ')
mylist = mystring.split()
by alphabetical character I mean all alphabetical characters present in UTF8, that means including arabic ,jewish chars etc.

Try this code
mystring = "ashtray ol'god for, shure! i.have "
lst = []
mystr = ''
for i in mystring:
temp = ord(i)
if (65 <= temp <= 90) or (97 <= temp <= 122):
mystr += i
else:
if mystr:
lst.append(mystr)
mystr = ''
print(lst)
Or
mystring = "ashtray ol'god for, shure! i.have "
lst = []
mystr = ''
for i in mystring:
if i in 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz':
mystr += i
else:
if mystr:
lst.append(mystr)
mystr = ''
print(lst)
Or (including Non-English characters)
mystring = "ashtray ol'god for, shure! i.have "
lst = []
mystr = ''
for i in mystring:
if i.isalpha():
mystr += i
else:
if mystr:
lst.append(mystr)
mystr = ''
print(lst)
Output:
['ashtray', 'ol', 'god', 'for', 'shure', 'i', 'have']
Tell me if its not working...

Related

Python iterations mischaracterizes string value

For this problem, I am given strings ThatAreLikeThis where there are no spaces between words and the 1st letter of each word is capitalized. My task is to lowercase each capital letter and add spaces between words. The following is my code. What I'm doing there is using a while loop nested inside a for-loop. I've turned the string into a list and check if the capital letter is the 1st letter or not. If so, all I do is make the letter lowercase and if it isn't the first letter, I do the same thing but insert a space before it.
def amendTheSentence(s):
s_list = list(s)
for i in range(len(s_list)):
while(s_list[i].isupper()):
if (i == 0):
s_list[i].lower()
else:
s_list.insert(i-1, " ")
s_list[i].lower()
return ''.join(s_list)
However, for the test case, this is the behavior:
Input: s: "CodesignalIsAwesome"
Output: undefined
Expected Output: "codesignal is awesome"
Console Output: Empty

You can use re.sub for this:
re.sub(r'(?<!\b)([A-Z])', ' \\1', s)
Code:
import re
def amendTheSentence(s):
return re.sub(r'(?<!\b)([A-Z])', ' \\1', s).lower()
On run:
>>> amendTheSentence('GoForPhone')
go for phone

Try this:
def amendTheSentence(s):
start = 0
string = ""
for i in range(1, len(s)):
if s[i].isupper():
string += (s[start:i] + " ")
start = i
string += s[start:]
return string.lower()
print(amendTheSentence("CodesignalIsAwesome"))
print(amendTheSentence("ThatAreLikeThis"))
Output:
codesignal is awesome
that are like this

def amendTheSentence(s):
new_sentence=''
for char in s:
if char.isupper():
new_sentence=new_sentence + ' ' + char.lower()
else:
new_sentence=new_sentence + char
return new_sentence
new_sentence=amendTheSentence("CodesignalIsAwesome")
print (new_sentence)
result is codesignal is awesome

How do I convert all the vowels within a string to white-spaces in python

I am trying to take an input string and replace any vowels in it with a whitespace. How do you do this?
def w_space (s):
vowels = "aeiouAEIOU"
string = s
for a in string:
for b in vowels:
if string[a] == vowels[b]:
vowels = ""
return string

First of all instead of setting string = s simply loop through s as it is. Also, instead of looping through both s and vowels just loop s and check if the letter is in vowels, if so add a white space to string if not add the letter to string. Here is the code:
def w_space (s):
vowels = "aeiouAEIOU"
string = ""
for a in s:
if a in vowels:
string += " "
else:
string += a
return string

def f(s):
l = list(s)
for i in range(len(l)):
if l[i] in "aeiouAEIOU":
l[i] = " "
return ''.join(l)

This can be accomplished with a list comprehension
def w_space(s):
vowels = "aeiouAEIOU"
return "".join([" " if x in vowels else x for x in s])

Regex is an option as well
def w_space (s):
from re import sub, IGNORECASE
vowels = r'[aeiou]'
return sub(vowels, ' ', s, flags=IGNORECASE)

Remove white spaces in string without split function

Need to remove all excess white spaces in a string, including ones at the beginning and end. I cannot use the split function. Only if and while statements. I have this so far, but every time i run it, it only returns the input the exact same way.
def cleanstring(S):
i=0
startpos=0
endpos=0
end=-1
word=0
#find position of first letter
while S[i]==(" "):
i=i+0
startpos=i
#find last letter
while (S[end]==(" ")):
end=end-1
endpos=S[len(S)-end]
#make first letter found the first letter in the string
if S[i]!=(" "):
word=S[i]
#start between startpos and endpos to find word
while (i<endpos) and (i>startpos):
while S[i]!=(" "):
word=word+S[i]
if S[i]==(" "):
if (S[i+1]==("")) or (S[i-1]==(" ")):
word=word+(" ")
else:
word=word+(" ")
#return the word
print(word)
Input=[" Hello to the world "]

Concat as you go to a temp string, if you hit a whitespace char check if the temp string is not empty, if not yield it and reset the temp string.
s = " Hello to the world "
def split(s):
temp_s = ""
for ch in s:
if ch.isspace():
if temp_s:
yield temp_s
temp_s = ""
else:
temp_s += ch
if temp_s:
yield temp_s
Output:
In [5]: s = " Hello to the world "
In [6]: list(split(s))
Out[6]: ['Hello', 'to', 'the', 'world']
In [7]: s = " Hello\tto\r\nthe world "
In [8]: list(split(s))
Out[8]: ['Hello', 'to', 'the', 'world']
In [10]: list(split(s))
Out[10]: ['Hello', 'world']
In [11]: s = "Hello"
In [12]: list(split(s))
Out[12]: ['Hello']
Obviously if needed you can change the for's to a while loops.

If you call your cleanstring function with a string with a space in it, this will cause an infinite loop:
while S[i]==(" "):
i=i+0
startpos=i
Since you are adding zero to i, it will never change. You should increment it by 1, which can be done like this:
i += 1
which is short hand for
i = i + 1
However, Input is not even a string, but a list with a string in it. You should change the input expression to this
Input = " Hello to the world "
The square brackets you have are making it a list with a string in it.

Using for:
def cleanstring(str_in):
str_out = ''
last_char = None
for cur_char in str_in:
str_out += '' if last_char == ' ' and cur_char ==' ' else cur_char
last_char = cur_char
return str_out
Using while:
def cleanstring(str_in):
str_out = ''
last_char = None
index = 0
while str_in[index:index+1]:
cur_char = str_in[index:index+1]
str_out += '' if last_char == ' ' and cur_char ==' ' else cur_char
last_char = cur_char
index+=1
return str_out
If the last character and current are spaces then do not append a space.
We assume that spaces are the only whitespace concerned. Otherwise this is a solution for sets of whitespace:
def cleanstring(str_in):
str_out = ''
last_char = None
index = 0
whitespace = [' ','\t','\n','\r','\f','\v']
while str_in[index:index+1]:
a = str_in[index:index+1]
str_out += '' if last_char in whitespace and a in whitespace else a
last_char = a
index+=1
return str_out
That removes all whitespace aside from the first detected entry, however if we want to remove whitespace that is similar to adjacent whitespace and leave the first detected instance:
def cleanstring(str_in):
str_out = ''
last_char = None
index = 0
whitespace = [' ','\t','\n','\r','\f','\v']
while str_in[index:index+1]:
a = str_in[index:index+1]
str_out += '' if last_char == a and a in whitespace else a
last_char = a
index+=1
return str_out
If you are concerned about the use of in, it can be replaced with (using last instance of cleanstring as example):
def cleanstring(str_in):
def is_whitespace_in(char):
whitespace = [' ','\t','\n','\r','\f','\v']
local_index = 0
while whitespace[local_index:local_index+1]:
a = whitespace[local_index:local_index+1][0]
if a[0] == char:
return True
local_index+=1
return False
str_out = ''
last_char = None
index = 0
while str_in[index:index+1]:
a = str_in[index:index+1]
str_out += '' if last_char == a and is_whitespace_in(a) else a
last_char = a
index+=1
return str_out
Whitespace of the last examples follows from Cython re's \s definition:
\s Matches any whitespace character; equivalent to [ \t\n\r\f\v] in
bytes patterns or string patterns with the ASCII flag.
Lines 73-74
I know this may not be the most Pythonic or PEP8 compliant, please feel free to edit this.

Just use the string.strip() method.

Is this kind of homework or something?
If you can't use 'for', only 'if' and 'while', then I'd use a counter and check for each char in your string.
def clean(input):
idx = 0
out = input[idx]
while idx < len(input):
if input[idx] != out[-1] or input[idx] != ' ':
out += input[idx]
idx+=1
return out
Of course, it's not the full solution, but you get the idea.

Please read the comment below.
TABLE = str.maketrans('','',' \n\r\t\f')
def clrstr(inp):
return inp.translate(TABLE)
However, it does not help much if you are learning while and for loops.

Splitting strings in Python without split()

What are other ways to split a string without using the split() method? For example, how could ['This is a Sentence'] be split into ['This', 'is', 'a', 'Sentence'] without the use of the split() method?

sentence = 'This is a sentence'
split_value = []
tmp = ''
for c in sentence:
if c == ' ':
split_value.append(tmp)
tmp = ''
else:
tmp += c
if tmp:
split_value.append(tmp)

You can use regular expressions if you want:
>>> import re
>>> s = 'This is a Sentence'
>>> re.findall(r'\S+', s)
['This', 'is', 'a', 'Sentence']
The \S represents any character that isn't whitespace, and the + says to find one or more of those characters in a row. re.findall will create a list of all strings that match that pattern.
But, really, s.split() is the best way to do it.

A recursive version, breaking out the steps in detail:
def my_split(s, sep=' '):
s = s.lstrip(sep)
if sep in s:
pos = s.index(sep)
found = s[:pos]
remainder = my_split(s[pos+1:])
remainder.insert(0, found)
return remainder
else:
return [s]
print my_split("This is a sentence")
Or, the short, one-line form:
def my_split(s, sep=' '):
return [s[:s.index(sep)]] + my_split(s[s.index(sep)+1:]) if sep in s else [s]

Starting with a list of strings, if you would like to split these strings there are a couple ways to do so depending on what your desired output is.
Case 1: One list of strings (old_list) split into one new list of strings (new_list).
For example ['This is a Sentence', 'Also a sentence'] -> ['This', 'is', 'a', 'Sentence', 'Also', 'a', 'sentence'].
Steps:
Loop through the strings. for sentence in old_list:
Create a new string to keep track of the current word (word).
Loop through the characters in each of these strings. for ch in sentence:
If you come across the character(s) you want to split on (spaces in this example), check that word is not empty and add it to the new list, otherwise add the character to word.
Make sure to add word to the list after looping through all the characters.
The final code:
new_list = []
for sentence in old_list:
word = ''
for ch in sentence:
if ch == ' ' and word != '':
new_list.append(word)
word = ''
else:
word += ch
if word != '':
new_list.append(word)
This is equivalent to
new_list = []
for sentence in old_list:
new_list.extend(sentence.split(' '))
or even simpler
new_list = ' '.join(old_list).split(' ')
Case 2: One list of strings (old_list) split into a new list of lists of strings (new_list).
For example ['This is a Sentence', 'Also a sentence'] -> [['This', 'is', 'a', 'Sentence'], ['Also', 'a', 'sentence']].
Steps:
Loop through the strings. for sentence in old_list:
Create a new string to keep track of the current word (word) and a new list to keep track of the words in this string (sentence_list).
Loop through the characters in each of these strings. for ch in sentence:
If you come across the character(s) you want to split on (spaces in this example), check that word is not empty and add it to sentence_list, otherwise add the character to word.
Make sure to add word to sentence_list after looping through all the characters.
Append (not extend) sentence_list to the new list and move onto the next string.
The final code:
new_list = []
for sentence in old_list:
sentence_list = []
word = ''
for ch in sentence:
if ch == ' ' and word != '':
sentence_list.append(word)
word = ''
else:
word += ch
if word != '':
sentence_list.append(word)
new_list.append(sentence_list)
This is equivalent to
new_list = []
for sentence in old_list:
new_list.append(sentence.split(' '))
or using list comprehensions
new_list = [sentence.split(' ') for sentence in old_list]

This is simple code to split a char value from a string value; i.e
INPUT : UDDDUDUDU
s = [str(i) for i in input().strip()]
print(s)
OUTPUT: ['U','D','D','D','U','D','U','D','U']

sentence = 'This is a sentence'
word=""
for w in sentence :
if w.isalpha():
word=word+w
elif not w.isalpha():
print(word)
word=""
print(word)

string1 = 'bella ciao amigos'
split_list = []
tmp = ''
for s in string1:
if s == ' ':
split_list.append(tmp)
tmp = ''
else:
tmp += s
if tmp:
split_list.append(tmp)
print(split_list)
Output:
------> ['bella', 'ciao', 'amigos']
reverse_list = split_list[::-1]
print(reverse_list)
Output:
------> ['amigos', 'ciao', 'bella']

def mysplit(strng):
strng = strng.lstrip()
strng = strng.rstrip()
lst=[]
temp=''
for i in strng:
if i == ' ':
lst.append(temp)
temp = ''
else:
temp += i
if temp:
lst.append(temp)
return lst
print(mysplit("Hello World"))
print(mysplit(" "))
print(mysplit(" abc "))
print(mysplit(""))

This is one of the most accurate replicas of split method:
def splitter(x, y = ' '):
l = []
for i in range(x.count(y) + 1):
a = ''
for i in x:
if i == y: break
a += i
x = x[len(a) + 1 : len(x)]
l.append(a)
return ([i for i in l if i != ''])

my_str='This is a sentence'
split_value = []
tmp = ''
for i in my_str+' ':
if i == ' ':
split_value.append(tmp)
tmp = ''
else:
tmp += i
print(split_value)
Just a small modification to the code already given

Python - Pyg Latin?

I am trying to expand on the Codecademy pig latin convertor so that it accepts sentences rather than just single words and converts each word in the sentence. Here's the code that I have:
pyg = 'ay'
pyg_input = raw_input("Please enter a sentence: ")
print
if len(pyg_input) > 0 and pyg_input.isalpha():
lwr_input = pyg_input.lower()
lst = lwr_input.split()
for item in lst:
frst = lst[item][0]
if frst == 'a' or frst == 'e' or frst == 'i' or frst == 'o' or frst == 'u':
lst[item] = lst[item] + pyg
else:
lst[item] = lst[item][1:len(lst[item]) + frst + pyg
print ' '.join(lst)
I'm not sure what is wrong so I am grateful of any help.
Thanks

Sentence can contain non-alphabet (for example space): so pyg_input.isalpha() will yield False:
You're using lst[item] to access each character. Instead use item.
You cannot update list while you iterate the list. In the following code I used another list called latin.
Your code have a SyntaxError in following line (no closing braket):
lst[item][1:len(lst[item])
The following code is not perfect. For example, you need to filter out non-alphabet such as ,, ., ...
pyg = 'ay'
pyg_input = raw_input("Please enter a sentence: ")
print
if len(pyg_input) > 0:# and pyg_input.isalpha():
lwr_input = pyg_input.lower()
lst = lwr_input.split()
latin = []
for item in lst:
frst = item[0]
if frst in 'aeiou':
item = item + pyg
else:
item = item[1:] + frst + pyg
latin.append(item)
print ' '.join(latin)

I have tried the below way to implement the pyg_latin translator
import enchant
input_str = raw_input("Enter a word:")
d = enchant.Dict("en_US")
d.check(input_str)
pyg_latin = input_str[1:]+input_str[0]+"ay"
print pyg_latin

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

split string by all non alphabetic character occurences in Python - python

Related

Python iterations mischaracterizes string value

How do I convert all the vowels within a string to white-spaces in python

Remove white spaces in string without split function

Splitting strings in Python without split()

Python - Pyg Latin?

Categories

Resources