I have a string eg:
line = "a sentence with a few words"
I want to convert the above in a string with each of the words in double quotes, eg:
'"a" "sentence" "with" "a" "few" "words"'
Any suggestions?
Split the line into words, wrap each word in quotes, then re-join:
' '.join('"{}"'.format(word) for word in line.split(' '))
Since you say:
I want to convert the above in a string with each of the words in double quotes
You can use the following regex:
>>> line="a sentence with a few words"
>>> import re
>>> re.sub(r'(\w+)',r'"\1"',line)
'"a" "sentence" "with" "a" "few" "words"'
This would take into consideration punctuations, etc, as well (if that is really what you wanted):
>>> line="a sentence with a few words. And, lots of punctuations!"
>>> re.sub(r'(\w+)',r'"\1"',line)
'"a" "sentence" "with" "a" "few" "words". "And", "lots" "of" "punctuations"!'
Or you can something simpler (more implementation but easier for beginners) by searching for each space in the quote then slice whatever between the spaces, add " before and after it then print it.
quote = "they stumble who run fast"
first_space = 0
last_space = quote.find(" ")
while last_space != -1:
print("\"" + quote[first_space:last_space] + "\"")
first_space = last_space + 1
last_space = quote.find(" ",last_space + 1)
Above code will output for you the following:
"they"
"stumble"
"who"
"run"
The first answer missed an instance of the original quote. The last string/word "fast" was not printed.
This solution will print the last string:
quote = "they stumble who run fast"
start = 0
location = quote.find(" ")
while location >=0:
index_word = quote[start:location]
print(index_word)
start = location + 1
location = quote.find(" ", location + 1)
#this runs outside the While Loop, will print the final word
index_word = quote[start:]
print(index_word)
This is the result:
they
stumble
who
run
fast
Related
I need to find a way to figure out a way to find the exact word in a string.
All the information I have read online has only given me how to search for letters in a string, so
98787This is correct
will still come back as true in an if statement.
This is what I have so far.
if 'This is correct' in text:
print("correct")
This will work with any combination of letters before the This is correct... For example fkrjThis is correct, 4123This is correct and lolThis is correct will all come back as true in the if statement. When I want it to come back as true only if it exactly matches This is correct.
You can use the word-boundaries of regular expressions. Example:
import re
s = '98787This is correct'
for words in ['This is correct', 'This', 'is', 'correct']:
if re.search(r'\b' + words + r'\b', s):
print('{0} found'.format(words))
That yields:
is found
correct found
For an exact match, replace \b assertions with ^ and $ to restrict the match to the begin and end of line.
Use the comparison operator == instead of in then:
if text == 'This is correct':
print("Correct")
This will check to see if the whole string is just 'This is correct'. If it isn't, it will be False
Actually, you should look for 'This is correct' string surrounded by word boundaries.
So
import re
if re.search(r'\bThis is correct\b', text):
print('correct')
should work for you.
I suspect that you are looking for the startswith() function. This checks to see if the characters in a string match at the start of another string
"abcde".startswith("abc") -> true
"abcde".startswith("bcd") -> false
There is also the endswith() function, for checking at the other end.
You can make a few changes.
elif 'This is correct' in text[:len('This is correct')]:
or
elif ' This is correct ' in ' '+text+' ':
Both work. The latter is more flexible.
It could be a complicated problem if we want to solve it without using regular expression. But I came up with a little trick.
First we need to pad the original string with whitespaces.
After that we can search the text, which is also padded with whitespaces.
Example code here:
incorrect_str = "98787This is correct"
correct_str = "This is a great day and This is correct"
# Padding with whitespaces
new_incorrect_str = " " + incorrect_str + " "
new_correct_str = " " + correct_str + " "
if " This is correct " in new_correct_str:
print("correct")
else:
print("incorrect")
Break up the string into a list of strings with .split() then use the in operator.
This is much simpler than using regular expressions.
Below is a solution without using regular expressions. The program searches for exact word in this case 'CASINO' and prints the sentence.
words_list = [ "The Learn Python Challenge Casino.", "They bought a car while at
the casino", "Casinoville" ]
search_string = 'CASINO'
def text_manipulation(words_list, search_string):
search_result = []
for sentence in words_list:
words = sentence.replace('.', '').replace(',', '').split(' ')
[search_result.append(sentence) for w in words if w.upper() ==
search_string]
print(search_result)
text_manipulation(words_list, search_string)
This will print the results - ['The Learn Python Challenge Casino.', 'They bought a car while at the casino']
I am trying to build a function that will take a string and print every other letter of the string, but it has to be without the spaces.
For example:
def PrintString(string1):
for i in range(0, len(string1)):
if i%2==0:
print(string1[i], sep="")
PrintString('My Name is Sumit')
It shows the output:
M
a
e
i
u
i
But I don't want the spaces. Any help would be appreciated.
Use stepsize string1[::2] to iterate over every 2nd character from string and ignore if it is " "
def PrintString(string1):
print("".join([i for i in string1[::2] if i!=" "]))
PrintString('My Name is Sumit')
Remove all the spaces before you do the loop.
And there's no need to test i%2 in the loop. Use a slice that returns every other character.
def PrintString(string1):
string1 = string1.replace(' ', '')
print(string1[::2])
Replace all the spaces and get every other letter
def PrintString(string1):
return print(string1.replace(" ", "") [::2])
PrintString('My Name is Sumit')
It depends if you want to first remove the spaces and then pick every second letter or take every second letter and print it, unless it is a space:
s = "My name is Summit"
print(s.replace(" ", "")[::2])
print(''.join([ch for ch in s[::2] if ch != " "]))
Prints:
MnmiSmi
Maeiumt
You could alway create a quick function for it where you just simply replace the spaces with an empty string instead.
Example
def remove(string):
return string.replace(" ", "")
There's a lot of different approaches to this problem. This thread explains it pretty well in my opinion: https://www.geeksforgeeks.org/python-remove-spaces-from-a-string/
I have been working on a program which will take a hex file, and if the file name starts with "CID", then it should remove the first 104 characters, and after that point there is a few words. I also want to remove everything after the words, but the problem is the part I want to isolate varies in length.
My code is currently like this:
y = 0
import os
files = os.listdir(".")
filenames = []
for names in files:
if names.endswith(".uexp"):
filenames.append(names)
y +=1
print(y)
print(filenames)
for x in range(1,y):
filenamestart = (filenames[x][0:3])
print(filenamestart)
if filenamestart == "CID":
openFile = open(filenames[x],'r')
fileContents = (openFile.read())
ItemName = (fileContents[104:])
print(ItemName)
Input Example file (pulled from HxD):
.........................ýÿÿÿ................E.................!...1AC9816A4D34966936605BB7EFBC0841.....Sun Tan Specialist.................9.................!...9658361F4EFF6B98FF153898E58C9D52.....Outfit.................D.................!...F37BE72345271144C16FECAFE6A46F2A.....Don't get burned............................................................................................................................Áƒ*ž
I have got it working to remove the first 104 characters, but I would also like to remove the characters after 'Sun Tan Specialist', which will differ in length, so I am left with only that part.
I appreciate any help that anyone can give me.
One way to remove non-alphabetic characters in a string is to use regular expressions [1].
>>> import re
>>> re.sub(r'[^a-z]', '', "lol123\t")
'lol'
EDIT
The first argument r'[^a-z]' is the pattern that captures what will removed (here, by replacing it by an empty string ''). The square brackets are used to denote a category (the pattern will match anything in this category), the ^ is a "not" operator and the a-z denotes all the small caps alphabetiv characters. More information here:
https://docs.python.org/3/library/re.html#regular-expression-syntax
So for instance, to keep also capital letters and spaces it would be:
>>> re.sub(r'[^a-zA-Z ]', '', 'Lol !this *is* a3 -test\t12378')
'Lol this is a test'
However from the data you give in your question the exact process you need seems to be a bit more complicated than just "getting rid of non-alphabetical characters".
You can use filter:
import string
print(''.join(filter(lambda character: character in string.ascii_letters + string.digits, '(ABC), DEF!'))) # => ABCDEF
You mentioned in a comment that you got the string down to Sun Tan SpecialistFEFFBFFECDOutfitDFBECFECAFEAFADont get burned
Essentially your goal at this point is to remove any uppercase letter that isn't immediately followed by a lowercase letter because Upper Lower indicates the start of a phrase. You can use a for loop to do this.
import re
h = "Sun Tan SpecialistFEFFBFFECDOutfitDFBECFECAFEAFADont get burned"
output = ""
for i in range(0, len(h)):
# Keep spaces
if h[i] is " ":
output += h[i]
# Start of a phrase found, so separate with space and store character
elif h[i].isupper() and h[i+1].islower():
output += " " + h[i]
# We want all lowercase characters
elif h[i].islower():
output += h[i]
# [1:] because we appended a space to the start of every word
print output[1:]
# If you dont care about Outfit since it is always there, remove it
print output[1:].replace("Outfit", "")
Output:
Sun Tan Specialist Outfit Dont get burned
Sun Tan Specialist Dont get burned
i need to make a program that will capitalize the first word in a sentence and i want to be sure that all the special characters that are used to end a sentence can be used.
i can not import anything! this is for a class and i just want some examples to do this.
i have tried to use if to look in the list to see if it finds the matching character and do the correct split operatrion...
this is the function i have now... i know its not good at all as it just returns the original string...
def getSplit(userString):
userStringList = []
if "? " in userString:
userStringList=userString.split("? ")
elif "! " in userStringList:
userStringList = userString.split("! ")
elif ". " in userStringList:
userStringList = userString.split(". ")
else:
userStringList = userString
return userStringList
i want to be able to input something like this is a test. this is a test? this is definitely a test!
and get [this is a test.', 'this is a test?', 'this is definitely a test!']
and the this is going to send the list of sentences to another function to make the the first letter capitalized for each sentence.
this is an old homework assignment that i could only make it use one special character to separate the string into a list. buti want to user to be able to put in more then just one kind of sentence...
This may hep. use str.replace to replace special chars with space and the use str.split
Ex:
def getSplit(userString):
return userString.replace("!", " ").replace("?", " ").replace(".", " ").split()
print(map(lambda x:x.capitalize, getSplit("sdfsdf! sdfsdfdf? sdfsfdsf.sdfsdfsd!fdfgdfg?dsfdsfgf")))
Normally, you could use re.split(), but since you cannot import anything, the best option would be just to do a for loop. Here it is:
def getSplit(user_input):
n = len(user_input)
sentences =[]
previdx = 0
for i in range(n - 1):
if(user_input[i:i+2] in ['. ', '! ', '? ']):
sentences.append(user_input[previdx:i+2].capitalize())
previdx = i + 2
sentences.append(user_input[previdx:n].capitalize())
return "".join(sentences)
I would split the string at each white space. Then scan the list for words that contain the special character. If any is present, the next word is capitalised. Join the list back at the end. Of course, this assumes that there are no more than two consecutive spaces between words.
def capitalise(text):
words = text.split()
new_words = [words[0].capitalize()]
i = 1
while i < len(words) - 1:
new_words.append(words[i])
if "." in words[i] or "!" in words[i] or "?" in words[i]:
i += 1
new_words.append(words[i].capitalize())
i += 1
return " ".join(new_words)
If you can use the re module which is available by default in python, this is how you could do it:
import re
a = 'test this. and that, and maybe something else?even without space. or with multiple.\nor line breaks.'
print(re.sub(r'[.!?]\s*\w', lambda x: x.group(0).upper(), a))
Would lead to:
test this. And that, and maybe something else?Even without space. Or with multiple.\nOr line breaks.
Here is a long string that I convert to a list so I can manipulate it, and then join it back together. I am having some trouble being able to have an iterator go through the list and when the iterator reach, let us say every 5th object, it should insert a '\n' right there. Here is an example:
string = "Hello my name is Josh I like pizza and python I need this string to be really really long"
string = string.split()
# do the magic here
string = ' '.join(string)
print(string)
Output:
Hello my name is Josh
I like pizza and python
I need this string to
be really really long
Any idea how i can achieve this?
I tried using:
for words in string:
if words % 5 == 0:
string.append('\n')
but it doesn't work. What am I missing?
What you're doing wrong is attempting to change string in your example which doesn't affect the string contained in your list... instead you need to index into the list and directly change the element.
text = "Hello my name is Josh I like pizza and python I need this string to be really really long"
words = text.split()
for wordno in range(len(words)):
if wordno and wordno % 5 == 0:
words[wordno] += '\n'
print ' '.join(words)
You don't want to call something string as it's a builtin module that is sometimes used and may confuse things, and I've also checked that wordno isn't 0 else you'll end up with a single word line in your rejoin...
The problem with the for loop you attempted to use, is that it didn't keep the index of the word, and thus could not determine which word that was the 5th. By using enumerate(iterable) you can get the index of the word, and the word at the same time. You could also just use range(len(iterable)) to get the index and just do it the same way.
string = "Hello my name is Josh I like pizza and python I need this string to be really really long"
string = string.split()
for word_num, word in enumerate(string):
if word_num and word_num % 5 == 0:
# access the array since changing only the variable word wont work
string[word_num] += '\n'
string = ' '.join(string)
print(string)
Edit:
As #JonClements pointed out, this causes "Hello" to be printed on its own line, because 0%5 = 0. Therefore I added a check to see if word_num evaluates to True (which it does if it is not equal to 0)
In this case you should probably be creating a new string instead of trying to modify the existing one. You can just use a counter to determine which word you're on. Here's a very simple solution, though Jon Clements has a more sophisticated (and probably more efficient) one:
newstring = ""
str = "Hello my name is Josh I like pizza and python I need this string to be really really long"
strlist = str.split()
for i, words in enumerate(strlist):
newstring += words
if (i + 1) % 5 == 0:
newstring += "\n"
else:
newstring += " "
`enumerate1 returns both the index of the word in your list of words, as well as the word itself. It's a handy automated counter to determine which word you're on.
Also, don't actually name your string string. That's the name of a module that you don't want to overwrite.
I am sure this could be much shorter but this does what you want. They key improvements are a new string to hold everything and the use of enumerate to catch the ith word in the list.
string = "Hello my name is Josh I like pizza and python I need this string to be really really long"
string2 = ""
for (i,word) in enumerate(string.split()):
string2 = string2 + " " + word
if (i + 1) % 5 == 0:
string2 = string2 + '\n'
print(string2)
you can use enumerate() on your string after you split it.
and iterate like that:
new_string_list = []
for index, word in string:
if (index + 1) % 5 == 0:
word += '\n'
new_string_list.append(word)
string = ' '.join(new_string_list)
Use enumerate to get index and % operation on i to append '\n' every n blocks
new_string_list = []
strlist = oldstr.split()
// do magic here
for i,word in enumerate(strlist):
new_string_list.append(word) #it is better
#to append to list then join, using + to concatenate strings is ineffecient
if count % 5 == 0:
new_string_list.append('\n')
print("".join(new_string_list))
Using list slicing...
s='hello my name is josh i like pizza and python i need this string to be really really long'
l=s.split()
l[4::5]=[v+'\n' for v in l[4::5] ]
print '\n'.join(' '.join(l).split('\n '))
hello my name is josh
i like pizza and python
i need this string to
be really really long
Horrible one-liner I cooked up as an example of what you should never do.
s='hello my name is josh i like pizza and python i need this string to be really really long'
print '\n'.join([' '.join(l) for l in [s.split()[i:i + 4] for i in range(0, len(s.split()), 4)]])