Is it possible to check element of list? If it has the same word as in "test01.txt" then replace with space?
test01.txt:
to
her
too
a
for
In the codes:
with open('C:/test01.txt') as words:
ws = words.read().splitlines()
with open('C:/test02.txt') as file_modify4:
for x in file_modify4:
sx = map(str.strip, x.split("\t"))
ssx = sx[0].split(" ")
print ssx
Results from "print ssx":
['wow']
['listens', 'to', 'her', 'music']
['too', 'good']
['a', 'film', 'for', 'stunt', 'scheduling', 'i', 'think']
['really', 'enjoyed']
How to replace the element in ssx?
Expected result:
['wow']
['listens', ' ', ' ', 'music']
[' ', 'good']
[' ', 'film', ' ', 'stunt', 'scheduling', 'i', 'think']
['really', 'enjoyed']
Any suggestion?
Use list comprehensions; storing the words in a set first for faster testing:
ws = set(ws)
# ...
ssx = [w if w not in ws else ' ' for w in ssx]
or, as a complete solution:
with open('C:/test01.txt') as words:
ws = set(words.read().splitlines())
with open('C:/test02.txt') as file_modify4:
for x in file_modify4:
ssx = [w if w not in ws else ' ' for w in x.strip().split('\t')[0].split()]
print ssx
The naive solution is:
new_ssx = []
for word in ssx:
if word in ws:
new_ssx.append(' ')
else:
new_ssx.append(word)
Of course whenever you have an empty list that you just append to in a loop, you can turn it into a list comprehension:
new_ssx = [' ' if word in ws else word for word in ssx]
If ws is more than a few words, you probably want to turn it into a set to make the lookups faster.
So, putting it all together:
with open('C:/test01.txt') as words:
ws = set(words.read().splitlines())
with open('C:/test02.txt') as file_modify4:
for x in file_modify4:
sx = map(str.strip, x.split("\t"))
ssx = sx[0].split(" ")
new_ssx = [' ' if word in ws else word for word in ssx]
print new_ssx
Related
I am looking output string having vowels removed.
Input: My name is 123
Output: my 123
I tried below code:
def without_vowels(sentence):
vowels = 'aeiou'
word = sentence.split()
for l in word:
for k in l:
if k in vowels:
l = ''
without_vowels('my name 123')
Can anyone give me result using list compression ?
You can use regex with search chars with 'a|e|i|o|u' with .lower() for words if have upper char like below:
>>> import re
>>> st = 'My nAmE Is 123 MUe'
>>> [s for s in st.split() if not re.search(r'a|e|i|o|u',s.lower())]
['My', '123']
>>> ' '.join(s for s in st.split() if not re.search(r'a|e|i|o|u',s.lower()))
'My 123'
This is one way to do it
def without_vowels(sentence):
words = sentence.split()
vowels = ['a', 'e', 'i', 'o', 'u']
cleaned_words = [w for w in words if not any(v in w for v in vowels)]
cleaned_string = ' '.join(cleaned_words)
print(cleaned_string)
Outputs my 123
def rem_vowel(string):
vowels = ['a','e','i','o','u']
result = [letter for letter in string if letter.lower() not in vowels]
result = ''.join(result)
print(result)
string = "My name is 123"
rem_vowel(string)
import re
def rem_vowel(string):
return (re.sub("[aeiouAEIOU]","",string))
Driver program
string = " I am uma Bhargav "
print rem_vowel(string)
I was wondering if it would be possible to split a string such as
string = 'hello world [Im nick][introduction]'
into an array such as
['hello', 'world', '[Im nick][introduction]']
It doesn't have to be efficient, but just a way to get all the words from a sentence split unless they are in brackets, where the whole sentence is not split.
I need this because I have a markdown file with sentences such as
- What is the weather in [San antonio, texas][location]
I need the san antonio texas to be a full sentence inside of an array, would this be possible? The array would look like:
array = ['what', 'is', 'the', 'weather', 'in', 'San antonio, texas][location]']
Maybe this could work for you:
>>> s = 'What is the weather in [San antonio, texas][location]'
>>> i1 = s.index('[')
>>> i2 = s.index('[', i1 + 1)
>>> part_1 = s[:i1].split() # everything before the first bracket
>>> part_2 = [s[i1:i2], ] # first bracket pair
>>> part_3 = [s[i2:], ] # second bracket pair
>>> parts = part_1 + part_2 + part_3
>>> s
'What is the weather in [San antonio, texas][location]'
>>> parts
['What', 'is', 'the', 'weather', 'in', '[San antonio, texas]', '[location]']
It searches for the left brackets and uses that as a reference before splitting by spaces.
This assumes:
that there is no other text between the first closing bracket and the second opening bracket.
that there is nothing after the second closing bracket
Here is a more robust solution:
def do_split(s):
parts = []
while '[' in s:
start = s.index('[')
end = s.index(']', s.index(']')+1) + 1 # looks for second closing bracket
parts.extend(s[:start].split()) # everything before the opening bracket
parts.append(s[start:end]) # 2 pairs of brackets
s = s[end:] # remove processed part of the string
parts.extend(s.split()) # add remainder
return parts
This yields:
>>> do_split('What is the weather in [San antonio, texas][location] on [friday][date]?')
['What', 'is', 'the', 'weather', 'in', '[San antonio, texas][location]', 'on', '[friday][date]', '?']
Maybe this short snippet can help you. But note that this only works if everything you said holds true for all the entries in the file.
s = 'What is the weather in [San antonio, texas][location]'
s = s.split(' [')
s[1] = '[' + s[1] # add back the split character
mod = s[0] # store in a variable
mod = mod.split(' ') # split the first part on space
mod.append(s[1]) # attach back the right part
print(mod)
Outputs:
['What', 'is', 'the', 'weather', 'in', '[San antonio, texas][location]']
and for s = 'hello world [Im nick][introduction]'
['hello', 'world', '[Im nick][introduction]']
For an one liner use functional programming tools such as reduce from the functool module
reduce( lambda x, y: x.append(y) if y and y.endswith("]") else x + y.split(), s.split(" ["))
or, slightly shorter with using standard operators, map and sum
sum(map( lambda x: [x] if x and x.endswith("]") else x.split()), []) s.split(" ["))
This code below will work with your example. Hope it helps :)
I'm sure it can be better but now I have to go. Please enjoy.
string = 'hello world [Im nick][introduction]'
list = string.split(' ')
finall = []
for idx, elem in enumerate(list):
currentelem = elem
if currentelem[0] == '[' and currentelem[-1] != ']':
currentelem += list[(idx + 1) % len(list)]
finall.append(currentelem)
elif currentelem[0] != '[' and currentelem[-1] != ']':
finall.append(currentelem)
print(finall)
Let me offer an alternative to the ones above:
import re
string = 'hello world [Im nick][introduction]'
re.findall(r'(\[.+\]|\w+)', string)
Produces:
['hello', 'world', '[Im nick][introduction]']
you can use regex split with lookbehind/lookahead, note it is simple to filter out empty entries with filter or a list comprehension than avoid in re
import re
s = 'sss sss bbb [zss sss][zsss ss] sss sss bbb [ss sss][sss ss]'
[x for x in re.split(r"(?=\[[^\]\[]+\])* ", s)] if x]
I write a code that has str data
def characters(self, content):
self.contentText = content.split()
# self.contentText is List here
I am sending self.contentText list to another module as:
self.contentText = Formatter.formatter(self.contentText)
In this method, I am writing below code:
remArticles = remArticles = {' a ':'', ' the ':'', ' and ':'', ' an ':'', '& nbsp;':''}
contentText = [i for i in contentText if i not in remArticles.keys()]
But it is not replacing. Is it that remArticles should be list and not dict
But I tried replacing it with list too. It wouldn't simply replace.
ofcourse with list, below will be the code:
contentText = [i for i in contentText if i not in remArticles]
This is continuation from Accessing Python List Type
Initially I was trying:
for i in remArticles:
print type(contentText)
print "1"
contentText = contentText.replace(i, remArticles[i])
print type(contentText)
But that threw errors:
contentText = contentText.replace(i, remArticles[i])
AttributeError: 'list' object has no attribute 'replace'
Your question is not clear but if your goal is to convert a string to a list, remove unwanted words, and then turn the list back into a string, then you can do it like this:
def clean_string(s):
words_to_remove = ['a', 'the', 'and', 'an', ' ']
list_of_words = s.split()
cleaned_list = [word for word in list_of_words if word not in words_to_remove]
new_string = ' '.join(cleaned_list)
return new_string
This is how you could do the same without converting to a list:
def clean_string(s):
words_to_remove = ['a', 'the', 'and', 'an', ' ']
for word in words_to_remove:
s = s.replace(word, '')
return s
And if you wanted more flexibility in removing some words but replacing others, you could do the following with a dictionary:
def clean_string(s):
words_to_replace = {'a': '', 'the': '', 'and': '&', 'an': '', ' ': ' '}
for old, new in words_to_replace.items():
s = s.replace(old, new)
return s
Your problem is that your map contains spaces within the keys. Following code solves your problem:
[i for i in contentText if i not in map(lambda x: x.strip(), remArticles.keys())]
I want to get rid of the white space at the end of each line.
w = input("Words: ")
w = w.split()
k = 1
length = []
for ws in w:
length.append(len(ws))
y = sorted(length)
while k <= y[-1]:
if k in length:
for ws in w:
if len(ws) != k:
continue
else:
print(ws, end=" ")
print("")
k += 1
The out put is giving me lines of words in assessing lengths eg if I type in I do love QI;
I
do QI
love
But it has white space at the end of each line. If I try to .rstrip() it I also delete the spaces between the words and get;
I
doQI
love
Use " ".join(ws) instead and it will auto but them on the same line (you will need to create a list rather than a string)
re.sub(r"[ ]*$","",x)
You use use re.sub of re module.
you need to use rstrip
demo:
>>> 'hello '.rstrip()
'hello'
rstrip removes any whitespace from right
lstrip removes whitespace from left:
>>> ' hello '.lstrip()
'hello '
while strip removes from both end:
>>> ' hello '.strip()
'hello'
you need to use split to convert them to list
>>> "hello,how,are,you".split(',') # if ',' is the delimiter
['hello', 'how', 'are', 'you']
>>> "hello how are you".split() # if whitespace is delimiter
['hello', 'how', 'are', 'you']
if i have a list of strings-
common = ['the','in','a','for','is']
and i have a sentence broken up into a list-
lst = ['the', 'man', 'is', 'in', 'the', 'barrel']
how can i compare the two,and if there are any words in common, then print the full string again as a title. I have part of it working but my end result prints out the newly changed in common strings as well as the original.
new_title = lst.pop(0).title()
for word in lst:
for word2 in common:
if word == word2:
new_title = new_title + ' ' + word
new_title = new_title + ' ' + word.title()
print(new_title)
output:
The Man is Is in In the The Barrel
so I'm trying to get it so that the lower case words in common, stay in the new sentence, without the originals, and without them changing into title case.
>>> new_title = ' '.join(w.title() if w not in common else w for w in lst)
>>> new_title = new_title[0].capitalize() + new_title[1:]
'The Man Is in the Barrel'
If all you’re trying to do is to see whether any of the elements of lst appear in common, you can do
>>> common = ['the','in','a','for']
>>> lst = ['the', 'man', 'is', 'in', 'the', 'barrel']
>>> list(set(common).intersection(lst))
['the', 'in']
and just check to see whether the resulting list has any elements in it.
If you want the words in common to be lowercased and you want all of the other words to be uppercased, do something like this:
def title_case(words):
common = {'the','in','a','for'}
partial = ' '.join(word.title() if word not in common else word for word in words)
return partial[0].capitalize() + partial[1:]
words = ['the', 'man', 'is', 'in', 'the', 'barrel']
title_case(words) # gives "The Man Is in the Barrel"