Sorry if the title isn't descriptive enough. Basically, I have a list like
["The house is red.", "Yes it is red.", "Very very red."]
and I'd like to insert the word "super" before the first character, between the middle characters and after the last character of each string. So I would have something like this for the first element:
["superThe houssupere is red.super",...]
How would I do this? I know with strings I could use add the "super" string to the beginning of my string then use len() to go to the middle of the string and add "super". Is there a way to get this to work with a list or should I try a different approach?
The method used here is to iterate through the original list, splitting each item into two halves and building the final item string using .format before appending it into a new list.
orig_list = ["The house is red.", "Yes it is red.", "Very very red."]
new_list = []
word = 'super'
for item in orig_list:
first_half = item[:len(item) // 2]
second_half = item[len(item) // 2:]
item = '{}{}{}{}{}'.format(word, first_half, word, second_half, word)
new_list.append(item)
Related
I'm looking for a fast approach to find all the indexes in string which match with items (one or multiple words). Actually I do not need index in list I need index in string.
I have a list of words and an string like these:
words = ['must', 'shall', 'may','should','forbidden','car',...]
string= 'you should wash the car every day'
desired output:
[1,4]# should=1, car=4
The length of list some times can be more than hundreds of items and string more that tens of thousands.
I'm looking for a so fast approach because it is called a thousand times in each iteration.
I know how to implement it with loops and check all the items one-by-one but it's so slow!
One solution is make words set instead of list and then do simple list comprehension:
words = {'must', 'shall', 'may','should','forbidden','car'}
string= 'you should wash the car every day'
out = [i for i, w in enumerate(string.split()) if w in words]
print(out)
Prints:
[1, 4]
You need the Aho Corasick algorithm to this.
Given a set of strings and a text, it finds occurrences of all strings from the set in the given text in O(len+ans), where len is the length of the text and ans is the size of the answer.
It uses an automaton and can be modified to suit your needs.
You can use dictionaries
time complexity for look up dictionary is O(1)
string = 'you should wash the car every day'
wordToIndex = {word: index for index, word in enumerate(string.split())}
words = ['must', 'shall', 'may','should','forbidden','car']
result = [wordToIndex[word] for word in words if word in wordToIndex]
# [1,4]
Use list comprehension,
print([string.split().index(i) for i in string.split() if i in words])
#[1,4]
I have a list of string and I want to take the last "word" of it, explanation :
Here's my code :
myList = ["code 53 value 281", "code 53 value 25", ....]
And I want to take only the number at the end :
myList = ["281", "25", ....]
Thank you.
Let's break down your problem.
So first off, you've got a list of strings. You know that each string will end with some kind of numeric value, you want to pull that out and store it in the list. Basically, you want to get rid of everything except for that last numeric value.
To write it in code terms, we need to iterate on that list, split each string by a space character ' ', then grab the last word from that collection, and store it in the list.
There are quite a few ways you could do this, but the simplest would be list comprehension.
myList = ["Hey 123", "Hello 456", "Bye 789"] # we want 123, 456, 789
myNumericList = [x.split(' ')[-1] for x in myList]
# for x in myList is pretty obvious, looks like a normal for loop
# x.split(' ') will split the string by the space, as an example, "Hey 123" would become ["Hey", "123"]
# [-1] gets the last element from the collection
print(myNumericList) # "123", "456", "789"
I don't know why you would want to check if there are integers in your text, extract them and then convert them back to a string and add to a list. Anyhow, you can use .split() to split the text on spaces and then try to interpret the splitted strings as integers, like so:
myList = ["code 53 value 281", "code 53 value 25"]
list = []
for var in myList:
list.append(var.split()[-1])
print(list)
Loop through the list and for a particular value at i-th index in the list simply pick the last value.
See code section below:
ans=[]
for i in myList:
ans.append(i.split(" ")[-1])
print(ans)
This question already has answers here:
How to extract the first and final words from a string?
(7 answers)
Closed 5 years ago.
Heres the question I have to answer for school
For the purposes of this question, we will define a word as ending a sentence if that word is immediately followed by a period. For example, in the text “This is a sentence. The last sentence had four words.”, the ending words are ‘sentence’ and ‘words’. In a similar fashion, we will define the starting word of a sentence as any word that is preceded by the end of a sentence. The starting words from the previous example text would be “The”. You do not need to consider the first word of the text as a starting word. Write a program that has:
An endwords function that takes a single string argument. This functioin must return a list of all sentence ending words that appear in the given string. There should be no duplicate entries in the returned list and the periods should not be included in the ending words.
The code I have so far is:
def startwords(astring):
mylist = astring.split()
if mylist.endswith('.') == True:
return my list
but I don't know if I'm using the right approach. I need some help
Several issues with your code. The following would be a simple approach. Create a list of bigrams and pick the second token of each bigram where the first token ends with a period:
def startwords(astring):
mylist = astring.split() # a list! Has no 'endswith' method
bigrams = zip(mylist, mylist[1:])
return [b[1] for b in bigrams if b[0].endswith('.')]
zip and list comprehenion are two things worth reading up on.
mylist = astring.split()
if mylist.endswith('.')
that cannot work, one of the reasons being that mylist is a list, and doesn't have endswith as a method.
Another answer fixed your approach so let me propose a regular expression solution:
import re
print(re.findall(r"\.\s*(\w+)","This is a sentence. The last sentence had four words."))
match all words following a dot and optional spaces
result: ['The']
def endwords(astring):
mylist = astring.split('.')
temp_words = [x.rpartition(" ")[-1] for x in mylist if len(x) > 1]
return list(set(temp_words))
This creates a set so there are no duplicates. Then goes on a for loop in a list of sentences (split by ".") then for each sentence, splits it in words then using [:-1] makes a list of the last word only and gets [0] item in that list.
print (set([ x.split()[:-1][0] for x in s.split(".") if len(x.split())>0]))
The if in theory is not needed but i couldn't make it work without it.
This works as well:
print (set([ x.split() [len(x.split())-1] for x in s.split(".") if len(x.split())>0]))
This is one way to do it ->
#!/bin/env/ python
from sets import Set
sentence = 'This is a sentence. The last sentence had four words.'
uniq_end_words = Set()
for word in sentence.split():
if '.' in word:
# check if period (.) is at the end
if '.' == word[len(word) -1]:
uniq_end_words.add(word.rstrip('.'))
print list(uniq_end_words)
Output (list of all the end words in a given sentence) ->
['words', 'sentence']
If your input string has a period in one of its word (lets say the last word), something like this ->
'I like the documentation of numpy.random.rand.'
The output would be - ['numpy.random.rand']
And for input string 'I like the documentation of numpy.random.rand a lot.'
The output would be - ['lot']
So I have a list that has a strings in the form of a sentence as each element, like this
a = ["This is a sentence with some words.", "And this is a sentence as well.", "Also this right here is a sentence."]
What I want to do with this list is to only keep the third and fourth word of each string, so in the end I want a list like
b = ["a sentence", "is a", "right here"]
The first thing to do I presume is to split the list after spaces, so something like
for x in a:
x.split()
However I'm a bit confused on how to continue. The above loop should produce basically one list per sentence where every word is an own element. I thought about doing this
e = []
for x in a:
x.split()
a = x[0:2]
a = x[2:]
e.append(a)
but instead of removing words it removes characters and I get the following output
['is is a sentence with some words.', 'd this is a sentence as well.', 'so this right here is a sentence.']
I'm not sure why it produces this behavior. I have been sitting at this for a while now and probably missed something really stupid, so I would really appreciate some help.
Nothing can modify a string, they are immutable. You can only derive data from it. As others have said, you need to store the value of .split().
Lists are mutable but slicing them also does not modify them in place, it creates a new sublist which you need to store somewhere. Overall this can be done like so:
e = [' '.join(x.split()[2:4]) for x in a]
The whole thing is a list comprehension in case you're not familiar. .join() converts the sublist back into a string.
When you do x.split(), the output does not take effect on x itself, it results in a list of strings, since strings are not mutable:
lst = s.split(),
Then just join your desired items:
e.append(' '.join(lst[2:4]))
Strings are immutable. x.split() returns a list of strings, but does not modify x. However you do not capture that return value, so it is lost.
I am looking for a specific string in a list; this string is part of a longer string.
Basically i loop trough a text file and add each string in a different element of a list. Now my objective is to scan the whole list to find out if any of the elements string contain a specific string.
example of the source file:
asfasdasdasd
asdasdasdasdasd mystring asdasdasdasd
asdasdasdasdasdasdadasdasdasdas
Now imagine that each of the 3 string is in an element of the list; and you want to know if the list has the string "my string" in any of it's elements (i don't need to know where is it, or how many occurrence of the string are in the list). I tried to get it with this, but it seems to not find any occurrence
work_list=["asfasdasdasd", "asdasdasdasd my string asdasdasdasd", "asdadadasdasdasdas"]
has_string=False
for item in work_list:
if "mystring" in work_list:
has_string=True
print "***Has string TRUE*****"
print " \n".join(work_list)
The output will be just the list, and the bool has_string stays False
Am I missing something or am using the in statement in the wrong way?
You want it to be:
if "mystring" in item:
A concise (and usually faster) way to do this:
if any("my string" in item for item in work_list):
has_string = True
print "found mystring"
But really what you've done is implement grep.
Method 1
[s for s in stringList if ("my string" in s)]
# --> ["blah my string blah", "my string", ...]
This will yield a list of all the strings which contain "my string".
Method 2
If you just want to check if it exists somewhere, you can be faster by doing:
any(("my string" in s) for s in stringList)
# --> True|False
This has the benefit of terminating the search on the first occurrence of "my string".
Method 3
You will want to put this in a function, preferably a lazy generator:
def search(stringList, query):
for s in stringList:
if query in s:
yield s
list( search(["an apple", "a banana", "a cat"], "a ") )
# --> ["a banana", "a cat"]