String to list in Python - python

I'm trying to split a string:
'QH QD JC KD JS'
into a list like:
['QH', 'QD', 'JC', 'KD', 'JS']
How would I go about doing this?

>>> 'QH QD JC KD JS'.split()
['QH', 'QD', 'JC', 'KD', 'JS']
split:
Return a list of the words in the
string, using sep as the delimiter
string. If maxsplit is given, at most
maxsplit splits are done (thus, the
list will have at most maxsplit+1
elements). If maxsplit is not
specified, then there is no limit on
the number of splits (all possible
splits are made).
If sep is given, consecutive
delimiters are not grouped together
and are deemed to delimit empty
strings (for example,
'1,,2'.split(',') returns ['1', '', '2']). The sep argument may consist of
multiple characters (for example,
'1<>2<>3'.split('<>') returns ['1', '2', '3']). Splitting an empty string
with a specified separator returns
[''].
If sep is not specified or is None, a
different splitting algorithm is
applied: runs of consecutive
whitespace are regarded as a single
separator, and the result will contain
no empty strings at the start or end
if the string has leading or trailing
whitespace. Consequently, splitting an
empty string or a string consisting of
just whitespace with a None separator
returns [].
For example, ' 1 2 3 '.split()
returns ['1', '2', '3'], and ' 1 2 3 '.split(None, 1) returns ['1', '2 3 '].

Here the simples
a = [x for x in 'abcdefgh'] #['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Maybe like this:
list('abcdefgh') # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

Or for fun:
>>> ast.literal_eval('[%s]'%','.join(map(repr,s.split())))
['QH', 'QD', 'JC', 'KD', 'JS']
>>>
ast.literal_eval

You can use the split() function, which returns a list, to separate them.
letters = 'QH QD JC KD JS'
letters_list = letters.split()
Printing letters_list would now format it like this:
['QH', 'QD', 'JC', 'KD', 'JS']
Now you have a list that you can work with, just like you would with any other list. For example accessing elements based on indexes:
print(letters_list[2])
This would print the third element of your list, which is 'JC'

Related

Recreating the strip() method using list comprehensions but output returns unexpected result

I am trying to 'recreate' the str.split() method in python for fun.
def ourveryownstrip(string, character):
newString = [n for n in string if n != character]
return newString
print("The string is", ourveryownstrip(input("Enter a string.`n"), input("enter character to remove`n")))
The way it works is that I create a function that passes in two arguments: 1) the first one is a string supplied, 2) the second is a a string or char that the person wants to remote/whitespace to be moved from the string. Then I use a list comprehension to store the 'modified' string as a new list by using a conditional statement. Then it returns the modified string as a list.
The output however, returns the entire thing as an array with every character in the string separated by a comma.
Expected output:
Boeing 747 Boeing 787
enter character to removeBoeing
The string is ['B', 'o', 'e', 'i', 'n', 'g', ' ', '7', '4', '7', ' ', 'B', 'o', 'e', 'i', 'n', 'g', ' ', '7', '8', '7']
How can I fix this?
What you have set up is checking each individual character in a list and seeing if it matches 'Boeing' which will never be true so it will always return the whole input. It is returning it as a list because using list comprehension makes a list. Like #BrutusForcus said this can be solved using string slicing and the string.index() function like this:
def ourveryownstrip(string,character):
while character in string:
string = string[:string.index(character)] + string[string.index(character)+len(character):]
return string
This will first check if the value you want removed is in your string. If it is then string[:string.index(character)] will get all of the string before the first occurrence of the character variable value and string[string.index(character)+len(character):] will get everything in the string after the first occurrence of the variable value. That will keep happening until the variable value doesn't occur in the string anymore.

What regex will emulate the default behavior of split() in python?

Using split() I can easily create from a string the list of tokens that are divided by space:
>>> 'this is a test 200/2002'.split()
['this', 'is', 'a', 'test', '200/2002']
How do I do the same using re.compile and re.findall? I need something similiar to the following example but without splitting the "200/2002".
>>> test = re.compile('\w+')
>>> test.findall('this is a test 200/2002')
['this', 'is', 'a', 'test', '200', '2002']
This should output the desired list:
>>> test = re.compile('\S+')
>>> test.findall('this is a test 200/2002')
['this', 'is', 'a', 'test', '200/2002']
\S is anything but a whitespace (space, tab, newline, ...).
From str.split() documentation :
If sep is not specified or is None, a different splitting algorithm is
applied: runs of consecutive whitespace are regarded as a single
separator, and the result will contain no empty strings at the start
or end if the string has leading or trailing whitespace. Consequently,
splitting an empty string or a string consisting of just whitespace
with a None separator returns [].
findall() with the above regex should have the same behaviour :
>>> test.findall(" a\nb\tc d ")
['a', 'b', 'c', 'd']
>>> " a\nb\tc d ".split()
['a', 'b', 'c', 'd']

Python-Getting contents between current and next occurrence of pattern in a string

I want to implement the following in python
(1)Search pattern in a string
(2)Get content till next occurence of the same pattern in the same string
Till end of the string do (1) and (2)
Searched all available answers but of no use.
Thanks in advance.
As mentioned by Blckknght in the comment, you can achieve this with re.split. re.split retains all empty strings between a) the beginning of the string and the first match, b) the last match and the end of the string and c) between different matches:
>>> re.split('abc', 'abcabcabcabc')
['', '', '', '', '']
>>> re.split('bca', 'abcabcabcabc')
['a', '', '', 'bc']
>>> re.split('c', 'abcabcabcabc')
['ab', 'ab', 'ab', 'ab', '']
>>> re.split('a', 'abcabcabcabc')
['', 'bc', 'bc', 'bc', 'bc']
If you want to retain only c) the strings between 2 matches of the pattern, just slice the resulting array with [1:-1].
Do note that there are two caveat with this method:
re.split doesn't split on empty string match.
>>> re.split('', 'abcabc')
['abcabc']
Content in capturing groups will be included in the resulting array.
>>> re.split(r'(.)(?!\1)', 'aaaaaakkkkkkbbbbbsssss')
['aaaaa', 'a', 'kkkkk', 'k', 'bbbb', 'b', 'ssss', 's', '']
You have to write your own function with finditer if you need to handle those use cases.
This is the variant where only case c) is matched.
def findbetween(pattern, input):
out = []
start = 0
for m in re.finditer(pattern, input):
out.append(input[start:m.start()])
start = m.end()
return out
Sample run:
>>> findbetween('abc', 'abcabcabcabc')
['', '', '']
>>> findbetween(r'', 'abcdef')
['a', 'b', 'c', 'd', 'e', 'f']
>>> findbetween(r'ab', 'abcabcabc')
['c', 'c']
>>> findbetween(r'b', 'abcabcabc')
['ca', 'ca']
>>> findbetween(r'(?<=(.))(?!\1)', 'aaaaaaaaaaaabbbbbbbbbbbbkkkkkkk')
['bbbbbbbbbbbb', 'kkkkkkk']
(In the last example, (?<=(.))(?!\1) matches the empty string at the end of the string, so 'kkkkkkk' is included in the list of results)
You can use something like this
re.findall(r"pattern.*?(?=pattern|$)",test_Str)
Here we search pattern and with lookahead make sure it captures till next pattern or end of string.

Split by regex without resulting empty strings in Python [duplicate]

This question already has answers here:
Why are there extra empty strings at the beginning and end of the list returned by re.split?
(4 answers)
Closed 7 years ago.
I want to split a string containing irregularly repeating delimiter, like method split() does:
>>> ' a b c de '.split()
['a', 'b', 'c', 'de']
However, when I apply split by regular expression, the result is different (empty strings sneak into the resulting list):
>>> re.split('\s+', ' a b c de ')
['', 'a', 'b', 'c', 'de', '']
>>> re.split('\.+', '.a.b...c..de..')
['', 'a', 'b', 'c', 'de', '']
And what I want to see:
>>>some_smart_split_method('.a.b...c..de..')
['a', 'b', 'c', 'de']
The empty strings are just an inevitable result of the regex split (though there is good reasoning as to why that behavior might be desireable). To get rid of them you can call filter on the result.
results = re.split(...)
results = list(filter(None, results))
Note the list() transform is only necessary in Python 3 -- in Python 2 filter() returns a list, while in 3 it returns a filter object.
>>> re.findall(r'\S+', ' a b c de ')
['a', 'b', 'c', 'de']

Take user input and put it in a list of characters (Python 3)

I'm looking for a simple way of taking input from the user and placing it into a list of single character elements.
For example:
Input: word2
List = ['w', 'o', 'r', 'd', '2']
Use input to take input:
>>> word = input()
word2 <--- user type this.
>>> word
'word2'
Use list to conver a string into a list containg each character.
>>> list(word)
['w', 'o', 'r', 'd', '2']
This works because list accepts any iterable and string objects is iterable; Iterating a string yield each character.

Categories