string splitting after every other comma in string in python - python

I have string which contains every word separated by comma. I want to split the string by every other comma in python. How should I do this?
eg, "xyz,abc,jkl,pqr" should give "xyzabc" as one string and "jklpqr" as another string

It's probably easier to split on every comma, and then rejoin pairs
>>> original = 'a,1,b,2,c,3'
>>> s = original.split(',')
>>> s
['a', '1', 'b', '2', 'c', '3']
>>> alternate = map(''.join, zip(s[::2], s[1::2]))
>>> alternate
['a1', 'b2', 'c3']
Is that what you wanted?

Split, and rejoin.
So, to split:
In [173]: "a,b,c,d".split(',')
Out[173]: ['a', 'b', 'c', 'd']
And to rejoin:
In [193]: z = iter("a,b,c,d".split(','))
In [194]: [a+b for a,b in zip(*([z]*2))]
Out[194]: ['ab', 'cd']
This works because ([z]*2) is a list of two elements, both of which are the same iterator z. Thus, zip takes the first, then second element from z to create each tuple.
This also works as a oneliner, because in [foo]*n foo is evaluated only once, whether or not it is a variable or a more complex expression:
In [195]: [a+b for a,b in zip(*[iter("a,b,c,d".split(','))]*2)]
Out[195]: ['ab', 'cd']
I've also cut out a pair of brackets, because unary * has lower precedence than binary *.
Thanks to #pillmuncher for pointing out that this can be extended with izip_longest to handle lists with an odd number of elements:
In [214]: from itertools import izip_longest
In [215]: [a+b for a,b in izip_longest(*[iter("a,b,c,d,e".split(','))]*2, fillvalue='')]
Out[215]: ['ab', 'cd', 'e']
(See: http://docs.python.org/library/itertools.html#itertools.izip_longest )

str = "a,b,c,d".split(",")
print ["".join(str[i:i+2]) for i in range(0, len(str), 2)]

Just split on every comma, then combine it back:
splitList = someString.split(",")
joinedString = ','.join([splitList[i - 1] + splitList[i] for i in range(1, len(splitList), 2)]

Like this
"a,b,c,d".split(',')

Related

Get all possible ordered sublists of a list

Let's say I have a list with the following letters:
lst=['A','B','C','D']
And I need to get all the possible sublists of that list that maintain the order. Thus, the result must be:
res=['A'
'AB'
'ABC'
'ABCD'
'B'
'BC'
'BCD'
'C'
'CD'
'D']
I had implemebted the following for loop, but an error occurs, saying that "TypeError:Can only concatenate str (not "list) to str"
res=[]
for x in range(len(lst)):
for y in range(len(lst)):
if x==y:
res.appebd(x)
if y>x:
res.append(lst[x]+lst[y:len(lst)-1]
Is there a better and more efficient way to do this?
lst=['A','B','C','D']
out = []
for i in range(len(lst)):
for j in range(i, len(lst)):
out.append( ''.join(lst[i:j+1]) )
print(out)
Prints:
['A', 'AB', 'ABC', 'ABCD', 'B', 'BC', 'BCD', 'C', 'CD', 'D']
Rather than nested loops with redefined inner loop bounds on each go, you can use itertools to generate the bounds for you:
from itertools import combinations
lst = ['A','B','C','D']
out = []
for s, e in combinations(range(len(lst) + 1), 2):
out.append(''.join(lst[s:e]))
combinations conveniently produces all possible start and end indices from a single range, producing each set one at a time in your desired order. It also simplifies the code enough that the equivalent listcomp isn't too unreadable, allowing you to condense three lines of code down to one:
out = [''.join(lst[s:e]) for s, e in combinations(range(len(lst) + 1), 2)]
Either way, out ends up with the value:
['A', 'AB', 'ABC', 'ABCD', 'B', 'BC', 'BCD', 'C', 'CD', 'D']
This is probably the closest to what you have got and will produce the desired result:
res=[]
for x in range(len(lst)):
for y in range(len(lst)):
if x==y:
res.append(lst[x])
if y>x:
res.append(''.join(lst[x:y+1]))
The error you are describing mean that you are trying to add a character to a list:
lst[x]+lst[y:len(lst)-1]
lst[x] is a character and lst[y:len(lst)-1] is a list of characters and python does not know how to add it together. It can add a character and a string though using a join function.

Basic Sorting / Order Algorithm

Trying to implement and form a very simple algorithm. This algorithm takes in a sequence of letters or numbers. It first creates an array (list) out of each character or digit. Then it checks each individual character compared with the following character in the sequence. If the two are equal, it removes the character from the array.
For example the input: 12223344112233 or AAAABBBCCCDDAAABB
And the output should be: 1234123 or ABCDAB
I believe the issue stems from the fact I created a counter and increment each loop. I use this counter for my comparison using the counter as an index marker in the array. Although, each time I remove an item from the array it changes the index while the counter increases.
Here is the code I have:
def sort(i):
iter = list(i)
counter = 0
for item in iter:
if item == iter[counter + 1]:
del iter[counter]
counter = counter + 1
return iter
You're iterating over the same list that you are deleting from. That usually causes behaviour that you would not expect. Make a copy of the list & iterate over that.
However, there is a simpler solution: Use itertools.groupby
import itertools
def sort(i):
return [x for x, _ in itertools.groupby(list(i))]
print(sort('12223344112233'))
Output:
['1', '2', '3', '4', '1', '2', '3']
A few alternatives, all using s = 'AAAABBBCCCDDAAABB' as setup:
>>> import re
>>> re.sub(r'(.)\1+', r'\1', s)
'ABCDAB'
>>> p = None
>>> [c for c in s if p != (p := c)]
['A', 'B', 'C', 'D', 'A', 'B']
>>> [c for c, p in zip(s, [None] + list(s)) if c != p]
['A', 'B', 'C', 'D', 'A', 'B']
>>> [c for i, c in enumerate(s) if not s.endswith(c, None, i)]
['A', 'B', 'C', 'D', 'A', 'B']
The other answers a good. This one iterates over the list in reverse to prevent skipping items, and uses the look ahead type algorithm OP described. Quick note OP this really isn't a sorting algorithm.
def sort(input_str: str) -> str:
as_list = list(input_str)
for idx in range(len(as_list), 0, -1)):
if item == as_list[idx-1]:
del as_list[idx]
return ''.join(as_list)

Merge elements in list of lists

I have a list of lists like this:
A = [('b', 'a', 'a', 'a', 'a'), ('b', 'a', 'a', 'a', 'a')]
How can I merge the all elements of each inner list to get result A = ['baaaa', 'baaaa']?
I would prefer to do this outside of a loop, if possible, to speed up the code.
If you don't want to write a loop you can use map and str.join
>>> list(map(''.join, A))
['baaaa', 'baaaa']
However, the loop using a list comprehension is almost as short to write, and I think is clearer:
>>> [''.join(e) for e in A]
['baaaa', 'baaaa']
You can use str.join:
>>> ["".join(t) for t in A]
['baaaa', 'baaaa']
>>>
>>>
>>> list(map(''.join, A) #with map
['baaaa', 'baaaa']
>>>
>>> help(str.join)
Help on method_descriptor:
join(...)
S.join(iterable) -> str
Return a string which is the concatenation of the strings in the
iterable. The separator between elements is S.
>>>
Use the join method of the empty string. This means: "make a string concatenating every element of a tuple (for example ('b', 'a', 'a', 'a', 'a') ) with '' (empty string) between each of them.
Thus, what you are looking for is:
[''.join(x) for x in A]
If you prefer functional programming. You can use the function reduce. Here is how you can achieve the same result using reduce function as follows.
Note that, reduce was a built in function in python 2.7 but in python
3 it is moved to library functools
from functools import reduce
It is only required to import reduce if you are using python 3 else no need to import reduce from functools
A = [('b', 'a', 'a', 'a', 'a'), ('b', 'a', 'a', 'a', 'a')]
result = [reduce(lambda a, b: a+b, i) for i in A]
If you don't want to use loop or even list comprehension, here is another way
list(map(lambda i: reduce(lambda a, b: a+b, i), A))

match the pattern at the end of a string?

Imagine I have the following strings:
['a','b','c_L1', 'c_L2', 'c_L3', 'd', 'e', 'e_L1', 'e_L2']
Where the "c" string has important sub-categories (L1, L2, L3). These indicate special data for our purposes that have been generated in a program based a pre-designated string "L". In other words, I know that the special entries should have the form:
name_Lnumber
Knowing that I'm looking for this pattern, and that I am using "L" or more specifically "_L" as my designation of these objects, how could I return a list of entries that meet this condition? In this case:
['c', 'e']
Use a simple filter:
>>> l = ['a','b','c_L1', 'c_L2', 'c_L3', 'd', 'e', 'e_L1', 'e_L2']
>>> filter(lambda x: "_L" in x, l)
['c_L1', 'c_L2', 'c_L3', 'e_L1', 'e_L2']
Alternatively, use a list comprehension
>>> [s for s in l if "_L" in s]
['c_L1', 'c_L2', 'c_L3', 'e_L1', 'e_L2']
Since you need the prefix only, you can just split it:
>>> set(s.split("_")[0] for s in l if "_L" in s)
set(['c', 'e'])
you can use the following list comprehension :
>>> set(i.split('_')[0] for i in l if '_L' in i)
set(['c', 'e'])
Or if you want to match the elements that ends with _L(digit) and not something like _Lm you can use regex :
>>> import re
>>> set(i.split('_')[0] for i in l if re.match(r'.*?_L\d$',i))
set(['c', 'e'])

Joining pairs of elements of a list [duplicate]

This question already has answers here:
How to iterate over a list in chunks
(39 answers)
Closed 8 months ago.
I know that a list can be joined to make one long string as in:
x = ['a', 'b', 'c', 'd']
print ''.join(x)
Obviously this would output:
'abcd'
However, what I am trying to do is simply join the first and second strings in the list, then join the third and fourth and so on. In short, from the above example instead achieve an output of:
['ab', 'cd']
Is there any simple way to do this? I should probably also mention that the lengths of the strings in the list will be unpredictable, as will the number of strings within the list, though the number of strings will always be even. So the original list could just as well be:
['abcd', 'e', 'fg', 'hijklmn', 'opq', 'r']
You can use slice notation with steps:
>>> x = "abcdefghijklm"
>>> x[0::2] #0. 2. 4...
'acegikm'
>>> x[1::2] #1. 3. 5 ..
'bdfhjl'
>>> [i+j for i,j in zip(x[::2], x[1::2])] # zip makes (0,1),(2,3) ...
['ab', 'cd', 'ef', 'gh', 'ij', 'kl']
Same logic applies for lists too. String lenght doesn't matter, because you're simply adding two strings together.
Use an iterator.
List comprehension:
>>> si = iter(['abcd', 'e', 'fg', 'hijklmn', 'opq', 'r'])
>>> [c+next(si, '') for c in si]
['abcde', 'fghijklmn', 'opqr']
Very efficient for memory usage.
Exactly one traversal of s
Generator expression:
>>> si = iter(['abcd', 'e', 'fg', 'hijklmn', 'opq', 'r'])
>>> pair_iter = (c+next(si, '') for c in si)
>>> pair_iter # can be used in a for loop
<generator object at 0x4ccaa8>
>>> list(pair_iter)
['abcde', 'fghijklmn', 'opqr']
use as an iterator
Using map, str.__add__, iter
>>> si = iter(['abcd', 'e', 'fg', 'hijklmn', 'opq', 'r'])
>>> map(str.__add__, si, si)
['abcde', 'fghijklmn', 'opqr']
next(iterator[, default]) is available starting in Python 2.6
just to be pythonic :-)
>>> x = ['a1sd','23df','aaa','ccc','rrrr', 'ssss', 'e', '']
>>> [x[i] + x[i+1] for i in range(0,len(x),2)]
['a1sd23df', 'aaaccc', 'rrrrssss', 'e']
in case the you want to be alarmed if the list length is odd you can try:
[x[i] + x[i+1] if not len(x) %2 else 'odd index' for i in range(0,len(x),2)]
Best of Luck
Without building temporary lists:
>>> import itertools
>>> s = 'abcdefgh'
>>> si = iter(s)
>>> [''.join(each) for each in itertools.izip(si, si)]
['ab', 'cd', 'ef', 'gh']
or:
>>> import itertools
>>> s = 'abcdefgh'
>>> si = iter(s)
>>> map(''.join, itertools.izip(si, si))
['ab', 'cd', 'ef', 'gh']
>>> lst = ['abcd', 'e', 'fg', 'hijklmn', 'opq', 'r']
>>> print [lst[2*i]+lst[2*i+1] for i in range(len(lst)/2)]
['abcde', 'fghijklmn', 'opqr']
Well I would do it this way as I am no good with Regs..
CODE
t = '1. eat, food\n\
7am\n\
2. brush, teeth\n\
8am\n\
3. crack, eggs\n\
1pm'.splitlines()
print [i+j for i,j in zip(t[::2],t[1::2])]
output:
['1. eat, food 7am', '2. brush, teeth 8am', '3. crack, eggs 1pm']
Hope this helps :)
I came across this page interesting yesterday while wanting to solve a similar issue. I wanted to join items first in pairs using one string in between and then together using another string. Based on the code above I came up with the following function:
def pairs(params,pair_str, join_str):
"""Complex string join where items are first joined in pairs
"""
terms = iter(params)
pairs = [pair_str.join(filter(len, [term, next(terms, '')])) for term in terms]
return join_str.join(pairs)
This results in the following:
a = ['1','2','3','4','5','6','7','8','9']
print(pairs(a, ' plus ', ' and '))
>>1 plus 2 and 3 plus 4 and 5 plus 6 and 7 plus 8 and 9
The filter step prevents the '' which is produced in case of an odd number of terms from putting a final pair_str at the end.

Categories