create new string by adding characters of smaller strings one after another - python

You have strings
"aaaaa"
"bbbbb"
"ccccc"
"ddddd"
Now you want to generate the string:
"abcdabcdabcdabcdabcd"
How would be the fastest way to do it?
PS. This is a very simplified example. I really need to generate the new string from the existing smaller strings.

If the strings are of equal length you can use zip:
result = ''.join(map(''.join, zip(*strings)))

Use izip_longest from the itertools library, and the flatten recipe from the same library.
from itertools import izip_longest, chain
def flatten(listOfLists):
"Flatten one level of nesting"
return chain.from_iterable(listOfLists)
result = ''.join(flatten(izip_longest("aaaaa", "bbbbb", "ccc", "dddd", fillvalue='')))

string1 = "aaaaaaaaa"
string2 = "bbbbbbbbb"
string3 = "ccccccccc"
string4 = "ddddddddd"
new_string = ""
for i in range(0,len(string1)):
new_string = new_string+string1[i]+string2[i]+string3[i]+string4[i]
print (new_string)
Results:
abcdabcdabcdabcdabcdabcdabcdabcdabcd

Related

isolate data from long string [duplicate]

Suppose I had a string
string1 = "498results should get"
Now I need to get only integer values from the string like 498. Here I don't want to use list slicing because the integer values may increase like these examples:
string2 = "49867results should get"
string3 = "497543results should get"
So I want to get only integer values out from the string exactly in the same order. I mean like 498,49867,497543 from string1,string2,string3 respectively.
Can anyone let me know how to do this in a one or two lines?
>>> import re
>>> string1 = "498results should get"
>>> int(re.search(r'\d+', string1).group())
498
If there are multiple integers in the string:
>>> map(int, re.findall(r'\d+', string1))
[498]
An answer taken from ChristopheD here: https://stackoverflow.com/a/2500023/1225603
r = "456results string789"
s = ''.join(x for x in r if x.isdigit())
print int(s)
456789
Here's your one-liner, without using any regular expressions, which can get expensive at times:
>>> ''.join(filter(str.isdigit, "1234GAgade5312djdl0"))
returns:
'123453120'
if you have multiple sets of numbers then this is another option
>>> import re
>>> print(re.findall('\d+', 'xyz123abc456def789'))
['123', '456', '789']
its no good for floating point number strings though.
Iterator version
>>> import re
>>> string1 = "498results should get"
>>> [int(x.group()) for x in re.finditer(r'\d+', string1)]
[498]
>>> import itertools
>>> int(''.join(itertools.takewhile(lambda s: s.isdigit(), string1)))
With python 3.6, these two lines return a list (may be empty)
>>[int(x) for x in re.findall('\d+', your_string)]
Similar to
>>list(map(int, re.findall('\d+', your_string))
this approach uses list comprehension, just pass the string as argument to the function and it will return a list of integers in that string.
def getIntegers(string):
numbers = [int(x) for x in string.split() if x.isnumeric()]
return numbers
Like this
print(getIntegers('this text contains some numbers like 3 5 and 7'))
Output
[3, 5, 7]
def function(string):
final = ''
for i in string:
try:
final += str(int(i))
except ValueError:
return int(final)
print(function("4983results should get"))
Another option is to remove the trailing the letters using rstrip and string.ascii_lowercase (to get the letters):
import string
out = [int(s.replace(' ','').rstrip(string.ascii_lowercase)) for s in strings]
Output:
[498, 49867, 497543]
integerstring=""
string1 = "498results should get"
for i in string1:
if i.isdigit()==True
integerstring=integerstring+i
print(integerstring)

Possible occurrences of splitting a string by delimiter

I have a string : str = "**Quote_Policy_Generalparty_NameInfo** "
I am splitting the string as str.split("_") which gives me a list in python.
Any help in getting the output as below is appreciated.
[ Quote, Quote_Policy, Quote_Policy_Generalparty, Quote_Policy_Generalparty_NameInfo ]
You can use range(len(list)) to create slices list[:1], list[:2], etc. and then "_".join(...) to concatenate every slice
text = "Quote_Policy_Generalparty_NameInfo "
data = text.split('_')
result = []
for x in range(len(data)):
part = data[:x+1]
part = "_".join(part)
result.append(part)
print(result)
input = "Quote_Policy_Generalparty_NameInfo"
tokenized = input.split("_")
combined = [
"_".join(tokenized[:i])
for i, token in enumerate(tokenized, 1)
]
The value of combined above will be
['Quote', 'Quote_Policy', 'Quote_Policy_Generalparty', 'Quote_Policy_Generalparty_NameInfo']
you could use accumulate from itertools, we basically give it one more argument, which decides how to accumulate two elements
from itertools import accumulate
input = "Quote_Policy_Generalparty_NameInfo"
output = [*accumulate(input.split('_'), lambda str1, str2 : '_'.join([str1,str2])),]
which gives :
Out[11]:
['Quote',
'Quote_Policy',
'Quote_Policy_Generalparty',
'Quote_Policy_Generalparty_NameInfo']
If you find the above answers too clean and satisfactory, you can also consider regular expressions:
>>> import regex as re # For `overlapped` support
>>> x = "Quote_Policy_Generalparty_NameInfo"
>>> list(map(lambda s: s[::-1], re.findall('(?<=_).*$', '_' + x[::-1], overlapped=True)))
['Quote_Policy_Generalparty_NameInfo', 'Quote_Policy_Generalparty', 'Quote_Policy', 'Quote']

How can you terminate a string after k consecutive numbers have been found?

Say I have some list with files of the form *.1243.*, and I wish to obtain everything before these 4 digits. How do I do this efficiently?
An ugly, inefficient example of working code is:
names = []
for file in file_list:
words = file.split('.')
for i, word in enumerate(words):
if word.isdigit():
if int(word)>999 and int(word)<10000:
names.append(' '.join(words[:i]))
break
print(names)
Obviously though, this is far from ideal and I was wondering about better ways to do this.
You may want to use regular expressions for this.
import re
name = []
for file in file_list:
m = re.match(r'^(.+?)\.\d{4}\.', file)
if m:
name.append(m.groups()[0])
Using a regular expression, this would become simpler
import re
names = ['hello.1235.sas','test.5678.hai']
for fn in names:
myreg = r'(.*)\.(?:\d{4})\..*'
output = re.findall(myreg,fn)
print(output)
output:
['hello']
['test']
If you know that all entries has the same format, here is list comprehension approach:
[item[0] for item in filter(lambda start, digit, end: len(digit) == 4, (item.split('.') for item in file_list))]
To be fair I also like solution, provided by #James. Note, that downside of this list comprehension is three loops:
1. On all items to split
2. Filtering all items, that match
3. Returning result.
With regular for loop it could be be more sufficient:
output = []
for item in file_list:
begging, digits, end = item.split('.')
if len(digits) == 4:
output.append(begging)
It does only one loop, which way better.
You can use Positive Lookahead (?=(\.\d{4}))
import re
pattern=r'(.*)(?=(\.\d{4}))'
text=['*hello.1243.*','*.1243.*','hello.1235.sas','test.5678.hai','a.9999']
print(list(map(lambda x:re.search(pattern,x).group(0),text)))
output:
['*hello', '*', 'hello', 'test', 'a']

How to split this string to dict with python?

String
string1 = '"{ABCD-1234-3E3F},MEANING1","{ABCD-1B34-3X5F},MEANING2","{XLMN-2345-KFDE},WHITE"'
Expected Result
dict1 = {'{ABCD-1234-3E3F}' : 'MEANING1', '{ABCD-1B34-3X5F}' : 'MEANING2', '{XLMN-2345-KFDE}' : 'WHITE'}
Perhaps it is simple question,
Is there any easy method to split string1 to dict1?
If you are looking for a one-liner, this will work:
>>> dict(tuple(x.split(',')) for x in string1[1:-1].split('","'))
{'{ABCD-1B34-3X5F}': 'MEANING2', '{XLMN-2345-KFDE}': 'WHITE', '{ABCD-1234-3E3F}': 'MEANING1'}
string1 = '"{ABCD-1234-3E3F},MEANING1","{ABCD-1B34-3X5F},MEANING2","{XLMN-2345-KFDE},WHITE"'
elements = string1.replace('"','').split(',')
dict(zip(elements[::2],elements[1::2]))
You can first split the original string and extract elements from it. Now you can pair element in odd and even positions and turn them into a dict.
And here another one-liner alternative:
>>> string1 = '"{ABCD-1234-3E3F},MEANING1","{ABCD-1B34-3X5F},MEANING2","{XLMN-2345-KFDE},WHITE"'
>>> dict((nm, v) for nm,v in [pair.split(',') for pair in eval(string1)])
>>> {'{ABCD-1234-3E3F}': 'MEANING1', '{ABCD-1B34-3X5F}': 'MEANING2', '{XLMN-2345-KFDE}': 'WHITE'}

how to make python 3 itertools product/combinations/permutations from string

Hi!
I want to make mixing strings program in python 3.4. Something like input: string1 = 123, string2 = abc, string3 = 4g6. Output should be like combine of string1, 2 and 3. Example: 1234g6abc. I tried searching in itertools and there was only combine of letters not words. And i want words. Please help.">
Add your strings to a list, use itertools permutations, and then join the result.
import itertools
L = [string1, string2, string3]
# Convert perms to list if you want to iterate multiple times
perms = itertools.permutations(L)
for perm in perms:
print(''.join(perm))
import itertools as it
>>> list(map(''.join,list(it.permutations(['234','g6','abc']))))
['234g6abc', '234abcg6', 'g6234abc', 'g6abc234', 'abc234g6', 'abcg6234']

Categories