python distinguish number and string solution - python

I'm new to python and trying to solve the distinguish between number and string
For example :
Input: 111aa111aa
Output : Number: 111111 , String : aaaa

Here is your answer
for numbers
import re
x = '111aa111aa'
num = ''.join(re.findall(r'[\d]+',x))
for alphabets
import re
x = '111aa111aa'
alphabets = ''.join(re.findall(r'[a-zA-Z]', x))

You can use in-built functions as isdigit() and isalpha()
>>> x = '111aa111aa'
>>> number = ''.join([i for i in x if i.isdigit()])
'111111'
>>> string = ''.join([i for i in x if i.isalpha()])
'aaaa'
Or You can use regex here :
>>> x = '111aa111aa'
>>> import re
>>> numbers = ''.join(re.findall(r'\d+', x))
'111111'
>>> string = ''.join(re.findall(r'[a-zA-Z]', x))
'aaaa'

>>> my_string = '111aa111aa'
>>> ''.join(filter(str.isdigit, my_string))
'111111'
>>> ''.join(filter(str.isalpha, my_string))
'aaaa'

Try with isalpha for strings and isdigit for numbers,
In [45]: a = '111aa111aa'
In [47]: ''.join([i for i in a if i.isalpha()])
Out[47]: 'aaaa'
In [48]: ''.join([i for i in a if i.isdigit()])
Out[48]: '111111'
OR
In [18]: strings,numbers = filter(str.isalpha,a),filter(str.isdigit,a)
In [19]: print strings,numbers
aaaa 111111

As you mentioned you are new to Python, most of the presented approaches using str.join with list comprehensions or functional styles are quite sufficient. Alternatively, I present some options using dictionaries that can help organize data, starting from basic to intermediate examples with arguably increasing intricacy.
Basic Alternative
# Dictionary
d = {"Number":"", "String":""}
for char in s:
if char.isdigit():
d["Number"] += char
elif char.isalpha():
d["String"] += char
d
# {'Number': '111111', 'String': 'aaaa'}
d["Number"] # access by key
# '111111'
import collections as ct
# Default Dictionary
dd = ct.defaultdict(str)
for char in s:
if char.isdigit():
dd["Number"] += char
elif char.isalpha():
dd["String"] += char
dd

Related

python split a unicode string by 3-bytes utf8 character

suppose that we have a unicode string in python,
s = u"abc你好def啊"
Now I want to split that by no-ascii characters, with result like
result = ["abc", "你好", "def", "啊"]
So, how to implement that?
With regex you could simply split between "has or has not" a-z chars.
>>> import re
>>> re.findall('([a-zA-Z0-9]+|[^a-zA-Z0-9]+)', u"abc你好def啊")
["abc", "你好", "def", "啊"]
Or, with all ASCIIs
>>> ascii = ''.join(chr(x) for x in range(33, 127))
>>> re.findall('([{}]+|[^{}]+)'.format(ascii, ascii), u"abc你好def啊")
['abc', '你好', 'def', '啊']
Or, even simpler as suggested by #Dolda2000
>>> re.findall('([ -~]+|[^ -~]+)', u"abc你好def啊")
['abc', '你好', 'def', '啊']
You can do something like this:
s = u"abc你好def啊"
status = ord(s[0]) < 128
word = ""
res =[]
for b, letter in zip([ ord(c) < 128 for c in s ], s):
if b != status:
res.append(word)
status = b
word = ""
word += letter
res.append(word)
print res
>> ["abc", "你好", "def", "啊"]
s = "abc你好def啊"
filter(None, re.split('(\w+|\W+)', s))
works in python 2.x versions
Just ... split.
[s[i:i+3] for i in xrange(0, len(s), 3)]
http://ideone.com/PeoGaF

Same Python code returns different results for same input string

Below code is supposed to return the most common letter in the TEXT string in the format:
always lowercase
ignoring punctuation and spaces
in the case of words such as "One" - where there is no 2 letters the same - return the first letter in the alphabet
Each time I run the code using the same string, e.g. "One" the result cycles through the letters...weirdly though, only from the third try (in this "One" example).
text=input('Insert String: ')
def mwl(text):
from string import punctuation
from collections import Counter
for l in punctuation:
if l in text:
text = text.replace(l,'')
text = text.lower()
text=''.join(text.split())
text= sorted(text)
collist=Counter(text).most_common(1)
print(collist[0][0])
mwl(text)
Counter uses a dictionary:
>>> Counter('one')
Counter({'e': 1, 'o': 1, 'n': 1})
Dictionaries are not ordered, hence the behavior.
You can get the desired output with OrderedDict replacing the below two lines:
text= sorted(text)
collist=Counter(text).most_common(1)
with:
collist = OrderedDict([(i,text.count(i)) for i in text])
collist = sorted(collist.items(), key=lambda x:x[1], reverse=True)
You also need to import OrderedDict for this.
Demo:
>>> from collections import Counter, OrderedDict
>>> text = 'One'
>>> collist = OrderedDict([(i,text.count(i)) for i in text])
>>> print(sorted(collist.items(), key=lambda x:x[1], reverse=True)[0][0])
O
>>> print(sorted(collist.items(), key=lambda x:x[1], reverse=True)[0][0])
O # it will always return O
>>> text = 'hello'
>>> collist = OrderedDict([(i,text.count(i)) for i in text])
>>> print(sorted(collist.items(), key=lambda x:x[1], reverse=True)[0][0])
l # l returned because it is most frequent
This can also be done without Counter or OrderedDict:
In [1]: s = 'Find the most common letter in THIS sentence!'
In [2]: letters = [letter.lower() for letter in s if letter.isalpha()]
In [3]: max(set(letters), key=letters.count)
Out[3]: 'e'

python3 string "abcd" print: aababcabcd?

If a have a string like abcd or 1234 etc. how can I print together, the first character, then the first two characters, then the first three etc. all together?
For example for a string = 1234 I would like to print/return 1121231234 or aababcabcd
I have this code so far:
def string_splosion(str):
i = 0
while i <= len(str):
i += 1
print(str[:i])
print(string_splosion('abcd'))
But it prints/returns it in separate lines. I could write it manually as print(str[0:1], str[1:2] <...>) but how do I make python do it as I don't know how long the string is going to be?
You shouldn't use str as a variable name, because it shadows the built-in str type. You could join the sliced strings together in your loop:
def string_splosion(string):
i, result = 0, ''
while i < len(string): # < instead of <=
i += 1
result += string[:i]
return result
It's possible to shorten your code a little using str.join and range:
def string_splosion(string):
return ''.join(string[:i] for i in range(1, len(string) + 1))
or using itertools.accumulate (Python 3.2+):
import itertools
def string_splosion(string):
return ''.join(itertools.accumulate(string))
itertools.accumulate approach appears to be 2 times faster than str.join one and about 1.5 times faster than the original loop-based solution:
string_splosion_loop(abcdef): 2.3944241080715223
string_splosion_join_gen(abcdef): 2.757582983268288
string_splosion_join_lc(abcdef): 2.2879220573578865
string_splosion_itertools(abcdef): 1.1873638161591886
The code I used to time the functions is
import itertools
from timeit import timeit
string = 'abcdef'
def string_splosion_loop():
i, result = 0, ''
while i < len(string):
i += 1
result += string[:i]
return result
def string_splosion_join_gen():
return ''.join(string[:i] for i in range(1, len(string) + 1))
def string_splosion_join_lc():
# str.join performs faster when the argument is a list
return ''.join([string[:i] for i in range(1, len(string) + 1)])
def string_splosion_itertools():
return ''.join(itertools.accumulate(string))
funcs = (string_splosion_loop, string_splosion_join_gen,
string_splosion_join_lc, string_splosion_itertools)
for f in funcs:
print('{.__name__}({}): {}'.format(f, string, timeit(f)))
Just use:
"".join([s[:i] for i in range(len(s)+1)])
As #abc noted, don't use str as a variable name because it's one of the default type. see https://docs.python.org/2/library/stdtypes.html#sequence-types-str-unicode-list-tuple-bytearray-buffer-xrange
E.g.:
>>> s = "1234"
>>> "".join([s[:i] for i in range(len(s)+1)])
'1121231234'
>>> s = "abcd"
>>> "".join([s[:i] for i in range(len(s)+1)])
'aababcabcd'
range(len(s)+1) is because of slicing, see Explain Python's slice notation:
>>> s = "1234"
>>> len(s)
4
>>> range(len(s))
[0, 1, 2, 3]
>>> s[:3]
'123'
>>> range(len(s)+1)
[0, 1, 2, 3, 4]
>>> s[:4]
'1234'
Then:
>>> s[:0]
''
>>> s[:1]
'1'
>>> s[:2]
'12'
>>> s[:3]
'123'
>>> s[:4]
'1234'
Lastly, join list([s[:1], s[:2], s[:3], s[:4]]) using "".join(list), see https://docs.python.org/2/library/string.html#string.join:
>>> list([s[:1], s[:2], s[:3], s[:4]])
['1', '12', '123', '1234']
>>> x = list([s[:1], s[:2], s[:3], s[:4]])
>>> "".join(x)
'1121231234'
>>> "-".join(x)
'1-12-123-1234'
>>> " ".join(x)
'1 12 123 1234'
To avoid extract iteration in loop, you can use range(1,len(s)+1) since s[:0] returns string of 0 length:
>>> s = "1234"
>>> "".join([s[:i] for i in range(1,len(s)+1)])
'1121231234'
>>> "".join([s[:i] for i in range(len(s)+1)])
'1121231234'
If you are using python 3 you can use this to print without a newline:
print(yourString, end="")
So your function could be:
def string_splosion(str):
for i in range(len(str)):
print(str[:i], end="")
print(string_splosion('abcd'))

Changing binary characters in string

I would like to convert this list:
a = [['0001', '0101'], ['1100', '0011']]
to:
a' = [['1110', '1010'],['0011','1100']]
In the second example, every character is changed to its opposite (i.e. '1' is changed to '0' and '0' is changed to '1').
The code I have tried is:
for i in a:
for j in i:
s=list(j)
for k in s:
position = s.index(k)
if k=='0':
s[position] = '1'
elif k=='1':
s[position] = '0'
''.join(s)
But it doen't work properly. What can I do?
Thanks
You can use a function that flips the bits like this:
from string import maketrans
flip_table = maketrans('01', '10')
def flip(s):
return s.translate(flip_table)
Then just call it on each item in the list like this:
>>> flip('1100')
'0011'
[["".join([str(int(not int(t))) for t in x]) for x in d] for d in a]
Example:
>>> a = [['0001', '0101'], ['1100', '0011']]
>>> a_ = [["".join([str(int(not int(t))) for t in x]) for x in d] for d in a]
>>> a_
[['1110', '1010'], ['0011', '1100']]
Using a simple list comprehension:
[[k.translate({48:'1', 49:'0'}) for k in i] for i in a]
48 is the code for "0", and 49 is the code for "1".
Demo:
>>> a = [['0001', '0101'], ['1100', '0011']]
>>> [[k.translate({48:'1', 49:'0'}) for k in i] for i in a]
[['1110', '1010'], ['0011', '1100']]
For Python 2.x:
from string import translate, maketrans
[[translate(k, maketrans('10', '01')) for k in i] for i in a]
from ast import literal_eval
import re
a = [['0001', '0101'], ['1100', '0011']]
print literal_eval(re.sub('[01]',lambda m: '0' if m.group()=='1' else '1',str(a)))
literal_eval() is said to be safer than eval()

Single integer to multiple integer translation in Python

I'm trying to translate a single integer input to a multiple integer output, and am currently using the transtab function. For instance,
intab3 = "abcdefg"
outtab3 = "ABCDEFG"
trantab3 = maketrans(intab3, outtab3)
is the most basic version of what I'm doing. What I'd like to be able to do is have the input be a single letter and the output be multiple letters. So something like:
intab4 = "abc"
outtab = "yes,no,maybe"
but commas and quotation marks don't work.
It keeps saying :
ValueError: maketrans arguments must have same length
Is there a better function I should be using? Thanks,
You can use a dict here:
>>> dic = {"a":"yes", "b":"no", "c":"maybe"}
>>> strs = "abcd"
>>> "".join(dic.get(x,x) for x in strs)
'yesnomaybed'
In python3, the str.translate method was improved so this just works.
>>> intab4 = "abc"
>>> outtab = "yes,no,maybe"
>>> d = {ord(k): v for k, v in zip(intab4, outtab.split(','))}
>>> print(d)
{97: 'yes', 98: 'no', 99: 'maybe'}
>>> 'abcdefg'.translate(d)
'yesnomaybedefg'

Categories