How to check if characters in a string are alphabetically ordered - python

I have been trying these code but there is something wrong. I simply want to know if the first string is alphabetical.
def alp(s1):
s2=sorted(s1)
if s2 is s1:
return True
else:
return False
This always prints False and when i say print s1 or s2, it says "NameError: name 's1' is not defined"

is is identity testing which compares the object IDs, == is the equality testing:
In [1]: s1 = "Hello World"
In [2]: s2 = "Hello World"
In [3]: s1 == s2
Out[3]: True
In [4]: s1 is s2
Out[4]: False
Also note that sorted returns a list, so change it to:
if ''.join(s2) == s1:
Or
if ''.join(sorted(s2)) == s1:

You could see this answer and use something which works for any sequence:
all(s1[i] <= s1[i+1] for i in xrange(len(s1) - 1))
Example:
>>> def alp(s1):
... return all(s1[i] <= s1[i+1] for i in xrange(len(s1) - 1))
...
>>> alp("test")
False
>>> alp("abcd")
True

I would do it using iter to nicely get the previous element:
def is_ordered(ss):
ss_iterable = iter(ss)
try:
current_item = next(ss_iterable)
except StopIteration:
#replace next line to handle the empty string case as desired.
#This is how *I* would do it, but others would prefer `return True`
#as indicated in the comments :)
#I suppose the question is "Is an empty sequence ordered or not?"
raise ValueError("Undefined result. Cannot accept empty iterable")
for next_item in ss_iterable:
if next_item < current_item:
return False
current_item = next_item
return True
This answer has complexity O(n) in the absolute worst case as opposed to the answers which rely on sort which is O(nlogn).

Make sure that you are comparing strings with strings:
In [8]: s = 'abcdef'
In [9]: s == ''.join(sorted(s))
Out[9]: True
In [10]: s2 = 'zxyw'
In [11]: s2 == ''.join(sorted(s2))
Out[11]: False
If s1 or s2 is a string, sorted will return a list, and you will then be comparing a string to a list. In order to do the comparison you want, using ''.join() will take the list and join all the elements together, essentially creating a string representing the sorted elements.

use something like this:
sorted() returns a list and you're trying to compare a list to a string, so change that list to a string first:
In [21]: "abcd"=="".join(sorted("abcd"))
Out[21]: True
In [22]: "qwerty"=="".join(sorted("qwerty"))
Out[22]: False
#comparsion of list and a string is False
In [25]: "abcd"==sorted("abcd")
Out[25]: False

Related

Simple login function to check length of input and not an integer [duplicate]

Most of the questions I've found are biased on the fact they're looking for letters in their numbers, whereas I'm looking for numbers in what I'd like to be a numberless string.
I need to enter a string and check to see if it contains any numbers and if it does reject it.
The function isdigit() only returns True if ALL of the characters are numbers. I just want to see if the user has entered a number so a sentence like "I own 1 dog" or something.
Any ideas?
You can use any function, with the str.isdigit function, like this
def has_numbers(inputString):
return any(char.isdigit() for char in inputString)
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
Alternatively you can use a Regular Expression, like this
import re
def has_numbers(inputString):
return bool(re.search(r'\d', inputString))
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
You can use a combination of any and str.isdigit:
def num_there(s):
return any(i.isdigit() for i in s)
The function will return True if a digit exists in the string, otherwise False.
Demo:
>>> king = 'I shall have 3 cakes'
>>> num_there(king)
True
>>> servant = 'I do not have any cakes'
>>> num_there(servant)
False
Use the Python method str.isalpha(). This function returns True if all characters in the string are alphabetic and there is at least one character; returns False otherwise.
Python Docs: https://docs.python.org/3/library/stdtypes.html#str.isalpha
https://docs.python.org/2/library/re.html
You should better use regular expression. It's much faster.
import re
def f1(string):
return any(i.isdigit() for i in string)
def f2(string):
return re.search('\d', string)
# if you compile the regex string first, it's even faster
RE_D = re.compile('\d')
def f3(string):
return RE_D.search(string)
# Output from iPython
# In [18]: %timeit f1('assdfgag123')
# 1000000 loops, best of 3: 1.18 µs per loop
# In [19]: %timeit f2('assdfgag123')
# 1000000 loops, best of 3: 923 ns per loop
# In [20]: %timeit f3('assdfgag123')
# 1000000 loops, best of 3: 384 ns per loop
You could apply the function isdigit() on every character in the String. Or you could use regular expressions.
Also I found How do I find one number in a string in Python? with very suitable ways to return numbers. The solution below is from the answer in that question.
number = re.search(r'\d+', yourString).group()
Alternatively:
number = filter(str.isdigit, yourString)
For further Information take a look at the regex docu: http://docs.python.org/2/library/re.html
Edit: This Returns the actual numbers, not a boolean value, so the answers above are more correct for your case
The first method will return the first digit and subsequent consecutive digits. Thus 1.56 will be returned as 1. 10,000 will be returned as 10. 0207-100-1000 will be returned as 0207.
The second method does not work.
To extract all digits, dots and commas, and not lose non-consecutive digits, use:
re.sub('[^\d.,]' , '', yourString)
I'm surprised that no-one mentionned this combination of any and map:
def contains_digit(s):
isdigit = str.isdigit
return any(map(isdigit,s))
in python 3 it's probably the fastest there (except maybe for regexes) is because it doesn't contain any loop (and aliasing the function avoids looking it up in str).
Don't use that in python 2 as map returns a list, which breaks any short-circuiting
You can accomplish this as follows:
if a_string.isdigit():
do_this()
else:
do_that()
https://docs.python.org/2/library/stdtypes.html#str.isdigit
Using .isdigit() also means not having to resort to exception handling (try/except) in cases where you need to use list comprehension (try/except is not possible inside a list comprehension).
You can use NLTK method for it.
This will find both '1' and 'One' in the text:
import nltk
def existence_of_numeric_data(text):
text=nltk.word_tokenize(text)
pos = nltk.pos_tag(text)
count = 0
for i in range(len(pos)):
word , pos_tag = pos[i]
if pos_tag == 'CD':
return True
return False
existence_of_numeric_data('We are going out. Just five you and me.')
You can use range with count to check how many times a number appears in the string by checking it against the range:
def count_digit(a):
sum = 0
for i in range(10):
sum += a.count(str(i))
return sum
ans = count_digit("apple3rh5")
print(ans)
#This print 2
import string
import random
n = 10
p = ''
while (string.ascii_uppercase not in p) and (string.ascii_lowercase not in p) and (string.digits not in p):
for _ in range(n):
state = random.randint(0, 2)
if state == 0:
p = p + chr(random.randint(97, 122))
elif state == 1:
p = p + chr(random.randint(65, 90))
else:
p = p + str(random.randint(0, 9))
break
print(p)
This code generates a sequence with size n which at least contain an uppercase, lowercase, and a digit. By using the while loop, we have guaranteed this event.
any and ord can be combined to serve the purpose as shown below.
>>> def hasDigits(s):
... return any( 48 <= ord(char) <= 57 for char in s)
...
>>> hasDigits('as1')
True
>>> hasDigits('as')
False
>>> hasDigits('as9')
True
>>> hasDigits('as_')
False
>>> hasDigits('1as')
True
>>>
A couple of points about this implementation.
any is better because it works like short circuit expression in C Language and will return result as soon as it can be determined i.e. in case of string 'a1bbbbbbc' 'b's and 'c's won't even be compared.
ord is better because it provides more flexibility like check numbers only between '0' and '5' or any other range. For example if you were to write a validator for Hexadecimal representation of numbers you would want string to have alphabets in the range 'A' to 'F' only.
What about this one?
import string
def containsNumber(line):
res = False
try:
for val in line.split():
if (float(val.strip(string.punctuation))):
res = True
break
except ValueError:
pass
return res
containsNumber('234.12 a22') # returns True
containsNumber('234.12L a22') # returns False
containsNumber('234.12, a22') # returns True
I'll make the #zyxue answer a bit more explicit:
RE_D = re.compile('\d')
def has_digits(string):
res = RE_D.search(string)
return res is not None
has_digits('asdf1')
Out: True
has_digits('asdf')
Out: False
which is the solution with the fastest benchmark from the solutions that #zyxue proposed on the answer.
Also, you could use regex findall. It's a more general solution since it adds more control over the length of the number. It could be helpful in cases where you require a number with minimal length.
s = '67389kjsdk'
contains_digit = len(re.findall('\d+', s)) > 0
Simpler way to solve is as
s = '1dfss3sw235fsf7s'
count = 0
temp = list(s)
for item in temp:
if(item.isdigit()):
count = count + 1
else:
pass
print count
alp_num = [x for x in string.split() if x.isalnum() and re.search(r'\d',x) and
re.search(r'[a-z]',x)]
print(alp_num)
This returns all the string that has both alphabets and numbers in it. isalpha() returns the string with all digits or all characters.
This too will work.
if any(i.isdigit() for i in s):
print("True")
You can also use set.intersection
It is quite fast, better than regex for small strings.
def contains_number(string):
return True if set(string).intersection('0123456789') else False
An iterator approach. It consumes all characters unless a digit is met. The second argument of next fix the default value to return when the iterator is "empty". In this case it set to False but also '' works since it is casted to a boolean value in the if.
def has_digit(string):
str_iter = iter(string)
while True:
char = next(str_iter, False)
# check if iterator is empty
if char:
if char.isdigit():
return True
else:
return False
or by looking only at the 1st term of a generator comprehension
def has_digit(string):
return next((True for char in string if char.isdigit()), False)
I'm surprised nobody has used the python operator in. Using this would work as follows:
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in range(10):
if str(i) in list(string):
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
Or we could use the function isdigit():
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in list(string):
if i.isdigit():
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
These functions basically just convert s into a list, and check whether the list contains a digit. If it does, it returns True, if not, it returns False.

Check whether a String value contains a numeric figure using Python [duplicate]

Most of the questions I've found are biased on the fact they're looking for letters in their numbers, whereas I'm looking for numbers in what I'd like to be a numberless string.
I need to enter a string and check to see if it contains any numbers and if it does reject it.
The function isdigit() only returns True if ALL of the characters are numbers. I just want to see if the user has entered a number so a sentence like "I own 1 dog" or something.
Any ideas?
You can use any function, with the str.isdigit function, like this
def has_numbers(inputString):
return any(char.isdigit() for char in inputString)
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
Alternatively you can use a Regular Expression, like this
import re
def has_numbers(inputString):
return bool(re.search(r'\d', inputString))
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
You can use a combination of any and str.isdigit:
def num_there(s):
return any(i.isdigit() for i in s)
The function will return True if a digit exists in the string, otherwise False.
Demo:
>>> king = 'I shall have 3 cakes'
>>> num_there(king)
True
>>> servant = 'I do not have any cakes'
>>> num_there(servant)
False
Use the Python method str.isalpha(). This function returns True if all characters in the string are alphabetic and there is at least one character; returns False otherwise.
Python Docs: https://docs.python.org/3/library/stdtypes.html#str.isalpha
https://docs.python.org/2/library/re.html
You should better use regular expression. It's much faster.
import re
def f1(string):
return any(i.isdigit() for i in string)
def f2(string):
return re.search('\d', string)
# if you compile the regex string first, it's even faster
RE_D = re.compile('\d')
def f3(string):
return RE_D.search(string)
# Output from iPython
# In [18]: %timeit f1('assdfgag123')
# 1000000 loops, best of 3: 1.18 µs per loop
# In [19]: %timeit f2('assdfgag123')
# 1000000 loops, best of 3: 923 ns per loop
# In [20]: %timeit f3('assdfgag123')
# 1000000 loops, best of 3: 384 ns per loop
You could apply the function isdigit() on every character in the String. Or you could use regular expressions.
Also I found How do I find one number in a string in Python? with very suitable ways to return numbers. The solution below is from the answer in that question.
number = re.search(r'\d+', yourString).group()
Alternatively:
number = filter(str.isdigit, yourString)
For further Information take a look at the regex docu: http://docs.python.org/2/library/re.html
Edit: This Returns the actual numbers, not a boolean value, so the answers above are more correct for your case
The first method will return the first digit and subsequent consecutive digits. Thus 1.56 will be returned as 1. 10,000 will be returned as 10. 0207-100-1000 will be returned as 0207.
The second method does not work.
To extract all digits, dots and commas, and not lose non-consecutive digits, use:
re.sub('[^\d.,]' , '', yourString)
I'm surprised that no-one mentionned this combination of any and map:
def contains_digit(s):
isdigit = str.isdigit
return any(map(isdigit,s))
in python 3 it's probably the fastest there (except maybe for regexes) is because it doesn't contain any loop (and aliasing the function avoids looking it up in str).
Don't use that in python 2 as map returns a list, which breaks any short-circuiting
You can accomplish this as follows:
if a_string.isdigit():
do_this()
else:
do_that()
https://docs.python.org/2/library/stdtypes.html#str.isdigit
Using .isdigit() also means not having to resort to exception handling (try/except) in cases where you need to use list comprehension (try/except is not possible inside a list comprehension).
You can use NLTK method for it.
This will find both '1' and 'One' in the text:
import nltk
def existence_of_numeric_data(text):
text=nltk.word_tokenize(text)
pos = nltk.pos_tag(text)
count = 0
for i in range(len(pos)):
word , pos_tag = pos[i]
if pos_tag == 'CD':
return True
return False
existence_of_numeric_data('We are going out. Just five you and me.')
You can use range with count to check how many times a number appears in the string by checking it against the range:
def count_digit(a):
sum = 0
for i in range(10):
sum += a.count(str(i))
return sum
ans = count_digit("apple3rh5")
print(ans)
#This print 2
import string
import random
n = 10
p = ''
while (string.ascii_uppercase not in p) and (string.ascii_lowercase not in p) and (string.digits not in p):
for _ in range(n):
state = random.randint(0, 2)
if state == 0:
p = p + chr(random.randint(97, 122))
elif state == 1:
p = p + chr(random.randint(65, 90))
else:
p = p + str(random.randint(0, 9))
break
print(p)
This code generates a sequence with size n which at least contain an uppercase, lowercase, and a digit. By using the while loop, we have guaranteed this event.
any and ord can be combined to serve the purpose as shown below.
>>> def hasDigits(s):
... return any( 48 <= ord(char) <= 57 for char in s)
...
>>> hasDigits('as1')
True
>>> hasDigits('as')
False
>>> hasDigits('as9')
True
>>> hasDigits('as_')
False
>>> hasDigits('1as')
True
>>>
A couple of points about this implementation.
any is better because it works like short circuit expression in C Language and will return result as soon as it can be determined i.e. in case of string 'a1bbbbbbc' 'b's and 'c's won't even be compared.
ord is better because it provides more flexibility like check numbers only between '0' and '5' or any other range. For example if you were to write a validator for Hexadecimal representation of numbers you would want string to have alphabets in the range 'A' to 'F' only.
What about this one?
import string
def containsNumber(line):
res = False
try:
for val in line.split():
if (float(val.strip(string.punctuation))):
res = True
break
except ValueError:
pass
return res
containsNumber('234.12 a22') # returns True
containsNumber('234.12L a22') # returns False
containsNumber('234.12, a22') # returns True
I'll make the #zyxue answer a bit more explicit:
RE_D = re.compile('\d')
def has_digits(string):
res = RE_D.search(string)
return res is not None
has_digits('asdf1')
Out: True
has_digits('asdf')
Out: False
which is the solution with the fastest benchmark from the solutions that #zyxue proposed on the answer.
Also, you could use regex findall. It's a more general solution since it adds more control over the length of the number. It could be helpful in cases where you require a number with minimal length.
s = '67389kjsdk'
contains_digit = len(re.findall('\d+', s)) > 0
Simpler way to solve is as
s = '1dfss3sw235fsf7s'
count = 0
temp = list(s)
for item in temp:
if(item.isdigit()):
count = count + 1
else:
pass
print count
alp_num = [x for x in string.split() if x.isalnum() and re.search(r'\d',x) and
re.search(r'[a-z]',x)]
print(alp_num)
This returns all the string that has both alphabets and numbers in it. isalpha() returns the string with all digits or all characters.
This too will work.
if any(i.isdigit() for i in s):
print("True")
You can also use set.intersection
It is quite fast, better than regex for small strings.
def contains_number(string):
return True if set(string).intersection('0123456789') else False
An iterator approach. It consumes all characters unless a digit is met. The second argument of next fix the default value to return when the iterator is "empty". In this case it set to False but also '' works since it is casted to a boolean value in the if.
def has_digit(string):
str_iter = iter(string)
while True:
char = next(str_iter, False)
# check if iterator is empty
if char:
if char.isdigit():
return True
else:
return False
or by looking only at the 1st term of a generator comprehension
def has_digit(string):
return next((True for char in string if char.isdigit()), False)
I'm surprised nobody has used the python operator in. Using this would work as follows:
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in range(10):
if str(i) in list(string):
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
Or we could use the function isdigit():
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in list(string):
if i.isdigit():
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
These functions basically just convert s into a list, and check whether the list contains a digit. If it does, it returns True, if not, it returns False.

Presence of a number in a string with regural expression in python [duplicate]

Most of the questions I've found are biased on the fact they're looking for letters in their numbers, whereas I'm looking for numbers in what I'd like to be a numberless string.
I need to enter a string and check to see if it contains any numbers and if it does reject it.
The function isdigit() only returns True if ALL of the characters are numbers. I just want to see if the user has entered a number so a sentence like "I own 1 dog" or something.
Any ideas?
You can use any function, with the str.isdigit function, like this
def has_numbers(inputString):
return any(char.isdigit() for char in inputString)
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
Alternatively you can use a Regular Expression, like this
import re
def has_numbers(inputString):
return bool(re.search(r'\d', inputString))
has_numbers("I own 1 dog")
# True
has_numbers("I own no dog")
# False
You can use a combination of any and str.isdigit:
def num_there(s):
return any(i.isdigit() for i in s)
The function will return True if a digit exists in the string, otherwise False.
Demo:
>>> king = 'I shall have 3 cakes'
>>> num_there(king)
True
>>> servant = 'I do not have any cakes'
>>> num_there(servant)
False
Use the Python method str.isalpha(). This function returns True if all characters in the string are alphabetic and there is at least one character; returns False otherwise.
Python Docs: https://docs.python.org/3/library/stdtypes.html#str.isalpha
https://docs.python.org/2/library/re.html
You should better use regular expression. It's much faster.
import re
def f1(string):
return any(i.isdigit() for i in string)
def f2(string):
return re.search('\d', string)
# if you compile the regex string first, it's even faster
RE_D = re.compile('\d')
def f3(string):
return RE_D.search(string)
# Output from iPython
# In [18]: %timeit f1('assdfgag123')
# 1000000 loops, best of 3: 1.18 µs per loop
# In [19]: %timeit f2('assdfgag123')
# 1000000 loops, best of 3: 923 ns per loop
# In [20]: %timeit f3('assdfgag123')
# 1000000 loops, best of 3: 384 ns per loop
You could apply the function isdigit() on every character in the String. Or you could use regular expressions.
Also I found How do I find one number in a string in Python? with very suitable ways to return numbers. The solution below is from the answer in that question.
number = re.search(r'\d+', yourString).group()
Alternatively:
number = filter(str.isdigit, yourString)
For further Information take a look at the regex docu: http://docs.python.org/2/library/re.html
Edit: This Returns the actual numbers, not a boolean value, so the answers above are more correct for your case
The first method will return the first digit and subsequent consecutive digits. Thus 1.56 will be returned as 1. 10,000 will be returned as 10. 0207-100-1000 will be returned as 0207.
The second method does not work.
To extract all digits, dots and commas, and not lose non-consecutive digits, use:
re.sub('[^\d.,]' , '', yourString)
I'm surprised that no-one mentionned this combination of any and map:
def contains_digit(s):
isdigit = str.isdigit
return any(map(isdigit,s))
in python 3 it's probably the fastest there (except maybe for regexes) is because it doesn't contain any loop (and aliasing the function avoids looking it up in str).
Don't use that in python 2 as map returns a list, which breaks any short-circuiting
You can accomplish this as follows:
if a_string.isdigit():
do_this()
else:
do_that()
https://docs.python.org/2/library/stdtypes.html#str.isdigit
Using .isdigit() also means not having to resort to exception handling (try/except) in cases where you need to use list comprehension (try/except is not possible inside a list comprehension).
You can use NLTK method for it.
This will find both '1' and 'One' in the text:
import nltk
def existence_of_numeric_data(text):
text=nltk.word_tokenize(text)
pos = nltk.pos_tag(text)
count = 0
for i in range(len(pos)):
word , pos_tag = pos[i]
if pos_tag == 'CD':
return True
return False
existence_of_numeric_data('We are going out. Just five you and me.')
You can use range with count to check how many times a number appears in the string by checking it against the range:
def count_digit(a):
sum = 0
for i in range(10):
sum += a.count(str(i))
return sum
ans = count_digit("apple3rh5")
print(ans)
#This print 2
import string
import random
n = 10
p = ''
while (string.ascii_uppercase not in p) and (string.ascii_lowercase not in p) and (string.digits not in p):
for _ in range(n):
state = random.randint(0, 2)
if state == 0:
p = p + chr(random.randint(97, 122))
elif state == 1:
p = p + chr(random.randint(65, 90))
else:
p = p + str(random.randint(0, 9))
break
print(p)
This code generates a sequence with size n which at least contain an uppercase, lowercase, and a digit. By using the while loop, we have guaranteed this event.
any and ord can be combined to serve the purpose as shown below.
>>> def hasDigits(s):
... return any( 48 <= ord(char) <= 57 for char in s)
...
>>> hasDigits('as1')
True
>>> hasDigits('as')
False
>>> hasDigits('as9')
True
>>> hasDigits('as_')
False
>>> hasDigits('1as')
True
>>>
A couple of points about this implementation.
any is better because it works like short circuit expression in C Language and will return result as soon as it can be determined i.e. in case of string 'a1bbbbbbc' 'b's and 'c's won't even be compared.
ord is better because it provides more flexibility like check numbers only between '0' and '5' or any other range. For example if you were to write a validator for Hexadecimal representation of numbers you would want string to have alphabets in the range 'A' to 'F' only.
What about this one?
import string
def containsNumber(line):
res = False
try:
for val in line.split():
if (float(val.strip(string.punctuation))):
res = True
break
except ValueError:
pass
return res
containsNumber('234.12 a22') # returns True
containsNumber('234.12L a22') # returns False
containsNumber('234.12, a22') # returns True
I'll make the #zyxue answer a bit more explicit:
RE_D = re.compile('\d')
def has_digits(string):
res = RE_D.search(string)
return res is not None
has_digits('asdf1')
Out: True
has_digits('asdf')
Out: False
which is the solution with the fastest benchmark from the solutions that #zyxue proposed on the answer.
Also, you could use regex findall. It's a more general solution since it adds more control over the length of the number. It could be helpful in cases where you require a number with minimal length.
s = '67389kjsdk'
contains_digit = len(re.findall('\d+', s)) > 0
Simpler way to solve is as
s = '1dfss3sw235fsf7s'
count = 0
temp = list(s)
for item in temp:
if(item.isdigit()):
count = count + 1
else:
pass
print count
alp_num = [x for x in string.split() if x.isalnum() and re.search(r'\d',x) and
re.search(r'[a-z]',x)]
print(alp_num)
This returns all the string that has both alphabets and numbers in it. isalpha() returns the string with all digits or all characters.
This too will work.
if any(i.isdigit() for i in s):
print("True")
You can also use set.intersection
It is quite fast, better than regex for small strings.
def contains_number(string):
return True if set(string).intersection('0123456789') else False
An iterator approach. It consumes all characters unless a digit is met. The second argument of next fix the default value to return when the iterator is "empty". In this case it set to False but also '' works since it is casted to a boolean value in the if.
def has_digit(string):
str_iter = iter(string)
while True:
char = next(str_iter, False)
# check if iterator is empty
if char:
if char.isdigit():
return True
else:
return False
or by looking only at the 1st term of a generator comprehension
def has_digit(string):
return next((True for char in string if char.isdigit()), False)
I'm surprised nobody has used the python operator in. Using this would work as follows:
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in range(10):
if str(i) in list(string):
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
Or we could use the function isdigit():
foo = '1dfss3sw235fsf7s'
bar = 'lorem ipsum sit dolor amet'
def contains_number(string):
for i in list(string):
if i.isdigit():
return True
return False
print(contains_number(foo)) #True
print(contains_number(bar)) #False
These functions basically just convert s into a list, and check whether the list contains a digit. If it does, it returns True, if not, it returns False.

How to return False when using issubset and an empty set

When I have two sets e.g.
s1 = set()
s2 = set(['somestring'])
and I do
print s1.issubset(s2)
it returns True; so apparently, an empty set is always a subset of another set.
For my analysis, it should actually return False and I am wondering about the best way to do this. I can write a function like this:
def check_set(s1, s2):
if s1 and s1.issubset(s2):
return True
return False
which then indeed returns False for the example above. Is there any better way of doing this?
I would do that like this:
s1 <= s2 if s1 else False
It should be faster, because it uses the built-in operators supported by sets rather than using more expensive function calls and attribute lookups. It's logically equivalent.
You can take advantage of how Python evaluates the truthiness of an object plus how it short-circuits boolean and expressions with:
bool(s1) and s1 <= s2
Essentially this means: if s1 is something not empty AND it's a subset of s2
Instead of using an if you can force the result to be a bool by doing this:
def check_set(s1, s2):
return bool(s1 and s1.issubset(s2))
Why not just return the value? That way, you avoid having to write return True or return False.
def check_set(s1, s2):
return bool(s1 and s1.issubset(s2))
Instead of using an empty set, you could use a set with an empty value:
s1 = set(['']) or s1 = set([None])
Then your print statement would work as you expected.

Comparing sets that contain nan in python

I'm trying to compare two sets in python that contain nan but struggling to do so because {float('nan')} != {float('nan')}. For example:
s1 = {float('nan'), 1}
s2 = {float('nan'), 1, 2}
assert set.issubset(s1, s2)
And I get an assertion error. How can I handle this?
One approach: identity is tested before equality (see here in the docs, for example), so it'd work if you use the same nan:
>>> nan = float("nan")
>>> s1 = {nan, 1}
>>> s2 = {nan, 1, 2}
>>> set.issubset(s1, s2)
True
even though
>>> s1 = {float("nan"), 1}
>>> s2 = {float("nan"), 1, 2}
>>> set.issubset(s1, s2)
False
Working with nans is awkward enough that I'd try to avoid putting them in sets and switch to a different canonical form. But you could always just make sure it's the same one:
>>> def one_nan(x, nan=float("nan")):
... return nan if math.isnan(x) else x
...
>>> set.issubset(set(map(one_nan, s1)), set(map(one_nan, s2)))
True
or a thousand variants on the same. (I sometimes use x != x as a shortcut for nan-detection but it's probably a good idea to be explicit here.)
You could also write a simple function for this. Note that float('nan') == float('nan') is False for nan; to check if any element is nan, we just have to compare it with itself.
def is_subset(s1, s2):
no_nan_set = lambda s: {x for x in s if x == x}
s1_nan, s2_nan = no_nan_set(s1), no_nan_set(s2)
if s1_nan != s1 and s2_nan != s2:
return s1_nan.issubset(s2_nan)
elif s1_nan == s1 and s2_nan == s2:
return s1.issubset(s2)
else:
return False
You can simplify the if-elif-else block
def is_subset(s1, s2):
no_nan_set = lambda s: {x for x in s if x == x}
s1_nan, s2_nan = no_nan_set(s1), no_nan_set(s2)
return (s1_nan != s1 and s2_nan != s2 and s1_nan.issubset(s2_nan)) \
or (s1_nan == s1 and s2_nan == s2 and s1.issubset(s2))
Note that if either of your set has two or more nans (because float('nan') != float('nan')), this will work correctly, and similarly it will work all right if the ids of the nans are different. And lastly, this will work even if you don't have the nans in one or both of your set.
Create temporary sets with all nan values removed, and compare those instead. Afterwards, handle the nan comparison separately. For example, you could check if both of the original sets contain nan.
Even if you could perform the comparison for your sets without the assertion exception, float('nan') == float('nan') will return False so there is little value gained from this set comparison (it will invalidate the rest of the comparison). You can check this behavior by printing set.issubset.
s1 = frozenset({float('nan'), 1})
s2 = frozenset({float('nan'), 1, 2})
print frozenset.issubset(s1,s2)
which prints False.
Although set is deprecated, you may generate the temporary sets as follows:
s3 = set([value for value in s1 if not math.isnan(value)])
(repeat for each temporary set as needed)

Categories