I have seen methods like isAlpha(), but it accepts spaces and punctuations, which I don't want. Any way to check that a string contains only big or upper case alphabet letters?
E.g. psudo:
"asdf".isLetters() -> true
"as df".isLetters() -> false
"as. df:".isLetters() -> false
>>> "asdf".isalpha()
True
>>> "as df".isalpha()
False
>>> "as. df:".isalpha()
False
According to the documentation for .isalpha() it does what it seems you're after:
Return true if all characters in the string are alphabetic and there is at least one character, false otherwise.
To check for uppercase, use my_str.isupper()
import re
if re.match(r"^[A-Za-z]*$", some_string):
print "yey"!
Related
Is there any way to test for all special characters in python other than manually putting them in, perhaps something similar to the .isalnum or .isalpha functions? I'm relatively new to coding, so I have no idea.
Assuming that any non alphanumeric character counts as special, you can add not in front isalnum and will return true when there's any special character:
test = "1$%a"
print not test.isalnum()
# returns true
You could define your own is_alphanumeric function:
from string import digits, letters
def is_alphanumeric(mystring):
""" Returns true if all characters of `mystring` are either letters or digits:
>>> is_alphanumeric('hello wörld')
True
>>> is alphanumeric('Hello World!')
False
"""
return all(character in chain(digits, letters) for character in mystring)
If you want to restrict it to ascii:
from string import digits, letters_ascii
def is_alphanumeric_ascii(mystring):
""" Returns true if all characters of `mystring` are either ascii letters or digits:
>>> is_alphanumeric('hello wörld')
False
>>> is alphanumeric('Hello World')
True
"""
return all(character in chain(digits, letters_ascii) for character in mystring)
I don't mean specific characters, I just mean anything that isn't alphanumeric. I've tried asking if the string contains only alphabetic and numeric characters like so:
if userInput.isalpha() and userInput.isdigit() == False:
print ("Not valid, contains symbols or spaces")
but this doesn't work and and denies all passwords I put in.
Firstly, you can't be only alpha and only numeric, so your expression will always be false.
Secondly the methods isalpha() and isdigit() equate to True or False, so no need to use == False.
I would suggest using .isalnum().
If that doesn't satisfy you're requirements you should use a regex.
alnum looks for both: https://docs.python.org/2/library/stdtypes.html#str.isalnum and works in Python 3.x and 2.
Example:
>>> 'sometest'.isalnum()
True
>>> 'some test'.isalnum()
False
>>> 'sometest231'.isalnum()
True
>>> 'sometest%231'.isalnum()
False
>>> '231'.isalnum()
True
You have three problems:
if a and b == False is not the same is if a == False and b == False;
if b == False should be written if not b; and
You aren't using str.isalnum, which saves you from the problem anyway.
So it should be:
if not userInput.isalnum():
Is there an easy way to verify that the given character has a special regex function?
Of course I can collect regex characters in a list like ['.', "[", "]", etc.] to check that, but I guess there is a more elegant way.
You could use re.escape. For example:
>>> re.escape("a") == "a"
True
>>> re.escape("[") == "["
False
The idea is that if a character is a special one, then re.escape returns the character with a backslash in front of it. Otherwise, it returns the character itself.
You can use re.escape within all function as following :
>>> def checker(st):
... return all(re.escape(i)==i for i in st)
...
>>> checker('aab]')
False
>>> checker('aab')
True
>>> checker('aa.b3')
False
Per the documentation, re.escape will (emphasis mine):
Return string with all non-alphanumerics backslashed; this is useful
if you want to match an arbitrary literal string that may have regular
expression metacharacters in it.
So it tells you whether a character could be a meaningful one, not whether it is. For example:
>>> re.escape('&') == '&'
False
This is useful for processing arbitrary strings, as it ensures that all control characters are escaped, but not for telling you which actually needed to be. The simplest approach, in my view, is the one dismissed in the question:
char in set(r'.^$*+?{}[]\| ')
Elegance lies in the eyes of the beholder, however (IMHO) this (below) is the most generic/"timeproof" way of checking if a character is considered to be special by the Python Regex engine -
def isFalsePositive(char):
m = re.match(char, 'a')
if m is not None and m.end() == 1:
return True
else:
return False
def isSpecial(char):
try:
m = re.match(char, char)
except:
return True
if m is not None and m.end() == 1:
if isFalsePositive(char):
return True
else:
return False
else:
return True
P.S. -
isFalsePositive() may be overkill to check the special case of '.' (dot). :-)
I'm trying to check if a string only contains letters, not digits or symbols.
For example:
>>> only_letters("hello")
True
>>> only_letters("he7lo")
False
Simple:
if string.isalpha():
print("It's all letters")
str.isalpha() is only true if all characters in the string are letters:
Return true if all characters in the string are alphabetic and there is at least one character, false otherwise.
Demo:
>>> 'hello'.isalpha()
True
>>> '42hello'.isalpha()
False
>>> 'hel lo'.isalpha()
False
The str.isalpha() function works. ie.
if my_string.isalpha():
print('it is letters')
For people finding this question via Google who might want to know if a string contains only a subset of all letters, I recommend using regexes:
import re
def only_letters(tested_string):
match = re.match("^[ABCDEFGHJKLM]*$", tested_string)
return match is not None
You can leverage regular expressions.
>>> import re
>>> pattern = re.compile("^[a-zA-Z]+$")
>>> pattern.match("hello")
<_sre.SRE_Match object; span=(0, 5), match='hello'>
>>> pattern.match("hel7lo")
>>>
The match() method will return a Match object if a match is found. Otherwise it will return None.
An easier approach is to use the .isalpha() method
>>> "Hello".isalpha()
True
>>> "Hel7lo".isalpha()
False
isalpha() returns true if there is at least 1 character in the string and if all the characters in the string are alphabets.
Actually, we're now in globalized world of 21st century and people no longer communicate using ASCII only so when anwering question about "is it letters only" you need to take into account letters from non-ASCII alphabets as well. Python has a pretty cool unicodedata library which among other things allows categorization of Unicode characters:
unicodedata.category('陳')
'Lo'
unicodedata.category('A')
'Lu'
unicodedata.category('1')
'Nd'
unicodedata.category('a')
'Ll'
The categories and their abbreviations are defined in the Unicode standard. From here you can quite easily you can come up with a function like this:
def only_letters(s):
for c in s:
cat = unicodedata.category(c)
if cat not in ('Ll','Lu','Lo'):
return False
return True
And then:
only_letters('Bzdrężyło')
True
only_letters('He7lo')
False
As you can see the whitelisted categories can be quite easily controlled by the tuple inside the function. See this article for a more detailed discussion.
The string.isalpha() function will work for you.
See http://www.tutorialspoint.com/python/string_isalpha.htm
Looks like people are saying to use str.isalpha.
This is the one line function to check if all characters are letters.
def only_letters(string):
return all(letter.isalpha() for letter in string)
all accepts an iterable of booleans, and returns True iff all of the booleans are True.
More generally, all returns True if the objects in your iterable would be considered True. These would be considered False
0
None
Empty data structures (ie: len(list) == 0)
False. (duh)
(1) Use str.isalpha() when you print the string.
(2) Please check below program for your reference:-
str = "this"; # No space & digit in this string
print str.isalpha() # it gives return True
str = "this is 2";
print str.isalpha() # it gives return False
Note:- I checked above example in Ubuntu.
A pretty simple solution I came up with: (Python 3)
def only_letters(tested_string):
for letter in tested_string:
if letter not in "abcdefghijklmnopqrstuvwxyz":
return False
return True
You can add a space in the string you are checking against if you want spaces to be allowed.
What is best pure Python implementation to check if a string contains ANY letters from the alphabet?
string_1 = "(555).555-5555"
string_2 = "(555) 555 - 5555 ext. 5555
Where string_1 would return False for having no letters of the alphabet in it and string_2 would return True for having letter.
Regex should be a fast approach:
re.search('[a-zA-Z]', the_string)
How about:
>>> string_1 = "(555).555-5555"
>>> string_2 = "(555) 555 - 5555 ext. 5555"
>>> any(c.isalpha() for c in string_1)
False
>>> any(c.isalpha() for c in string_2)
True
You can use islower() on your string to see if it contains some lowercase letters (amongst other characters). or it with isupper() to also check if contains some uppercase letters:
below: letters in the string: test yields true
>>> z = "(555) 555 - 5555 ext. 5555"
>>> z.isupper() or z.islower()
True
below: no letters in the string: test yields false.
>>> z= "(555).555-5555"
>>> z.isupper() or z.islower()
False
>>>
Not to be mixed up with isalpha() which returns True only if all characters are letters, which isn't what you want.
Note that Barm's answer completes mine nicely, since mine doesn't handle the mixed case well.
I liked the answer provided by #jean-françois-fabre, but it is incomplete.
His approach will work, but only if the text contains purely lower- or uppercase letters:
>>> text = "(555).555-5555 extA. 5555"
>>> text.islower()
False
>>> text.isupper()
False
The better approach is to first upper- or lowercase your string and then check.
>>> string1 = "(555).555-5555 extA. 5555"
>>> string2 = '555 (234) - 123.32 21'
>>> string1.upper().isupper()
True
>>> string2.upper().isupper()
False
You can use regular expression like this:
import re
print re.search('[a-zA-Z]+',string)
I tested each of the above methods for finding if any alphabets are contained in a given string and found out average processing time per string on a standard computer.
~250 ns for
import re
~3 µs for
re.search('[a-zA-Z]', string)
~6 µs for
any(c.isalpha() for c in string)
~850 ns for
string.upper().isupper()
Opposite to as alleged, importing re takes negligible time, and searching with re takes just about half time as compared to iterating isalpha() even for a relatively small string.
Hence for larger strings and greater counts, re would be significantly more efficient.
But converting string to a case and checking case (i.e. any of upper().isupper() or lower().islower() ) wins here. In every loop it is significantly faster than re.search() and it doesn't even require any additional imports.
You can also do this in addition
import re
string='24234ww'
val = re.search('[a-zA-Z]+',string)
val[0].isalpha() # returns True if the variable is an alphabet
print(val[0]) # this will print the first instance of the matching value
Also note that if variable val returns None. That means the search did not find a match