International phone number validation - python

I need to do pretty basic phone-number validation and formatting on all US and international phone numbers in Python. Here's what I have so far:
import re
def validate(number):
number = re.compile(r'[^0-9]').sub('', number)
if len(number) == 10:
# ten-digit number, great
return number
elif len(number) == 7:
# 7-digit number, should include area code
raise ValidationError("INCLUDE YOUR AREA CODE OR ELSE.")
else:
# I have no clue what to do here
def format(number):
if len(number) == 10:
# basically return XXX-XXX-XXXX
return re.compile(r'^(\d{3})(\d{3})(\d{4})$').sub('$1-$2-$3', number)
else:
# basically return +XXX-XXX-XXX-XXXX
return re.compile(r'^(\d+)(\d{3})(\d{3})(\d{4})$').sub('+$1-$2-$3-$4', number)
My main problem is that I have NO idea as to how international phone numbers work. I assume that they're simply 10-digit numbers with a \d+ of the country code in front of them. Is this true?

E.164 numbers can be up to fifteen digits, and you should have no expectation that beyond the country code of 1-3 digits that they will fit any particular form. Certainly there are lots of countries where it is not XXX-XXX-XXXX. As I see it you have three options:
Painstakingly create a database of the number formats for every country code. Then check each country individually for updates on a periodic basis. (Edit: it looks like Google already does this, so if you trust them and the Python porter to keep libphonenumber correct and up to date, and don't mind upgrading this library every time there is a change, that might work for you.)
Eliminate all delimiters in the supplied telephone numbers and format them without any spacing: +12128675309
Format the numbers as the user supplies them rather than reformatting them yourself incorrectly.

I ignore the format as in where are the spaces and dashes.
But here is the regex function I use to validate that numbers:
eventually, start with a + and some digits for the country code
eventually, contain one set of brackets with digits inside for area code or optional 0
finish with a digit
contain spaces or dashes in the number itself (not in the country or area codes):
def is_valid_phone(phone):
return re.match(r'(\+[0-9]+\s*)?(\([0-9]+\))?[\s0-9\-]+[0-9]+', phone)

Related

Why is this random generator sometimes returning 3 digits instead of 4?

if choice == '1':
print('Your card has been created')
card = create_card()
print(f'Your card number:\n{card}')
pin = int(''.join(f'{random.randint(0, 9)}' for _ in range(4)))
print(f'Your card PIN:\n{pin}\n')
cards.append([card, pin])
Please can someone explain why the above code sometimes generates a 3-digit number as opposed to a 4-digit number? For example:
Others have explained why you sometimes end up with three-digit (or even two- or one-digit) results. You could view this as a formatting problem -- printing a number with leading zeroes to a specified width. Python does have features supporting that.
But I urge you to instead recognize that the problem is really that your data isn't actually a number in the first place. Rather, it is a string of digits, all of which are always significant. The easiest and best thing to do, then, is to simply keep it as a string instead of converting it to a number.
Your random generator is not returning four-digit numbers sometimes since it is putting a zero as the first digit. Numbers like 0322 are created by the random generator and it is being converted to 322. This generator can also make two-digit and one-digit numbers because of the zeros in front. If you want four-digit numbers only use pin = random.randint(1000, 9999). If you want numbers with leading zeros, use pin = ''.join(f'{random.randint(0, 1)}' for _ in range(4)). This keeps the leading zeros. Keeping the pin as a string stops the leading zeros from being removed.
In your logic, it's possible to generate a string that starts with 0. If you pass a leading 0 numeric string to int(), that leading 0 is ignored:
print(int("0999"))
Output
999
To fix this, you could just change the range start value.
pin = int(''.join(f'{random.randint(1, 9)}' for _ in range(4)))
Edit: To prove this to yourself, print both the generated string and the result of the int() function, like below.
for i in range(100):
st = ''.join(f'{random.randint(0, 9)}' for _ in range(4))
print(st, int(st))

Why do I have to change integers to strings in order to iterate them in Python?

First of all, I have only recently started to learn Python on codeacademy.com and this is probably a very basic question, so thank you for the help and please forgive my lack of knowledge.
The function below takes positive integers as input and returns the sum of all that numbers' digits. What I don't understand, is why I have to change the type of the input into str first, and then back into integer, in order to add the numbers' digits to each other. Could someone help me out with an explanation please? The code works fine for the exercise, but I feel I am missing the big picture here.
def digit_sum(n):
num = 0
for i in str(n):
num += int(i)
return num
Integers are not sequences of digits. They are just (whole) numbers, so they can't be iterated over.
By turning the integer into a string, you created a sequence of digits (characters), and a string can be iterated over. It is no longer a number, it is now text.
See it as a representation; you could also have turned the same number into hexadecimal text, or octal text, or binary text. It would still be the same numerical value, just written down differently in text.
Iteration over a string works, and gives you single characters, which for a number means that each character is also a digit. The code takes that character and turns it back into a number with int(i).
You don't have to use that trick. You could also use maths:
def digit_sum(n):
total = 0
while n:
n, digit = divmod(n, 10)
num += digit
return num
This uses a while loop, and repeatedly divides the input number by ten (keeping the remainder) until 0 is reached. The remainders are summed, giving you the digit sum. So 1234 is turned into 123 and 4, then 12 and 3, etc.
Let's say the number 12345
So I would need 1,2,3,4,5 from the given number and then sum it up.
So how to get individuals number. One mathematical way was how #Martijn Pieters showed.
Another is to convert it into a string , and make it iterable.
This is one of the many ways to do it.
>>> sum(map(int, list(str(12345))))
15
The list() function break a string into individual letters. SO I needed a string. Once I have all numbers as individual letters, I can convert them into integers and add them up .

Python - How to make one input correspond to several functions

My program should have a single input where you write either an arabic number, a roman number, or adding roman numbers. For example:
Year: 2001
... MMI
Year: LX
... 60
Year: MI + CI + I
... MCIII
Year: ABC
... That's not a correct roman numeral
Well, I guess you get the deal. First, I tried with something like this:
def main():
year = input ("Year: ")
if type(int(year)) == type(1):
arab_to_rome(year)
else:
rome_to_arab(year)
main()
This have obvious problems, firstly, everything else than an integer will be considered as roman numerals and it doesn't take addition in consideration.
Then I googled and found something called isinstance. This is the result of trying that:
def main(input):
year = input ("Year: ")
if isinstance(input,int):
arab_to_rome(year)
elif isinstance(input,str):
rome_to_arab (year)
else:
print ("Error")
main()
This has problems as well. I get an error message stating: Invalid syntax.
This code doesn't take addition in consideration either.
You can use a try/except block to see if the input is an int. This makes up the first 3 lines of the code. For more on that see this question.
year = input('Year: ')
try:
print(arab_to_rome(int(year)))
except ValueError:
result = 0
for numeral in year.split('+'):
numeral.replace(' ','')
result += rome_to_arab(numeral)
print(result)
The part after the except is a simple way to handle the addition, and you could migrate this into your rome_to_arab function if you wish. It splits the string on each + and then add the result of the Arabic calculations together to make a complete Arabic result. The replace() function gets rid of any extra spaces to make sure it doesn't break your other methods. See this question or more info on split() and replace().
As you haven't shown us the conversion methods I'm going to assume arab_to_rome returns a string and rome_to_arab returns an int. If not you should probably change them such that they do (not just for my example, but for good code convention)

Thinking logically: Calculate how many times a certain number appears in an integer

Going through a self-learn book they gave you some code to find how many times a specific digit is in an integer or not. How did they automatically know to use modulo 10? Is this something you as a programmer learn a trick for in your CompSci classes?
def num_zero_and_five_digits(n):
count = 0
while n:
digit = n % 10 # This divides w/10 for remainder. How did they know to use 10?
if digit == 0 or digit == 5: #These can be changed to whatever digits you want.
count = count + 1
n = n / 10
return count
I understand the code, but don't own it. What I mean is that if I was asked to 'write code' that would find how many times a certain digit is in an integer, I would personally
do something like this:
integer = str(22342445)
looker = list(integer)
counter = 0
find = raw_input("What number are you looking for")
for num in looker:
if find == num:
print "We found it!"
counter += 1
print "There are %d, %s's in %s" % (counter, find,integer )
Now, my main questions are:
What if someone wants to look for the integer "10" or higher? How
can I account for that in the first solution?
What steps would you personally take to come up with a solution like the first? How would you just "know" that you needed to do modulo 10?
The exercise itself is improperly defined. Rather than looking for a specific number it should instead ask about looking for a specific numeral. This restricts it to, since we use decimal (base-10) numbers (by which I mean that we use the base-10 representation of numbers), one of 10 possibilities. And since we use base-10 numbers, we need to divide by and take the modulus of 10 to separate the number into its digits so that we can compare the numerals. If we were talking about a hexadecimal number instead then we would use 16 to separate the digits, for octal we would use 8, etc.

Replacing in Python

I need to complete a basic task on Python that requires me to convert a standard phone number into an international one.
So for example, if the users phone number was 0123456789, the program should display, Your international phone number is +44123456789.
I don't know how to replace the 0 with a 44. I don't know many techniques on Python so advice on how to is welcomed, thanks.
EDIT:
#Python Number Conversion
def GetInternational(PhoneNumber):
if num.startswith('0'):
num = num.replace('0','+44',1)
return GetInternational
PhoneNumber = input("Enter your phone number: ")
print('Your international number is',GetInternational,'')
I'm missing something obvious but not sure what...
Another way to do it would be:
num.replace('0','+44',1) #Only one, the leftmost zero is replaced
Therefore if the number starts with zero we replace only that one,
num = "0123456789"
if num.startswith('0'):
num = num.replace('0','+44',1)
Well the simplest way to do it is strip off the first character (the 0) and then concatenate it with +"44":
num = "0123456789"
if num.startswith("0"):
num = "+44" + num[1:]
For clarity I added a startswith check to make sure the substitution only happens if the number starts with a zero.

Categories