Convert a line of text which contains numbers and letters to int - python

I am working on a program that reads an RFID card, and then pulls information about that card from a database. I am using Python with MySQL for this but in order for it to work I need to convert a string, e.g. "2345d566k", to an int. I don't need the letters to be in there, just the numbers.
when I do the following:
test = "2345d566k"
test2 = int(test)
it returns: ValueError: invalid literal for int() with base 10: '2345d566k'
How could I convert this string to an int?

Assuming you want to ignore characters other than digits, a simple solution would be
test = "ab23cd56e3f"
test2 = int(filter(lambda x: x.isdigit(), test))
#test2 is now 23563
filter() applies the isdigit() function to every character of the string, keeping only those that are digits. Then you can safely call int() to convert the result to an integer.
As pointed out in the comments, this will only work if every line of text you want to convert contains at least 1 digit.

You can easily use regex for this
import re
"".join(re.findall('\d', "ab23cd56e3f"))
It will parse all numeric digit from the given string, If you will put \D in place of \d it will result all alphabets.

Related

How to generate a string consisting of 16 digits but between every 4 numbers there is a hyphen(-)?

What kind of technique would allow me to generate a string in Python similar to this output:
1234-1234-1234-1234
Python has a built in wrap function, made to split strings like this. In conjunction with string.join, you can rejoin the split string with your character of choice:
from textwrap import wrap
s = '1234567890abcdef'
print('-'.join(wrap(s, 4)))
>>> 1234-5678-90ab-cdef
The wrap function takes your string, and the number of characters to split on (in this case 4).
The result from this is used in '-'.join to join each element together with dashes, which gives the result you were looking for.
Note: if you're starting with a number instead of a string, you can easily convert it using str():
s = str(1234567890123456)
print('-'.join(wrap(s, 4)))
>>> 1234-5678-9012-3456

Python - Adding comments into a triple-quote string

Is there a way to add comments into a multiline string, or is it not possible? I'm trying to write data into a csv file from a triple-quote string. I'm adding comments in the string to explain the data. I tried doing this, but Python just assumed that the comment was part of the string.
"""
1,1,2,3,5,8,13 # numbers to the Fibonnaci sequence
1,4,9,16,25,36,49 # numbers of the square number sequence
1,1,2,5,14,42,132,429 # numbers in the Catalan number sequence
"""
No, it's not possible to have comments in a string. How would python know that the hash sign # in your string is supposed to be a comment, and not just a hash sign? It makes a lot more sense to interpret the # character as part of the string than as a comment.
As a workaround, you can make use of automatic string literal concatenation:
(
"1,1,2,3,5,8,13\n" # numbers to the Fibonnaci sequence
"1,4,9,16,25,36,49\n" # numbers of the square number sequence
"1,1,2,5,14,42,132,429" # numbers in the Catalan number sequence
)
If you add comments into the string, they become part of the string. If that weren't true, you'd never be able to use a # character in a string, which would be a pretty serious problem.
However, you can post-process the string to remove comments, as long as you know this particular string isn't going to have any other # characters.
For example:
s = """
1,1,2,3,5,8,13 # numbers to the Fibonnaci sequence
1,4,9,16,25,36,49 # numbers of the square number sequence
1,1,2,5,14,42,132,429 # numbers in the Catalan number sequence
"""
s = re.sub(r'#.*', '', s)
If you also want to remove trailing whitespace before the #, change the regex to r'\s*#.*'.
If you don't understand what these regexes are matching and how, see regex101 for a nice visualization.
If you plan to do this many times in the same program, you can even use a trick similar to the popular D = textwrap.dedent idiom:
C = functools.partial(re.sub, r'#.*', '')
And now:
s = C("""
1,1,2,3,5,8,13 # numbers to the Fibonnaci sequence
1,4,9,16,25,36,49 # numbers of the square number sequence
1,1,2,5,14,42,132,429 # numbers in the Catalan number sequence
""")

From the 4 numbers code-point to the unicode character?

I've got a 4 number string corresponding to the code-point of an unicode character.
I need to dynamically convert it to its unicode character to be stored inside a variable.
For example, my program will spit during its loop a variable a = '0590'. (https://www.compart.com/en/unicode/U+0590)
How do I get the variable b = '\u0590'?
I've tried string concatenation '\u' + a but obviously it's not the way.
chr will take a code point as an integer and convert it to the corresponding character. You need to have an integer though, of course.
a = '0590'
result = chr(int(a))
print(result)
On Python 2, the function is called unichr, not chr. And if you want to interpret the string as a hex number, you can pass an explicit radix to int.
a = '0590'
result = unichr(int(a, 16))
print(result)

String splitting in python by finding non-zero character

I want to do the following split:
input: 0x0000007c9226fc output: 7c9226fc
input: 0x000000007c90e8ab output: 7c90e8ab
input: 0x000000007c9220fc output: 7c9220fc
I use the following line of code to do this but it does not work!
split = element.rpartition('0')
I got these outputs which are wrong!
input: 0x000000007c90e8ab output: e8ab
input: 0x000000007c9220fc output: fc
what is the fastest way to do this kind of split?
The only idea for me right now is to make a loop and perform checking but it is a little time consuming.
I should mention that the number of zeros in input is not fixed.
Each string can be converted to an integer using int() with a base of 16. Then convert back to a string.
for s in '0x000000007c9226fc', '0x000000007c90e8ab', '0x000000007c9220fc':
print '%x' % int(s, 16)
Output
7c9226fc
7c90e8ab
7c9220fc
input[2:].lstrip('0')
That should do it. The [2:] skips over the leading 0x (which I assume is always there), then the lstrip('0') removes all the zeros from the left side.
In fact, we can use lstrip ability to remove more than one leading character to simplify:
input.lstrip('x0')
format is handy for this:
>>> print '{:x}'.format(0x000000007c90e8ab)
7c90e8ab
>>> print '{:x}'.format(0x000000007c9220fc)
7c9220fc
In this particular case you can just do
your_input[10:]
You'll most likely want to properly parse this; your idea of splitting on separation of non-zero does not seem safe at all.
Seems to be the XY problem.
If the number of characters in a string is constant then you can use
the following code.
input = "0x000000007c9226fc"
output = input[10:]
Documentation
Also, since you are using rpartitionwhich is defined as
str.rpartition(sep)
Split the string at the last occurrence of sep, and return a 3-tuple containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return a 3-tuple containing two empty strings, followed by the string itself.
Since your input can have multiple 0's, and rpartition only splits the last occurrence this a malfunction in your code.
Regular expression for 0x00000 or its type is (0x[0]+) and than replace it with space.
import re
st="0x000007c922433434000fc"
reg='(0x[0]+)'
rep=re.sub(reg, '',st)
print rep

python: regular expressions, how to match a string of undefind length which has a structure and finishes with a specific group

I need to create a regexp to match strings like this 999-123-222-...-22
The string can be finished by &Ns=(any number) or without this... So valid strings for me are
999-123-222-...-22
999-123-222-...-22&Ns=12
999-123-222-...-22&Ns=12
And following are not valid:
999-123-222-...-22&N=1
I have tried testing it several hours already... But did not manage to solve, really need some help
Not sure if you want to literally match 999-123-22-...-22 or if that can be any sequence of numbers/dashes. Here are two different regexes:
/^[\d-]+(&Ns=\d+)?$/
/^999-123-222-\.\.\.-22(&Ns=\d+)?$/
The key idea is the (&Ns=\d+)?$ part, which matches an optional &Ns=<digits>, and is anchored to the end of the string with $.
If you just want to allow strings 999-123-222-...-22 and 999-123-222-...-22&Ns=12 you better use a string function.
If you want to allow any numbers between - you can use the regex:
^(\d+-){3}[.]{3}-\d+(&Ns=\d+)?$
If the numbers must be of only 3 digits and the last number of only 2 digits you can use:
^(\d{3}-){3}[.]{3}-\d{2}(&Ns=\d{2})?$
This looks like a phone number and extension information..
Why not make things simpler for yourself (and anyone who has to read this later) and split the input rather than use a complicated regex?
s = '999-123-222-...-22&Ns=12'
parts = s.split('&Ns=') # splits on Ns and removes it
If the piece before the "&" is a phone number, you could do another split and get the area code etc into separate fields, like so:
phone_parts = parts[0].split('-') # breaks up the digit string and removes the '-'
area_code = phone_parts[0]
The portion found after the the optional '&Ns=' can be checked to see if it is numeric with the string method isdigit, which will return true if all characters in the string are digits and there is at least one character, false otherwise.
if len(parts) > 1:
extra_digits_ok = parts[1].isdigit()

Categories