How to make "int" parse blank strings? - python

I have a parsing system for fixed-length text records based on a layout table:
parse_table = [\
('name', type, length),
....
('numeric_field', int, 10), # int example
('textc_field', str, 100), # string example
...
]
The idea is that given a table for a message type, I just go through the string, and reconstruct a dictionary out of it, according to entries in the table.
Now, I can handle strings and proper integers, but int() will not parse all-spaces fields (for a good reason, of course).
I wanted to handle it by defining a subclass of int that handles blank strings. This way I could go and change the type of appropriate table entries without introducing additional kludges in the parsing code (like filters), and it would "just work".
But I can't figure out how to override the constructor of a build-in type in a sub-type, as defining constructor in the subclass does not seem to help. I feel I'm missing something fundamental here about how Python built-in types work.
How should I approach this? I'm also open to alternatives that don't add too much complexity.

Use int() function with the argument s.strip() or 0, i.e:
int(s.strip() or 0)
Or if you know that the string will always contain only digit characters or is empty (""), then just:
int(s or 0)
In your specific case you can use lambda expression, e.g:
parse_table = [\
....
('numeric_field', lambda s: int(s.strip() or 0), 10), # int example
...
]

Use a factory function instead of int or a subclass of int:
def mk_int(s):
s = s.strip()
return int(s) if s else 0

lenient_int = lambda string: int(string) if string.strip() else None
#else 0
#else ???

note that mylist is a list that contain:
Tuples, and inside tuples, there are
I) null / empty values,
ii) digits, numbers as strings, as well
iii) empty / null lists. for example:
mylist=[('','1',[]),('',[],2)]
#Arlaharen I am repeating here, your solution, somewhat differently, in order to add keywords, because, i lost a lot of time, in order to find it!
The following solution is stripping / converting null strings, empty strings, or otherwise, empty lists, as zero, BUT keeping non empty strings, non empty lists, that include digits / numbers as strings, and then it convert these strings, as numbers / digits.
Simple solution. Note that "0" can be replaced by iterable variables.
Note the first solution cannot TREAT empty lists inside tuples.
int(mylist[0][0]) if mylist[0][0].strip() else 0
I found even more simpler way, that IT can treat empty lists in a tuple
int(mylist[0][0] or '0')
convert string to digits / convert string to number / convert string to integer
strip empty lists / strip empty string / treat empty string as digit / number
convert null string as digit / number / convert null string as integer

Related

python string to list (special list)

I'm trying to get this string into list, how can i do that pleas ?
My string :
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
I want to convert it to a list, so that:
print(list[0])
Output : (['xyz1'], 'COM95')
If you have this string instead of a list, that presumes it is coming from somewhere outside your control (otherwise you'd just make a proper list). If the string is coming from a source outside your program eval() is dangerous. It will gladly run any code passed to it. In this case you can use ast.liter_eval() which is safer (but make sure you understand the warning on the docs):
import ast
x = "[(['xyz1'], 'COM95'), (['xyz2'], 'COM96'), (['xyz3'], 'COM97'), (['xyz4'], 'COM98'), (['xyz5'], 'COM99'), (['xyz6'], 'COM100')]"
l = ast.literal_eval(x)
Which gives an l of:
[(['xyz1'], 'COM95'),
(['xyz2'], 'COM96'),
(['xyz3'], 'COM97'),
(['xyz4'], 'COM98'),
(['xyz5'], 'COM99'),
(['xyz6'], 'COM100')]
If the structure is uniformly a list of tuples with a one-element list of strings and an individual string, you can manually parse it using the single quote as a separator. This will give you one string value every other component of the split (which you can access using a striding subscript). You can then build the actual tuple from pairing of two values:
tuples = [([a],s) for a,s in zip(*[iter(x.split("'")[1::2])]*2)]
print(tuples[0])
(['xyz1'], 'COM95')
Note that this does not cover the case where an individual string contains a single quote that needed escaping
You mean convert list like string into list? Maybe you can use eval().
For example
a="[1,2,3,4]"
a=eval(a)
Then a become a list
to convert as list use x = eval(x)
print(list[0]) will give you an error because list is a python builtin function
you should do print(x[0]) to get what you want

How to split a string with an integer into two variables?

For example, is it possible to convert the input
x = 10hr
into something like
y = 10
z = hr
I considering slicing, but the individual parts of the string will never be of a fixed length -- for example, the base string could also be something like 365d or 9minutes.
I'm aware of split() and re.match which can separate items into a list/group based on delimitation. But I'm curious what the shortest way to split a string containing a string and an integer into two separate variables is, without having to reassign the elements of the list.
You could use list comprehension and join it as a string
x='10hr'
digits="".join([i for i in x if not i.isalpha()])
letters="".join([i for i in x if i.isalpha()])
You don't need some fancy function or regex for your use case
x = '10hr'
i=0
while x[i].isdigit():
i+=1
The solution assumes that the string is going to be in format you have mentioned: 10hr, 365d, 9minutes, etc..
Above loop will get you the first index value i for the string part
>>i
2
>>x[:i]
'10'
>>x[i:]
'hr'

How to Remove single Inverted comma from array in python?

I have the following array:
a =['1','2']
I want to convert this array into the below format :
a=[1,2]
How can I do that?
You can do it like that. You change each element of a (which are strings) in an integer.
a=[int(x) for x in a]
This single inverted comma you are talking about is the difference between str and int. This is pretty basic python stuff.
A string is a characters, displayed with the inverted comma's around it. 'Hello' is a string, but '1' can be a string too.
In you case ['1','2'] is a list of strings, and [1,2] is a list of numbers.
To convert a string to an int, you can do what is called casting. This is converting one type to another (They have to be compatible though.) Casting 'hello' to a number doesn't make sense and won't work.
Casting '1' to a number is possible by calling int('1') which will result in 1
In your case you can cast all elements in you list by calling a = [int(x) for x in a].
For more info on types see this article.
For information on list comprehensions (What I used to change your list) see this article.

python pandas dataframe to list string error

I am relatively new to pandas and now trying to convert pandas DataFrame rows to lists of strings.
It works well, however the strings in the original DataFrame are strangely modified in the list, as some append an "L" character for some reason.
I appreciate your help very much..
>>data=pd.DataFrame(Data)
>>for r in data.iterrows():
>> r[1].tolist()
>>r[1]
a 16593
b 15
c 179.069
d 110000
e 5906
Name: 0, dtype: object
>>r[1].tolist()
[16593L, 15.0, 179.068851, 110000.0, 5906L]
In fact I figured out, that the numbers that append an L are integers, for floats it works..
Every column in the DataFrame has a specific "type" associated with it.
Typically this usually means they are of type "string", "int", or "float".
Right now, your .tolist() call converts the row into a list, but it doesn't necessarily change the type of all the values into a string.
When you type a list into the console, Python uses the "repr" method to find a string representation of the list. This involves putting in the brackets and calling "repr" on each of the elements. This is slightly different than casting the value to a string, which is done with the "str" method.
You can test this out for yourself:
# For regular ints, repr and str do the same thing
a = 5
str(a) #'5'
repr(a) #'5'
# The L means it's a *long*, basically an int with a higher max-value
a = 5L
str(a) #'5'
repr(a) #'5L'
*Note, this isn't the case in Python 3 all ints are automatically 'long', resulting in no L as it would be redundant.
So, in the end, if you really want to convert your list of various types (float, int, str, depending on each column) to strings, you could use something like this:
my_list = [str(x) for x in my_list]
However, if you plan on doing some processing using these numbers, it's better to just leave them as their numerical type rather than convert back and forth to string.

complement/negate a boolean string in python

I have a list of boolean strings. Each string is of length 6. I need to get the complement of each string. E.g, if the string is "111111", then "000000" is expected. My idea is
bin(~int(s,2))[-6:]
convert it to integer and negate it by treating it as a binary number
convert it back to a binary string and use the last 6 characters.
I think it is correct but it is not readable. And it only works for strings of length less than 30. Is there a better and general way to complement a boolean string?
I googled a 3rd party package "bitstring". However, it is too much for my code.
Well, you basically have a string in which you want to change all the 1s to 0s and vice versa. I think I would forget about the Boolean meaning of the strings and just use maketrans to make a translation table:
from string import maketrans
complement_tt = maketrans('01', '10')
s = '001001'
s = s.translate(complement_tt) # It's now '110110'
Replace in three steps:
>>> s = "111111"
>>> s.replace("1", "x").replace("0", "1").replace("x", "0")
'000000'

Categories