I have a list of boolean strings. Each string is of length 6. I need to get the complement of each string. E.g, if the string is "111111", then "000000" is expected. My idea is
bin(~int(s,2))[-6:]
convert it to integer and negate it by treating it as a binary number
convert it back to a binary string and use the last 6 characters.
I think it is correct but it is not readable. And it only works for strings of length less than 30. Is there a better and general way to complement a boolean string?
I googled a 3rd party package "bitstring". However, it is too much for my code.
Well, you basically have a string in which you want to change all the 1s to 0s and vice versa. I think I would forget about the Boolean meaning of the strings and just use maketrans to make a translation table:
from string import maketrans
complement_tt = maketrans('01', '10')
s = '001001'
s = s.translate(complement_tt) # It's now '110110'
Replace in three steps:
>>> s = "111111"
>>> s.replace("1", "x").replace("0", "1").replace("x", "0")
'000000'
Related
I'd like to take some numbers that are in a string in python, round them to 2 decimal spots in place and return them. So for example if there is:
"The values in this string are 245.783634 and the other value is: 25.21694"
I'd like to have the string read:
"The values in this string are 245.78 and the other value is: 25.22"
What you'd have to do is find the numbers, round them, then replace them. You can use regular expressions to find them, and if we use re.sub(), it can take a function as its "replacement" argument, which can do the rounding:
import re
s = "The values in this string are 245.783634 and the other value is: 25.21694"
n = 2
result = re.sub(r'\d+\.\d+', lambda m: format(float(m.group(0)), f'.{n}f'), s)
Output:
The values in this string are 245.78 and the other value is: 25.22
Here I'm using the most basic regex and rounding code I could think of. You can vary it to fit your needs, for example check if the numbers have a sign (regex: [-+]?) and/or use something like the decimal module for handling large numbers better.
Another alternative using regex for what it is worth:
import re
def rounder(string, decimal_points):
fmt = f".{decimal_points}f"
return re.sub(r'\d+\.\d+', lambda x: f"{float(x.group()):{fmt}}", string)
text = "The values in this string are 245.783634 and the other value is: 25.21694"
print(rounder(text, 2))
Output:
The values in this string are 245.78 and the other value is: 25.22
I'm not sure quite what you are trying to do. "Round them in place and return them" -- do you need the values saved as variables that you will use later? If so, you might look into using a regular expression (as noted above) to extract the numbers from your string and assign them to variables.
But if you just want to be able to format numbers on-the-fly, have you looked at f-strings? f-string
print(f"The values in this string are {245.783634:.2f} and the other value is: {25.21694:.2f}.")
output:
The values in this string are 245.78 and the other value is: 25.22.
You can use format strings simply
link=f'{23.02313:.2f}'
print(link)
This is one hacky way but many other solutions do exist. I did that in one of my recent projects.
Given the string "001000100", I want to replace only the zeros surrounded by "1" by "1". The result in this case would be "001111100" In this case there's guaranteed to be only one sequence of zeros surrounded by ones.
Given the string "100" or "001" or "110" or "011", I want the original string returned.
Performance is not an issue as the string (which is currently "101"), is only expected to increase slowly over time when electricity and/or tax rates change.
I think this should be trivial but my limited regex experience and web searches have failed to come up with an answer. Any help coming up with the relevant regex pattern will be appreciated.
EDIT: since posting this question, I've received quite a bit of useful feedback. To ensure any answers address my requirements I've rethought the requirements and I think (since I'm still not 100% certain) that they can be summarized as follows:
‘string’ shall always contain at least one 1
‘string’ shall have zero or one sequence of one or more 0 surrounded by a 1
a sequence of one or more 0 surrounded by a 1 shall be replaced by the same number of 1
‘string’ that does not have at least one 0 surrounded by 1 shall be returned as-is
Another useful piece of information is that the original input is not a string but a Python list of Booleans. Therefore any solution that uses regex will have to convert the list of Booleans to a string and vice versa.
I solved my problem thanks to the essential contributions of Kelly Bundy and bobble bubble. The following Python function meets the requirements but improvements are of course welcome:
def make_contiguous(booleans): # replaces '0' surrounded by '1' into '1'
string = "".join(str(int(i)) for i in booleans) # convert list of Booleans to str to allow use of regex
string = re.sub('10*1', lambda m: '1' * len(m[0]), string) # apply the regex
string = list(string)
booleans = [int(i) for i in string] # convert the str back to Booleans
return booleans
Say we have an numpy.ndarray with numpy.str_ elements. For example, below arr is the numpy.ndarray with two numpy.str_ elements like this:
arr = ['12345"""ABCDEFG' '1A2B3C"""']
Trying to perform string slicing on each numpy element.
For example, how can we slice the first element '12345"""ABCDEFG' so that we replace its 10 last characters with the string REPL, i.e.
arr = ['12345REPL' '1A2B3C"""']
Also, is it possible to perform string substitutions, e.g. substitute all characters after a specific symbol?
Strings are immutable, so you should either create slices and manually recombine or use regular expressions. For example, to replace the last 10 characters of the first element in your array, arr, you could do:
import numpy as np
import re
arr = np.array(['12345"""ABCDEFG', '1A2B3C"""'])
arr[0] = re.sub(arr[0][-10:], 'REPL', arr[0])
print(arr)
#['12345REPL' '1A2B3C"""']
If you want to replace all characters after a specific character you could use a regular expression or find the index of that character in the string and use that as the slicing index.
EDIT: Your comment is more about regular expressions than simply Python slicing, but this is how you could replace everything after the triple quote:
re.sub('["]{3}(.+)', 'REPL', arr[0])
This line essentially says, "Find the triple quote and everything after it, but only replace every character after the triple quotes."
In python, strings are immutable. Also, in NumPy, array scalars are immutable; your string is therefore immutable.
What you would want to do in order to slice is to treat your string like a list and access the elements.
Say we had a string where we wanted to slice at the 3rd letter, excluding the third letter:
my_str = 'purple'
sliced_str = my_str[:3]
Now that we have the part of the string, say we wanted to substitute z's for every letter following where we sliced. We would have to work with the new string that pulled out the letters we wanted, and create an additional string with the desired string that we want to create:
# say I want to replace the end of 'my_str', from where we sliced, with a string named 's'
s = 'dandylion'
new_string = sliced_str + s # returns 'pudandylion'
Because string types are immutable, you have to store elements you want to keep, then combine the stored elements with the elements you would like to add in a new variable.
np.char has replace function, which applies the corresponding string method to each element of the array:
In [598]: arr = np.array(['12345"""ABCDEFG', '1A2B3C"""'])
In [599]: np.char.replace(arr,'"""ABCDEFG',"REPL")
Out[599]:
array(['12345REPL', '1A2B3C"""'],
dtype='<U9')
In this particular example it can be made to work, but it isn't nearly as general purpose as re.sub. Also these char functions are only modestly faster than iterating on the array. There are some good examples of that in #Divakar's link.
I have a binary string say '01110000', and I want to return the number of leading zeros in front without writing a forloop. Does anyone have any idea on how to do that? Preferably a way that also returns 0 if the string immediately starts with a '1'
If you're really sure it's a "binary string":
input = '01110000'
zeroes = input.index('1')
Update: it breaks when there's nothing but "leading" zeroes
An alternate form that handles the all-zeroes case.
zeroes = (input+'1').index('1')
Here is another way:
In [36]: s = '01110000'
In [37]: len(s) - len(s.lstrip('0'))
Out[37]: 1
It differs from the other solutions in that it actually counts the leading zeroes instead of finding the first 1. This makes it a little bit more general, although for your specific problem that doesn't matter.
A simple one-liner:
x = '01110000'
leading_zeros = len(x.split('1', 1)[0])
This partitions the string into everything up to the first '1' and the rest after it, then counts the length of the prefix. The second argument to split is just an optimization and represents the number of splits to perform, meaning the function will stop after it found the first '1' instead of splitting it on all occurences. You could just use x.split('1')[0] if performance doesn't matter.
I'd use:
s = '00001010'
sum(1 for _ in itertools.takewhile('0'.__eq__, s))
Rather pythonic, works in the general case, for example on the empty string and non-binary strings, and can handle strings of any length (or even iterators).
If you know it's only 0 or 1:
x.find(1)
(will return -1 if all zeros; you may or may not want that behavior)
If you don't know which number would be next to zeros i.e. "1" in this case, and you just want to check if there are leading zeros, you can convert to int and back and compare the two.
"0012300" == str(int("0012300"))
How about re module?
a = re.search('(?!0)', data)
then a.start() is the position.
I'm using has_leading_zero = re.match(r'0\d+', str(data)) as a solution that accepts any number and treats 0 as a valid number without a leading zero
I have a parsing system for fixed-length text records based on a layout table:
parse_table = [\
('name', type, length),
....
('numeric_field', int, 10), # int example
('textc_field', str, 100), # string example
...
]
The idea is that given a table for a message type, I just go through the string, and reconstruct a dictionary out of it, according to entries in the table.
Now, I can handle strings and proper integers, but int() will not parse all-spaces fields (for a good reason, of course).
I wanted to handle it by defining a subclass of int that handles blank strings. This way I could go and change the type of appropriate table entries without introducing additional kludges in the parsing code (like filters), and it would "just work".
But I can't figure out how to override the constructor of a build-in type in a sub-type, as defining constructor in the subclass does not seem to help. I feel I'm missing something fundamental here about how Python built-in types work.
How should I approach this? I'm also open to alternatives that don't add too much complexity.
Use int() function with the argument s.strip() or 0, i.e:
int(s.strip() or 0)
Or if you know that the string will always contain only digit characters or is empty (""), then just:
int(s or 0)
In your specific case you can use lambda expression, e.g:
parse_table = [\
....
('numeric_field', lambda s: int(s.strip() or 0), 10), # int example
...
]
Use a factory function instead of int or a subclass of int:
def mk_int(s):
s = s.strip()
return int(s) if s else 0
lenient_int = lambda string: int(string) if string.strip() else None
#else 0
#else ???
note that mylist is a list that contain:
Tuples, and inside tuples, there are
I) null / empty values,
ii) digits, numbers as strings, as well
iii) empty / null lists. for example:
mylist=[('','1',[]),('',[],2)]
#Arlaharen I am repeating here, your solution, somewhat differently, in order to add keywords, because, i lost a lot of time, in order to find it!
The following solution is stripping / converting null strings, empty strings, or otherwise, empty lists, as zero, BUT keeping non empty strings, non empty lists, that include digits / numbers as strings, and then it convert these strings, as numbers / digits.
Simple solution. Note that "0" can be replaced by iterable variables.
Note the first solution cannot TREAT empty lists inside tuples.
int(mylist[0][0]) if mylist[0][0].strip() else 0
I found even more simpler way, that IT can treat empty lists in a tuple
int(mylist[0][0] or '0')
convert string to digits / convert string to number / convert string to integer
strip empty lists / strip empty string / treat empty string as digit / number
convert null string as digit / number / convert null string as integer