I am dealing with a number of different length byte objects. However, they will all contain a certain byte that I want to end on (i.e. I will always want to get a certain number of values up to that value but not past it). The problem is it is not always the last byte (so I can't just use [:-1]. I am never interested in what comes after that so it is okay for me to ignore what comes later but I do need to capture what comes before it.
Is there a way in Python to slice up to a certain value as opposed to a certain index?
i.e.
[2:'\xf0']
to slice from the third byte to the \xf0 byte?
In Python 2.x, you can use the index function and slicing, like this
a = bytearray(b"abcd\xf0asda")
print a[2:a.index('\xf0')]
# cd
In Python 3.x, you just need to search with the bytes object, like this
a = b"abcd\xf0asda"
print(a[2:a.index(b'\xf0')])
# b'cd'
The index function will return the index of the item you are looking for, in the object. Beware, it will raise an exception if the item being searched for is not found in the object.
Related
I encounter weird problem and could not solve it for days. I have created byte array that contains values from 1 to 250 and write it to binary file from C# using WriteAllBytes.
Later i read it from Python using np.fromfile(filename, dtype=np.ubyte). However, i realize this functions was adding arbitrary comma (see the image). Interestingly it is not visible in array property. And if i call numpy.array2string, comma turns '\n'. One solution is to replace comma with none, however i have very long sequences it will take forever on 100gb of data to use replace function. I also recheck the files by reading using .net Core, i'm quite sure comma is not there.
What could i be missing?
Edit:
I was trying to read all byte values to array and cast each member to or entire array to string. I found out that most reliable way to do this is:
list(map(str, (ubyte_array))
Above code returns string list that its elements without any arbitrary comma or blank space.
I want to retrieve multiple strings in one row of my terminal right now I'm using instr() but that only extracts the string in that exact position. The function that should actually do this is inchstr() but that doesn't seem to work in python or is it?
No. Python's curses binding does not extend the underlying curses library (much). There's more than one related curses function which python might use, depending on what you are looking at, but none read more than a single line of text:
int instr(char *str);
int inwstr(wchar_t *wstr);
int inchstr(chtype *chstr);
int in_wchstr(cchar_t *wchstr);
The first (instr) and third (inchstr) both read from the screen, but the latter returns attributes (color, underline, etc) along with the text.
Python's instr appears to use the former, since its documentation states
Return a bytes object of characters, extracted from the window starting at the current cursor position, or at y, x if specified. Attributes are stripped from the characters. If n is specified, instr() returns a string at most n characters long (exclusive of the trailing NUL).
The second (inwstr) and fourth (in_wchstr) differ from the other two by allowing for reading wide-characters directly. python actually should provide for using either set (narrow or wide character interfaces), since ncurses' wide-character interface is better suited to returning Unicode strings, but it is using the narrow interface in either case, returning a byte array (and requiring the application to puzzle out how to convert the data into a string).
Let's assume I have a variable tmp that is of type bytes and contains zeros and ones. I want to replace the value of the fifth position within tmp by setting an explicit value (e.g. 1).
I wonder what is a clean way to replace individual bits within an object (tmp) that has type 'Bytes'. I would like to set it directly. My attempt does not work. Help in understanding the problem in my approach would highly be appreciated.
print(tmp) # -> b'00101001'
print(type(tmp)) # -> <class 'bytes'>
tmp[3] = 1 # Expected b'00111001' but actually got TypeError: 'bytes' object does not support item assignment
Is there a function like set_bit_in(tmp, position, bit_value)?
A bytes object is an immutable object in python, you can index it an iterate it though.
You can turn it into a bytearray though, and that would be the easiest way to go about it
Or what you can do is, for example, turn it into a list, then change the value, as follows:
tmp_list = list(bin(tmp)[2:])
tmp_list[3] = '1'
The first two characters are stripped ([2:]) because they are always '0b', of course that is optional.
Also a bytesis a string representation of a byte (hence immutable), thus the assignment you want to make is = '1' not = 1
If turning to a list, then back, is not the way you wanna go you can also just copy the string representation and change the one element you wanna change.
Alternatively you can perform bitwise operations (on the int itself), if you feel comfortable with working with binaries
In a dictionary, I have the following value with equals signal:
{"appVersion":"o0u5jeWA6TwlJacNFnjiTA=="}
To be explicit, I need to replace the = for the unicode representation '\u003d' (basically the reverse process of [json.loads()][1]). How can I set the unicode value to a variable without store the value with two scapes (\\u003d)?.
I've tryed of different ways, including the enconde/decode, repr(), unichr(61), etc, and even searching a lot, cound't find anything that does this, all the ways give me the following final result (or the original result):
'o0u5jeWA6TwlJacNFnjiTA\\u003d\\u003d'
Since now, thanks for your attention.
EDIT
When I debug the code, it gives me the value of the variable with 2 escapes. The program will get this value and use it to do the following actions, including the extra escape. I'm using this code to construct a json by the json.dumps() and the result returned is a unicode with 2 escapes.
Follow a print of the final result after the JSON construction. I need to find a way to store the value in the var with just one escape.
I don't know if make difference, but I'm doing this to a custom BURP Plugin, manipulating some selected requests.
Here is an image of my POC, getting the value of the var.
The extra backslash is not actually added, The Python interpreter uses the repr() to indicate that it's a backslash not something like \t or \n when the string containing \ gets printed:
I hope this helps:
>>> t['appVersion'] = t["appVersion"].replace('=', '\u003d')
>>> t['appVersion']
'o0u5jeWA6TwlJacNFnjiTA\\u003d\\u003d'
>>> print(t['appVersion'])
o0u5jeWA6TwlJacNFnjiTA\u003d\u003d
>>> t['appVersion'] == 'o0u5jeWA6TwlJacNFnjiTA\u003d\u003d'
True
I have an integer representing a unicode character which I want to transform to the actual character so I can print it out.
However the function unichr() gives me different behaviour depending on whether there a leading zero or not. (See screenshot below for a better explanation)
However, when the integer is stored in a variable I always get the first behavior whilst I want to achieve the second. How can I do this?