How to replace string character in Python? - python

for i in range(len(contents)):
contents = contents.replace(contents[i],chr(ord(contents[i]) + 1))
print(contents)
for i in range(len(contents)):
contents = contents.replace(contents[i],chr(ord(contents[i]) - 1))
print(contents)
This is where I get confused, shouldn't it just add 1 int to the character more and give you a character that is one byte (in UNICODE) above? Shouldn't it after you subtract one give you back the same result as before?
I have a string This is some sample text!. When I run the code, the string is converted to Ukkz%kz%zqoh%zboqmh%zhzz%.
Then, it should decrypt it back, but it show Tees es sole salole sess.

Thanks to #Jon Clements i have found a fix:
I converted the string into an array, then looped trough the array adding 1 to the character ordinal.
s = list(contents)
def encrypt(s):
for i in range(len(s)):
s[i] = chr(ord(s[i]) + 1)
ret = ''.join(s)
return ret

contents = "This is some sample text!"
print(f"Original value of contents: {contents}")
for i in range(len(contents)):
contents = contents[:i]+contents[i:].replace(contents[i],chr(ord(contents[i]) + 1),1)
print(f"Modified value of contents: {contents}")
for i in range(len(contents)):
contents = contents[:i]+contents[i:].replace(contents[i],chr(ord(contents[i]) - 1),1)
print(f"Reversed value of contents: {contents}")

Related

Read Null terminated string in python

I'm trying to read a null terminated string but i'm having issues when unpacking a char and putting it together with a string.
This is the code:
def readString(f):
str = ''
while True:
char = readChar(f)
str = str.join(char)
if (hex(ord(char))) == '0x0':
break
return str
def readChar(f):
char = unpack('c',f.read(1))[0]
return char
Now this is giving me this error:
TypeError: sequence item 0: expected str instance, int found
I'm also trying the following:
char = unpack('c',f.read(1)).decode("ascii")
But it throws me:
AttributeError: 'tuple' object has no attribute 'decode'
I don't even know how to read the chars and add it to the string, Is there any proper way to do this?
Here's a version that (ab)uses __iter__'s lesser-known "sentinel" argument:
with open('file.txt', 'rb') as f:
val = ''.join(iter(lambda: f.read(1).decode('ascii'), '\x00'))
How about:
myString = myNullTerminatedString.split("\x00")[0]
For example:
myNullTerminatedString = "hello world\x00\x00\x00\x00\x00\x00"
myString = myNullTerminatedString.split("\x00")[0]
print(myString) # "hello world"
This works by splitting the string on the null character. Since the string should terminate at the first null character, we simply grab the first item in the list after splitting. split will return a list of one item if the delimiter doesn't exist, so it still works even if there's no null terminator at all.
It also will work with byte strings:
myByteString = b'hello world\x00'
myStr = myByteString.split(b'\x00')[0].decode('ascii') # "hello world" as normal string
If you're reading from a file, you can do a relatively larger read - estimate how much you'll need to read to find your null string. This is a lot faster than reading byte-by-byte. For example:
resultingStr = ''
while True:
buf = f.read(512)
resultingStr += buf
if len(buf)==0: break
if (b"\x00" in resultingStr):
extraBytes = resultingStr.index(b"\x00")
resultingStr = resultingStr.split(b"\x00")[0]
break
# now "resultingStr" contains the string
f.seek(0 - extraBytes,1) # seek backwards by the number of bytes, now the pointer will be on the null byte in the file
# or f.seek(1 - extraBytes,1) to skip the null byte in the file
(edit version 2, added extra way at the end)
Maybe there are some libraries out there that can help you with this, but as I don't know about them lets attack the problem at hand with what we know.
In python 2 bytes and string are basically the same thing, that change in python 3 where string is what in py2 is unicode and bytes is its own separate type, which mean that you don't need to define a read char if you are in py2 as no extra work is required, so I don't think you need that unpack function for this particular case, with that in mind lets define the new readString
def readString(myfile):
chars = []
while True:
c = myfile.read(1)
if c == chr(0):
return "".join(chars)
chars.append(c)
just like with your code I read a character one at the time but I instead save them in a list, the reason is that string are immutable so doing str+=char result in unnecessary copies; and when I find the null character return the join string. And chr is the inverse of ord, it will give you the character given its ascii value. This will exclude the null character, if its needed just move the appending...
Now lets test it with your sample file
for instance lets try to read "Sword_Wea_Dummy" from it
with open("sword.blendscn","rb") as archi:
#lets simulate that some prior processing was made by
#moving the pointer of the file
archi.seek(6)
string=readString(archi)
print "string repr:", repr(string)
print "string:", string
print ""
#and the rest of the file is there waiting to be processed
print "rest of the file: ", repr(archi.read())
and this is the output
string repr: 'Sword_Wea_Dummy'
string: Sword_Wea_Dummy
rest of the file: '\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6=\x00\x00\x00\x00\xeaQ8?\x9e\x8d\x874$-i\xb3\x00\x00\x00\x00\x9b\xc6\xaa2K\x15\xc6=;\xa66?\x00\x00\x00\x00\xb8\x88\xbf#\x0e\xf3\xb1#ITuB\x00\x00\x80?\xcd\xcc\xcc=\x00\x00\x00\x00\xcd\xccL>'
other tests
>>> with open("sword.blendscn","rb") as archi:
print readString(archi)
print readString(archi)
print readString(archi)
sword
Sword_Wea_Dummy
ÍÌÌ=p=Š4:¦6¿JÆ=
>>> with open("sword.blendscn","rb") as archi:
print repr(readString(archi))
print repr(readString(archi))
print repr(readString(archi))
'sword'
'Sword_Wea_Dummy'
'\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6='
>>>
Now that I think about it, you mention that the data portion is of fixed size, if that is true for all files and the structure on all of them is as follow
[unknow size data][know size data]
then that is a pattern we can exploit, we only need to know the size of the file and we can get both part smoothly as follow
import os
def getDataPair(filename,knowSize):
size = os.path.getsize(filename)
with open(filename, "rb") as archi:
unknown = archi.read(size-knowSize)
know = archi.read()
return unknown, know
and by knowing the size of the data portion, its use is simple (which I get by playing with the prior example)
>>> strins_data, data = getDataPair("sword.blendscn", 80)
>>> string_data, data = getDataPair("sword.blendscn", 80)
>>> string_data
'sword\x00Sword_Wea_Dummy\x00'
>>> data
'\xcd\xcc\xcc=p=\x8a4:\xa66\xbfJ\x15\xc6=\x00\x00\x00\x00\xeaQ8?\x9e\x8d\x874$-i\xb3\x00\x00\x00\x00\x9b\xc6\xaa2K\x15\xc6=;\xa66?\x00\x00\x00\x00\xb8\x88\xbf#\x0e\xf3\xb1#ITuB\x00\x00\x80?\xcd\xcc\xcc=\x00\x00\x00\x00\xcd\xccL>'
>>> string_data.split(chr(0))
['sword', 'Sword_Wea_Dummy', '']
>>>
Now to get each string a simple split will suffice and you can pass the rest of the file contained in data to the appropriated function to be processed
Doing file I/O one character at a time is horribly slow.
Instead use readline0, now on pypi: https://pypi.org/project/readline0/ . Or something like it.
In 3.x, there's a "newline" argument to open, but it doesn't appear to be as flexible as readline0.
Here is my implementation:
import struct
def read_null_str(f):
r_str = ""
while 1:
back_offset = f.tell()
try:
r_char = struct.unpack("c", f.read(1))[0].decode("utf8")
except:
f.seek(back_offset)
temp_char = struct.unpack("<H", f.read(2))[0]
r_char = chr(temp_char)
if ord(r_char) == 0:
return r_str
else:
r_str += r_char

Python: converting hex values, stored as string, to hex data

(Answer found. Close the topic)
I'm trying to convert hex values, stored as string, in to hex data.
I have:
data_input = 'AB688FB2509AA9D85C239B5DE16DD557D6477DEC23AF86F2AABD6D3B3E278FF9'
I need:
data_output = '\xAB\x68\x8F\xB2\x50\x9A\xA9\xD8\x5C\x23\x9B\x5D\xE1\x6D\xD5\x57\xD6\x47\x7D\xEC\x23\xAF\x86\xF2\xAA\xBD\x6D\x3B\x3E\x27\x8F\xF9'
I was trying data_input.decode('hex'), binascii.unhexlify(data_input) but all they return:
"\xabh\x8f\xb2P\x9a\xa9\xd8\\#\x9b]\xe1m\xd5W\xd6G}\xec#\xaf\x86\xf2\xaa\xbdm;>'\x8f\xf9"
What should I write to receive all bytes in '\xFF' view?
updating:
I need representation in '\xFF' view to write this data to a file (I'm opening file with 'wb') as:
«hЏІPљ©Ш\#›]бmХWЦG}м#Ї†тЄЅm;>'Џщ
update2
Sorry for bothering. An answer lies under my nose all the time:
data_output = data_input.decode('hex')
write_file(filename, data_output) #just opens a file 'wb', ant write a data in it
gives the same result as I need
I like chopping strings into fixed-width chunks using re.findall
print '\\x' + '\\x'.join(re.findall('.{2}', data_input))
If you want to actually convert the string into a list of ints, you can do that like this:
data = [int(x, 16) for x in re.findall('.{2}', data_input)]
It's an inefficient solution, but there's always:
flag = True
data_output = ''
for char in data_input:
if flag:
buffer = char
flag = False
else:
data_output = data_output + '\\x' + buffer + char
flag = True
EDIT HOPEFULLY THE LAST: Who knew I could mess up in so many different ways on that simple a loop? Should actually run now...
>>> int('0x10AFCC', 16)
1093580
>>> hex(1093580)
'0x10afcc'
So prepend your string with '0x' then do the above

String to integer implicit change when not called for?

I'm trying to create a simple encryption/decryption code in Python like this (maybe you can see what I'm going for):
def encrypt():
import random
input1 = input('Write Text: ')
input1 = input1.lower()
key = random.randint(10,73)
output = []
for character in input1:
number = ord(character) - 96
number = number + key
output.append(number)
output.insert(0,key)
print (''.join(map(str, output)))
def decrypt():
text = input ('What to decrypt?')
key = int(text[0:2])
text = text[2:]
n=2
text = text
text = [text[i:i+n] for i in range(0, len(text), n)]
text = map(int,text)
text = [x - key for x in text]
text = ''.join(map(str,text))
text = int(text)
print (text)
for character in str(text):
output = []
character = int((character+96))
number = str(chr(character))
output.append(number)
print (''.join(map(str, output)))
When I run the decryptor with the output from the encryption output, I get "TypeError: Can't convert 'int' object to str implicitly."
As you can see, I've added some redundancies to help try to fix things but nothing's working. I ran it with different code (can't remember what), but all that one kept outputting was something like "generatorobject at ."
I'm really lost and I could use some pointers guys, please and thank you.
EDIT: The problem arises on line 27.
EDIT 2: Replaced "character = int((character+96))" with "character = int(character)+96", now the problem is that it only prints (and as I can only assume) only appends the last letter of the decrypted message.
EDIT 2 SOLVED: output = [] was in the for loop, thus resetting it every time. Problem solved, thank you everyone!
Full traceback would help, but it looks like character = int(character)+96 is what you want on line 27.

I want to write math expression in file

ı want to write that command but it is not working
for k in range 900 :
l=len(str(a[k]) **a[ ] is a string which gives random float numbers**
f.write("\n*l") **ı need to write space as the number of string length**
The multiplication needs to be on the string, not in it.
f.write("\n" * l)
I assume that you get length correctly, problem is to write to file.
All you need to do is save to string first and write that string into file.
temp = ''
for k in range 900 :
l=len(str(a[k]) **a[ ] is a string which gives random float numbers**
temp = temp + str(l) + "\n" # store all your values to temp string
f.write(temp) # then write than temp string into file

python : ValueError: invalid literal for int() with base 10: ' '

I have a text file which contains entry like
70154::308933::3
UserId::ProductId::Score
I wrote this program to read:
(Sorry the indendetion is bit messed up here)
def generateSyntheticData(fileName):
dataDict = {}
# rowDict = []
innerDict = {}
try:
# for key in range(5):
# count = 0
myFile = open(fileName)
c = 0
#del innerDict[0:len(innerDict)]
for line in myFile:
c += 1
#line = str(line)
n = len(line)
#print 'n: ',n
if n is not 1:
# if c%100 ==0: print "%d: "%c, " entries read so far"
# words = line.replace(' ','_')
words = line.replace('::',' ')
words = words.strip().split()
#print 'userid: ', words[0]
userId = int( words[0]) # i get error here
movieId = int (words[1])
rating =float( words[2])
print "userId: ", userId, " productId: ", movieId," :rating: ", rating
#print words
#words = words.replace('_', ' ')
innerDict = dataDict.setdefault(userId,{})
innerDict[movieId] = rating
dataDict[userId] = (innerDict)
innerDict = {}
except IOError as (errno,strerror):
print "I/O error({0}) :{1} ".format(errno,strerror)
finally:
myFile.close()
print "total ratings read from file",fileName," :%d " %c
return dataDict
But i get the error:
ValueError: invalid literal for int() with base 10: ''
Funny thing is, it is working just fine reading the same format data from other file..
Actually while posting this question, I noticed something weird..
The entry 70154::308933::3
each number has a space.in between like 7 space 0 space 1 space 5 space 4 space :: space 3...
BUt the text file looks fine..:( on copy pasting only it shows this nature..
Anyways.. but any clue whats going on.
Thanks
The "spaces" thay you are seeing appear to be NULs ("\x00"). There is a 99.9% chance that your file is encoded in UTF-16, UTF-16LE, or UTF-16BE. If this is a one-off file, just open it with Notepad and save as "ANSI", not "Unicode" and not "Unicode bigendian". If however you need to process it as is, you'll need to know/detect what the encoding is. To find out which, do this:
print repr(open("yourfile.txt", "rb").read(20))
and compare the srtart of the output with the following:
>>> ucode = u"70154:"
>>> for sfx in ["", "LE", "BE"]:
... enc = "UTF-16" + sfx
... print enc, repr(ucode.encode(enc))
...
UTF-16 '\xff\xfe7\x000\x001\x005\x004\x00:\x00'
UTF-16LE '7\x000\x001\x005\x004\x00:\x00'
UTF-16BE '\x007\x000\x001\x005\x004\x00:'
>>>
You can make a detector that's good enough for your purposes by inspecting the first 2 bytes:
[pseudocode]
if f2b in `"\xff\xfe\xff"`: UTF-16
elif f2b[1] == `"\x00"`: UTF-16LE
elif f2b[0] == `"\x00"`: UTF-16BE
else: cp1252 or UTF-8 or whatever else is prevalent in your neck of the woods.
You could avoid hard-coding the fallback encoding:
>>> import locale
>>> locale.getpreferredencoding()
'cp1252'
Your line-reading code will look like this:
rawbytes = open(myFile, "rb").read()
enc = detect_encoding(rawbytes[:2])
for line in rawbytes.decode(enc).splitlines():
# whatever
Oh, and the lines will be unicode objects ... if that gives you a problem, ask another question.
Debugging 101: simply change the line:
words = words.strip().split()
to:
words = words.strip().split()
print words
and see what comes out.
I will mention a couple of things. If you have the literal UserId::... in the file and you try to process it, it won't take kindly to trying to convert that to an integer.
And the ... unusual line:
if n is not 1:
I would probably write as:
if n != 1:
If, as you indicate in your comment, you end up seeing:
['\x007\x000\x001\x005\x004\x00', '\x003\x000\x008\x009\x003\x003\x00', '3']
then I'd be checking your input file for binary (non-textual) data. You should never end up with that binary information if you're just reading text and trimming/splitting.
And because you state that the digits seem to have spaces between them, you should do a hex dump of the file to find out what's really in there. It may be a UTF-16 Unicode string, for example.

Categories