Entering Strings of Bytes into C Program User Input

Entering Strings of Bytes into C Program User Input - python

I'm working on a problem which involves performing a buffer overflow on a C program. The program asks for a command line input from the user, and I need to use this input to enter in my byte string including nops, shellcode, ret address, etc. The problem is the c program interprets everything I enter as a literal string, so if I enter in something like "\xff\xab\x12..." the program will interpret each separate character as a different byte. I can enter in separate bytes by creating a string of the corresponding ASCII characters for each byte, but this limits me to hex values 00-7F for my byte values. I'm wondering if there's some way I can use python to enter in the byte string, or possibly format the user input so that the C program interprets it the way I need it to. Please note that I don't know what the return address of the byte string will be until after I start running the program, so I can't make the byte string prior to running the C program I'm working on a Linux x86 shell, and I do not have the source code for the C program.
I've tried calling python to print the string I need into the input using commands like,
$(python -c 'print "\x41" * 205 + "\x34\x86")
but the C program just interprets all these characters literally. I've also tried using extended ASCII characters but the C program doesn't seem to interpret them correctly, associating them all with an unknown character symbol and 0xc3 hex value. Is it possible there is some other character/hex mapping that the C program may be using?
If anyone knows a way I can enter the input the way I need to, your help would be very much appreciated.
thanks

You were missing a semicolon (') in your shell call to python. Additionally:
If your executable takes its input in the form of a CLI argument:
$ ./yourprog $(python -c 'print "\x41" * 205 + "\x34\x86"')
If it takes its input in the form of STDIN:
$ python -c 'print "\x41" * 205 + "\x34\x86"' | ./yourprog
Replace "./yourprog" with the name of your executable.

Related

Python3 handling non-ASCII characters in a weird way

I was trying to solve a pwnable with Python 3. For that I need to print some characters that are not in the ASCII range.
Python 3 is converting these characters into some weird Unicode.
For example if I print "\xff" in Python 3, I get this:
root#kali:~# python3 -c 'print("\xff")' | xxd
00000000: c3bf 0a ...
\xff gets converted to \xc3\xbf
But in Python 2 it works as expected, like this:
root#kali:~# python -c 'print("\xff")' | xxd
00000000: ff0a ..
So how can print it like that in Python 3?

In Python 2, print '\xff' writes a bytes string directly to the terminal, so you get the byte you print.
In Python 3, print('\xff') encodes the Unicode character U+00FF to the terminal using the default encoding...in your case UTF-8.
To directly output bytes to the terminal in Python 3 you can't use print, but you can use the following to skip encoding and write a byte string:
python3 -c "import sys; sys.stdout.buffer.write(b'\xff')"

In Python 2, str and bytes were the same thing, so when you wrote '\xff', the result contained the actual byte 0xFF.
In Python 3, str is closer to Python 2's unicode object, and is not an alias for bytes. \xff is no longer a request to insert a byte, but rather a request to insert a Unicode character whose code can be represented in 8 bits. The string is printed with your default encoding (probably UTF-8), in which character 0xFF is encoded as the bytes \xc3\xbf. \x is basically the one-byte version of \u when it appears in a string. It's still the same thing as before when it appears in a bytes though.
Now for a solution. If you just want some bytes, do
b'\xff'
That will work the same as in Python 2. You can write these bytes to a binary file, but you can't then print directly, since everything you print gets converted to str. The problem with printing is that everything gets encoded in text mode. Luckily, sys.stdout has a buffer attribute that lets you output bytes directly:
sys.stdout.buffer.write(b'\xff\n')
This will only work if you don't replace sys.stdout with something fancy that doesn't have a buffer.

Python: Image's path as a raw string an input to a function

Python: I want to get an image as an input from the user as a raw string! I used input() to get the path. Giving it as a raw string makes the program work, I can do it by appending r before the path, but Image.open(' ') also takes r as a string and producing an error. Can someone help me in resolving this problem.
path=input('Please enter the path of the image')
im=Image.open(path)
get an error as no file found
if i give..
y='r'+path
im=Image.open(y)
then the error is
OSError: [Errno 22] Invalid argument: 'rC:\\Users\\User\\Desktop\.......jpeg'
I am new to python, so please help me if there is any method by which I can solve this issue.

raw strings are for a programmer's convenience; you don't have to have your users enter raw strings as normal input.
See the end of this post for the solution to your problem. Because you said you are new to Python, I have decided to give a detailed answer here.
Why raw strings?
Normal strings assign special meaning to the \ (backslash) character. This is fine as \ can be escaped by using \\ (two backslashes) to represent a single backslash.
However, this can sometimes become ugly.
Consider, for example, a path: C:\Users\Abhishek\test.txt. To represent this as a normal string in Python, all \ must be escaped:
string = 'C:\\Users\\Abhishek\\test.txt'
You can avoid this by using raw strings. Raw strings don't treat \ specially.
string = r'C:\Users\Abhishek\test.txt'
That's it. This is the only use of raw strings, viz., convenience.
Solution
If you are using Python 2, use raw_input instead of input. If you are using Python 3 (as you should be) input is fine. Don't try to input the path as a raw string.

Hex string to ASCII conversion with errors?

I am trying to write a python script to convert a hex string into ASCII and save the result into a file in .der cert format. I can do this in Notepad++ using the conversion plugin, but I would like to find a way to do this conversion in a python script from command line, either by invoking the notepad++ NppConverter plugin or using python modules.
I am part way there, but my conversion is not identical to the ASCII ouptut seen in notepad++, below is a snippet of the output in Notepad++
But my python conversion is displaying a slightly different output below
As you can see my script causes missing characters in the output, and if i'm honest I don't know why certain blocks are outlined in black. But these missing blocks are needed in the same format to the first picture.
Here's my basic code, I am working in Python 3, I am using the backslashreplace error control as this is the only way I can get the problematic hex to appear in the output file
result = bytearray.fromhex('380c2fd6172cd06d1f30').decode('ascii', 'backslashreplace')
text_file = open("C:\Output.der", "w")
text_file.write(result)
text_file.close()
Any guidance would be greatly appreciated.

MikG, I would say that python did exactly what you requested.
You told to convert the bytes to string, and replace bytes with most significant bit set with escape sequence (except for \xFF char).
Characters \x04 (ETB) and \x1F (US) are perfectly legal ASCII chars (though non-printable), and they are encoded using their literal value.
Characters \xd6 and \xd0 are illegal in ASCII - they are 8-bit long. They are encoded using 4-letter long escape sequence, as you asked: "\" (backslash char) and "xd6" / "xd0" strings
I'm not good with DER, but suppose that you expect to have raw 8-bit sequences. Here is how this could be accomplished:
result = bytearray.fromhex('380c2fd6172cd06d1f30')
with open("Output.der", "wb") as text_file:
text_file.write(result)
Please note "wb" specifier to open -- it tells python to do binary IO.
I also used with statement to ensure that text_file is closed whatever happens with write.

How to give raw bytes as parameter to C program

I'm working on an online CTF challenge and I need to somehow give raw bytes to this compiled C program. I've tried the following using python:
./program `python -c 'print "\x00\x00"'`
... but for some reason that doesn't seem to be giving me what I'm expecting. Is there some conversion/formatting that's happening that I'm not aware of? I would expect this to give raw bytes as an argument.

Command line args in C are an array of 0 terminated strings. There is no way to pass "raw bytes" (any 0 byte won't behave as expected).
I'd suggest passing either reading the bytes from stdin or from a file specified on the command line.

incorrect strcpy in buffer from python input

I'm trying to fill a simple buffer in C with an input generated with Python. This is practice for a ROP project. Here's the simple C-code:
#include <string.h>
int main(int argc, char **argv)
{
char buf[128];
strcpy(buf, argv[1]);
}
compiled as: gcc -m32 -ggdb -fno-stack-protector -mpreferred-stack-boundary=2 test.c -o test
my hardware: x86-64, Linux Mint.
Here's part of the python input:
from struct import pack
p = '//bin/sh' #address 0xffffd15c
p += 'A'*28
#null terminate our string
p += pack("<I", 0x0806e67a) # pop edx ; ret
p += pack("<I", 0xffffd163) # # "/bin/sh" + 7
p += pack("<I", 0x080bac56) # pop eax ; ret
p += pack("<I", 0xffffffff) # 0xffffffff, or could xor the instruction
p += pack("<I", 0x0807b0cf) # inc eax ; ret
p += pack("<I", 0x08099fad) # mov dword ptr [edx], eax ; ret
For some reason when I input this as argv[1] the buffer fills correctly up until the last line. Instead of filling the buffer with 0x08099fad, it says 0x00009fad. There's more input to follow this line, but this is where is screws up, causing the rest of the input to be junk (not what I inputed).
For some reason it seems like a null byte was put into strcpy, possibly terminating it prematurely. But I don't know where the null byte is. The same happens when I try to input this address, as well later on: 0x080acedc.
Any thoughts?
Thanks!

I presume you are providing that string as a command-line argument to your C utility. (By the way, test is not a good name for a utility since it is a standard shell function often implemented as a builtin.)
Now suppose you were to invoke your utility from the terminal:
./test some thing
Clearly argv[1] would consist of a four-character word, with another word being placed in argv[2]. Had you wanted the single argument to be the entire rest of the command-line, you would need to quote it:
./test "some thing"
Now, normally when we invoke a utility from a program, we don't actually want the arguments to be interpreted by a shell. We would like to just exec the process with an argv array with the actual argument strings. That way, we don't have to worry about whitespace and shell metacharacters, and tear our hair out trying to correctly quote an arbitrary string.
But for the benefit of masochists, python provides the possibility os specifying shell=True. Even though the manual clearly warns against using this option, and even though people routinely get into trouble using it, it continues to be an oddly popular choice.
By the way, there is no space in your generated program (although their could be). Space is 0x20. But the shell interprets other bytes as whitespace. For example, tab is 0x09. I'll leave it as an exercise to figure out what the consequence of 0x0A is.

Just so someone who searches this for an answer gets actual help.
I encountered the same problem today.
What I found out is, that python itself escapes these characters.
If you write the same program in C it will work.
If you print a var like this:
jmpto = "\xbf\x84\x04\x08"
print(jmpto)
save the output to a file and use a hex editor to view it you will see it actually printed:
"C3 BB C2 84 04 08"
When I instead tried the same with:
jmpto = "\x41\x42\x43\x44"
print(jmpto)
looking at it in a hex editor it printed:
"41 42 43 44"
I sadly don't know a solution as to how to print those characters correctly using python.
Simplest solution seems to be writing it in C.
P.S: #weather-vane, what is the point in shaming someone for being interested to learn how it works underneath the hood?
Security by obscurity (aka not telling so noone finds out) doesn't work.
Someone is gonna break it.
Better show interested white hats how to do it, they might try to fix those problems.
EDIT:
I found the solution to it thanks to someone at Frauenhofer FKIE.
Using sys.stdout.buffer.write(ex_str)
Where ex_str needs to be of type bytes.
First create your string as a bytearray, than cast it to bytes type:
import sys
#convert this Hex Address or Hex ASM Code to int: fb 84 04 08
jmpto = [251, 132, 4, 8]
ex_str = bytes(bytearray(b"A"*(132 + 4)) + bytearray(jmpto))
sys.stdout.buffer.write(ex_str)
You can also use subprocess.call() or subprocess.run() to start the executable and pass it the bytes Object.
Hope someone found this helpful.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.