How to add a software watchpoint following a breakpoint on gdb - python

I am trying to debug a C program which allocates and frees various instances of a particular structure during its lifetime. At some point, one of these instances is getting corrupted.
To debug it, I would like to set watchpoints shortly after these structures are allocated and remove the watchpoints shortly before they are free'd. For that, I wrote a python gdb script (see below) which implements two subclasses of gdb.Breakpoint: BreakpointAlloc() and BreakpointFree(). The former has a stop() method which adds a watchpoint on the allocated structure and the latter has a stop() method which removes the watchpoint. (Watchpoints are kept in a dict indexed by a string containing the address of the allocated instance.)
Due to the large number of instances allocated (over 100), I cannot use hardware watchpoints. When using software watchpoints (by first running gdb.execute("set can-use-hw-watchpoints 0")), however, the program seems to wedge and I can't tell what's happening.
#!/usr/bin/env python
import gdb
wps = {}
def BreakpointAlloc(gdb.Breakpoint):
def stop(self):
ptr = gdb.parse_and_eval("ptr").address
wp = gdb.Breakpoint("*({0})({1})".format(ptr.type, ptr), gdb.BP_WATCHPOINT)
wps["{0}".format(ptr)] = wp
return False
def BreakpointFree(gdb.Breakpoint):
def stop(self):
ptr = gdb.parse_and_eval("ptr").address
wp = wps["{0}".format(ptr)]
wp.delete()
del wps["{0}".format(wp)]
return False
bp_alloc = BreakpointAlloc("prog.c:111")
bp_free = BreakpointFree("prog.c:222")
gdb.execute("set can-use-hw-watchpoints 0")
The documentation for the gdb python API suggests you shouldn't do what I'm doing. I believe that may explain why the program is wedging:
Function: Breakpoint.stop (self)
...
You should not alter the execution state of the inferior (i.e., step, next, etc.), alter the current frame context (i.e., change the current active frame), or alter, add or delete any breakpoint.
Considering the documentation, I have also tried modifying my program to add/del the watchpoints via stop events (see below). When using software watchpoints, the same problem occurs: the program seems to wedge.
#!/usr/bin/env python
import gdb
wps = {}
bp_alloc = gdb.Breakpoint("prog.c:111")
bp_free = gdb.Breakpoint("proc.c:222")
def stopAlloc():
ptr = gdb.parse_and_eval("ptr").address
wp = gdb.Breakpoint("*({0})({1})".format(ptr.type, ptr), gdb.BP_WATCHPOINT)
wps["{0}".format(ptr)] = wp
def stopFree():
ptr = gdb.parse_and_eval("ptr").address
wp = wps["{0}".format(ptr)]
wp.delete()
del wps["{0}".format(ptr)]
def handleStop(stopEvent):
for bp in stopEvent.breakpoints:
if bp == bp_alloc:
stopAlloc()
elif bp == bp_free:
stopFree()
gdb.events.stop(handleStop)
gdb.execute("set can-use-hw-watchpoints 0")
Any ideas on how I could manipulate watchpoints off breakpoints?
(Or any other ideas on how to debug this issue?)

Related

LLDB Python ReadMemory returns NoneType

I have a script in Python where I check value of x0 register which is a pointer to some memory address and read memory from there. But in my Python script, ReadMemory return None and because of that bytearray function call throws an error. In LLDB console memory read <x0 value> works. Code is below:
import random
import lldb
debugger = lldb.SBDebugger.Create()
target = debugger.GetTargetAtIndex(0)
def modify_memory(debugger, command, result, internal_dict):
thread = debugger.GetThread()
# if called from console itself, debugger argument type is true but if it comes from breakpoint
# debugger type == SBFrame. SBFrame has GetThread() method; so debugger above is actually frame
db = lldb.SBDebugger.Create()
if thread:
#db.HandleCommand("print \"a\"")
frame = thread.GetSelectedFrame()
# Read the value of register x0
x0 = frame.FindRegister("x0")
x0_value = x0.GetValue()
x0_value = int(x0_value,16)
print(x0_value)
# Read memory at the address stored in x0
memory = process.ReadMemory(x0_value+16+4, 256, lldb.SBError())
print(memory)
if memory != None:
print("finally something!")
# Modify a random byte in the memory
random_byte = random.randint(2, 255)
memory = bytearray(memory)
memory[random_byte] = random.randint(0, 255)
# Write the modified memory back to the original location
process.WriteMemory(x0_value, memory, lldb.SBError())
process.Continue()
else:
db.HandleCommand("print \"thread NOT found\"")
process.Continue()
I add modify_memory as a command to breakpoints.
I tried to run command from interactive console but did not manage to re-create thread, process etc variables. Also, I add function modify_memory as command but then variable "debugger" come as SBDebugger (which is actually true :) )but if I add this to a breakpoint, "debugger" variable becomes SBFrame which has the method GetThread
You want to use the newer command definition form:
def command_function(debugger, command, exe_ctx, result, internal_dict):
If you define your function that way, lldb will pass an SBExecutionContext in the exe_ctx parameter that contains the frame/thread/process/target you should act on in the command.
The point is that at any given stop, lldb first queries all the threads, and for each of them that have stopped for a reason, runs the relevant callbacks, then decides whether to stop or not, then computes the selected thread. So at the time breakpoint callbacks are being run, the selected thread is still the one from the last time the debugee stopped.
The original form was really an oversight in the design. We kept the old form around for compatibility reasons, but at this point it's unlikely there are any lldb's around that don't include the more useful form. So that's really the one you want to use.

How to rewrite a state machine in a clearer style?

I am interacting with an external device, and I have to issue certain commands in order. Sometimes I have to jump back and redo steps. Pseudocode (the actual code has more steps and jumps):
enter_update_mode() # step 1
success = start_update()
if not success:
retry from step 1
leave_update_mode()
How do I handle this the cleanest way? What I did for now is to define an enum, and write a state machine. This works, but is pretty ugly:
class Step(Enum):
ENTER_UPDATE_MODE = 1
START_UPDATE = 2
LEAVE_UPDATE_MODE = 3
EXIT = 4
def main():
next_step = Step.ENTER_UPDATE_MODE
while True:
if next_step == Step.ENTER_UPDATE_MODE:
enter_update_mode()
next_step = Step.START_UPDATE
elif next_step == Step.START_UPDATE:
success = start_update()
if success:
next_step = Step.LEAVE_UPDATE_MODE
else:
next_step = Step.ENTER_UPDATE_MODE
....
I can imagine an alternative would be to just call the functions nested. As long as this is only a few levels deep, it should not be a problem:
def enter_update_mode():
# do stuff ...
# call next step:
perform_update()
def perform_update():
# ...
# call next step:
if success:
leave_update_mode()
else:
enter_update_mode()
I have looked into the python-statemachine module, but it seems to be there to model state machines. You can define states and query which state it is in, and you can attach behavior to states. But that is not what I'm looking for. I am looking for a way to write the behavior code in a very straightforward, imperative style, like you would use for pseudocode or instructions to a human.
There is also a module to add goto to Python, but I think it is a joke and would not like to use it in production :-).
Notes:
This code is synchronous, meaning it is a terminal app or a separate thread. Running concurrently with other code would be an added complication. If a solution allows that (e.g. by using yield) that would be a bonus, but not neccessary.
I left out a lot of retry logic. A step may be only retried a certain number of times.
Releated discussion of explicit state machine vs. imperative style: https://softwareengineering.stackexchange.com/q/147182/62069

How to spawn threads in pyobjc

I am learning how to use pyobjc for some basic prototyping. Right now I have a main UI set up and a python script that runs the main application. The only issue is when the script runs, the script runs on the main thread thus blocking the UI.
So this is my sample code snippet in that I attempted in python using the threading import:
def someFunc(self):
i = 0
while i < 20:
NSLog(u"Hello I am in someFunc")
i = i + 1
#objc.IBAction
def buttonPress(self, sender):
thread = threading.Thread(target=self.threadedFunc)
thread.start()
def threadedFunc(self):
NSLog(u"Entered threadedFunc")
self.t = NSTimer.NSTimer.scheduledTimerWithTimeInterval_target_selector_userInfo_repeats_(1/150., self,self.someFunc,None, True)
NSLog(u"Kicked off Runloop")
NSRunLoop.currentRunLoop().addTimer_forMode_(self.t,NSDefaultRunLoopMode)
When clicking on the button, the NSLogs in threadedFunc prints out to console, but it never enters someFunc
So I decided to use NSThread to kick off a thread. On Apple's documentation the Objective-C call looks like this:
(void)detachNewThreadSelector:(SEL)aSelector
toTarget:(id)aTarget
withObject:(id)anArgument
So I translated that to what I interpreted as pyobjc rules for calling objective-c function:
detachNewThreadSelector_aSelector_aTarget_anArgument_(self.threadedFunc, self, 1)
So in context the IBAction function looks like this:
#objc.IBAction
def buttonPress(self, sender):
detachNewThreadSelector_aSelector_aTarget_anArgument_(self.threadedFunc, self, 1)
But when the button is pressed, I get this message: global name 'detachNewThreadSelector_aSelector_aTarget_anArgument_' is not defined.
I've also tried similar attempts with grand central dispatch, but the same message kept popping up of global name some_grand_central_function is not defined
Clearly I am not understanding the nuances of python thread, or the pyobjc calling conventions, I was wondering if some one could shed some light on how to proceed.
So I got the result that I wanted following the structure below. Like I stated in my response to the comments: For background thread, NSThread will not allow you to perform certain tasks. (i.e update certain UI elements, prints, etc). So I used performSelectorOnMainThread_withObject_waitUntilDone_ for things that I needed to perform in between thread operations. The operations were short and not intensive so it didn't affect the performance as much. Thank you Michiel Kauw-A-Tjoe for pointing me in the right direction!
def someFunc(self):
i = 0
someSelector = objc.selector(self.someSelector, signature='v#:')
while i < 20:
self.performSelectorOnMainThread_withObject_waitUntilDone(someSelector, None, False)
NSLog(u"Hello I am in someFunc")
i = i + 1
#objc.IBAction
def buttonPress(self, sender):
NSThread.detachNewThreadSelector_toTarget_withObject_(self.threadedFunc, self, 1)
def threadedFunc(self):
NSLog(u"Entered threadedFunc")
self.t = NSTimer.NSTimer.scheduledTimerWithTimeInterval_target_selector_userInfo_repeats_(1/150., self,self.someFunc,None, True)
NSLog(u"Kicked off Runloop")
self.t.fire()
The translated function name should be
detachNewThreadSelector_toTarget_withObject_(aSelector, aTarget, anArgument)
You're currently applying the conversion rule to the arguments part instead of the Objective-C call parts. Calling the function with the arguments from your example:
detachNewThreadSelector_toTarget_withObject_(self.threadedFunc, self, 1)

Python, pyserial program for communicating with Zaber TLSR300B

I'm still new here so I apologize if I make any mistakes or if my question is not specific enough, please correct me! I'm working on a program for controlling two Zaber TLSR300B linear motion tracks in a laser lab via a serial connection. I have no problems communicating with them using pyserial, I've been able to write and read data no problem. My questions is more about how to structure my program to achieve the desired functionality (I have very little formal programming training).
What I would like the program to do is provide a number of methods which allow the user to send commands to the tracks which then return what the track responds. However, I do not want the program to hang while checking for responses, so I can't just write methods with write commands followed by a read command. For some commands, the tracks respond right away (return ID, return current position, etc.) but for others the tracks respond once the requested action has been performed (move to location, move home, etc.). For example, if a move_absolute command is sent the track will move to the desired position and then send a reply with its new position. Also, the tracks can be moved manually with a physical knob, which causes them to continuously send their current position as they move (this is why I need to continuously read the serial data).
I've attached code below where I implement a thread to read data from the serial port when there is data to read and put it in a queue. Then another thread takes items from the queue and handles them, modifying the appropriate values of the ZaberTLSR300B objects which store the current attributes of each track. The methods at the bottom are some of the methods I'd like the user to be able to call. They simply write commands to the serial port, and the responses are then picked up and handled by the continuously running read thread.
The main problem I run into is that those methods will have no knowledge of what the track responds so they can't return the reply, and I can't think of a way to fix this. So I can't write something like:
newPosition = TrackManager.move_absolute(track 1, 5000)
or
currentPosition = TrackManager.return_current_position(track 2)
which is in the end what I want the user to be able to do (also since I'd like to implement some sort of GUI on top of this in the future).
Anyways, am I going about this in the right way? Or is there a cleaner way to implement this sort of behavior? I am willing to completely rewrite everything if need be!
Thanks, and let me know if anything is unclear.
Zaber TLSR300B Manual (if needed): http://www.zaber.com/wiki/Manuals/T-LSR
CODE:
TRACK MANAGER CLASS
import serial
import threading
import struct
import time
from collections import deque
from zaberdevices import ZaberTLSR300B
class TrackManager:
def __init__(self):
self.serial = serial.Serial("COM5",9600,8,'N',timeout=None)
self.track1 = ZaberTLSR300B(1, self.serial)
self.track2 = ZaberTLSR300B(2, self.serial)
self.trackList = [self.track1, self.track2]
self.serialQueue = deque()
self.runThread1 = True
self.thread1 = threading.Thread(target=self.workerThread1)
self.thread1.start()
self.runThread2 = True
self.thread2 = threading.Thread(target=self.workerThread2)
self.thread2.start()
def workerThread1(self):
while self.runThread1 == True:
while self.serial.inWaiting() != 0:
bytes = self.serial.read(6)
self.serialQueue.append(struct.unpack('<BBl', bytes))
def workerThread2(self):
while self.runThread2 == True:
try:
reply = self.serialQueue.popleft()
for track in self.trackList:
if track.trackNumber == reply[0]:
self.handleReply(track, reply)
except:
continue
def handleReply(self, track, reply):
if reply[1] == 10:
track.update_position(reply[2])
elif reply[1] == 16:
track.storedPositions[address] = track.position
elif reply[1] == 20:
track.update_position(reply[2])
elif reply[1] == 21:
track.update_position(reply[2])
elif reply[1] == 60:
track.update_position(reply[2])
def move_absolute(self, trackNumber, position):
packet = struct.pack("<BBl", trackNumber, 20, position)
self.serial.write(packet)
def move_relative(self, trackNumber, distance):
packet = struct.pack("<BBl", trackNumber, 21, distance)
self.serial.write(packet)
def return_current_position(self, trackNumber):
packet = struct.pack("<BBl", trackNumber, 60, 0)
self.serial.write(packet)
def return_stored_position(self, trackNumber, address):
packet = struct.pack("<BBl", trackNumber, 17, address)
self.serial.write(packet)
def store_current_position(self, trackNumber, address):
packet = struct.pack("<BBl", trackNumber, 16, address)
self.serial.write(packet)
zaberdevices.py ZaberTLSR300B class
class ZaberTLSR300B:
def __init__(self, trackNumber, serial):
self.trackNumber = trackNumber
self.serial = serial
self.position = None
self.storedPositions = []
def update_position(self, position):
self.position = position
There is an option on the controllers to disable all of the replies that don't immediately follow a command. You can do this by enabling bits 0 and 5 of the Device Mode setting. These bits correspond to 'disable auto-replies' and 'disable manual move tracking'. Bit 11 is enabled by default, so the combined value to enable these bits is 2081.
My recommendation would be to disable these extra responses so that you can then rely on the predictable and reliable command->response model. For example to move a device you would send the move command, but never look for a response from that command. To check whether the movement has completed, you could either use the Return Current Position command and read the position response, or use the Return Status (Cmd_54) command to check whether the device is busy (i.e. moving) or idle.
So, the way I went about doing this when a multithreaded application needed to share the same data was implementing my own "locker". Basically your threads would call this ChekWrite method, it would return true or false, if true, it would set the lock to true, then use the shared resource by calling another method inside of it, after complete, it would set the lock to false. If not, it would wait some time, then try again. The check write would be its own class. You can look online for some Multithreading lock examples and you should be golden. Also, this is all based on my interpretation of your code and description above... I like pictures.
Custom Locks Threading python
/n/r
EDIT
My fault. When your ListenerThread obtains data and needs to write it to the response stack, it will need to ask permission. So, it will call a method, CheckWrite, which is static, This method will take in the data as a parameter and return a bit or boolean. Inside this function, it will check to see if it is in use, we will call it Locked. If it is locked, it will return false, and your listener will wait some time and then try again. If it is unlocked. It will set it to locked, then will proceed in writing to the response stack. Once it is finished, it will unlock the method. On your response stack, you will have to implement the same functionality. It slipped my mind when i created the graphic. Your main program, or whatever wants to read the data, will need to ask permission to read from, lock the Locker value and proceed to read the data and clear it. Also, you can get rid of the stack altogether and have a method inside that write method that goes ahead and conducts logic on the data. I would separate it out, but to each their own.

Python Memory Leak - Why is it happening?

For some background on my problem, I'm importing a module, data_read_module.pyd, written by someone else, and I cannot see the contents of that module.
I have one file, let's called it myfunctions. Ignore the ### for now, I'll comment on the commented portions later.
import data_read_module
def processData(fname):
data = data_read_module.read_data(fname)
''' process data here '''
return t, x
### return 1
I call this within the framework of a larger program, a TKinter GUI specifically. For purposes of this post, I've pared down to the bare essentials. Within the GUI code, I call the above as follows:
import myfunctions
class MyApplication:
def __init__(self,parent):
self.t = []
self.x = []
def openFileAndProcessData(self):
# self.t = None
# self.x = None
self.t,self.x = myfunctions.processData(fname)
## myfunctions.processData(fname)
I noticed what every time I run openFileAndProcessData, Windows Task Manager reports that my memory usage increases, so I thought that I had a memory leak somewhere in my GUI application. So the first thing I tried is the
# self.t = None
# self.x = None
that you see commented above. Next, I tried calling myfunctions.processData without assigning the output to any variables as follows:
## myfunctions.processData(fname)
This also had no effect. As a last ditch effort, I changed the processData function so it simply returns 1 without even processing any of the data that comes from the module, data_read_module.pyd. Unfortunately, even this results in more memory being taken up with each successive call to processData, which narrows the problem down to data_read_module.read_data. I thought that within the Python framework, this is the exact type of thing that is automatically taken care of. Referring to this website, it seems that memory taken up by a function will be released when the function terminates. In my case, I would expect the memory used in processData to be released after a call [with the exception of the output that I am keeping track of with self.t and self.x]. I understand I won't get a fix to this kind of issue without access to data_read_module.pyd, but I'd like to understand how this can happen to begin with.
A .pyd file is basically a DLL. You're calling code written in C, C++, or another such compiled language. If that code allocates memory and doesn't release it properly, you will get a memory leak. The fact that the code is being called from Python won't magically fix it.

Categories