I am developing a rather large python application (wxpython) that allows a workflow of data analysis. Performing all steps of the workflow can be quite long and the user is not likely to everything at once. More likely he would prefer to do different parts of the processing at different points in time. It would therefore be very handy to be able to store the application's current status with some sort of "save project" functionality. Opening the application and loading a project file would set up the application as it was previously and allow one to continue where he/she left off last time.
However I have a large amount of objects to save, most of which are imbued with attributes coming from wxpython. This causes pickle to fail with the following error:
TypeError: can't pickle PySwigObject objects
Does anyone has experience with this? What would be a best practice to obtain the required functionality? Are there libraries devoted to this?
Thanks you.
wxPython is a wrapper around a C++ library known as wxWidgets. So you cannot use normal Python serialization to save its state. However, you can use the persist library to save most widget's state: http://wxpython.org/Phoenix/docs/html/lib.agw.persist.html
I'm not sure when this library was added to wxPython, but I'm guessing it was with 2.9 or perhaps the latest version of 2.8. Otherwise you can probably find it in the latest version of 2.8's source.
As others have said, it's usually better to just save the process's state and then load that information back to the GUI when it's started.
Related
TLDR: Is there a Python library that allows me to get a application window frame as an image and rewrite it to the said application?
So the whole story is that I want to write an application using Python that does something similar to Lossless Scaling and Magpie. I want to grab an application window (a videogame window, for example), get the current frame as an image, then use some Machine Learning/Deep Learning algorithm (like FSR or DLSS) to upscale said image, then rewrite the current frame from the application with said upscaled image.
So far, I have been playing around with some upscaling algorithms like the one from Real-ESRGAN, but now my main problem is how to upscale the video game images in real-time. The only thing I found that does something related to what I need to do is PyAutoGUI. But this package only allows you to take screenshots of an application but not rewrite the graphics of said application.
I hope I have clarified my problem; feel free to comment if you still have any questions.
Thank you for reading this post, and have a good day.
Doing this with Python is going to be very difficult. A lot of the performance involved in this sort of thing is in avoiding as many memory copies as possible, and Python's idiom for string and bytes processing unfortunately makes quite a few additional copies in the course of any idiomatic program. I say this as a die-hard Python fan who is constantly trying to cram Python in everywhere it doesn't belong: you'd be better off doing this in Rust.
Update: After receiving some feedback from some folks with more direct experience in this sort of thing, I may have overstated the difficulty here. Many ML tools in Python provide zero-copy access, you can easily access and manipulate memory-mapped data from numpy and there is even a CUDA protocol for doing this to data in GPU memory, so while it's not exactly easy, as long as your operations are implemented as numpy operations and not as pure-python pixel-by-pixel logic, it shouldn't be much harder than other python machine learning applications which require access to native APIs for accessing their source data.
However, there's no way to access framebuffer data directly from python, so step 1 is going to be writing your own bindings over the relevant DirectX APIs. Since Magpie is open source, you can see which APIs it's using, for example, in its various C++ "Frame Source" backends. For example, this looks relevant: https://github.com/Blinue/Magpie/blob/42cfcba1222b07e4cec282eaff639aead229f123/Runtime/GraphicsCaptureFrameSource.cpp#L87
You can then look those APIs up on MSDN; that one, for example, is here: https://learn.microsoft.com/en-us/uwp/api/windows.graphics.capture.direct3d11captureframepool.createfreethreaded?view=winrt-22621
CFFI is a good choice for writing native wrappers: https://cffi.readthedocs.io/en/latest/
Gluing these together appropriately is left as an exercise for the reader :).
As I improve my code each time, for big structural changes in flow and other features I save up and add a new number...
so when I use the code it looks like this:
import custom3 as c
function = c.do_thing()
as I save up to custom4, I change it to
import custom4 as c
function = c.do_thing()
very simple update.
My problem is that I have many scripts where i'm using import custom# as c so when I update the version number, I have to go back and change the number everywhere.
Is there a way to centrally control this? Basically dynamically importing a library using another script? I guess I can use something like modules = map(__import__, moduleNames) and keep a spreadsheet of latest version? And write a script to access that file first every time?
Has anybody implemented anything else more elegant?
The way to do this that the pros use is not to create different modules for different versions, but to use a version control system to manage and track changes to the same module.
A good version control system will do the following:
allow you to keep and view a history of changes to your module
allow you to mark your versions with a meaningful annotation e.g.
"develop", "release"
allow you to recover from mistakes and revert back to another earlier
version without having to rewrite code
allow you to share your work with other developers.
There are many version control systems available, some are proprietary licensed, but others available free. Git is probably the most popular open source system at the moment, and can scale from a lone developer to a large team. Plus there is already a whole ecosystem of code sharing available with Github.
As you learn programming, take the time to learn and use version control. You won't regret it.
You can use importlib.
import importlib
version = "3"
c = importlib.import_module("custom"+version)
function = c.do_thing()
But yeah, as suggested in the comments, use some file versioning system like git. The learning curve is a bit steep, but it would make your life a lot easier.
I am using python inside another application (CINEMA 4D) create a nice connection to out issue tracker (Jira) inside the application. Rationale behind this is to make it really easy for our plugin users to report and track bugs and have things like machine specs, screenshots or attaching scene files (including textures) automatically.
So far it as been a really smooth ride and the integration is coming along great. I started grabbing the icons for issue priorities, projects, issue types, etc. from Jira as well so they can be displayed for better overview. To read the image files I am using CINEMA 4D functionality that is available inside its python binding.
The problem now is, that most icons from Jira come in GIF format and the CINEMA 4D SDK doesn't read GIF files directly (actually it does read them, but only through a back door so users can load them, but I can't use that functionality through Python or the SDK). So I need another way to read the GIF files.
There are a few questions on stackoverflow that go into this direction, but they all seem to recommend PIL. This doesn't feel like the right solution for a few reasons:
While that looks nice, it's not part of the standard distribution and seems to be really only maintained for Windows (even though there are builds for Mac OS X).
It also seems to install itself into the current system installation of Python, but CINEMA 4D comes with its own, so I'd have to rip it apart and distribute it with my plugin.
And then it is quite large, while I really only want a compact script to have a compact solution (preferably out of the box, but that doesn't seem to be an option)
I was wondering if there is a simpler or at least more compact way. Since GIF seems to be a relatively simple file format, I am wondering if there may even be a simple parser as a python function/class.
I found a link where somebody disassembles a gif files embedded frames, but doesn't actually grab the image contents: Python, how i can get gif frames
I'm fine with putting in some time on my own, and I would've already been coding away if the file format was something uncompressed, but I am a little reluctant since the compression seems to raise the bar slightly.
Due to several edits, this question might have become a bit incoherent. I apologize.
I'm currently writing a Python server. It will never see more than 4 active users, but I'm a computer science student, so I'm planning for it anyway.
Currently, I'm about to implement a function to save a backup of the current state of all relevant variables into CSV files. Of those I currently have 10, and they will never be really big, but... well, computer science student and so on.
So, I am currently thinking about two things:
When to run a backup?
What kind of backup?
When to run:
I can either run a backup every time a variable changes, which has the advantage of always having the current state in the backup, or something like once every minute, which has the advantage of not rewriting the file hundreds of times per minute if the server gets busy, but will create a lot of useless rewrites of the same data if I don't implement a detection which variables have changed since the last backup.
Directly related to that is the question what kind of backup I should do.
I can either do a full backup of all variables (Which is pointless if I'm running a backup every time a variable changes, but might be good if I'm running a backup every X minutes), or a full backup of a single variable (Which would be better if I'm backing up each time the variables change, but would involve either multiple backup functions or a smart detection of the variable that is currently backed up), or I can try some sort of delta-backup on the files (Which would probably involve reading the current file and rewriting it with the changes, so it's probably pretty stupid, unless there is a trick for this in Python I don't know about).
I cannot use shelves because I want the data to be portable between different programming languages (java, for example, probably cannot open python shelves), and I cannot use MySQL for different reasons, mainly that the machine that will run the Server has no MySQL support and I don't want to use an external MySQL-Server since I want the server to keep running when the internet connection drops.
I am also aware of the fact that there are several ways to do this with preimplemented functions of python and / or other software (sqlite, for example). I am just a big fan of building this stuff myself, not because I like to reinvent the wheel, but because I like to know how the things I use work. I'm building this server partly just for learning python, and although knowing how to use SQLite is something useful, I also enjoy doing the "dirty work" myself.
In my usage scenario of possibly a few requests per day I am tending towards the "backup on change" idea, but that would quickly fall apart if, for some reason, the server gets really, really busy.
So, my question basically boils down to this: Which backup method would be the most useful in this scenario, and have I possibly missed another backup strategy? How do you decide on which strategy to use in your applications?
Please note that I raise this question mostly out of a general curiosity for backup strategies and the thoughts behind them, and not because of problems in this special case.
Use sqlite. You're asking about building persistent storage using csv files, and about how to update the files as things change. What you're asking for is a lightweight, portable relational (as in, table based) database. Sqlite is perfect for this situation.
Python has had sqlite support in the standard library since version 2.5 with the sqlite3 module. Since a sqlite database is implemented as a single file, it's simple to move them across machines, and Java has a number of different ways to interact with sqlite.
I'm all for doing things for the sake of learning, but if you really want to learn about data persistence, I wouldn't marry yourself to the idea of a "csv database". I would start by looking at the wikipedia page for Persistence. What you're thinking about is basically a "System Image" for your data. The Wikipedia article describes some of the same shortcomings of this approach that you've mentioned:
State changes made to a system after its last image was saved are lost
in the case of a system failure or shutdown. Saving an image for every
single change would be too time-consuming for most systems
Rather than trying to update your state wholesale at every change, I think you'd be better off looking at some other form of persistence. For example, some sort of journal could work well. This makes it simple to just append any change to the end of a log-file, or some similar construct.
However, if you end up with many concurrent users, with processes running on multiple threads, you'll run in to concerns of whether or not your changes are atomic, or if they conflict with one another. While operating systems generally have some ways of dealing with locking files for edits, you're opening up a can of worms trying to learn about how that works and interacts with your system. At this point you're back to needing a database.
So sure, play around with a couple different approaches. But as soon as you're looking to just get it working in a clear and consistent manner, go with sqlite.
If your data is in CSV files, why not use a revision control system on those files? E.g. git would be pretty fast and give excellent history. The repository would be wholly contained in the directory where the files reside, so it's pretty easy to handle. You could also replicate that repository to other machines or directories easily.
let us assume that there is a big, commercial project (a.k.a Project), which uses Python under the hood to manage plugins for configuring new control surfaces which can be attached and used by Project.
There was a small information leak, some part of the Project's Python API leaked to the public information and people were able to write Python scripts which were called by the underlying Python implementation as a part of Project's plugin loading mechanism.
Further on, using inspect module and raw __dict__ readings, people were able to find out a major part of Project's underlying Python implementation.
Is there a way to keep the Python secret codes secret?
Quick look at Python's documentation revealed a way to suppres a import of inspect module this way:
import sys
sys.modules['inspect'] = None
Does it solve the problem completely?
No, this does not solve the problem. Someone could just rename the inspect module to something else and import it.
What you're trying to do is not possible. The python interpreter must be able to take your bytecode and execute it. Someone will always be able to decompile the bytecode. They will always be able to produce an AST and view the flow of the code with variable and class names.
Note that this process can also be done with compiled language code; the difference there is that you will get assembly. Some tools can infer C structure from the assembly, but I don't have enough experience with that to comment on the details.
What specific piece of information are you trying to hide? Could you keep the algorithm server side and make your software into a client that touches your web service? Keeping the code on a machine you control is the only way to really keep control over the code. You can't hand someone a locked box, the keys to the box, and prevent them from opening the box when they have to open it in order to run it. This is the same reason DRM does not work.
All that being said, it's still possible to make it hard to reverse engineer, but it will never be impossible when the client has the executable.
There is no way to keep your application code an absolute secret.
Frankly, if a group of dedicated and determined hackers (in the good sense, not in the pejorative sense) can crack the PlayStation's code signing security model, then your app doesn't stand a chance. Once you put your app into the hands of someone outside your company, it can be reverse-engineered.
Now, if you want to put some effort into making it harder, you can compile your own embedded python executable, strip out unnecessary modules, obfuscate the compiled python bytecode and wrap it up in some malware rootkit that refuses to start your app if a debugger is running.
But you should really think about your business model. If you see the people who are passionate about your product as a threat, if you see those who are willing to put time and effort into customizing your product to personalize their experience as a danger, perhaps you need to re-think your approach to security. Assuming you're not in the DRM business, or have a similar model that involves squeezing money from reluctant consumers, consider developing an approach that involves sharing information with your users, and allowing them to collaboratively improve your product.
Is there a way to keep the Python secret codes secret?
No there is not.
Python is particularly easy to reverse engineer, but other languages, even compiled ones, are easy enough to reverse.
You cannot fully prevent reverse engineering of software - if it comes down to it, one can always analyze the assembler instructions your program consists of.
You can, however, significantly complicate the process, for example by messing with Python internals. However, before jumping to how to do it, I'd suggest you evaluate whether to do it. It's usually harder to "steal" your code (one needs to fully understand them to be able to extend them, after all) than code it oneself. A pure, unobfuscated Python plugin interface, however, can be vital in creating a whole ecosystem around your program, far outweighing the possible downsides to having someone peek in your maybe not perfectly designed coding internals.