I have a GUI driven image analysis package in IDL that needs some serious rewriting. Python has been suggested to be as an alternative to IDL (with benefits of cost and some nice libraries among other things). I've poked around now with PyQT4 and it looks like it should work nicely. However, one of the best things about IDL (being interpreted) is that if the code hits a bug, you can correct it on the fly, type 'retall' and then continue with your work. If you are hours into some analysis and have lots of datafiles open, etc., this is a HUGE improvement over having to exit, then change and restart the program. Not only that but we can quickly try some things on the command line, then if it looks good, code up a routine, put it in the menu structure, and then 'retall' and we are back with the new functionality, all without ever having to restart the program.
My question is, is this possible with Python? A little googling makes the answer seem like no but since it is an interpreted language I don't understand why not. If the answer really is no I'd strongly urge someone to think about implementing this -- it is probably the feature that made me most happy about IDL over the past decade.
Thanks in advance,
Eric
I don't think it can be done.
However, you may consider structuring your package to cache intermediate results on disk and to allow resume from thoses cached results. This has benefits outside the bug case you describe.
Fixing a bug "on the fly" seems potentially dangerous to me as it is so easy to overlook possible side-effects of the bug. It breaks the link between execution flow and code and make it extremely difficult to debug subsequent issues, etc ...
Related
I'm creating a program in python (2.7) and I want to protect it from reverse engineering.
I compiled it using cx_freeze (supplies basic security- obfuscation and anti-debugging)
How can I add more protections such as obfuscation, packing, anti-debugging, encrypt the code recognize VM.
I thought maybe to encrypt to payload and decrypt it on run time, but I have no clue how to do it.
Generally speaking, it's almost impossible for you to make your program unbreakable as long as there's enough motive for the hackers.
But still you can make it harder to be reverse engineered, try to use cython to compile your core codes into pyd or so files.
There's no way to make anything digital safe nowadays.
What you CAN do is making it hard to a point where it's frustrating to do it, but I admit I don't know python specific ways to achieve that. The amount of security of your program is not actually a function of programsecurity, but of psychology.
Yes, psychology.
Given the fact that it's an arms race between crackers and anti-crackers, where both continuously attempt to top each other, the only thing one can do is trying to make it as frustrating as possible. How do we achieve that?
By being a pain in the rear!
Every additional step you take to make sure your code is hard to decipher is a good one.
For example could you turn your program into a single compiled block of bytecode, which you call from inside your program. Use an external library to encrypt it beforehand and decrypt it afterwards. Do the same with extra steps for codeblocks of functions. Or, have functions in precompiled blocks ready, but broken. At runtime, utilizing byteplay, repair the bytecode with bytes depending on other bytes of different functions, which would then stop your program from working when modified.
There are lots of ways of messing with people's heads and while I can't tell you any python specific ways, if you think in context of "How to be difficult", you'll find the weirdest ways of making it a mess to deal with your code.
Funnily enough this is much easier in assembly, than python, so maybe you should look into executing foreign code via ctypes or whatever.
Summon your inner Troll!
Story time: I was a Python programmer for a long time. Recently I joined in a company as a Python programmer. My manager was a Java programmer for a decade I guess. He gave me a project and at the initial review, he asked me that are we obfuscating the code? and I said, we don't do that kind of thing in Python. He said we do that kind of things in Java and we want the same thing to be implemented in python. Eventually I managed to obfuscate code just removing comments and spaces and renaming local variables) but entire python debugging process got messed up.
Then he asked me, Can we use ProGuard? I didn't know what the hell it was. After some googling I said it is for Java and cannot be used in Python. I also said whatever we are building we deploy in our own servers, so we don't need to actually protect the code. But he was reluctant and said, we have a set of procedures and they must be followed before deploying.
Eventually I quit my job after a year tired of fighting to convince them Python is not Java. I also had no interest in making them to think differently at that point of time.
TLDR; Because of the open source nature of the Python, there are no viable tools available to obfuscate or encrypt your code. I also don't think it is not a problem as long as you deploy the code in your own server (providing software as a service). But if you actually provide the product to the customer, there are some tools available to wrap up your code or byte code and give it like a executable file. But it is always possible to view your code if they want to. Or you choose some other language that provides better protection if it is absolutely necessary to protect your code. Again keep in mind that it is always possible to do reverse engineering on the code.
As the title says. I'll give some background as to why I ask this question. I have just started learning a bit of python, and embarrassingly I have re-named all the files in the anaconda folder by running a script I had written in the wrong folder due to misinterpreting the code I was using as an example. This is incredibly frustrating as you can imagine and seems to be quite an easy mistake for a beginner to make. I was wondering if there are any techniques to prevent this sort of thing happening when you're learning, other than being more careful?
Thanks for your advice.
Christian
With low level languages like Assembly, it was much more common for beginners to make damaging mistakes but now adays, you cant do too much damage without really trying to do damage if that makes sense.
When it comes to scripts you don't entirely follow, you should look for things like
Does it modify/read a file?
Does it run any system commands like "rm -rf"
Does it have any infinite loops or anything that seems strange?
Of course this is just a very simple list, there is no actual check list for "bad stuff".
In your case, it sounds like you just got unlucky. Don't look at it as a failure, look at it as you learned what the script can do.
You can also set up a Virtual Machine to use as a "sandbox" for your applications.
One last thing, since you are starting on this journey, it is important to make backups of your code/workspace and your important computer files. This is because you might want to go back to a previous version of your code.
I prefere setting up a virtual machine while coding
"potential dangerous things"
VM's also have the advantage of making snapshots before running the code that afterwards easily can be reset (e.g. for testing things multiple times)
I'm creating a program in python (2.7) and I want to protect it from reverse engineering.
I compiled it using cx_freeze (supplies basic security- obfuscation and anti-debugging)
How can I add more protections such as obfuscation, packing, anti-debugging, encrypt the code recognize VM.
I thought maybe to encrypt to payload and decrypt it on run time, but I have no clue how to do it.
Generally speaking, it's almost impossible for you to make your program unbreakable as long as there's enough motive for the hackers.
But still you can make it harder to be reverse engineered, try to use cython to compile your core codes into pyd or so files.
There's no way to make anything digital safe nowadays.
What you CAN do is making it hard to a point where it's frustrating to do it, but I admit I don't know python specific ways to achieve that. The amount of security of your program is not actually a function of programsecurity, but of psychology.
Yes, psychology.
Given the fact that it's an arms race between crackers and anti-crackers, where both continuously attempt to top each other, the only thing one can do is trying to make it as frustrating as possible. How do we achieve that?
By being a pain in the rear!
Every additional step you take to make sure your code is hard to decipher is a good one.
For example could you turn your program into a single compiled block of bytecode, which you call from inside your program. Use an external library to encrypt it beforehand and decrypt it afterwards. Do the same with extra steps for codeblocks of functions. Or, have functions in precompiled blocks ready, but broken. At runtime, utilizing byteplay, repair the bytecode with bytes depending on other bytes of different functions, which would then stop your program from working when modified.
There are lots of ways of messing with people's heads and while I can't tell you any python specific ways, if you think in context of "How to be difficult", you'll find the weirdest ways of making it a mess to deal with your code.
Funnily enough this is much easier in assembly, than python, so maybe you should look into executing foreign code via ctypes or whatever.
Summon your inner Troll!
Story time: I was a Python programmer for a long time. Recently I joined in a company as a Python programmer. My manager was a Java programmer for a decade I guess. He gave me a project and at the initial review, he asked me that are we obfuscating the code? and I said, we don't do that kind of thing in Python. He said we do that kind of things in Java and we want the same thing to be implemented in python. Eventually I managed to obfuscate code just removing comments and spaces and renaming local variables) but entire python debugging process got messed up.
Then he asked me, Can we use ProGuard? I didn't know what the hell it was. After some googling I said it is for Java and cannot be used in Python. I also said whatever we are building we deploy in our own servers, so we don't need to actually protect the code. But he was reluctant and said, we have a set of procedures and they must be followed before deploying.
Eventually I quit my job after a year tired of fighting to convince them Python is not Java. I also had no interest in making them to think differently at that point of time.
TLDR; Because of the open source nature of the Python, there are no viable tools available to obfuscate or encrypt your code. I also don't think it is not a problem as long as you deploy the code in your own server (providing software as a service). But if you actually provide the product to the customer, there are some tools available to wrap up your code or byte code and give it like a executable file. But it is always possible to view your code if they want to. Or you choose some other language that provides better protection if it is absolutely necessary to protect your code. Again keep in mind that it is always possible to do reverse engineering on the code.
Python is a relatively new language for me and I already see some of the trouble areas of maintaining a scripting language based project. I am just wondering how the larger community , with a scenario when one has to maintain a fairly large code base written by people who are not around anymore, deals with the following situations:
Return type of a function/method. Assuming past developers didn't document the code very well, this is turning out to be really annoying as I am basically reading code line by line to figure out what a method/function is suppose to return.
Code refactoring: I figured a lot of code need to be moved around, edited/deleted and etc. But lot of times simple errors, which would otherwise be compile time error in other compiled languages e.g. - wrong number of arguments, wrong type of arguments, method not present and etc, only show up when you run the code and the code reaches the problematic area. Therefore, whether a re-factored code will work at all or not can only be known once you run the code thoroughly. I am using PyLint with PyDev but still I find it very lacking in this respect.
You are right, that's an issue with dynamically typed interpreted languages.
There are to important things that can help:
Good documentation
Extensive unit-testing.
They apply to other languages as well of course, but here they are especially important.
As far as I know If code is not documented at all and the author isn't around anymore it's up to you to find out what the ode actually does.
That's why people should always stick to certain guidelindes that can be enforced by stylecheckers like pep8. https://pypi.python.org/pypi/pep8
Comments and docstrings should be included in every method to avoid such situation you're describing. http://www.python.org/dev/peps/pep-0257/#what-is-a-docstring
Also unittests are very helpfull for refactoring since you can check if you broke something with the click of a button. http://docs.python.org/2/library/unittest.html
hope this helps
Others have already mentioned documentation and unit-testing as being the main tools here. I want to add a third: the Python shell. One of the huge advantages of a non-compiled language like Python is that you can easily fire up the shell, import your module, and run the code there to see what it does and what it returns.
Linked to this is the Python debugger: just put import pdb;pdb.set_trace() at any point in your code, and when you run it you will be dropped into the interactive debugger where you can inspect the current values of the variables. In fact, the pdb shell is an actual Python shell as well, so you can even change things there.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I've written a few little things in python, and I am ramping up to build something a little more challenging.
The last project I made basically ingested some text files, did some regex over each file and structured the data in an useful way so I could investigate some data I have.
I found it quite tough near the end to remember what section operated on what part of the text, especially as the code grew as I 'fixed' things along the way.
In my head, I imagine my code to be a series of small interconnected modules - descrete .py files that I can leave to one side knowing what they do, and how they interoperate.
The colleague that showed me how to def functions basically meant that I ended up with one really long piece of code that I found really hard to navigate and troubleshoot.
(1) Is this the right way? or is there an easier way of making modules that pass variables between them, I think i would find this better, as I could visualise the flow better (mainly becuase its how I was used to working in MATLAB a few years ago I guess)
(2) Can you use this method to plan out the various layers of functions before hand to give you a 'map' to write towards?
(3) is there any easy to access tutorials for this kind of stuff? I often find the tutorials suddenly jump way over my head....
Thanks.
(1) It is possible to write a fine programme in a single .py file
(2) In any style of programming, it is always (apart from special, hardware-driven cases) best to break your code up into short functions (or methods) that accomplish a discrete task.
(3) Experienced programmers will frequent write their code one way, discover a problem, either write more code, or different code, and consider whether any of their existing code can be broken out into a separate function.
A sign that you need to do this is when you are sequentially assigning to variables to pass data down your function. Never copy-paste your code to another place, even with changes, unless it be to break it out as a function, and replace the original code with a call to that function.
(4) In many cases, it can be useful to organise your code into classes and objects, even when it is not technologically necessary to do so. It can help you see that you have defined a complete set of operations (or not) necessary on some collection of data.
(5) Programming is actually quite hard. Even among those who have a talent for it, it takes a while to be comfortable. As an illustration, when I was doing my master's degree, I and my (fairly talented) friends all felt only in our final year that we had begun to achieve a degree of facility and competence (and these are all people who had been programming since at least their teenage years).
The important thing is to keep learning and improving, rather than repeating the same one or two years of experience over and over.
(6) To that end, read books and articles. Try new things. Think.
Others have suggested studying other experienced programmers' code from open source projects, etc. and from tutorials and textbooks, which is sound advice. Sometimes a similar example is all you need to set you on the right path.
I also suggest to use your own frustration and experience as feedback to help yourself improve. Whenever you find yourself thinking any of the following:
It feels like I'm writing the same code over and over again with only small changes
I wrote this code myself, but I had to study it for a long time to re-learn how it works
Each time I go back and add something to this code it takes me longer to get it working again
There's a bug in this code somewhere, but I haven't a clue where
Surely somebody somewhere has solved this problem already
Why is this taking me so long to get done?
That means you have room for improvement in your technique. A lot of the difference between an expert and beginning programmer is the ability to do the following:
Don't Repeat Yourself (DRY): Instead of copy-pasting code, or writing the same code over and over with variations, write a single common routine with one or more parameters that can do all of those things. Then call that routine in multiple places.
Keep It Simple (KIS): Break up your code into simple well-defined behaviors/routines that make sense on their own, organized into classes/modules/packages, so that each part of the overall program is easy to understand and maintain. Write informative and concise comments, and document the calls even if you don't intend to publish them.
Divide & Conquer Testing: Thoroughly test each individual class, function, etc. by itself (preferably with a unit-testing framework) as you develop it, rather than only testing the entire application.
Don't Re-invent the Wheel: Use open source frameworks or other tools where possible to solve problems that are general and not specific to your application. In all but the most trivial cases, there is a risk that you do not fully understand the problem and your home-grown solution may be lacking in an important way.
Estimate Honestly: Study your own previous efforts to learn how long it takes you to do certain things. Try to work faster next time, but don't assume you will. Measure yourself and use your own experience to estimate future effort. Set expectations and bargain with scope.
It's hard to know even where to begin answering your question without a snippet of your code for reference. You might want to post your code to a free public site such as http://www.bitbucket.org/ or http://www.github.org/ and then include some specific questions about small snippets of code with links back to your repository. This allows respondents here to look at the code and comment on it. (Both of these options even include color syntax highlighting, and interested correspondent can even pull the code down, make changes and push up a patch or create their own branch of your code and send you a "pull" request so you can look at the differences and pull selected changesets back into your branch).
More generally there are a number of approaches to program design. You seem to be trying to re-invent a very old methodology which is referred to as "functional decomposition" --- look at the overall task at hand as a function (digest text files) and consider how that breaks down (decomposes) into smaller functions (ingest input files, parse them, prepare results, output those) and then breaking those down further until you have units which are small enough to be coded easily in your programming environment (Python).
Modern approaches (and tools) tend to use object oriented design methodologies. You might try reading: http://www.itmaybeahack.com/homepage/books/oodesign/build-python/html/index.html