pyCharm: safe refactoring information for application on depending code

pyCharm: safe refactoring information for application on depending code - python

If I do refactoring in a library pyCharm does handle all depending applications which are known to the current running pyCharm instance.
But code which is not known to the current pyCharm does not get updated.
Is there a way to store the refactoring information in version control, so that depending applications can be updated if they get the update to the new version of the library?
Use Case:
class Server:
pass
gets renamed to
class ServerConnection:
pass
If a team mate updates the code of my library, his usage of Server needs to be changed to ServerConnection.
It would be very nice if pyCharm (or an other tool) could help my team mate to update his code automatically.

As far as I can tell this is not possible neither with a vanilla PyCharm nor with a plugin nor with a 3rd party tool.
It is not mentioned in the official documentation
There is no such plugin in the JetBrains Plugin Repositories
If PyCharm writes refactoring information to it's internal logs, you could build this yourself (but would you really want to?)
I am also not aware of any python specific refactorig tool that does that. You can check for yourself: there is another SO question for the most popular refactoring tools
But ...
I am sure there are reasons why your situation is like it is - there always are good reasons (and most of the time the terms 'historic and 'grown' turn up in explanations of these reasons) but I still feel obligated to point out what qarma already mentioned in his comment: the fact that you want to do something like replaying a refactoring on a different code base points towards a problem that should be solved in a different way.
Alternative 1: introduce an API
If you have different pieces of software that depend on each other on such a deep level, it might be a good idea to define an API that decouples the code bases from each others internals. With an API it is clear which parts have to be stable. If changes have to be done on the API level they must be communicated and coordinated with the involved teams.
Alternative 2: Make it what it actually is: one code base
If A1 for whatever reason is not possible I would conclude that you actually have one system distributed over different code bases and then those should be merged into one code base. Different teams can still work on the same code base (hopefully using a DVCS) but global refactorings can be done with tooling help and they reach all parts of the system.
Alternative 3: Make these refactorings in PyCharm over all involved code bases
Even if you can't merge them into one code base you could combine them easily in PyCharm by loading different projects into the same Window. I do this without problems with two git projects that have to be in different repositories but still share certain aspects. PyCharm handles commits to these repositories transparently: if you make changes in several repositories and commit them you write one commit message and the commits will be done to all repositories.

Related

Integrating Python monorepo application into Yocto

A little context before the actual questions. As a project that I have worked on grew, the number of repositories used for the application we're developing grew as well. To the point that it's gotten unbearable to keep making modifications to the codebase: when something is changed in the core library in one repository we need to make adjustments to other Python middleware projects that use this library. Whereas this sounds not that terrifying, managing this in Yocto has become a burden: on every version bump of the Python library we need to bump all of the dependant projects as well. After some thinking, we decided to try a monorepo approach to deal with the complexity. I'm omitting the logic behind choosing this approach but I can delve into it if you think this is wrong.
1) Phase 1 was easy. Just bring every component of our middleware to one repository. Right now we have a big repository with more than git 20 submodules. Submodules will be gone once we finish the transition but as the project doesn't stop while we're doing the transition, submodules were chosen to track the changes and keep the monorepo up to date. Every submodule is a Python code with a setup.py/Pipfile et al. that does the bootstrapping.
2) Phase 2 is to integrate everything in Yocto. It has turned out to be more difficult than I had anticipated.
Right now we have this:
Application monorepo
Pipfile
python-library1/setup.py
python-library1/Makefile
python-library1/python-library1/<code>
python-library2/setup.py
python-library2/Makefile
python-library2/python-library2/<code>
...
python-libraryN/setup.py
python-libraryN/Makefile
python-libraryN/python-libraryN/<code>
Naturally, right now we have python-library1_<some_version>.bb, python-library2_<some_other_version>.bb in Yocto. Now we want to get rid of this version hell and stick to monorepo versions so that we could have just monorepo_1.0.bb that is to be updated when anything changes in any part.
Right now I have this monorepo_1.0.bb (actually, would like to have as this approach doesn't work)
SRCREV=<hash>
require monorepo.inc
monorepo.inc is as follows:
SRC_URI=<gitsm://the location of the monorepo,branch=...,>
S = "${WORKDIR}/git"
require python-library1.inc
require python-library2.inc
...
require python-libraryN.inc
python-libraryN.inc:
S = "${WORKDIR}/git/python-libraryN"
inherit setuptools
This approach doesn't work for multiple reasons. Basically every new require will override ${S} to another git/python-library and the only package that will be built is the last one.
Even though I know that this approach won't end up being the final solution I just wanted to tell you what I've strived for: very easy updates. The only thing that I need to do is just update SRCREV (or ${PV} in the future when this will be deployed) and the whole stack will be brought up to date.
So, well, the question is how to structure the Yocto recipes for Python monorepo manipulation?
P.S.
1) The monorepo structure is not set in stone so if you have any suggestions, I'm more than open to any criticism
2) The .inc file structure might not be suitable for another reason. This stack is deployed across a dozen of different devices and some of those python-library_N.bb have .bbappends in other layers that are custom for the devices. I mean that some of them might require different components of the system installed or some configs and files/ modifications.
Any suggestion on how to deal with the complexity of a big Python application in Yocto will be welcome. Thanks in advance.

I'm currently looking into the same issue as you.
I found that the recipe for the boost library also builds several packages from a "monorepo".
This recipe can be found in the poky layer:
http://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/meta/recipes-support/boost
Look especially into the boost.inc file, which builds the actual package list, while the other files define both SRC_URI and various patches.
Hope that helps !

How do I keep my code obfuscated from my enterprise client?

Software these days can be separated into two categories: runs on client infrastructure (like in the case of enterprise software, like Splunk or Tibco), OR runs on the infrastructure of a software provider (like in the case of Facebook, where you need to use their API to access the backend).
In the first category, the client pays for a license and receives the software to run on their own machines on their premises of choice. The client IS in possession of the actual code and software.
In the second category, the software resides somewhere external and can be accessed only by an API. The client is NOT in possession of the software and can only use it to the extent allowed by the API.
My question is: in the first category above, how is the actual code kept hidden from the client?
Let's say I've built a really awesome analysis engine in Python for analyzing output logs. A corporate client is interested in using it for their internal applications. However, they insist that my engine must run on their own machines for security reasons. If I succumb and give them my Python code, then I will risk my intellectual property.
In that case, do I need to rewrite all my code into a compiled language like C++ to obfuscate it during compile time? Or is there a way to keep it in Python but secure the source code it in another way?
Update:
Given the answers below, in that case, would the more efficient pathway to developing a client-hosted application (i.e.: first category above) be to write a proof of concept in a more convenient language like Python first, and then take those ideas and rewrite it into C++?

The short answer is that you pretty much can't. You can do workarounds, but in the end almost anyone can reverse engineer your code no matter how you obfuscate it.
Your best bet might be to use something like PyInstaller and see if you can only include .pyc files. That doesn't protect you all the way, but it at least makes it a pain to reverse. You might even be able to find an obfuscater to run it first, but I don't know much about that part.

Similar to #gabeappleton's suggestion above, compile the Python code into an EXE. I use cx_freeze quite regularly and have good success. It's pretty well documented and reasonable support on these forums.

How to get code-completion for COM programming in PyCharm?

When using app = win32com.client.Dispatch('Some.Application'), is there any feasible way get code-completion in PyCharm? It is rather tedious having to retype (or copy-paste) everything from an API documentation, so would creating skeletons be. Is there no other way to let PyCharm know about the Interface provided via COM, especially if I can provide a .tlb file? Or is there at least some way automatically generate such a skeleton (or a wrapping module?) from the TypeLib?

Since there is no way for PyCharm to know the runtime type of app, you shouldn't expect to get code completion on app directly; at least not until they decide to add built-in support for generating code from type libraries.
However, you can exploit the fact that win32com implicitly generates code based on the type library as described in the first part of this answer, together with PyCharm's support for type hinting, to get code completion on COM methods.
Make sure that the Python types have been generated; their location is determined by the GUID of the COM object. For example, the types for Microsoft Word 2016 on my machine are available in
C:\Users\username\appdata\local\temp\gen_py\3.6\00020905-0000-0000-c000-000000000046x0x8x7\.
Add this folder to the path of your PyCharm Python interpreter; see e.g. this answer.
Import the modules for which you want code completion.
In the screenshots below, we use this approach with Word's Find:
Now, besides feeling dirty, this approach relies on the relevant types having been generated and the code completion is limited to the methods published by the object, so I imagine its usefulness in practice might be somewhat limited; in particular, anybody working on the code will have to generate the code, or the annotations will cause NameErrors. Personally, I would probably prefer using Jupyter for the exploratory part of the implementation process, and with minimal tweaks outlined in the answer mentioned above, Jupyter can be extended to have full code completion with win32com.

Canonical way to build a plug-in system in Python

I'll try to keep this question objective. What is the canonical way to build a plugin-system for a Python desktop application?
Is it possible to have an easy to use (for the developer) system which achieves the following:
Users input their code using an in-app editor (the editor could end up dumping their plugins into a subdirectory of the app, if necessary)
Keep the source of the application "closed" (Yes, there are pitfalls to obfuscation)
Thanks

I am guessing you need some sort of educational system where the user can submit code, presumable to check that the code performs cf. a exercise.
My immediate thoughts about this would to use a web-interface. In this manner, the code of the evaluating system is entirely hidden (unless the student hacks your webserver, but this is an entirely different topic).
However, for this to work you must be aware of the pitfalls of allowing others submit code to your service, that is then executed. The code must be rigorously checked for the obvious things, but the issues here are open-bounded. You must also protect your service from the back-end by providing a safe environment for executing (e.g. a sandbox).

How to prevent decompilation or inspecting python code?

let us assume that there is a big, commercial project (a.k.a Project), which uses Python under the hood to manage plugins for configuring new control surfaces which can be attached and used by Project.
There was a small information leak, some part of the Project's Python API leaked to the public information and people were able to write Python scripts which were called by the underlying Python implementation as a part of Project's plugin loading mechanism.
Further on, using inspect module and raw __dict__ readings, people were able to find out a major part of Project's underlying Python implementation.
Is there a way to keep the Python secret codes secret?
Quick look at Python's documentation revealed a way to suppres a import of inspect module this way:
import sys
sys.modules['inspect'] = None
Does it solve the problem completely?

No, this does not solve the problem. Someone could just rename the inspect module to something else and import it.
What you're trying to do is not possible. The python interpreter must be able to take your bytecode and execute it. Someone will always be able to decompile the bytecode. They will always be able to produce an AST and view the flow of the code with variable and class names.
Note that this process can also be done with compiled language code; the difference there is that you will get assembly. Some tools can infer C structure from the assembly, but I don't have enough experience with that to comment on the details.
What specific piece of information are you trying to hide? Could you keep the algorithm server side and make your software into a client that touches your web service? Keeping the code on a machine you control is the only way to really keep control over the code. You can't hand someone a locked box, the keys to the box, and prevent them from opening the box when they have to open it in order to run it. This is the same reason DRM does not work.
All that being said, it's still possible to make it hard to reverse engineer, but it will never be impossible when the client has the executable.

There is no way to keep your application code an absolute secret.
Frankly, if a group of dedicated and determined hackers (in the good sense, not in the pejorative sense) can crack the PlayStation's code signing security model, then your app doesn't stand a chance. Once you put your app into the hands of someone outside your company, it can be reverse-engineered.
Now, if you want to put some effort into making it harder, you can compile your own embedded python executable, strip out unnecessary modules, obfuscate the compiled python bytecode and wrap it up in some malware rootkit that refuses to start your app if a debugger is running.
But you should really think about your business model. If you see the people who are passionate about your product as a threat, if you see those who are willing to put time and effort into customizing your product to personalize their experience as a danger, perhaps you need to re-think your approach to security. Assuming you're not in the DRM business, or have a similar model that involves squeezing money from reluctant consumers, consider developing an approach that involves sharing information with your users, and allowing them to collaboratively improve your product.

Is there a way to keep the Python secret codes secret?
No there is not.
Python is particularly easy to reverse engineer, but other languages, even compiled ones, are easy enough to reverse.

You cannot fully prevent reverse engineering of software - if it comes down to it, one can always analyze the assembler instructions your program consists of.
You can, however, significantly complicate the process, for example by messing with Python internals. However, before jumping to how to do it, I'd suggest you evaluate whether to do it. It's usually harder to "steal" your code (one needs to fully understand them to be able to extend them, after all) than code it oneself. A pure, unobfuscated Python plugin interface, however, can be vital in creating a whole ecosystem around your program, far outweighing the possible downsides to having someone peek in your maybe not perfectly designed coding internals.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.