How to get code-completion for COM programming in PyCharm? - python

When using app = win32com.client.Dispatch('Some.Application'), is there any feasible way get code-completion in PyCharm? It is rather tedious having to retype (or copy-paste) everything from an API documentation, so would creating skeletons be. Is there no other way to let PyCharm know about the Interface provided via COM, especially if I can provide a .tlb file? Or is there at least some way automatically generate such a skeleton (or a wrapping module?) from the TypeLib?

Since there is no way for PyCharm to know the runtime type of app, you shouldn't expect to get code completion on app directly; at least not until they decide to add built-in support for generating code from type libraries.
However, you can exploit the fact that win32com implicitly generates code based on the type library as described in the first part of this answer, together with PyCharm's support for type hinting, to get code completion on COM methods.
Make sure that the Python types have been generated; their location is determined by the GUID of the COM object. For example, the types for Microsoft Word 2016 on my machine are available in
C:\Users\username\appdata\local\temp\gen_py\3.6\00020905-0000-0000-c000-000000000046x0x8x7\.
Add this folder to the path of your PyCharm Python interpreter; see e.g. this answer.
Import the modules for which you want code completion.
In the screenshots below, we use this approach with Word's Find:
Now, besides feeling dirty, this approach relies on the relevant types having been generated and the code completion is limited to the methods published by the object, so I imagine its usefulness in practice might be somewhat limited; in particular, anybody working on the code will have to generate the code, or the annotations will cause NameErrors. Personally, I would probably prefer using Jupyter for the exploratory part of the implementation process, and with minimal tweaks outlined in the answer mentioned above, Jupyter can be extended to have full code completion with win32com.

Related

pyCharm: safe refactoring information for application on depending code

If I do refactoring in a library pyCharm does handle all depending applications which are known to the current running pyCharm instance.
But code which is not known to the current pyCharm does not get updated.
Is there a way to store the refactoring information in version control, so that depending applications can be updated if they get the update to the new version of the library?
Use Case:
class Server:
pass
gets renamed to
class ServerConnection:
pass
If a team mate updates the code of my library, his usage of Server needs to be changed to ServerConnection.
It would be very nice if pyCharm (or an other tool) could help my team mate to update his code automatically.
As far as I can tell this is not possible neither with a vanilla PyCharm nor with a plugin nor with a 3rd party tool.
It is not mentioned in the official documentation
There is no such plugin in the JetBrains Plugin Repositories
If PyCharm writes refactoring information to it's internal logs, you could build this yourself (but would you really want to?)
I am also not aware of any python specific refactorig tool that does that. You can check for yourself: there is another SO question for the most popular refactoring tools
But ...
I am sure there are reasons why your situation is like it is - there always are good reasons (and most of the time the terms 'historic and 'grown' turn up in explanations of these reasons) but I still feel obligated to point out what qarma already mentioned in his comment: the fact that you want to do something like replaying a refactoring on a different code base points towards a problem that should be solved in a different way.
Alternative 1: introduce an API
If you have different pieces of software that depend on each other on such a deep level, it might be a good idea to define an API that decouples the code bases from each others internals. With an API it is clear which parts have to be stable. If changes have to be done on the API level they must be communicated and coordinated with the involved teams.
Alternative 2: Make it what it actually is: one code base
If A1 for whatever reason is not possible I would conclude that you actually have one system distributed over different code bases and then those should be merged into one code base. Different teams can still work on the same code base (hopefully using a DVCS) but global refactorings can be done with tooling help and they reach all parts of the system.
Alternative 3: Make these refactorings in PyCharm over all involved code bases
Even if you can't merge them into one code base you could combine them easily in PyCharm by loading different projects into the same Window. I do this without problems with two git projects that have to be in different repositories but still share certain aspects. PyCharm handles commits to these repositories transparently: if you make changes in several repositories and commit them you write one commit message and the commits will be done to all repositories.

Advise needed for Static vs Dynamic linking

I have a Python code that needs to be able to execute a C++ code. I'm new to the idea of creating libraries but from what I have learned so far I need to know whether I need to use static or dynamic linking.
I have read up on the pros and cons of both but there is a lot of jargon thrown around that I do not understand yet and since I need to do this ASAP I was wondering if some light can be shed on this from somebody who can explain it simply to me.
So here's the situation. My C++ code generates some text files that have data. My Python code then uses those text files to plot the data. As a starter, I need to be able to run the C++ code directly from Python. Is DLL more suitable than SL? Or am I barking up the completely wrong tree?
Extra: is it possible to edit variables in my C++ code, compile it and execute it, all directly from Python?
It depends on your desired deployment. If you use dynamic linking will need to carefully manage the libraries (.so, .dll) on your path and ensure that the correct version is loaded. This can be helped if you include the version number in the filename, but then that has its own problems (security... displaying version numbers of your code is a bad idea).
Another benefit is that you can swap your library functionality without a re-compile as long as the interface does not change.
Statically linking is conceptually simpler and practically simpler. You only have to deploy one artefact (an .exe for example). I recommend you start with that until you need to move to the more complicated shared library setup.
Edit: I don't understand your "extra credit" question. What do you mean by "edit values"? If you mean can you modify variables that were declared in your C++ code, then yes you can as long as you use part of the public interface to do it.
BTW this advice is for the general decision. If you are linking from Python to C/C++ I think you need to use a shared library. Not sure as I haven't done it myself.
EDIT: To expand on "public interface". When you create a C++ library of whatever kind, you specify what functions are available to outside classes (look up how to to that). This is what I mean by public interface. Parts of your library are inaccessible but others (that you specify) are able to be called from client code (i.e. your python script). This allows you to modify the values that are stored in memory.
If you DO mean that you want to edit the actual C++ code from within your python I would suggest that you should re-design your application. You should be able to customise the run-time behaviour of your C++ library by providing the appropriate configuration.
If you give a solid example of what you mean by that we'll be able to give you better advice.
Yes it is possible!!
Try exploring subprocess module in python.
Following can be an example implementation of your scenario:
yourfile.cpp
#compilation
args = ['g++','-o','your_executable_name_with_path','yourfile.cpp_with_path']
your_compile = subprocess.Popen(args,stdin=subprocess.PIPE,stderr=subprocess.PIPE,stdout=subprocess.PIPE)
output,compilation_error = your_compile.communicate()
your_compile.wait()
#successful compilation then there will be execuatble
if not compilation_error:
#execuation
args = ['your_executable_name_with_path'] #command to run a an execuatble
your_run = subprocess.Popen(args,stdin=subprocess.PIPE,stderr=subprocess.PIPE,stdout=subprocess.PIPE)
your_code_output,runtime_error = your_run.communicate()
your_run.wait()
Further, you can tackle more cases and come up with an efficient design
I'm not quite sure how the idea of linking comes into what you are asking, but it sounds to me like you want to use something like SWIG, which allows you to create wrappers around C++ functions (and many other languages) which you can then call directly from your Python code.
Extra: is it possible to edit values in my C++ code, compile it and execute it directly from Python?
If I'm understanding this correctly, you want to use Python to change your C++ code, then compile and execute it? If this is the case, you may want to look into embedding the Python interpreter in your C++ program. This would mean doing things the other way around and having C++ run your Python script, instead of trying to do everything from Python.

checking/verifying python code

Python is a relatively new language for me and I already see some of the trouble areas of maintaining a scripting language based project. I am just wondering how the larger community , with a scenario when one has to maintain a fairly large code base written by people who are not around anymore, deals with the following situations:
Return type of a function/method. Assuming past developers didn't document the code very well, this is turning out to be really annoying as I am basically reading code line by line to figure out what a method/function is suppose to return.
Code refactoring: I figured a lot of code need to be moved around, edited/deleted and etc. But lot of times simple errors, which would otherwise be compile time error in other compiled languages e.g. - wrong number of arguments, wrong type of arguments, method not present and etc, only show up when you run the code and the code reaches the problematic area. Therefore, whether a re-factored code will work at all or not can only be known once you run the code thoroughly. I am using PyLint with PyDev but still I find it very lacking in this respect.
You are right, that's an issue with dynamically typed interpreted languages.
There are to important things that can help:
Good documentation
Extensive unit-testing.
They apply to other languages as well of course, but here they are especially important.
As far as I know If code is not documented at all and the author isn't around anymore it's up to you to find out what the ode actually does.
That's why people should always stick to certain guidelindes that can be enforced by stylecheckers like pep8. https://pypi.python.org/pypi/pep8
Comments and docstrings should be included in every method to avoid such situation you're describing. http://www.python.org/dev/peps/pep-0257/#what-is-a-docstring
Also unittests are very helpfull for refactoring since you can check if you broke something with the click of a button. http://docs.python.org/2/library/unittest.html
hope this helps
Others have already mentioned documentation and unit-testing as being the main tools here. I want to add a third: the Python shell. One of the huge advantages of a non-compiled language like Python is that you can easily fire up the shell, import your module, and run the code there to see what it does and what it returns.
Linked to this is the Python debugger: just put import pdb;pdb.set_trace() at any point in your code, and when you run it you will be dropped into the interactive debugger where you can inspect the current values of the variables. In fact, the pdb shell is an actual Python shell as well, so you can even change things there.

How to prevent decompilation or inspecting python code?

let us assume that there is a big, commercial project (a.k.a Project), which uses Python under the hood to manage plugins for configuring new control surfaces which can be attached and used by Project.
There was a small information leak, some part of the Project's Python API leaked to the public information and people were able to write Python scripts which were called by the underlying Python implementation as a part of Project's plugin loading mechanism.
Further on, using inspect module and raw __dict__ readings, people were able to find out a major part of Project's underlying Python implementation.
Is there a way to keep the Python secret codes secret?
Quick look at Python's documentation revealed a way to suppres a import of inspect module this way:
import sys
sys.modules['inspect'] = None
Does it solve the problem completely?
No, this does not solve the problem. Someone could just rename the inspect module to something else and import it.
What you're trying to do is not possible. The python interpreter must be able to take your bytecode and execute it. Someone will always be able to decompile the bytecode. They will always be able to produce an AST and view the flow of the code with variable and class names.
Note that this process can also be done with compiled language code; the difference there is that you will get assembly. Some tools can infer C structure from the assembly, but I don't have enough experience with that to comment on the details.
What specific piece of information are you trying to hide? Could you keep the algorithm server side and make your software into a client that touches your web service? Keeping the code on a machine you control is the only way to really keep control over the code. You can't hand someone a locked box, the keys to the box, and prevent them from opening the box when they have to open it in order to run it. This is the same reason DRM does not work.
All that being said, it's still possible to make it hard to reverse engineer, but it will never be impossible when the client has the executable.
There is no way to keep your application code an absolute secret.
Frankly, if a group of dedicated and determined hackers (in the good sense, not in the pejorative sense) can crack the PlayStation's code signing security model, then your app doesn't stand a chance. Once you put your app into the hands of someone outside your company, it can be reverse-engineered.
Now, if you want to put some effort into making it harder, you can compile your own embedded python executable, strip out unnecessary modules, obfuscate the compiled python bytecode and wrap it up in some malware rootkit that refuses to start your app if a debugger is running.
But you should really think about your business model. If you see the people who are passionate about your product as a threat, if you see those who are willing to put time and effort into customizing your product to personalize their experience as a danger, perhaps you need to re-think your approach to security. Assuming you're not in the DRM business, or have a similar model that involves squeezing money from reluctant consumers, consider developing an approach that involves sharing information with your users, and allowing them to collaboratively improve your product.
Is there a way to keep the Python secret codes secret?
No there is not.
Python is particularly easy to reverse engineer, but other languages, even compiled ones, are easy enough to reverse.
You cannot fully prevent reverse engineering of software - if it comes down to it, one can always analyze the assembler instructions your program consists of.
You can, however, significantly complicate the process, for example by messing with Python internals. However, before jumping to how to do it, I'd suggest you evaluate whether to do it. It's usually harder to "steal" your code (one needs to fully understand them to be able to extend them, after all) than code it oneself. A pure, unobfuscated Python plugin interface, however, can be vital in creating a whole ecosystem around your program, far outweighing the possible downsides to having someone peek in your maybe not perfectly designed coding internals.

Sweave for python

I've recently started using Sweave* for creating reports of analyses run with R, and am now looking to do the same with my python scripts.
I've found references to embedding python in Sweave docs, but that seems like a bit of a hack. Has anyone worked out a better solution, or is there an equivalent for python I'm not aware of?
* Sweave is a tool that allows to embed the R code for complete data analyses in latex documents
I have written a Python implementation of Sweave called Pweave that implements basic functionality and some options of Sweave for Python code embedded in reST or Latex document. You can get it here: http://mpastell.com/pweave and see the original blog post here: http://mpastell.com/2010/03/03/pweave-sweave-for-python/
Some suggestions:
I have been using Pweave for several years now, and it is very similar to Sweave. Highly recommended.
The most popular tool for embedded reports in python at this stage is Jupyter notebooks, which allow you to embed markdown, and they are quite useful although I personally still like writing things in LaTeX...
You can also have a look at PyLit, which is intended for literate programming with Python, but not as well maintained as some of the alternatives.
Sphinx is great for documenting with python, and can output LaTex.
Here's a list of tools for literate programming. Some of these work with any programming language.
Dexy is a very similar product to Sweave. One advantage of Dexy is that it is not exclusive to one single language. You could create a Dexy document that included R code, Python code, or about anything else.
This is a bit late, but for future reference you might consider my PythonTeX package for LaTeX. PythonTeX allows you enter Python code in a LaTeX document, run it, and bring back the output. But unlike Sweave, the document you actually edit is a valid .tex document (not .Snw or .Rnw), so editing the non-code part of the document is fast and convenient.
PythonTeX provides many features, including the following:
The document can be compiled without running any Python code; code only needs to be executed when it is modified.
All Python output is saved or cached.
Code runs in user-defined sessions. If there are multiple sessions, sessions automatically run in parallel using all available cores.
Errors and warnings are synchronized with the line numbers of the .tex document, so you know exactly where they came from.
Code can be executed, typeset, or typeset and executed. Syntax highlighting is provided by Pygments.
Anything printed by Python is automatically brought into the .tex document.
You can customize when code is re-executed (modified, errors, warnings, etc.).
The PythonTeX utilities class is available in all code that is executed. It allows you to automatically track dependencies and specify created files that should be cleaned up. For example, you can set the document to detect when the data it depends on is modified, so that code will be re-executed.
A basic PythonTeX file looks like this:
\documentclass{article}
\usepackage{pythontex}
\begin{document}
\begin{pycode}
#Whatever you want here!
\end{pycode}
\end{document}
You might consider noweb, which is language independent and is the basis for Sweave. I've used it for Python and it works well.
http://www.cs.tufts.edu/~nr/noweb/
I've restructured Matti's Pweave a bit, so that it is possible to define arbitrary "chunk-processors" as plugin-modules. This makes it easy to extend for several chunk-based text-preprocessing applications. The restructured version is available at https://bitbucket.org/edgimar/pweave/src. As an example, you could write the following LaTeX-Pweave document (notice the "processor name" in this example is specified with the name 'mplfig'):
\documentclass[a4paper]{article}
\usepackage{graphicx}
\begin{document}
\title{Test document}
\maketitle
Don't miss the great information in Figure \ref{myfig}!
<<p=mplfig, label=myfig, caption = "Figure caption...">>=
import sys
import pylab as pl
pl.plot([1,2,3,4,5],['2,4,6,8,10'], 'b.', markersize=15)
pl.axis('scaled')
pl.axis([-3,3, -3,3]) # [xmin,xmax, ymin,ymax]
#
\end{document}
You could try SageTeX which implements Sweave-Like functionality for the SAGE mathematics platform. I haven't played around with it as much as I would like to, but SAGE is basically a python shell and evaluates python as it's native language.
I have also thought about the same thing many times. After reading your questions and looking into your link I made small modifications to the custom python Sweave driver, that you link to. I modified it to also keep the source code and produce the output as well the same way that Sweave does for R.
I posted the modified version and an example here: http://mpastell.com/2010/02/09/python-in-sweave-document/
Granted, it is not optimal but I'm quite happy with the output and I like the ability to include both R and Python in the same document.
Edit about PyLit:
I also like PyLit and contrary to my original answer you can catch ouput with it as well, although it not as elegant as Sweave! Here is a small example how to do it:
import sys
# Catch PyLit output
a = range(3)
sys.stdout = open('output.txt', 'w')
print a
sys.stdout = sys.__stdout__
# .. include:: output.txt
What you're looking for is achieved with GNU Emacs and org-mode*. org-mode does far more than can be detailed in a single response, but the relevant points are:
Support for literate programming with the ability to integrate multiple languages within the same document (including using one language's results as the input for another language).
Graphics integration.
Export to LaTeX, HTML, PDF, and a variety of other formats natively, automatically generating the markup (but you can do it manually, too).
Everything is 100% customizable, allowing you to adapt the editor to your needs.
I don't have Python installed on my system, but below is an example of two different languages being run within the same session. The excerpt is modified from the wonderful org-mode R tutorial by Erik Iverson which explains the set up and effective use of org-mode for literate programming tasks. This SciPy 2013 presentation demonstrates how org-mode can be integrated into a workflow (and happens to use Python).
Emacs may seem intimidating. But for statistics/data science, it offers tremendous capabilities that either aren't offered anywhere else or are spread across various systems. Emacs allows you to integrate them all into a single interface. I think Daniel Gopar says it best in his Emacs tutorial,
Are you guys that lazy? I mean, c'mon, just read the tutorial, man.
An hour or so with the Emacs tutorial opens the door to some extremely powerful tools.
* Emacs comes with org-mode. No separate install is required.
Well, with reticulate which is a recent best implementation of a Python interface in R you could continue using Sweave and call Python inline using the R interpreter. For example this now works in a .Rnw or .Rmd markdown file.
```{r example, include=FALSE}
library(reticulate)
use_python("./dir/python")
```
```{python}
import pandas
data = pandas.read_csv("./data.csv")
print(data.head())
```
I think that Jupyter-book may do what you want.

Categories