What are the multiple output files from Cython for? - python

I am on Python 2.7 and new to Cython.
Background:
I have 20+ py files in my project and then I found the slowness are coming from 3 of them.
So I use Cython for those files, they are now compiled with Cython and become pyd files without any issue. (I spent days to investigate the problem, look for the best solution, improve the way coding in Python but I still have to use Cython for performance reason)
Except the pyd file, under the build folder, there are a few more files with the same filename but different extension, namely ".c", ".exp", ".lib", ".obj" and ".pyd.manifest"
It seems like the project is still working and the performance remains on Cython level even I moved those files away (".c" ".exp", ".lib", ".obj" and ".pyd.manifest")
I am confused with those output files from the compiler, not sure what's necessary and what's not, and how shall I use them and treat them.
My setup.py:
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize("myCythonFile.pyx",)
)

All of these files are temporary files.
Cython compiles each of your pyx files (you only have one) to C code in matching .c files. It can also emit other files, like an HTML file to make the C code more readable, but by default, this is all it gives you, and you didn't ask for anything extra.
Cython then asks whatever C compiler you have configured via distutils—in your case, that's MSVC (Microsoft Visual C++, the C and C++ compiler that comes with Visual Studio)—and to build a .dll/.pyd file out of those .c files. The full details of what files that creates and what they mean depend on your compiler version, but basically it creates a .obj file for each .c file, then a .lib import library and .exp export library to go with your .dll, and a .manifest file that allows loading the library as an assembly.
Some of these files—in particular the .c and .obj files—are very handy for debugging if something goes wrong in the compiled code. (Cython-generated C code can be pretty ugly to trace through, but raw machine code can be even worse.)
All of these files can help make rebuilds after minor changes faster.
Some of these files are also needed if you want to do more complicated things like linking other libraries against your library.
If you're not doing any of those things, you don't need them. But there's also really no reason to get rid of them. (If you want to redistribute your code, you're probably going to build a source package, and a binary wheel, and both of those know how to skip over unnecessary intermediate files.)

Related

Running Cython install generates unwanted files

I am currently creating a python library for my python projects, and I require certain things to run much faster than normal python can, and Cython is the only way I can think to do this.
I have created a setup.py file and have tried multiple methods of achieving the cython build:
I have used
from distutils.core import setup
from Cython.Build import cythonize
# Note filePath is the directory to the .pyx file, not a variable
setup(ext_modules=cythonize(filePath))
Running python setup.py install builds the extension, and then installs it, however, it also generates many extra folders and files from previous projects where I have used cython. I only expected the file I had given it to be created into an extension module.
I have tried different methods for creating the extension files, however, none of them do anything different and they all give the same results: loads of folders and files being created in my project that I didn't ask for.
Any help as to how I should solve this problem would be greatly appreciated.
Thank you
Fixed this issue,
It turns out that Cython treats files with the same name as the same projects, so simply changing the name of my project was enough to fix it. This is not intuitive, though in a way it makes sense.
I hope this helps anyone who comes across this problem.

how to distribute a package with out exposing the source code in python?

Consider that I have a package called "A" consisting of several modules and also nested packages. Now, I want to distribute this package to user and I do not want user to see my code at all. I heard that ".pyc" can be de-compiled. So, I am just wondering what could be the other alternatives for this problem.
It would be great if someone gives some ideas in this regard.
You actually have few options. First, you can compile your code into pyc files. However, this can be circumvented with the disassembler library dis, but this requires a lot of technical know-how. You can also use py2exe to package it as an exe file; this converts the pyc file into an exe file. This can still be disassembled but adds an extra layer. You also have a few encryption solutions; for example you can use pyconcrete to encrypt your imports until they are loaded into memory. You can also just encryption the entire application, then ship the decrypter and launcher with it as a C/C++ application (or any other compiled language). Lastly, if you are comfortable with getting python to run custom C/C++ code, you can also put your private code into a DLL or SO and call it directly for the script.
Python is an interpreted language. That means that if you want to distribute pyc files you'll have to have them run on the same OS/architecture as yours or you'll run into subtle problems. That, and the fact that most code can be decompiled to some degree, would urge me to rethink your use case.
Can you rethink your package as a service instead?

Cx_Freeze's extra stuff

Whenever I build an exe with cx_Freeze and Python I get a bunch of extra stuff like the Library.zip and all the .dll files. Is there a way I can make it just one executable file that I can just send over to someone and have them run without having to give them all the extra files also? Python 3.4. Thank you!
Not really1. You're best option for a single-file distribution is probably to create an installer.
You can however append the library.zip to your executable:
params['options'] = {
'append_script_to_exe': True,
'create_shared_zip': False,
...
}
setup(**params)
But this only reduces the number of files by 1.
There are two reasons why you can't do this. The first is that some modules are not "zip safe" (those that contain data files that are read with open()). The second, and more important reason, is that Python requires various DLLs in order to run, and Windows's dynamic linker doesn't know how to find and load those DLLs if they're inside a zip file.
See: http://cx-freeze.readthedocs.org/en/latest/faq.html#single-file-executables
1 If you're really ambitious, you could theoretically create an entirely static build of Python (statically link all of the library source code, and the C runtimes, etc.), and do the same with any C modules that you might be using. That plus appending the Library.zip file to the exe might give you a single-file distribution.
However, tracking down and building all those dependencies would be a very large effort.
Yes, If your on windows this method works.
Run ->> iexpress
Follow the instructions.
This will compile all the files into on exe but first you need to create the exe using cx_freeze then browse to the directory in iexpress and it will do the rest.

Where do I put my cython files in a python distribution?

I write and maintain a Python library for quantum chemistry calculations called PyQuante. I have a fairly standard Python distribution with a setup.py file in the main directory, a subdirectory called "PyQuante" that holds all of the Python modules, and one called "Src" that contains source code for C extension modules.
I've been lucky enough to have some users donate code that uses Cython, which I hadn't used before, since I started PyQuante before either it or Pyrex existed. On my suggestion, they put the code into the Src subdirectory, since that's where all the C code went.
However, looking at the code that generates the extensions, I wonder whether I should have simply put the code in subdirectories of the Python branch instead. And thus my question is:
what are the best practices for the directory structure of python distributions with both Python and Cython source files?
Do you put the .pyx files in the same directory as the .py files?
Do you put them in in a subdirectory of the one that holds the .py files?
Do you put them in a child of the .py directory's parent?
Does the fact that I'm even asking this question betray my ignorance at distributing .pyx files? I'm sure there are many ways to make this work, and am mostly concerned with what has worked best for people.
Thanks for any help you can offer.
Putting the .pyx files in the same directory as .py files makes the most sense to me. It's what the authors of scikit-learn have done and what I've done in my py-earth module. I guess I think of Cython modules as optimized replacements for Python modules. I will often begin by writing a package in pure Python, then replace some modules with Cython if I need better performance. Since I'm treating Cython modules as replacements for Python modules, it makes sense to me to keep them in the same place. It also works well for test builds using the --inplace argument.

Trimming Python Runtime

We've got a (Windows) application, with which we distribute an entire Python installation (including several 3rd-party modules that we use), so we have consistency and so we don't need to install everything separately. This works pretty well, but the application is pretty huge.
Obviously, we don't use everything available in the runtime. I'd like to trim down the runtime to only include what we really need.
I plan on trying out py2exe, but I'd like to try and find another solution that will just help me remove the unneeded parts of the Python runtime.
One trick I've learned while trimming down .py files to ship: Delete all the .pyc files in the standard library, then run your application throughly (that is, enough to be sure all the Python modules it needs will be loaded). If you examine the standard library directories, there will be .pyc files for all the modules that were actually used. .py files without .pyc are ones that you don't need.
Both py2exe and pyinstaller (NOTE: for the latter use the SVN version, the released one is VERY long in the tooth;-) do their "trimming" via modulefinder, the standard library module for finding all modules used by a given Python script; you can of course use the latter yourself to identify all needed modules, if you don't trust pyinstaller or py2exe to do it properly and automatically on your behalf.
This py2exe page on compression suggests using UPX to compress any DLLs or .pyd files (which are actually just DLLs, still). Obviously this doesn't help in trimming out unneeded modules, but it can/will trim down the size of your distribution, if that's a large concern.

Categories