I wanted to see how a math.py function was implemented, but when I opened the file in PyCharm I found that all the functions are empty and there is a simple pass. For example:
def ceil(x): # real signature unknown; restored from __doc__
"""
ceil(x)
Return the ceiling of x as a float.
This is the smallest integral value >= x.
"""
pass
I guess it is because the functions being used are actually from the C standard library. How does it work?
PyCharm is lying to you. The source code you're looking at is a fake that PyCharm has created. PyCharm knows what functions should be there, and it can guess at their signatures using the function docstrings, but it has no idea what the function bodies should look like.
If you want to see the real source code, you can look at it in the official Github repository in Modules/mathmodule.c. A lot of the functions in there are macro-generated thin wrappers around C functions from math.h, but there's also a bunch of manually-written code to handle things like inconsistent or insufficient standard library implementations, functions with no math.h equivalent, and customization hooks like __ceil__.
There are no source code written in python for some python libraries.
These libraries was written on C or another languages. They are not presented by .py files and PyCharm or any another IDE cannot open their sources in readable view.
You can check sources in ...\AppData\Local\Programs\Python\Python310\Lib\site-packages. Most likely in your-package folder there will be .pui, .pud, .dll and other similiar files
Related
Consider the following scenario:
A python tool should be deployed for external users in a way that (a) the source is protected and (b) it is ensured that only users with valid license can use it. Furthermore, for internal users, the code should be available as pure python, so debugging with a python debugger is possible and no compiling is necessary.
A license check is available and written in C.
(a) is solved by using cython.
Thoughts on (b):
The license check could be applied to a "central" part of the program (i.e., "a class that contains enough business logic so that no one could replace it"). Of course I could use cython to link the license check to this class's *.pyd and use the c function, but then it would be impossible for the internal user to use / debug the class. On the other hand, applying the license check to a class that is "unimportant" while debugging would mean that it could (presumably) be exchanged rather easily.
Creating a separate function "licenseIsValid() -> bool" and calling it from a central position in the code is not feasible as it can be replaced easily.
As a matter of fact, (almost) everything written in python can be replaced and therefore is unsafe.
So I came up with the idea of implementing the license check before anything python-related is executed: inside of the dllmain function. I could call the license check function on startup of the dll / pyd and stop the application before executing any cython code.
Is there a way to add a custom dllmain to a cythonized module? Is it even safe to add a dllmain (or does cython require a "specialized" dllmain)?
One last remark: I am aware that python is the wrong choice of language for creating closed source code with license check. But the advantages of using python far outweigh this...
Thank you in advance,
Jan
I'm trying to look at the source code of the pow() function, and would like to get to know my way around the Python34 directory. I've looked at various libraries in the Lib directory such as fractions.py, and base64.py, but I can't seem to find the .py file where built-in functions are stored. Where should I look?
Builtins and functions from the standard library are not always implemented in Python. In CPython (the reference Python implementation), they're often written in C
The method you are looking for is defined in the following files (depending on the type you're interested in):
Float: Objects/floatobject.c:643
Long: Objects/longobject.c:3809
...
You can actually grep the Objects directory in the CPython source tree for /*nb_power*/ to find more. Try this search query on GitHub.
Functions which repr contain the wording "built-in" are not actually defined in Python code, in cPython - they are ratehr written in C, and amde available as binary modules.
For example, when you get the repr of "pow" you get:
>>> repr(pow)
'<built-in function pow>'
In contrast with a Python defined function:
>>> import glob
>>> repr (glob.glob)
'<function glob at 0x7efefe69aa70>'
So, you can search for the C source code of functions marked as "built-in" in the source code for Python itself, if you download it from Python.org (or fetch it with mercurial).
If you want a pure Python implementation of any function in the satandard lib, that should be included with the pypy distribution - just fetch it from:
https://bitbucket.org/pypy/pypy and you should see pure Python implementation of all functions that in cPython are implemented in C for performance or historic reasons.
I was reading Dietel's C++ programming book. In this book they mention how a programmer should release only the interface part of his code and not the implementation.
So carrying this over to python:
I have 2 files:
1) the implementation file = accountClass.py and
2) the interface file = useAccountClass.py
I have compiled the implementation file and have obtained the .pyc file. So when I provide my code to someone else, I would provide him with the .pyc file and the interface file, right?
Also, if I provide someone else with ONLY the .pyc file, can I expect him to write the interface on his own? I'm going to say no. But there's this one nagging doubt that I have:
The creators of numpy and scipy did not share the implementation with us end users. And I don't think they shared any interfaces either. But we can still search for the different classes and their methods inside both numpy and scipy. So, using this example of numpy and scipy, I guess what I'm trying to ask is:
Is it possible for someone else to create an interface to my code if I provide him/ her with only the compiled implementation file (in this case accountClass.pyc)? How will that person know what classes and methods I have defined in my implementation? I mean, will they use the
if __name__ = "__main__" :
blah blah
or is there some other way??
You got that entirely wrong. Or perhaps it's a horrible book whose author got something seriously wrong. Code using other code should indeed, barring significant counterarguments, adhere to an interface and not care about the details of the implementation. However, even in the world of static compilation to machine code (e.g. C++), this does not mean you should lock away the source code of the implementation.
Whether someone has access to the implementation, and whether they make use of that knowledge while writing a specific piece of code, are completely different issues. Heck, even the author of the implementation can/should still program to an interface when working on other code (e.g. other modules). Likewise, even if you lock the implementation away from someone, they may very well rely on implementation quirks which are not part of the interface. If anyone in the world of static compilation to machine code provides only headers and object files, and not the source code, it's because the projects are closed source, not to encourage good programming practices among clients.
In Python, your question makes no sense - there are no "interface" and "implementation" files, there's just code which is run and defines functions, classes, and other values. There is no such thing as an interface file you'd provide. You provide an implementation - and (hopefully) documentation which details both interface and possibly implementation details. And once a module is imported, the class objects, function objects, and other objects, contain plenty of information (including, in many cases, the text from which large parts of the documentation was generated). This is also true for extension modules like numpy. And note that their implementation is accessible, it's just not included in all distributions because it's of little use. With Python code, you practically have to distribute the source code because anything else is platform-specific.
On a side note, .pyc files are pretty high level, and easily understood when disassembled (which is as easy as importing the module and running the stdlib module dis on any function inside). I consider this a minor technicality as it's already the wrong question to ask.
Deitel's advice to C++ programmers doesn't apply to Python, for a number of reasons:
Python isn't compiled to machine code, so no matter what form you provide the program in, it will be relatively easy for someone to read the code.
Python doesn't have .h and .c files, all you can provide is the .py or .pyc files.
Treating code as a secret is kind of silly anyway. What is in your code that you need to keep hidden from others?
Numpy and Scipy are largely implemented in C, which is why you don't have the source, for your own convenience. You can get the source if you like. The "interface" to that code is the module that you can import and then call.
You should not confuse "user interface" with "class interface". If you have a useAccountClass file, that file probably performs some task using the classes and methods defined in the accountClass file, if I understood right.
If you send the file to other person, they are not supposed to "guess" what your compiled class does. That's what DOCUMENTATION is for: a description of the functions contained in the module (compiled or not), which parameters they take, which values they return, and what they are expected to do, the "meaning" of the task they perform.
As an abstract example, let's suppose you have an image processing class. If that class has the function findCircles(image), the documentation should explain that it takes an image, possibly containing circles, and returns a list or array of coordinates of the centers of circles contained in the image. HOW the circles are detected is not important, you don't need to know that to use the function. Now if the function was called like findCircles(image, gaussian_threshold=10), the caller would have to know the function uses some "gaussian_threshold" parameter, that is, the caller would NEED to know about the function's entrails, and in OOP this is Not Good. If you decided to use another algorithm in the future, every code using that function would have to be rewritten, because the gaussian_threshold most probably wouldn't make sense anymore.
So, the interface, in OOP, is the abstraction used to communicate to the object only the canonical parameters or inputs it needs to know to perform a task in the language of the problem, not in the language of the implementation (that can change anytime).
The documentation, in this sense, is a contract that assures to the user (in this case, another developer) that the function will perform as expected if sane inputs are given to it.
Now the FINAL USER, a non-technical person wanting to use your program, would need the WHOLE working program (controls and views), not only the class definitions (the model).
Hope this helps, and I must recommend the books "Code Complete 2nd ed." and "Pragmatic Programmer - From Journeyman to Master" as VERY enlightening readings on the broad topic.
How do I, at run-time (no LD_PRELOAD), intercept/hook a C function like fopen() on Linux, a la Detours for Windows? I'd like to do this from Python (hence, I'm assuming that the program is already running a CPython VM) and also reroute to Python code. I'm fine with just hooking shared library functions. I'd also like to do this without having to change the way the program is run.
One idea is to roll my own tool based on ptrace(), or on rewriting code found with dlsym() or in the PLT, and targeting ctypes-generated C-callable functions, but I thought I'd ask here first. Thanks.
You'll find from one of ltrace developer a way to do this. See this post, which includes a full patch in order to catch dynamically loaded library. In order to call it from python, you'll probably need to make a C module.
google-perftools has their own implementation of Detour under src/windows/preamble_patcher* . This is windows-only at the moment, but I don't see any reason it wouldn't work on any x86 machine except for the fact that it uses win32 functions to look up symbol addresses.
A quick scan of the code and I see these win32 functions used, all of which have linux versions:
GetModuleHandle/GetProcAddress : get the function address. dlsym can do this.
VirtualProtect : to allow modification of the assembly. mprotect.
GetCurrentProcess: getpid
FlushInstructionCache (apparently a nop according to the comments)
It doesn't seem too hard to get this compiled and linked into python, but I'd send a message to the perftools devs and see what they think.
This may seem like a weird question, but I would like to know how I can run a function in a .dll from a memory 'signature'. I don't understand much about how it actually works, but I needed it badly. Its a way of running unexported functions from within a .dll, if you know the memory signature and adress of it.
For example, I have these:
respawn_f "_ZN9CCSPlayer12RoundRespawnEv"
respawn_sig "568BF18B06FF90B80400008B86E80D00"
respawn_mask "xxxxx?xxx??xxxx?"
And using some pretty nifty C++ code you can use this to run functions from within a .dll.
Here is a well explained article on it:
http://wiki.alliedmods.net/Signature_Scanning
So, is it possible using Ctypes or any other way to do this inside python?
If you can already run them using C++ then you can try using SWIG to generate python wrappers for the C++ code you've written making it callable from python.
http://www.swig.org/
Some caveats that I've found using SWIG:
Swig looks up types based on a string value. For example
an integer type in Python (int) will look to make sure
that the cpp type is "int" otherwise swig will complain
about type mismatches. There is no automatic conversion.
Swig copies source code verbatim therefore even objects in the same namespace
will need to be fully qualified so that the cxx file will compile properly.
Hope that helps.
You said you were trying to call a function that was not exported; as far as I know, that's not possible from Python. However, your problem seems to be merely that the name is mangled.
You can invoke an arbitrary export using ctypes. Since the name is mangled, and isn't a valid Python identifier, you can use getattr().
Another approach if you have the right information is to find the export by ordinal, which you'd have to do if there was no name exported at all. One way to get the ordinal would be using dumpbin.exe, included in many Windows compiled languages. It's actually a front-end to the linker, so if you have the MS LinK.exe, you can also use that with appropriate commandline switches.
To get the function reference (which is a "function-pointer" object bound to the address of it), you can use something like:
import ctypes
func = getattr(ctypes.windll.msvcrt, "##myfunc")
retval = func(None)
Naturally, you'd replace the 'msvcrt' with the dll you specifically want to call.
What I don't show here is how to unmangle the name to derive the calling signature, and thus the arguments necessary. Doing that would require a demangler, and those are very specific to the brand AND VERSION of C++ compiler used to create the DLL.
There is a certain amount of error checking if the function is stdcall, so you can sometimes fiddle with things till you get them right. But if the function is cdecl, then there's no way to automatically check. Likewise you have to remember to include the extra this parameter if appropriate.