Python: How are objects/variables such as the complex '1j' defined? - python

And how can I define my own types of variables/objects to behave with the provided operations in python?
Can I do this without having to create my own object classes and operators as a separate and distinct entity? I dont mind having to create my own object class (thats a given), but I want it to integrate flawlessly with the already existing constructs. So emphasis on my wish to avoid "separate and distinct".
Im trying to create a quaternion object class. I want to define a 1i and 1k that are distinct from 1j.
And yes, a package might already exist; this is purely academic and for my own programming practice and understanding. Im trying to extend what is already there, and not build something that is distinct and separate.
I already do class objects but unfortunately they require a redefinition of the basic operations in order to make use of them, and even then I have to "declare" these objects before I can use them, quite unlike '1j'.
I hope I am clear with my intent. The end result of a quaternion is not my intent; it is the types of methods and objects and generalizations Im trying to figure out how to do, to extend and make use of what is already built into python.
It seems to me whoever built numpy and cmath have already been able to achieve this endeavor.
Thanks to the commentary below. I now have the vocabulary to express my intent better.
Im trying to add new features to Pythons SYNTAX.
If anyone can offer resources on how to do this, Id appreciate it.

I see two options for you here:
Change the Python syntax (fork CPython), there's a surprising amount of articles about how to do that.
Build some kind of preprocessor like Mypy.
Either way it seems like too much trouble just to have a new literal value.

Python does not support custom operators nor custom literals.
A language that supports custom literals is C++ (since C++11 I believe), but it does not support custom operators.
A language that supports custom operators is, for example, Haskell.
If you want to add this feature to Python you'll have to take the Python sources, modify its grammar, modify the lexer/parser and more importantly the compiler.
However at that point you just create a new language, which means you broke compatibility with python.
The easiest solution would simply be to write a simple preprocessor that replaces some simple syntax with an expanded equivalent. For example:
sed -i 's/(\d)+\+(\d+)i/MyComplex(\1, \2)/g' my_file.py
Then you can execute the preprocessor in the build step of your library/application.
This has the advantage of letting you write the code you want, but when you ship it/use it it is translated into normal python, keeping 100% compatibility with existing installations.
I believe using import hooks it would be possible to avoid having to ship the preprocessed version of your library... basically the preprocessor could be included in the import step and done on the fly. This would avoid having to deal with temporary preprocessed files.
The only requirement would be that people that need to use your library will have to install the import hook someway.

Related

Interpret Python bytecode in C# (with fine control)

For a project idea of mine, I have the following need, which is quite precise:
I would like to be able to execute Python code (pre-compiled before hand if necessary) on a per-bytecode-instruction basis. I also need to access what's inside the Python VM (frame stack, data stacks, etc.). Ideally, I would also like to remove a lot of Python built-in features and reimplement a few of them my own way (such as file writing).
All of this must be coded in C# (I'm using Unity).
I'm okay with loosing a few of Python's actual features, especially concerning complicated stuff with imports, etc. However, I would like most of it to stay intact.
I looked a little bit into IronPython's code but it remains very obscure to me and it seems quite enormous too. I began translating Byterun (a Python bytecode interpreter written in Python) but I face a lot of difficulties as Byterun leverages a lot of Python's features to... interpret Python.
Today, I don't ask for a pre-made solution (except if you have one in mind?), but rather for some advice, places to look at, etc. Do you have any ideas about the things I should research first?
I've tried to do my own implementation of the Python VM in the distant past and learned a lot but never came even close to a fully working implementation. I used the C implementation as a starting point, specifically everything in https://github.com/python/cpython/tree/main/Objects and
https://github.com/python/cpython/blob/main/Python/ceval.c (look for switch(opcode))
Here are some pointers:
Come to grips with the Python object model. Implement an abstract PyObject class with the necessary methods for instancing, attribute access, indexing and slicing, calling, comparisons, aritmetic operations and representation. Provide concrete implemetations for None, booleans, ints, floats, strings, tuples, lists and dictionaries.
Implement the core of your VM: a Frame object that loops over the opcodes and dispatches, using a giant switch statment (following the C implementation here), to the corresponding methods of the PyObject. The frame should maintains a stack of PyObjects for the operants of the opcodes. Depending on the opcode, arguments are popped from and pushed on this stack. A dict can be used to store and retrieve local variables. Use the Frame object to create a PyObject for function objects.
Get familiar with the idea of a namespace and the way Python builds on the concept of namespaces. Implement a module, a class and an instance object, using the dict to map (attribute)names to objects.
Finally, add as many builtin functions as you think you need to get a usefull implementation.
I think it is easy to underestimate the amount of work you're getting yourself into, but ... have fun!

How to apply Closed-Open and Inversion of Control principles in Python?

Building out a new application now and struggling a lot with the implementation part of "Closed-Open" and "Inversion of Control" principles I following after reading Clean Architecture book by Uncle Bob.
How can I implement them in Python?
Usually, these two principles coming hand in hand and depicted in the UML as an Interface reversing control from module/package A to B.
I'm confused because:
Python does not possess Interfaces as Java and C++ do. Yes, there are ABC and #abstractmethod, but it is not a Pythonic style and redundant from my point of view if you are not developing a framework
Passing a class to the method of another one (I understood that it is a way to implement open-closed principle) is a little bit dangerous in Python, since it does not have a compiler which is catching issues may (and will) happen if one of two loosely coupled objects change
After neglecting interfaces and passing a top-level class to lower-level ones... I still need to import everything somewhere at the top module. And by that, the whole thing is violated.
So, as you can see I'm super confused and having a hard time programming according to my design. I came up with. Can you help me, please?
You just pass an object that implements the methods you need it to implement.
True, there is no "Interface" to define what those methods have to be, but that's just the way it is in python.
You pass around arguments all the time that have to be lists, maps, tuples, or whatever, and none of these are type-checked. You can write code that calls whatever you want on these things and python will not notice any kind of problem until that code is actually executed.
It's exactly the same when you need those arguments to implement whatever IoC interface you're using. Make sure you detail the requirements in comments.
Yes, this is all pretty dangerous. That's why we prefer statically typed languages for large systems that have complex interfaces.

Statically Typed Metaprogramming?

I've been thinking about what I would miss in porting some Python code to a statically typed language such as F# or Scala; the libraries can be substituted, the conciseness is comparable, but I have lots of python code which is as follows:
#specialclass
class Thing(object):
#specialFunc
def method1(arg1, arg2):
...
#specialFunc
def method2(arg3, arg4, arg5):
...
Where the decorators do a huge amount: replacing the methods with callable objects with state, augmenting the class with additional data and properties, etc.. Although Python allows dynamic monkey-patch metaprogramming anywhere, anytime, by anyone, I find that essentially all my metaprogramming is done in a separate "phase" of the program. i.e.:
load/compile .py files
transform using decorators
// maybe transform a few more times using decorators
execute code // no more transformations!
These phases are basically completely distinct; I do not run any application level code in the decorators, nor do I perform any ninja replace-class-with-other-class or replace-function-with-other-function in the main application code. Although the "dynamic"ness of the language says I can do so anywhere I want, I never go around replacing functions or redefining classes in the main application code because it gets crazy very quickly.
I am, essentially, performing a single re-compile on the code before i start running it.
The only similar metapogramming i know of in statically typed languages is reflection: i.e. getting functions/classes from strings, invoking methods using argument arrays, etc. However, this basically converts the statically typed language into a dynamically typed language, losing all type safety (correct me if i'm wrong?). Ideally, I think, I would have something like the following:
load/parse application files
load/compile transformer
transform application files using transformer
compile
execute code
Essentially, you would be augmenting the compilation process with arbitrary code, compiled using the normal compiler, that will perform transformations on the main application code. The point is that it essentially emulates the "load, transform(s), execute" workflow while strictly maintaining type safety.
If the application code are borked the compiler will complain, if the transformer code is borked the compiler will complain, if the transformer code compiles but doesn't do the right thing, either it will crash or the compilation step after will complain that the final types don't add up. In any case, you will never get the runtime type-errors possible by using reflection to do dynamic dispatch: it would all be statically checked at every step.
So my question is, is this possible? Has it already been done in some language or framework which I do not know about? Is it theoretically impossible? I'm not very familiar with compiler or formal language theory, I know it would make the compilation step turing complete and with no guarantee of termination, but it seems to me that this is what I would need to match the sort of convenient code-transformation i get in a dynamic language while maintaining static type checking.
EDIT: One example use case would be a completely generic caching decorator. In python it would be:
cacheDict = {}
def cache(func):
#functools.wraps(func)
def wrapped(*args, **kwargs):
cachekey = hash((args, kwargs))
if cachekey not in cacheDict.keys():
cacheDict[cachekey] = func(*args, **kwargs)
return cacheDict[cachekey]
return wrapped
#cache
def expensivepurefunction(arg1, arg2):
# do stuff
return result
While higher order functions can do some of this or objects-with-functions-inside can do some of this, AFAIK they cannot be generalized to work with any function taking an arbitrary set of parameters and returning an arbitrary type while maintaining type safety. I could do stuff like:
public Thingy wrap(Object O){ //this probably won't compile, but you get the idea
return (params Object[] args) => {
//check cache
return InvokeWithReflection(O, args)
}
}
But all the casting completely kills type safety.
EDIT: This is a simple example, where the function signature does not change. Ideally what I am looking for could modify the function signature, changing the input parameters or output type (a.l.a. function composition) while still maintaining type checking.
Very interesting question.
Some points regarding metaprogramming in Scala:
In scala 2.10 there will be developments in scala reflection
There is work in source to source transformation (macros) which is something you are looking for: scalamacros.org
Java has introspection (through the reflection api) but does not allow self modification. However you can use tools to support this (such as javassist). In theory you could use these tools in Scala to achieve more than introspection.
From what I could understand of your development process, you separate your domain code from your decorators (or a cross cutting concern if you will) which allow to achieve modularity and code simplicity. This can be a good use for aspect oriented programming, which allows to just that. For Java theres is a library (aspectJ), however I'm dubious it will run with Scala.
So my question is, is this possible?
There are many ways to achieve the same effect in statically-typed programming languages.
You have essentially described the process of doing some term rewriting on a program before executing it. This functionality is perhaps best known in the form of the Lisp macro but some statically typed languages also have macro systems, most notably OCaml's camlp4 macro system which can be used to extend the language.
More generally, you are describing one form of language extensibility. There are many alternatives and different languages provide different techniques. See my blog post Extensibility in Functional Programming for more information. Note that many of these languages are research projects so the motivation is to add novel features and not necessarily good features, so they rarely retrofit good features that were invented elsewhere.
The ML (meta language) family of languages including Standard ML, OCaml and F# were specifically designed for metaprogramming. Consequently, they tend to have awesome support for lexing, parsing, rewriting, interpreting and compiling. However, F# is the most far removed member of this family and lacks the mature tools that languages like OCaml benefit from (e.g. camlp4, ocamllex, dypgen, menhir etc.). F# does have a partial implementation of fslex, fsyacc and a Haskell-inspired parser combinator library called FParsec.
You may well find that the problem you are facing (which you have not described) is better solved using more traditional forms of metaprogramming, most notably a DSL or EDSL.
Without knowing why you're doing this, it's difficult to know whether this kind of approach is the right one in Scala or F#. But ignoring that for now, it's certainly possible to achieve in Scala, at least, although not at the language level.
A compiler plugin gives you access to the tree and allows you to perform all kinds of manipulation of that tree, all fully typechecked.
There are some issues with generating synthetic methods in Scala compiler plugins - it's difficult for me to know whether that will be a problem for you.
It is possible to work around this by creating a compiler plugin that generates source code which is then compiled in a separate pass. This is how ScalaMock works, for instance.
You might be interested in source-to-source program transformation systems (PTS).
Such tools parse the source code, producing an AST, and then allow one to define arbitrary analyses and/or transformations on the code, finally regenerating source code from the modified AST.
Some tools provide parsing, tree building and AST navigation by a procedural interface, such as ANTLR. Many of the more modern dynamic languages (Python, Scala, etc.) have had some self-hosting parser libraries built, and even Java (compiler plug-ins) and C# (open compiler) are catching on to this idea.
But mostly these tools only provide procedural access to the AST. A system with surface-syntax rewriting allows you to express "if you see this change it to that" using patterns with the syntax of the language(s) being manipulated. These include Stratego/XT and TXL.
It is our experience that manipulating complex languages requires complex compiler support and reasoning; this is the canonical lesson from 70 years of people building compilers. All of the above tools suffer from not having access to symbol tables and various kinds of flow analysis; after all, how one part of the program operates, depends on action taken in remote parts, so information flow is fundamental. [As noted in comments on another answer, you can implement symbol tables/flow analysis with those tools; my point is they give you no special support for doing so, and these are difficult tasks, even worse on modern languages with complex type systems and control flows].
Our DMS Software Reengineering Toolkit is a PTS that provides all of the above facilities (Life After Parsing), at some cost in configuring it to your particular language or DSL, which we try to ameliorate by providing these off-the-shelf for mainstream languages. [DMS provides explicit infrastructure for building/managing symbol tables, control and data flow; this has been used to implement these mechanisms for Java 1.8 and full C++14].
DMS has also been used to define meta-AOP, tools that enable one to build AOP systems for arbitrary languages and apply AOP like operations.
In any case, to the extent that you simply modify the AST, directly or indirectly, you have no guarantee of "type safety". You can only get that by writing transformation rules that don't break it. For that, you'd need a theorem prover to check that each modification (or composition of such) didn't break type safety, and that's pretty much beyond the state of the art. However, you can be careful how you write your rules, and get pretty useful systems.
You can see an example of specification of a DSL and manipulation with surface-syntax source-to-source rewriting rules, that preserves semantics, in this example that defines and manipulates algebra and calculus using DMS. I note this example is simple to make it understandable; in particular, its does not exhibit any of the flow analysis machinery DMS offers.
Ideally what I am looking for could modify the function signature, changing the input parameters or output type (a.l.a. function composition) while still maintaining type checking.
I have same need for making R APIs available in the type safe world. This way we would bring the wealth of scientific code from R into the (type) safe world of Scala.
Rationale
Make possible documenting the business domain aspects of the APIs through Specs2 (see https://etorreborre.github.io/specs2/guide/SPECS2-3.0/org.specs2.guide.UserGuide.html; is generated from Scala code). Think Domain Driven Design applied backwards.
Take a language oriented approach to the challenges faced by SparkR which tries to combine Spark with R.
See https://spark-summit.org/east-2015/functionality-and-performance-improvement-of-sparkr-and-its-application/ for attempts to improve how it is currently done in SparkR. See also https://github.com/onetapbeyond/renjin-spark-executor for a simplistic way to integrate.
In terms of solutioning this we could use Renjin (Java based interpreter) as runtime engine but use StrategoXT Metaborg to parse R and generate strongly typed Scala APIs (like you describe).
StrategoTX (http://www.metaborg.org/en/latest/) is the most powerful DSL development platform I know. Allows combining/embedding languages using a parsing technology that allows composing languages (longer story).

How to document and test interfaces required of formal parameters in Python 2?

To ask my very specific question I find I need quite a long introduction to motivate and explain it -- I promise there's a proper question at the end!
While reading part of a large Python codebase, sometimes one comes across code where the interface required of an argument is not obvious from "nearby" code in the same module or package. As an example:
def make_factory(schema):
entity = schema.get_entity()
...
There might be many "schemas" and "factories" that the code deals with, and "def get_entity()" might be quite common too (or perhaps the function doesn't call any methods on schema, but just passes it to another function). So a quick grep isn't always helpful to find out more about what "schema" is (and the same goes for the return type). Though "duck typing" is a nice feature of Python, sometimes the uncertainty in a reader's mind about the interface of arguments passed in as the "schema" gets in the way of quickly understanding the code (and the same goes for uncertainty about typical concrete classes that implement the interface). Looking at the automated tests can help, but explicit documentation can be better because it's quicker to read. Any such documentation is best when it can itself be tested so that it doesn't get out of date.
Doctests are one possible approach to solving this problem, but that's not what this question is about.
Python 3 has a "parameter annotations" feature (part of the function annotations feature, defined in PEP 3107). The uses to which that feature might be put aren't defined by the language, but it can be used for this purpose. That might look like this:
def make_factory(schema: "xml_schema"):
...
Here, "xml_schema" identifies a Python interface that the argument passed to this function should support. Elsewhere there would be code that defines that interface in terms of attributes, methods & their argument signatures, etc. and code that allows introspection to verify whether particular objects provide an interface (perhaps implemented using something like zope.interface / zope.schema). Note that this doesn't necessarily mean that the interface gets checked every time an argument is passed, nor that static analysis is done. Rather, the motivation of defining the interface is to provide ways to write automated tests that verify that this documentation isn't out of date (they might be fairly generic tests so that you don't have to write a new test for each function that uses the parameters, or you might turn on run-time interface checking but only when you run your unit tests). You can go further and annotate the interface of the return value, which I won't illustrate.
So, the question:
I want to do exactly that, but using Python 2 instead of Python 3. Python 2 doesn't have the function annotations feature. What's the "closest thing" in Python 2? Clearly there is more than one way to do it, but I suspect there is one (relatively) obvious way to do it.
For extra points: name a library that implements the one obvious way.
Take a look at plac that uses annotations to define a command-line interface for a script. On Python 2.x it uses plac.annotations() decorator.
The closest thing is, I believe, an annotation library called PyAnno.
From the project webpage:
"The Pyanno annotations have two functions:
Provide a structured way to document Python code
Perform limited run-time checking "

Are multiple classes in a single file recommended? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
How many Python classes should I put in one file?
Coming from a C++ background I've grown accustomed to organizing my classes such that, for the most part, there's a 1:1 ratio between classes and files. By making it so that a single file contains a single class I find the code more navigable. As I introduce myself to Python I'm finding lots of examples where a single file contains multiple classes. Is that the recommended way of doing things in Python? If so, why?
Am I missing this convention in the PEP8?
Here are some possible reasons:
Python is not exclusively class-based - the natural unit of code decomposition in Python is the module. Modules are just as likely to contain functions (which are first-class objects in Python) as classes. In Java, the unit of decomposition is the class. Hence, Python has one module=one file, and Java has one (public) class=one file.
Python is much more expressive than Java, and if you restrict yourself to one class per file (which Python does not prevent you from doing) you will end up with lots of very small files - more to keep track of with very little benefit.
An example of roughly equivalent functionality: Java's log4j => a couple of dozen files, ~8000 SLOC. Python logging => 3 files, ~ 2800 SLOC.
There's a mantra, "flat is better than nested," that generally discourages an overuse of hierarchy. I'm not sure there's any hard and fast rules as to when you want to create a new module -- for the most part, people just use their discretion to group logically related functionality (classes and functions that pertain to a particular problem domain).
Good thread from the Python mailing list, and a quote by Fredrik Lundh:
even more important is that in Python,
you don't use classes for every-
thing; if you need factories,
singletons, multiple ways to create
objects, polymorphic helpers, etc, you
use plain functions, not classes or
static methods.
once you've gotten over the "it's all
classes", use modules to organize
things in a way that makes sense to
the code that uses your components.
make the import statements look good.
the book Expert Python Programming has something related discussion
Chapter 4: Choosing Good Names:"Building the Namespace Tree" and "Splitting the Code"
My line crude summary: collect some related class to one module(source file),and
collect some related module to one package, is helpful for code maintain.
In python, class can also be used for small tasks (just for grouping etc). maintaining a 1:1 relation would result in having too many files with small or little functionality.
There is no specific convention for this - do whatever makes your code the most readable and maintainable.
A good example of not having seperate files for each class might be the models.py file within a django app. Each django app may have a handful of classes that are related to that app, and putting them into individual files just makes more work.
Similarly, having each view in a different file again is likely to be counterproductive.

Categories