How can I learn more about Python’s internals? [closed]

How can I learn more about Python’s internals? [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I have been programming using Python for slightly more than half an year now and I am more interested in Python internals rather than using Python to develop applications. Currently I am working on porting a few libraries from Python2 to Python3. However, I have a rather abstract view on how to make port stuff over from Python2 to Python3 as most of the changes deal with design issues in Python2.x
I'd like to learn more about Python internals; should I go for a top-down or a bottom-up approach? Are there any references you could recommend?

It sounds like you want to know more about the rationale behind the design of the language, rather than internals. "internals" to me means things like how objects are laid out in memory, how reference counting works, and so on.
If you're looking for a deeper understanding of the design decisions, try reading the PEPs: they are the proposals for changes in the language, and often include detailed discussions of the reasons for the changes, rejected alternatives, and so on. Even the rejected PEPs are useful, because they show the thinking that has shaped the language.
For example:
3105: Making print a function
3110: Catching exceptions in Python 3.x
3131: Supporting non-ASCII identifiers
and so on..
If you really want to learn about Python internals, then start by reading about the Python C API, which is used to build Python itself: my talk A Whirlwind Excursion through Python C Extensions is one place to start. Then you can dive into the Python source code itself for anything you need to learn about.

To someone who is stumbling upon this question from related links or search, there is a documentation written Yaniv Aknin on Python Internals. It starts from the scratch and is highly readable.

I find the series of Yaniv Aknin's Pythons Innards series
fantastic, too
I discovered it thanks to Planet Python
.
You may be also interested by the answer of TryPyPy in this SO thread

I would first read the What's New document for Python 3. It gives a good high-level overview and touches on the detailed changes.
You might also do a search for 'porting to python 3' or similar. There are lots of good resources and tools.
One tool that's new and hard to find is six, by Benjamin Peterson. It enables writing of code that is compatible across the Python 2*3 gap.
The part I found most difficult about maintaining Python 2 and Python 3 -compatible code was deployment. I could write code that would run just fine, but when I went do package and deploy, it was unclear when the conversion should happen. I ultimately found a distutils command build_py_2_to_3 that would do the trick. By using that command in my setup.py, I could release a source distribution that would deploy on either Python 2 or Python 3. An example can be found in jaraco.util.
You also asked about the internals. If you really want to get at the internals, you can view the source for Python 2.x and Python 3.x, though honestly, I would stick with reading the tutorials and maybe some of the .py files in the Python libs.

should I go for a top-down or a bottom-up approach?
Both! Seriously.

Have you tried this?
Automated Python 2 to 3 code
translation

Related

What's the best way to do literate programming in Python on Windows? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I've been playing with various ways of doing literate programming in Python. I like noweb, but I have two main problems with it: first, it is hard to build on Windows, where I spend about half my development time; and second, it requires me to indent each chunk of code as it will be in the final program --- which I don't necessarily know when I write it. I don't want to use Leo, because I'm very attached to Emacs.
Is there a good literate programming tool that:
Runs on Windows
Allows me to set the indentation of the chunks when they're used, not when they're written
Still lets me work in Emacs
Thanks!
Correction: noweb does allow me to indent later --- I misread the paper I found on it.
By default, notangle preserves whitespace and maintains indentation when expanding chunks. It can therefore be used with languages like Miranda and Haskell, in which indentation is significant
That leaves me with only the "Runs on Windows" problem.

I have written Pweave http://mpastell.com/pweave, that is aimed for dynamic report generation and uses noweb syntax. It is a pure python script so it also runs on Windows. It doesn't fix your indent problem, but maybe you can modify it for that, the code is really quite simple.

The de-facto standard in the community is IPython notebooks.
Excellent example in which Peter Norvig demonstrates algorithms to solve the Travelling Salesman Problem: https://nbviewer.org/url/norvig.com/ipython/TSP.ipynb
More examples listed at https://github.com/jupyter/jupyter/wiki

I did this:
http://sourceforge.net/projects/pywebtool/
You can get any number of web/weave products that will help you construct a document and code in one swoop.
You can -- pretty easily -- write your own. It's not rocket science to yank the Python code blocks out of RST source and assemble it. Indeed, I suggest you write your own Docutils directives to assemble the Python code from an RST source document.
You run the RST through docutils rst2html (or Sphinx) to produce your final HTML report.
You run your own utility on the same RST source to extract the Python code blocks and produce the final modules.

You could use org-mode and babel-tangle.
That works quite well, since you can give :noweb-ref to source blocks.
Here’s a minimal example: Activate org-babel-tangle, then put this into the file noweb-test.org:
#+begin_src python :exports none :noweb-ref c
abc = "abc"
#+end_src
#+begin_src python :noweb yes :tangle noweb-test.py
def x():
<<c>>
return abc
print(x())
#+end_src
You can also use properties of headlines for giving the noweb-ref. It can then even automatically concatenate several source blocks into one noweb reference.
Add :results output to the #+begin_src line of the second block to see the print results under that block when you hit C-c C-c in the block.

You might find noweb 3 easier to build on Windows. It was designed to be more portable than standard noweb.

Found this tool to be useful: https://github.com/bslatkin/pyliterate

See also my last LP tool: https://code.google.com/archive/p/nano-lp/. It does not requires special input format, supports Markdown/MultiMarkdown, reStructuredText, OpenOffice/LibreOffice, Creole, TeX/LaTeX and has super light and clean syntax - no more cryptic literate programs.

Are there programmatic tools for Perl to Python conversion? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
In my new job more people are using Python than Perl, and I have a very useful API that I wrote myself and I'd like to make available to my co-workers in Python.
I thought that a compiler that compiled Perl code into Python code would be really useful for such a task. Before trying to write something that parsed Perl (or at least, the subset of Perl that I've used in defining my API), I came across bridgekeeper from a consultancy.
It's almost certainly not worth the money for me to engage a consultancy to translate this API, but that's a really interesting tool.
Does anyone know of a compiler that will parse (or try to parse!) Perl5 code and compile it into Python? If there isn't such a thing, how should I start writing a simple compiler that parses my object-oriented Perl code and turns it into Python? Is there an ANTLR or YACC grammar that I can use as a starting point?
Edit: I found perl.y, which might be a starting point if I were to roll my own compiler.

James,
I recommend you to just rewrite the module in Python, for several reasons:
Parsing Perl is DARN HARD. Unless this is an important and desirable exercise for you, you'll find yourself spending much more time on the translation than on useful work.
By rewriting it, you'll have a great chance to practice Python. Learning is best done by doing, and having a task you really need done is a great boon.
Finally, Python and Perl have quite different philosophies. To get a more Pythonic API, it's best to just rewrite it in Python.

I think you should rewrite your code. The quality of the results of a parsing effort depends on your Perl coding style.
I think the quote below sums up the theoretical side very well.
From Wikipedia:Perl in Wikipedia
Perl has a Turing-complete grammar because parsing can be affected by run-time code executed during the compile phase.[25] Therefore, Perl cannot be parsed by a straight Lex/Yacc lexer/parser combination. Instead, the interpreter implements its own lexer, which coordinates with a modified GNU bison parser to resolve ambiguities in the language.
It is often said that "Only perl can parse Perl," meaning that only the Perl interpreter (perl) can parse the Perl language (Perl), but even this is not, in general, true. Because the Perl interpreter can simulate a Turing machine during its compile phase, it would need to decide the Halting Problem in order to complete parsing in every case. It's a long-standing result that the Halting Problem is undecidable, and therefore not even Perl can always parse Perl. Perl makes the unusual choice of giving the user access to its full programming power in its own compile phase. The cost in terms of theoretical purity is high, but practical inconvenience seems to be rare.
Other programs that undertake to parse Perl, such as source-code analyzers and auto-indenters, have to contend not only with ambiguous syntactic constructs but also with the undecidability of Perl parsing in the general case. Adam Kennedy's PPI project focused on parsing Perl code as a document (retaining its integrity as a document), instead of parsing Perl as executable code (which not even Perl itself can always do). It was Kennedy who first conjectured that, "parsing Perl suffers from the 'Halting Problem'."[26], and this was later proved.[27]

Starting in 5.10, you can compile perl with the experimental Misc Attribute Decoration enabled and set the PERL_XMLDUMP environment variable to a filename to get an XML dump of the parse tree (including comments - very helpful for language translators). Though as the doc says, this is a work in progress.

I never tried it and it seems unmaintained, but maybe PyPerl is an option?
How big is this API? If it really this useful then why don't you rewrite it in python. Writing an automatic converter will probably take longer then rewriting the API.
And even if you manage to automatically rewrite it, the resulting code probably won't be very pythonic anyway.
Be sure to check out the answers by weismat and eliben

As much as it might be fun to convert it to or rewrite it in python, I wouldn't make either of those my first choice. Then you'd be stuck with a forked code base. Any modifications you make will have to be duplicated.
Write some sort of wrapper for your API that you can access from outside of Perl. One possibility is a RESTful interface. Another, if you don't want to deal with networking issues, is to create a set of command line tools that access the API (possibly passing information as JSON). Then you can write an easy python library which accesses the wrapper API using httplib2 or subprocess (depending on how you've implemented the wrapper).
You'll still have to update the Python API whenever the interface changes, but now it's only for interface changes.

You could try writing a parser with PPI, dump it to some intermediary form and write Python mecanically from there. Hard, but doable. Useful? Er....
Or you could port your code to Perl 6, wait to Pynie to be ready enough to allow direct call from Python to Perl6 within the same runtime! It's not that far away after all. Too bad Ponie's dead though.

https://perthon.sourceforge.net could probably work? While it is still in alpha, I see a lot of potential.

What's the best online tutorial for starting with Spring Python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
Spring Python seems to be the gold-standard for how to define good quality APIs in Python - it's based on Spring which also seems to be the gold-standard for Java APIs.
My manager has complained (with good reason) that our APIs are in a mess - we need to impose some order on them. Since we will be re-factoring it makes sense to take advantage of what is considered best practice - so we would like to consider Spring.
Could somebody point me to the best learning resources for getting started with Spring? I've googled for a while and not found anything which seems to start from first principles. I'm looking for something which assumes good knowledge of Python but zero knowledge of Spring on other platforms or it's principles.

How did you come to decide on Spring Python as your API of choice? Spring works well on Java where there's a tradition of declarative programming; defining your application primarily using XML to control a core engine is a standard pattern in Java.
In Python, while the underlying patterns like Inversion of Control are still apposite (depending on your use case), the implementation chosen by Spring looks like a classic case of something produced by a Java programmer who doesn't want to learn Python. See the oft-referenced article Python is Not Java.
I applaud your decision to introduce order and thoughtfulness to your codebase, but you may wish to evaluate a number of options before making your decision. In particular, you may find that using Spring Python will make it difficult to hire good Python programmers, many of whom will run the other way when faced with 1000-line XML files describing object interactions.
Perhaps start by re-examining what you really want to accomplish. The problem cannot simply be that "you need a framework". There are lots of frameworks out there, and it's hard to evaluate a) if you truly need one and b) which one will work if you haven't identified what underlying software problems you need to solve.
If the real problem is that your code is an unmaintainable mess, introducing a framework probably won't fix the issue. Instead of just messy code, you'll have code that is messy in someone else's style :-) Perhaps rigour in the dev team is where you should recommend starting first: good planning, code reviews, stringent hiring practices, a "cleanup" release, etc...
Good luck with the research.

I won't go so far as to suggest that Spring Python is bad (because I don't know enough about it). But, to call Spring Python the "gold standard for Python APIs" is a stretch. To me, it seems that Spring Python is more of a way to allow Python apps to interact with Java Apps using Spring.
At any rate, after taking a precursory glance at the official documentation, it seems fairly easy to understand for me having decent knowledge of Python but no knowledge of spring. Aside from the fact that it almost looks like Java code where the author forgot the typenames, semicolons, and curly braces. :-)

Is there a library that will detect the source code language of a block of code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Writing a python script and it needs to find out what language a block of code is written in. I could easily write this myself, but I'd like to know if a solution already exists.
Pygments is insufficient and unreliable.

Pygments can guess too. Here is an example from the documentation:
>>> from pygments.lexers import guess_lexer, guess_lexer_for_filename
>>> guess_lexer('#!/usr/bin/python\nprint "Hello World!"')
<pygments.lexers.PythonLexer>
>>> guess_lexer_for_filename('test.py', 'print "Hello World!"')
<pygments.lexers.PythonLexer>

I guess you should try what this very site uses: google-code-prettify (from this question)
[EDIT]J.F. Sebastian pointed me to Pygments (see this answer)

This can be a little difficult to do reliably. For example, what language is the following:
print("blah");
The most reliable way (aside from having the user select the correct language, of course) is to check if the first line is starts with #! ("hashbang") - whatever is after this is the intepreter for the scripting language.
That will work reliably for a lot of scripting languages (including python, shell scripting, perl, ruby etc etc..), but not for compiled languages..
You could look for unique syntax stylings, or specific keywords and weight each one towards a specific language. For example $#somevar is probably Perl. somevar.each do |another| ..... end is probably ruby.. but this would end up being a lot of work, and will not always work (especially with short code blocks)
The other obvious way is to use the file-extension. If it's *.pl it's probably Perl code..
What are you trying to achieve? If you want to syntax highlight, look at what google-code-prettify does - basically a reasonably intelligent, generic syntax highlighter..
In the above above ambiguous example, print is probably a statement or function name, "blah" is probably a string. If you highlight those two differently, you've successfully highlighted a lot of different languages, without having to detect what one it actually is.. but that may not always work, depending on the task..

Ohcount has been developed for this exactly:
http://labs.ohloh.net/ohcount
They are using it at www.ohloh.net to count the contribution of people in languages.
The bad news is that it is coded in ruby, but I am sure that you can integrate it one way or the other in python.

Since you asked this question, GitHub have released the code they use to detect programming languages, Linguist. In my experience, GitHub is very accurate.
Language detection
Linguist defines the list of all languages known to GitHub in a yaml file. In order for a file to be highlighted, a language and lexer must be defined there.
Most languages are detected by their file extension. This is the fastest and most common situation.
For disambiguating between files with common extensions, we use a bayesian classifier. For an example, this helps us tell the difference between .h files which could be either C, C++, or Obj-C.
Ruby gem: http://rubygems.org/gems/github-linguist
If you can't use Ruby for whatever reason, the logic is simple enough to port https://github.com/github/linguist/blob/master/lib/linguist/language.rb

Vim uses a bunch of interesting tests and regular expressions to look for certain file formats. You can look at the vim instruction file at vim/vim71/filetype.vim, or here online.

what language a block of code is written in
What are your alternatives, among what languages? There is no way to determine this universally. But if you narrow your focus there is probably a tool somewhere

You can check highlight.js which automatically highlights the code block, they say they are using some kind of heuristic methods to accomplish this http://softwaremaniacs.org/soft/highlight/en/

As other have said Pygments will be your best bet.

What refactoring tools do you use for Python? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I have a bunch of classes I want to rename. Some of them have names that are small and that name is reused in other class names, where I don't want that name changed. Most of this lives in Python code, but we also have some XML code that references class names.
Simple search and replace only gets me so far. In my case, I want to rename AdminAction to AdminActionPlug and AdminActionLogger to AdminActionLoggerPlug, so the first one's search-and-replace would also hit the second, wrongly.
Does anyone have experience with Python refactoring tools ? Bonus points if they can fix class names in the XML documents too.

In the meantime, I've tried it two tools that have some sort of integration with vim.
The first is Rope, a python refactoring library that comes with a Vim (and emacs) plug-in. I tried it for a few renames, and that definitely worked as expected. It allowed me to preview the refactoring as a diff, which is nice. It is a bit text-driven, but that's alright for me, just takes longer to learn.
The second is Bicycle Repair Man which I guess wins points on name. Also plugs into vim and emacs. Haven't played much with it yet, but I remember trying it a long time ago.
Haven't played with both enough yet, or tried more types of refactoring, but I will do some more hacking with them.

I would strongly recommend PyCharm - not just for refactorings. Since the first PyCharm answer was posted here a few years ago the refactoring support in PyCharm has improved significantly.
Python Refactorings available in PyCharm (last checked 2016/07/27 in PyCharm 2016.2)
Change Signature
Convert to Python Package/Module
Copy
Extract Refactorings
Inline
Invert Boolean
Make Top-Level Function
Move Refactorings
Push Members down
Pull Members up
Rename Refactorings
Safe Delete
XML refactorings (I checked in context menu in an XML file):
Rename
Move
Copy
Extract Subquery as CTE
Inline
Javascript refactorings:
Extract Parameter in JavaScript
Change Signature in JavaScript
Extract Variable in JavaScript

WingIDE 4.0 (WingIDE is my python IDE of choice) will support a few refactorings, but I just tried out the latest beta, beta6, and... there's still work to be done. Retract Method works nicely, but Rename Symbol does not.
Update: The 4.0 release has fixed all of the refactoring tools. They work great now.

I would take a look at Bowler (https://pybowler.io).
It's better suited for use directly from the command-line than rope and encourages scripting (one-off scripts).

Your IDE can support refactorings !!
Check it Eric, Eclipse, WingIDE have build in tools for refactorings (Rename including). And that are very safe refactorings - if something can go wrong IDE wont do ref.
Also consider adding few unit test to ensure your code did not suffer during refactorings.

PyCharm have some refactoring features.
PYTHON REFACTORING
Rename refactoring allows to perform global code changes safely and instantly. Local changes within a file are performed in-place. Refactorings work in plain Python and Django projects.
Use Introduce Variable/Field/Constant and Inline Local for improving the code structure within a method, Extract Method to break up longer methods, Extract Superclass, Push Up, Pull Down and Move to move the methods and classes.

You can use sed to perform this. The trick is to recall that regular expressions can recognize word boundaries. This works on all platforms provided you get the tools, which on Windows is Cygwin, Mac OS may require installing the dev tools, I'm not sure, and Linux has this out of the box. So grep, xargs, and sed should do the trick, after 12 hours of reading man pages and trial and error ;)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.