CPython behaviour for SyntaxErrors

CPython behaviour for SyntaxErrors - python

CPython completely parses a file and begins to run it afterwards(?). Is it also possible get every line executed before the parser continues or run the whole program line by line to get something like this running:
print('not yet')
exit()
print('but here is a SyntaxError incoming'
My goal would be to make a vanilla module that tries to fix SyntaxErrors of the file it is imported into in some way like:
import SyntaxFixer from SyntaxFixer
SyntaxFixer.run() #this line would have to be executed
print( #before this line to fix it, or terminate the program befor a SyntaxError could be thrown
I know that the C reference is quite extensive about the definitions of its behaviour, or when the behaviour is undefined, but I couldn't find a similar definition for Python that would make my project impossible. I want to get it to work with CPython, as this is the default and most used implementation.

Related

How to read a txt file line by line? [duplicate]

I have this code:
import sys
def random(size=16):
return open(r"C:\Users\ravishankarv\Documents\Python\key.txt").read(size)
def main():
key = random(13)
print(key)
When I try running the script, there are no errors, but nothing appears to happen. I expected it to print some content from the key file, but nothing is printed.
What is wrong? How do I make the code run?

You've not called your main function at all, so the Python interpreter won't call it for you.
Add this as the last line to just have it called at all times:
main()
Or, if you use the commonly seen:
if __name__ == "__main__":
main()
It will make sure your main method is called only if that module is executed as the starting code by the Python interpreter. More about that here: What does if __name__ == "__main__": do?
If you want to know how to write the best possible 'main' function, Guido van Rossum (the creator of Python) wrote about it here.

Python isn't like other languages where it automatically calls the main() function. All you have done is defined your function.
You have to manually call your main function:
main()
Also, you may commonly see this in some code:
if __name__ == '__main__':
main()

There's no such main method in python, what you have to do is:
if __name__ == '__main__':
main()

Something does happen, it just isn't noticeable
Python runs scripts from top to bottom. def is a statement, and it executes when it is encountered, just like any other statement. However, the effect of this is to create the function (and assign it a name), not to call it. Similarly, import is a statement that loads the other module (and makes its code run top to bottom, with its own global-variable context), and assigns it a name.
When the example code runs, therefore, three things happen:
The code for the sys standard library module runs, and then the name sys in our own module's global variables is bound to that module
A function is created from the code for random, and then the name random is bound to that function
A function is created from the code for main, and then the name main is bound to that function
There is nothing to call the functions, so they aren't called. Since they aren't called, the code inside them isn't run - it's only used to create the functions. Since that code doesn't run, the file isn't read and nothing is printed.
There are no "special" function names
Unlike in some other languages, Python does not care that a function is named main, or anything else. It will not be run automatically.
As the Zen of Python says, "Explicit is better than implicit". If we want a function to be called, we have to call it. The only things that run automatically are the things at top level, because those are the instructions we explicitly gave.
The script starts at the top
In many real-world scripts, you may see a line that says if __name__ == '__main__':. This is not "where the script starts". The script runs top to bottom.
Please read What does if __name__ == "__main__": do? to understand the purpose of such an if statement (short version: it makes sure that part of your top-level code is skipped if someone else imports this file as a module). It is not mandatory, and it does not have any kind of special "signalling" purpose to say where the code starts running. It is just a perfectly normal if statement, that is checking a slightly unusual condition. Nothing requires you to use it in a script (aside from wanting to check what it checks), and nothing prevents you from using it more than once. Nothing prevents you from checking whether __name__ is equal to other values, either (it's just... almost certainly useless).

You're not calling the function. Put main() at the bottom of your code.

Embed Python in Python?

I wrote a "compiler" PypTeX that converts an input file a.tex containing Hello #{3+4} to an ouput file a.pyptex containing Hello 7. I evaluate arbitrary Python fragments like #{3+4} using something like eval(compile('3+4','a.tex',mode='eval'),myglobals), where myglobals is some (initially empty) dict. This creates a thin illusion of an embedded interpreter for running code in a.tex, however the call stack when running '3+4' looks pretty weird, because it backs up all the way into the PypTeX interpreter, instead of topping out at the user code '3+4' in a.tex.
Is there a way of doing something like eval but chopping off the top of the call stack?
Motivation: debugging
Imagine an exception is raised by the Python fragment deep inside numpy, and pdb is launched. The user types up until they reach the scope of their user code and then they type list. The way I've done it, this displays the a.tex file, which is the right context to be showing to the user and is the reason why I've done it this way. However, if the user types up again, the user ends up in the bowels of the PypTeX compiler.
An analogy would be if the g++ compiler had an error deep in a template, displayed a template "call stack" in its error message, but that template call stack backed all the way out into the bowels of the actual g++ call stack and exposed internal g++ details that would only serve to confuse the user.
Embedding Python in Python
Maybe the problem is that the illusion of the "embedded interpreter" created by eval is slightly too thin. eval allows to specify globals, but it inherits whatever call stack the caller has, so if one could somehow supply eval with a truncated call stack, that would resolve my problem. Alternatively, if pdb could be told "you shall go no further up" past a certain stack frame, that would help too. For example, if I could chop off a part of the stack in the traceback object and then pass it to pdb.post_mortem().
Or if one could do from sys import Interpreter; foo = Interpreter(); foo.eval(...), meaning that foo is a clean embedded interpreter with a distinct call stack, global variables, etc..., that would also be good.
Is there a way of doing this?
A rejected alternative
One way that is not good is to extract all Python fragments from a.tex by regular expression, dump them into a temporary file a.py and then run them by invoking a fresh new Python interpreter at the command line. This causes pdb to eventually top out into a.py. I've tried this and it's a very bad user experience. a.py should be an implementation detail; it is automatically generated and will look very unfamiliar to the user. It is hard for the user to figure out what bits of a.py came from what bits of a.tex. For large documents, I found this to be much too hard to use. See also pythontex.

I think I found a sufficient solution:
import pdb, traceback
def exec_and_catch(cmd,globals):
try:
exec(cmd,globals) # execute a user program
except Exception as e:
tb = e.__traceback__.tb_next # if user program raises an exception, remove the
f = e.with_traceback(tb) # top stack frame and return the exception.
return f
return None # otherwise, return None signifying success.
foo = exec_and_catch("import module_that_does_not_exist",{})
if foo is not None:
traceback.print_exception(value=foo, tb=foo.__traceback__, etype=type(foo))
pdb.post_mortem(foo.__traceback__)

Load and execute a full python script from a raw link?

I'm facing some problems trying to load a full python script from my pastebin/github pages.
I followed this link, trying to convert the raw into a temp file and use it like a module: How to load a python script from a raw link (such as Pastebin)?
And this is my test (Using a really simple python script as raw, my main program is not so simple unfortunately): https://trinket.io/python/0e95ba50c8
When I run the script (that now is creating a temp file in the current directory of the .py file) I get this error:
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\BOT\\Images\\tempxm4xpwpz.py'
Otherwise I also treid the exec() function... No better results unfortunately.
With this code:
import requests as rq
import urllib.request
def main():
code = "https://pastebin.com/raw/MJmYEKqh"
response = urllib.request.urlopen(code)
data = response.read()
exec(data)
I get this error:
File "<string>", line 10, in <module>
File "<string>", line 5, in hola
NameError: name 'printest' is not defined
Since my program is more complex compared to this simple test, I don't know how to proceed...
Basically What I want to achieve is to write the full script of my program on GitHub and connect it to a .exe so if I upgrade the raw also my program is updated. Avoiding to generate and share (only with my friends) a new .exe everytime...
Do you think is possible? If so.. what am I doing wrong?
PS: I'm also open to other possibilities to let my friends update the program without downloading everytime the .exe, as soon as they don't have to install anything (that's why I'm using .exe).

Disclaimer: it is really not a good idea to run an unverified (let alone untrusted) code. That being said if you really want to do it...
Probably the easiest and "least-dirty" way would be to run whole new process. This can be done directly in python. Something like this should work (inspiration from the answer you linked in your question):
import urllib.request
import tempfile
import subprocess
code = "https://pastebin.com/raw/MJmYEKqh"
response = urllib.request.urlopen(code)
data = response.read()
with tempfile.NamedTemporaryFile(suffix='.py') as source_code_file:
source_code_file.write(data)
source_code_file.flush()
subprocess.run(['python3', source_code_file.name])
You can also make your code with exec run correctly:
What may work:
exec(data, {}) -- All you need to do, is to supply {} as second argument (that is use exec(data, {})). Function exec may receive two additional optional arguments -- globals and locals. If you supply just one, it will use the same directory for locals. That is the code within the exec would behave like sort-of "clean" environment, at the top-level. Which is something you aim for.
exec(data, globals()) -- Second option is to supply the globals from your current scope. This will also work, though you probably has no need to give the execucted code access to your globals, given that that code will set-up everything inside anyway
What does not work:
exec(data, {}, {}) -- In this case the executed code will have two different dictionaries (albeit both empty) for locals and globals. As such it will behavie "as-in" (I'm not really sure about this part, but as I tested it, it seams as such) the function. Meaning that it will add the printest and hola functions to the local scope instead of global scope. Regardless, I expected it to work -- I expected it will just query the printest in the hola function from the local scope instead of global. However, for some reason the hola function in this case gets compiled in such a way it expects printest to be in global scope and not local, which is not there. I really did not figured out why. So this will result in the NameError
exec(data, globals(), locals()) -- This will provide access to the state from the caller function. Nevertheless, it will crash for the very same reason as in the previous case
exec(data) -- This is just a shorthand for exec(data, globals(), locals()

How do I use stereoscopy in IDLE/Python?

I have downloaded and installed the stereoscopy library. I know how to execute the program via command line as it shows clearly here: https://pypi.org/project/stereoscopy/#description
However, i looked at its code and I wanted to do it myself. I want to insert the code from: https://github.com/2sh/StereoscoPy/blob/master/stereoscopy/init.py and see if it works there.
I copied the code and when I run it, nothing happens. No errors or anything, but no picture shows up and no picture is saved.
So I would like to learn how to use this library to make my own anaglyph pictures by coding it myself and not use the command line executable.
Thank you for your help :)

Running StereoscoPy from a command line executes stereoscopy.__main__, which mainly consists of
from . import _main
_main()
_main is imported from _init.py. The latter defines some constants, classes, and functions, including _main, but does not call any of them. You need to do what __main__.py does, call _main, after stuffing arguments into sys.argv. One way to do the latter is to use sys.argv = shlex.split(cli_string, posix=True) where cli_string is the string that would be typed at a command line.

Run python program from another python program (with certain requirements)

Let's say I have two python scripts A.py and B.py. I'm looking for a way to run B from within A in such a way that:
B believes it is __main__ (so that code in an if __name__=="__main__" block in B will run)
B is not actually __main__ (so that it does not, e.g., overwrite the "__main__" entry in sys.modules)
Exceptions raised within B propagate to A (i.e., could be caught with an except clause in A).
Those exceptions, if not caught, generate a correct traceback referencing line numbers within B.
I've tried various techniques, but none seem to satisfy all my requirements.
using tools from the subprocess module means exceptions in B do not propagate to A.
execfile("B.py", {}) runs B, but it doesn't think it's main.
execfile("B.py", {'__name__': '__main__'}) makes B.py think it's main, but it also seems to screw up the exception traceback printing, so that the tracebacks refer to lines within A (i.e., the real __main__).
using imp.load_source with __main__ as the name almost works, except that it actually modifies sys.modules, thus stomping on the existing value of __main__
Is there any way to get what I want?
(The reason I'm doing this is because I'm doing some cleanup on an existing library. This library has no real test suite, just a set of "example" scripts that produce certain output. I'm trying to leverage these as tests to ensure that my cleanup doesn't affect the library's ability to execute these examples, so I want to run each example script from within my test suite. I'd like to be able to see exceptions from these scripts within the test script so the test script can report the type of failure, instead of just reporting a generic SubprocessError whenever an example script raises some exception.)

Your use case makes sense, but I still think you'd be better off refactoring the tests such that they can be run externally.
Do you test scripts have something like this?
def test():
pass
if __name__ == '__main__':
test()
If not, perhaps you should convert your tests to calling a function such as test. Then, from your main test script, you can just:
import test1
test1.test()
import test2
test2.test()
Provide a common interface to running tests, that the tests themselves use. Having a big block of code in a __main__ check is Not A Good Thing.
Sorry that I didn't answer the question you asked, but I feel this is the correct solution without deviating too far from the original test code.

Answering my own question because the result is kind of interesting and might be useful to others:
It turns out I was wrong: execfile("B.py", {'__name__': '__main__'} is the way to go after all. It does correctly produce the tracebacks. What I was seeing with incorrect line numbers weren't exceptions but warnings. These warnings were produced using warnings.warn("blah", stacklevel=2). The stacklevel=2 argument is supposed to allow for things like deprecation warnings to be raised where the deprecated thing is used, rather than at the warning call (see the documentation).
However, it seems that the execfile-d file doesn't count as a "stack level" for this purpose, and is invisible for stacklevel purposes. So if code at the top level of an execfile-d module causes a warning with stacklevel 2, the warning is not raised at the right line number in the execfile-d source file; instead it is raised at the corresponding line number of the file which is running the execfile.
This is unfortunate, but I can live with it, since these are only warnings, so they won't impact the actual performance of the tests. (I didn't notice at first that it was only the warnings that were affected by the line-number mismatches, because there were lots of warnings and exceptions intermixed in the test output.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.