How to import modules in Google App Engine? - python

I have created a simple GAE app based on the default template. I want to add an external module like short_url. How do I do this? The directions that I have found so far are confusing and GAE doesn't seem to use PYTHONPATH for obvious reasons I guess.

Simply place the short_url.py file in your app's directory.
Sample App Engine project:
myapp/
app.yaml
index.yaml
main.py
short_url.py
views.py
And in views.py (or wherever), you can then import like so:
import short_url
For more complex projects, perhaps a better method is to create a directory especially for dependencies; say lib:
myapp/
lib/
__init__.py
short_url.py
app.yaml
index.yaml
main.py
views.py
from lib import short_url
Edit #2:
Apologies, I should have mentioned this earlier. You need modify your path, thanks to Nick Johnson for the following fix.
Ensure that this code is run before starting up your app; something like this:
import os
import sys
def fix_path():
# credit: Nick Johnson of Google
sys.path.append(os.path.join(os.path.dirname(__file__), 'lib'))
def main():
url_map = [ ('/', views.IndexHandler),] # etc.
app = webapp.WSGIApplication(url_map, debug=False)
wsgiref.handlers.CGIHandler().run(app)
if __name__ == "__main__":
fix_path()
main()
Edit3:
To get this code to run before all other imports, you can put the path managing code in a file of its own in your app's base directory (Python recognizes everything in that directory without any path modifications).
And then you'd just ensure that this import
import fix_path
...is listed before all other imports in your main.py file.
Here's a link to full, working example in case my explanation wasn't clear.

i will second the answers given by #Adam Bernier and #S.Mark, although adam's explains things is a bit more detail. in general, you can add any pure Python module/package to your App Engine directory and use as-is, as long as they don't try to work outside of the sandbox, i.e, cannot create files, cannot open network sockets, etc.
also keep in mind the hard limits:
maximum total number of files (app files and static files): 3,000
maximum size of an application file: 10 megabytes
maximum size of a static file: 10 megabytes
maximum total size of all application and static files: 150 megabytes
UPDATE (Oct 2011): most of these numbers have been increased to:
maximum total number of files (app files and static files): 10,000
maximum size of an application file: 32MB
maximum size of a static file: 32MB
UPDATE (Jun 2012): the last limit was bumped up to:
maximum total size of all application and static files: 1GB

You can import python packages as ZIPs. This allows you to avoid the maximum file count.
The app engine docs address this.
python 2.5: zipimport is supported.
python 2.7: zipimport is not supported, but Python 2.7 can natively import from .zip files.
This is how I import boto.
sys.path.insert(0, 'boto.zip')
import boto #pylint: disable=F0401
from boto import connect_fps #pylint: disable=F0401
The cons of this technique include having to manually re-archive many packages.
For example, boto.zip decompresses into the "boto" subdirectory, with the "boto" module inside of it (as another subdirectory).
So to import boto naturally you may have to do from boto import boto, but this can cause weirdness with a lack of __init__.py.
To solve this, simply decompress, and archive the boto subfolder manually as boto.zip, and place that in your application folder.

Since that url_shortener program written in python, you could just include in your source code and import it like other python modules.

Related

Handling imports in externally-called multi-file script

I have a file structure like the following:
config.py
main.py
some_other_file.py
where config.py contains easily accessible parameters but not much code otherwise. These should be accessible to all other code files. Normally import config would do, but in this case the python script is called externally from another program, and therefore the root calling directory is not the same as the one the files are located at (so just an import results in an exception since it does not find the files).
Right now, the solution I have is to include into my main.py file (the one that is directly called by the third program) the following:
code_path = "Path\\To\\My\\Project\\"
import sys
sys.path.insert(0, code_path)
import config
import some_other_file
...
However, this means having to modify main.py every time the code is moved around. I could live with that, but I would certainly like having one single, simple file with all necessary configuration, not needing to dig through the others (especially since the code may later be passed onto others who just want it to work as effortlessly as possible).
I tried having the sys.path.insert inside the config file, and having that be the file directly called by the external script (and all other files called from there). However, I run into the problem that when the other files do import config, it results in an import loop since they are also being imported from config.py. Typically, I believe this is solved by making the imports in the config.py file only once through something like if __name__ == "__main__": (see below). This does not work in my case, and the script never goes into the if statement, possibly because it is being called as a sub-routine by a third program and it is not the main program itself. As a result, I have no way of enforcing a portion of the code in config.py to only be executed once.
This is what I meant above for config.py (which does not work for my case):
... # Some parameter definitions
if __name__ == "__main__":
code_path = "Path\\To\\My\\Project\\"
import sys
sys.path.insert(0, code_path)
import main # Everything is then executed in main.py (where config.py is also cross-imported)
Is there any way to enforce the portion of code inside the if above to only be executed once even if cross-imported, but without relying on __name__ == "__main__"? Or any other way to handle this at all, while keeping all parameters and configuration data within one single, simple file.
By the way, I am using IronPython for this (not exactly a choice), but since I am sticking to hopefully very simple stuff, I believe it is common to all python versions.
tl;dr: I want a config.py file with a bunch of parameters, including the directory where the program is located, to be accessible to multiple .py files. I want to avoid needing the project directory written in the "main" code file, since that should be a parameter and therefore only in config.py. The whole thing is passed to and executed by a third external program, so the directory where these files are located is not the same as where they are called from (therefore the project directory has to be included to system path at some point to import the different files).
A possible design alternative that is fairly common would be to rely on environment variables configured with a single file. Your multi-program system would then be started with some run script and your python application would then need to use something along the lines of os.env[…] to get/set/check the needed variables. Your directory would then look something along the lines of:
.
.
.
.env (environment variables - doesn't have to be called .env)
main.py
run.sh (starts system of programs - doesn't have to be called run.sh)
.
.
.
For the run script, you could then "activate" the environment variables and, after, start the relevant programs. If using bash as your terminal:
source .env # "activate" your environment variables
# Then the command to start whatever you need to; for example
#
# python main.py
# or
# ./myprogram

How to correctly import modules when working in a subdirectory?

I have a project where I want to structure the code in layers. The different parts of the program do very different things, and I wish to have a clean upper layer which binds all of the code in sub-directories together.
However, I struggle with importing modules correctly.
Say I have the structure
Project
└──manage.py
└──part a
├──script_a.py
├──__init__.py
└──modules_a
├──module_a1.py
├──module_a2.py
├──module_a3.py
└──__init__.py
└──part b
├──script_b.py
├──__init__.py
└──modules_b
├──module_b1.py
├──module_b2.py
├──module_b3.py
└──__init__.py
If I am writing code in script_a.py that depends on something from module_a1.py I use
from modules_a import module_a1
This works, but VS Code is never happy about the importing, always marking the imports with error. Therefore, I am wondering if there is something that I have logically misunderstood, especially since the script_a.py is not in the root folder?
If you are within a package and you want to access a sub package you have to put a . in front of the sub package. So change your import statement from
from modules_a import module_a1
to
from .modules_a import module_a1
Then the error disappears.
I decided to solve it by adding a testing file in the root folder and only running the script from the testing file, which will have similar functionality to the manage.py that will be my execution script later.

Python Change Directory Google App Engine

I have my webapp written in Python running on Google App Engine; so far only testing in my localhost. My script needs to read text files located in various directories. So in my Python script I would simply use os.chdir("./"+subdirectory_name) and then start opening and reading the files and using the text for analysis. However I cannot do that when trying to move it over to App Engine.
I assume I need to make some sort of changes to app.yaml but I'm stuck after that. Can anyone walk me through this somewhat thoroughly?
From the answer below, I was able to fix this part, I have a followup question about the app.yaml structure though. I have my folder set up like the following, and I'm wondering what I need to put in my app.yaml file to make it work when I deploy the project (it works perfectly on localhost right now).
Main_Folder:
python_my_app.py
app.yaml
text_file1
text_file2
text_file3
subfolder1_with_10_text_files_inside
subfolder2_with_10_text_files_inside
subfolder3_with_10_text_files_inside
...how do I specify this structure in my app.yaml and do I need to change anything in my code if it works right now on localhost?
You don't need to change your working directory at all to read files. Use absolute paths instead.
Use the current file location as the starting point, and resolve all relative paths from there:
import os.path
HERE = os.path.dirname(os.path.abspath(__file__))
somefile = os.path.abspath(os.path.join(HERE, 'subfolder1_with_10_text_files_inside /somefile.txt'))
If you want to be able to read static files, do remember to set the application_readable flag on these, as Google will otherwise only upload them to their content delivery network, not to the app engine.
You can package your text files inside your application and then do something like this:
path = os.path.join(os.path.dirname(__file__), 'subdir', 'file')
file = open(path)

Configuring celery with a .conf file

Good day.
I set up a separate project from the main one called myproject-celery. It is a buildout based project which contains the async part of my project. For convenience I want to have a file, that will be containing this machine's configuration. I know that celery provides the python config file, but I do not like this configuration style.
Let's say I have a configuration in a Yaml config file named myproject.yaml
What I want to achieve:
./bin/celery worker --config /absolute/path/to/project/myproject.yaml --app myproject.celery
The problem really is that I want to specify the file's location, because it can change. I tried writing a custom loader class, but I failed, cause I do not even know why and when the many custom methods of this class are called (the only doc that I found is http://docs.celeryproject.org/en/latest/reference/celery.loaders.base.html?highlight=loader#id1 and It's no help for me). I tried to do something on import phase for the app module, but I can not pass the filepath to that module's code... The only solution that I came up with was using a custom ENV param that will contain the path, but I do not see why can't it be a launch param like in most apps, that I use(refering to pyramid with it's paster serve myproject.ini)
So the question:
What do I have to do to set up the config from a file that I could specify by an absolute path?
EDIT:
The question was not answered, sow I posted an issue on celery's github. Will wait for a response.
https://github.com/celery/celery/issues/1100
Looking at celery.loaders.base it looks like the method you want to override is read_configuration:
from celery.datastructures import DictAttr
from celery.loaders.base import BaseLoader
class YAMLLoader(BaseLoader):
def read_configuration():
# Load YAML file here and return a DictAttr instance

The right way to share settings among modules

I have a project with 10 different python files. It has classes and functions - pretty much the lot.
I do want to share specific data that will represent the settings in the project between all the project files.
I came up with creating a settings.py file:
settings = {}
settings['max_bitrate'] = 160000
settings['dl_dir'] = r"C:\Downloads"
and then I import the class from every file.
Is there a more suitable way to do it?
I'm probably a little old-school in this regard, but in my latest project, I created a config file in /etc, then created a config module that uses ConfigParser to read it in and make it available, and import that config module wherever I need to read settings.
Your method sounds good to me, and has the advantage that you can easily change the implementation of the settings module, for example to use configuration files or the windows registry, or to provided a read only API.

Categories