I'm currently looking around to find what's allow and what's not in GAE.
Using the Google's Developers website, I found that _socket C Libraries and the socket module are not allowed on GAE.
How did they disable these modules? Did they performed a complete rebuild of the python interpreter or did they developed their own (like pypy)?
You don't really need to rebuild the whole python interpeter just to disable modules, you can (for example) delete the libraries or (as AppEngine did) or have a import hook that will check for loaded module and have a whitelist of modules which are allowed to be loaded.
Related
Some of my Airflow jobs are going to use the Google DataStore. There are at least two obvious possibilities to access DataStore from Airflow:
Use Googles python client library
Use the Airflow Hook for DataStore interaction
The usage of the python library is much more convenient in contract to the hook. It has implemented all the nice things one need. The hook is more or less just pure API wrapper.
However, I'm wondering if there are some advantages from using the hook instead of the client library.
First let me point out that you are referencing to an old version of the Hook. The updated version can be found in provider package here. See this answer for instructions how to install it.
Then you can import the hook as:
from airflow.providers.google.cloud.hooks.datastore import DatastoreHook
The updated version might have the functions that were missing in the old version of contrib hook.
The idea of Hooks is to wrap the python library thus saving you a lot of headache. For example: when you use the hook you don't need to handle setting up the connection. The hook does that for you.
You can always use the python library directly however I consider this a bad practice. It's very common to use the same library for different use cases - Hook can be used with more than one operator thus saving you a lot of code duplication.
If the relevant functions from the python library doesn't exist in the Hook you can always create a custom hook - inheriting from the upstream (open source) hook and use it:
from airflow.providers.google.cloud.hooks.datastore import DatastoreHook
Class MyDatastoreHook(DatastoreHook)
def missing_method(self):
#wrap a function from the python lib
conn = self.get_conn()
function code
I am new to Google Cloud Platform and in my whole I have been working on Python 3. I am trying to find out which version of Python is more complete for Google App Engine: Python 2.7 or Python 3.
As I'm starting to work with Google App Engine I have realised that continuing using Python 3 seems too painful as basic tools like dev_appserver.py are written for Python 2 only. Now I am hitting the opposite problem: cloudstorage module seems to exist only for python3. Again, when I install it, seems the only way I can test read/write to google bucket locally is by authenticating with google.appengine.ext, which in turn only works within dev_appserver.py or remotely. This leaves me confused which environment to chose.
What is a general agreement / what is the focus of Google App Engine: Python 2 or Python 3?
In App Engine, you have to options: the Standard environment and the Flexible environment.
Python 2.7 is available in both Standard and Flexible, while Python 3.6 is only available in Flexible.
Also, the choice between Standard and Flexible depends on what you want to do/what libraries you need:
There are some third-party libraries already built-in in the Standard Environment, and you can include other libraries, but, those libraries can't include C extensions, they must be written in pure Python. If you need libraries with C extensions, you will have to move to Flexible.
In Standard, you can use propietary libraries (like google.appengine.ext, as you mentioned) to do tasks like accessing databases, while in Flexible you can use other libraries (like the client you mentioned).
There are also another important differences, like pricing, scaling, etc. The choice will depend, as I said, in your needs for your application.
EDIT
dev_appserver.py is only used when developing in Standard. There is a tutorial in here, with Flask. If you are in Flexible, you can test the app locally as if you were running as usual a python file, like in this other example.
You can use buckets in both Standard and Flexible
The python3-only cloudstorage support assumption based on the SO post you referenced is not correct:
the import appears to be done in a regular python shell or as a standalone script, not from a standard environment GAE app - different things, see import cloudstorage, ImportError: No module named google.appengine.api.
it is not specified where that library comes from
GCS is definitely supported in the standard env GAE (i.e. on python 2), you just need to follow the steps from the official documentation: Setting Up Google Cloud Storage and Reading and Writing to Google Cloud Storage.
Both were good. But the question is what kind of environment do you want? Standard environment or Flexible environment.
Find your answer in this document: https://cloud.google.com/appengine/docs/python/
It kind of depends on what you're using it for. If you're doing data science, for example, I'm seeing a few notices of Python libraries that are (finally) dropping support for Python 2. numpy is one that is dropping support.
Generally speaking, I would recommend Python 3 over Python 2. Why spend time developing in an aging version when its replacement has matured nicely and is more consistent?
I am deploying a Django app on Google App Engine, but I get an error when importing django-widget-tweaks:
appcfg.py: error: Error parsing ./app.yaml: the library "django-widget-tweaks" is not supported
Is there any way to fix this, apart from not using the library?
You can install third-party libraries yourself. Since this is not one of the runtime-provided third-party libraries you have to fulfill the following criteria:
The library must be implemented as pure Python code (no C extensions).
The code is uploaded to App Engine with your application code, and
counts toward file quotas.
Use pip to install the library and the
vendor module to enable importing packages from the third-party
library directory
From what I can tell the module you want to use is fully implemented in Python, so this should be straight forward. Consult the docs on vendoring for more information on how to do this.
Does poster.encode module supported in python appengine ??
if No , whats is the possible alternatives ?
You'd need to deploy the module with your code by including it in your application's directory when deploying, but it does appear to be a pure python module and looking through the source I see no reason why it wouldn't work just fine in App Engine.
The only modules that won't work are those that use C extensions or make use of features like threads, sockets, etc. that are disabled in the App Engine runtime. poster.streaminghttp, for example, almost certainly won't work, as it uses sockets.
from what I see in https://bitbucket.org/chrisatlee/poster/src/bd5ab4c5005c/poster/encode.py I doesn't see any class that would be forbidden by google app engine. Just upload it as any of your own code.
How to check available Python libraries on Google App Engine & add more?
Is SQLite available or we must use GQL with their database system only?
Thank you in advance.
SQLite is there (but since you cannot write to files, you must use it in a read-only way, or on a :memory: database).
App engine docs do a good job at documenting what's there. You can add any other pure-python library, typically as a zipfile of .py (NOT .pyc) files to upload in the main directory of your app (you can directly import from inside the zipfile, of course).
A few more pure-Python third-party libraries included with app engine are listed and documented here -- the paragraph on zipimport at this URL has a bit more details on the ways and limitations of using zipfiles to add more third-party pure-Python libs to your app.
Afaik, you can only use the GAE specific database.