I am building a news aggregator app, and the backend can be separated (mostly) in two logical parts:
Crawling, information extraction, parsing, clustering, storing...
Serving the user requests
What I would like to do is:
a) create a heavy Google Compute Engine VM Instance to do the crawling (since that isn't doable with a Google App Engine, because the instance RAM is relatively small)
b) create a google app engine group of instances to serve the client requests which are light-weight and don't require much computational power per request
Is this possible to mix the two, Google App Engine and Google Compute Engine?
Or do I need to make the instances group on my own via GCE?
Another option you should explore is App Engine Flexible. (disclaimer, I work at Google on App Engine)
We allow you to build an App Engine application that has multiple modules. Those modules will run on GCE virtual machines, which are managed by App Engine. We auto-scale, auto-provision, etc. Under the hood, we're actually provisioning a managed instance group and auto-scaler the same way you would with GCE (just no work). You can also customize the CPU+Memory on the machine we run your app.
That way, both your front end and back end can run in the same project. Check out:
https://cloud.google.com/appengine/docs/flexible/python/
Hope this helps!
Related
I am trying to create a django project which uses AppEngine's task queue, and would like to test it locally before deploying (using gclouds dev_appserver.py`).
I can't seem to find resources that help in local development, and the closest thing was a Medium article that helps setting up django with Datastore (https://medium.com/#bcrodrigues/quick-start-django-datastore-app-engine-standard-3-7-dev-appserver-py-way-56a0f90c53a3).
Does anyone have an example that I could look into for understanding how to start my implementation?
I don't think you can test it locally. I'd create a new app engine project and test it there. You should be able to stay within the free quota.
Once you have it working, you can write unit tests with mocks of task queue API calls.
I am new with Google App Engine and I am a little bit confused with answers which are related to the connections to a local Datastore.
My ultimate goal is to stream data from a Google Datastore towards a Big Query Dataset, similar to https://blog.papercut.com/google-cloud-dataflow-data-migration/. I have a copy of this DataStore locally, accessible when I run a local App Engine, i.e. I can access it through an admin console when I use $[GOOGLE_SDK_PATH]/dev_appserver.py --datastore_path=./datastore.
I would like to know if it is possible to connect to this datastore using services outside of the App Engine Instance, with python google-cloud-datastore or even Apache Beam ReadFromDatastore method. If not, should I use the Datastore Emulator with the App Engine Datastore generated file ?
If anyone has an idea on how to proceed, I would be more than grateful to know how to do.
If it is possible it would have to be through the Datastore Emulator, which is capable to also serve apps other than App Engine. But it ultimately depends on the implementation of the libraries you intend to use - if the underlying access methods are capable of understanding the DATASTORE_EMULATOR_HOST environment variable pointing to a running datastore emulator and use that instead of the real Datastore. I guess you'll just have to give it a try.
But be aware that the local storage dir internal format used by the Datastore Emulator may be different than that used by the development server, so make a backup of your .datastore dir before trying stuff, just in case. From Local data format conversion:
Currently, the local Datastore emulator stores data in sqlite3 while
the Cloud Datastore Emulator stores data as Java objects.
When dev_appserver is launched with legacy sqlite3 data, the data will
be converted to Java objects. The original data is backed up with the
filename {original-data-filename}.sqlitestub.
In the google app engine firebase tic-tac-toe example here: https://cloud.google.com/solutions/using-firebase-real-time-events-app-engine
nbd is used to create the Game data model. This model is used in the code to store the state of the tic-tac-toe game. I thought nbd was used to store data in Cloud Datastore, but, as far as I can tell, nothing is being stored in the Cloud Datastore of the associated google cloud project. I think this is because I am launching the app in 'dev mode' with python dev_appserver.py app.yaml In this case, is the data being stored in memory instead of actually being written to cloud datastore?
You're correct, running the application locally is using a datastore emulation, contained inside dev_appserver.py.
The data is not stored in memory, but on the local disk. So even if the development server restarts it will still find the "datastore" data written in a previous execution.
You can check the data actually saved using the local development server's admin interface at http://localhost:8000/datastore
Dan's answer is correct; your "dev_appserver.py" automatically creates a local datastore.
I would like to add that if you do wish to emulate a real Cloud Datastore environment and be able to generate usable indexes for your production Cloud Datastore, we have an emulator that can do that. I assume that's why you want your dev app to use the real Datastore?
Either way, if you just doing testing and need a persistent storage to test (not for production), then both the default devserver local storage and the Cloud Datastore Emulator would suffice.
I am currently running a python app on GCE (app engine standard) it's a small app only running under 1 instance - and when I load test it increases/decreases instance as needed and I get billed for usages. Fine and Dandy.
Now when it comes node.js as per documentation I am on beta version of app engine flexible (previously managed VM's - Thanks marketing). I created an app deployed it for the first time and it created 2 compute engine instance (1 CPU small 1.7G mem). As far as I can tell looking at the documentation. These 2 instance will be running even if there is no load? is it correct? So the app engine flexible has a minimum 2 instance deployment correct? In app engine standard I have days where I get charged nothing cos no one visited the app (I know it's not popular - that's another issue). But in case of nodejs/flexible I will be charged for those days? correct?
Thanks for clarification? hmm.. this gives a whole new meaning to the word "flexible"
I built a complex set of NDB models using ndb.Model and ndb.PolyModel with lots of StructuredProperty and specialized ndb.Property attributes for my Python App Engine app.
Is it possible to use the google.ext.ndb etc.. libraries that work on App Engine also on a Compute Engine instance?
This way I could use the same, comfortable NDB object model on both App Engine and Compute Engine for storing and querying data.
NDB does not currently support Google Cloud Datastore (the API you can call from Compute Engine), but we are working on adding it. We don't have a timeline to share at this time, but you can receive notifications about the feature by subscribing this this issue on our GitHub tracker:
https://github.com/GoogleCloudPlatform/google-cloud-datastore/issues/2