Getting Objects with urllib2 - python

I have two GAE apps working in conjunction. One holds an object in a database, the other gets that object from the first app. Below I have the bit of code where the first app is asked for and gives the Critter Object. I am trying to access the first app's object via urllib2, is this really possible? I know it can be used for json but can it be used for objects?
Just for some context I am developing this as a project for a class. The students will learn how to host a GAE app by creating their critters. Then they will give me the url for their critters and my app will use the urls to collect all of their critters then put them into my app's world.
I've only recently heard about pickle, have not looked into yet, might that be a better alternative?
critter.py:
class Access(webapp2.RequestHandler):
def get(self):
creature = CritStore.all().order('-date').get()
if creature:
stats = loads(creature.stats)
return SampleCritter(stats)
else:
return SampleCritter()
map.py:
class Out(webapp2.RequestHandler):
def post(self):
url = self.request.POST['url']#from a simple html textbox
critter = urllib2.urlopen(url)
...work with critter as if it were the critter object...

Yes you can use pickle.
Here is some sample code to transfer an entity, including the key :
entity_dict = entity.to_dict() # First create a dict of the NDB entity
entity_dict['entity_ndb_key_safe'] = entity.key.urlsafe() # add the key of the entity to the dict
pickled_data = pickle.dumps(entity_dict, 1) # serialize the object
encoded_data = base64.b64encode(pickled_data) # encode it for safe transfer
As an alternative for urllib2 you can use the GAE urlfetch.fetch()
In the requesting app you can :
entity_dict = pickle.loads(base64.b64decode(encoded_data))

Related

How to use django commands to feed a db with an external API?

I'm learning django and I want to feed my django db with https://pokeapi.co API so i can make a drop down list on HTML with every pokemon name up to date.
fetchnames.py
import requests as r
def nameslist():
payload = {'limit':809}
listpokemons = []
response = r.get('https://pokeapi.co/api/v2/pokemon', params=payload)
pokemons = response.json()
for line in pokemons['results']:
listpokemons.append(line['name'])
return listpokemons
### Function that request from API and returns a list of pokemon names (['Bulbassaur', 'Ivyssaur',...)
core_app/management/commands/queryapi.py
from core_app.models import TablePokemonNames
from core_app.fetchnames import nameslist
class FetchApi(BaseCommand):
help = "Update DB with https://pokeapi.co/"
def add_model_value(self):
table = TablePokemonNames()
table.names = nameslist()
table.save()
core_app/models.py
class TablePokemonNames(models.Model):
id = models.AutoField(primary_key=True)
names = models.CharField(max_length=100)
i'm pretty sure that i'm missing a lot since i'm still learning to use django and i'm still confuse on how should i use django commands, but, i tried to make a django command with nameslist() function and nothing happend on the db, there is something wrong with using a list to feed a db?

Odoo - How to acess recordsets on web controller

I am using web controller in odoo 8 to make a REST API that will get some data and return values from the database. The problem is that I am not able to get the database from the builtin ORM.
I tried to call osv.pool.get() but gave me the error:
AttributeError: type object 'Model' has no attribute 'pool'
Odoo 8 apparently uses recordsets, but I can't use it too, and couldn't find anything usefull on docs.
How can I browse database data on web controller?
My code:
class TestWebService(http.Controller):
#http.route('/test', type='http', auth='none')
def test(self):
objects = osv.osv.pool.get("some_table")
# I need to get the objects from some_table and search them
return "Hello World"
Try Following
myobj = request.env['some.table']

Does web scraping have patterns?

I have not done too much of web scraping in my experience. So far I am using python and using BeautifulSoup4 to scrape the hackernews page.
Was just wondering if there are patterns I should keep in mind before doing scraping. Right now the code looks very ugly and I feel like a hack.
Code:
import requests
from bs4 import BeautifulSoup
class Command(BaseCommand):
page = {}
td_count = 2
data_count = 0
def handle(self, *args, **options):
for i in range(1,4):
self.page_no = i
self.parse()
print self.page[1]
def get_result(self):
return requests.get('https://news.ycombinator.com/news?p=%s'% self.page_no)
def parse(self):
soup = BeautifulSoup(self.get_result().text, 'html.parser')
for x in soup.find_all('table')[2].find_all('tr'):
self.data_count += 1
self.page[self.data_count] = {'other_data' : None, 'url' : ''}
if self.td_count%3 == 0:
try:
subtext = x.find_all('td','subtext')[0]
self.page[self.data_count - 1]['other_data'] = subtext
except IndexError:
pass
title = x.find_all('td', 'title')
if title:
try:
self.page[self.data_count]['url'] = title[1].a
print title[1].a
except IndexError:
print 'Done page %s'%self.page_no
self.td_count +=1
Actually I behave scrappable data as part of my domain(business) data, which allows me to use Domain Driven Design to structure the problem:
Entities and Value Objects
I use entities and value objects to store the correct extracted information from data into my programming language data structures, so I can work with them in a great way.
Repository Pattern
I use repository pattern to delegate the job of gathering data to a different class. The repository class is given a site and then fetches the data and pre-builds the entities if needed.
Transformer/Presenter pattern
After fetching the data from the repository, I pass the html data to a presenter class. The presenter class has the duty of creating my business entity/value objects from the given HTML string.
Service Layer
If there is more process than those described above, I make a service class which is a wrapper around the problem, It calls the repository , gives the fetched data to the presenter the presenter builds the entities, and done, the result may be used by another service to be stored in a SQL database.
If you are familiar with PHP, I have programmed a small app in Laravel which fetches the alexa rank of a given website each 15mins and notifies the subscribers of that website by Email.
Github repository : Alexa Watcher
Folder of Repository classes
Command line application layer class which calls the service
The Service class which is also a presenter that builds needed entities.
The Service class which pushes detected changes to subscriber emails.

Updating DataStore JSON values using endpoints (Python)

I am trying to use endpoints to update some JSON values in my datastore. I have the following Datastore in GAE...
class UsersList(ndb.Model):
UserID = ndb.StringProperty(required=True)
ArticlesRead = ndb.JsonProperty()
ArticlesPush = ndb.JsonProperty()
In general what I am trying to do with the API is have the method take in a UserID and a list of articles read (with an article being represented by a dictionary holding an ID and a boolean field saying whether or not the user liked the article). My messages (centered on this logic) are the following...
class UserID(messages.Message):
id = messages.StringField(1, required=True)
class Articles(messages.Message):
id = messages.StringField(1, required=True)
userLiked = messages.BooleanField(2, required=True)
class UserIDAndArticles(messages.Message):
id = messages.StringField(1, required=True)
items = messages.MessageField(Articles, 2, repeated=True)
class ArticleList(messages.Message):
items = messages.MessageField(Articles, 1, repeated=True)
And my API/Endpoint method that is trying to do this update is the following...
#endpoints.method(UserIDAndArticles, ArticleList,
name='user.update',
path='update',
http_method='GET')
def get_update(self, request):
userID = request.id
articleList = request.items
queryResult = UsersList.query(UsersList.UserID == userID)
currentList = []
#This query always returns only one result back, and this for loop is the only way
# I could figure out how to access the query results.
for thing in queryResult:
currentList = json.loads(thing.ArticlesRead)
for item in articleList:
currentList.append(item)
for blah in queryResult:
blah.ArticlesRead = json.dumps(currentList)
blah.put()
for thisThing in queryResult:
pushList = json.loads(thisThing.ArticlesPush)
return ArticleList(items = pushList)
I am having two problems with this code. The first is that I can't seem to figure out (using the localhost Google APIs Explorer) how to send a list of articles to the endpoints method using my UserIDAndArticles class. Is it possible to have a messages.MessageField() as an input to an endpoint method?
The other problem is that I am getting an error on the 'blah.ArticlesRead = json.dumps(currentList)' line. When I try to run this method with some random inputs, I get the following error...
TypeError: <Articles
id: u'hi'
userLiked: False> is not JSON serializable
I know that I have to make my own JSON encoder to get around this, but I'm not sure what the format of the incoming request.items is like and how I should encode it.
I am new to GAE and endpoints (as well as this kind of server side programming in general), so please bear with me. And thanks so much in advance for the help.
A couple things:
http_method should definitely be POST, or better yet PATCH because you're not overwriting all existing values but only modifying a list, i.e. patching.
you don't need json.loads and json.dumps, NDB does it automatically for you.
you're mixing Endpoints messages and NDB model properties.
Here's the method body I came up with:
# get UsersList entity and raise an exception if none found.
uid = request.id
userlist = UsersList.query(UsersList.UserID == uid).get()
if userlist is None:
raise endpoints.NotFoundException('List for user ID %s not found' % uid)
# update user's read articles list, which is actually a dict.
for item in request.items:
userslist.ArticlesRead[item.id] = item.userLiked
userslist.put()
# assuming userlist.ArticlesPush is actually a list of article IDs.
pushItems = [Article(id=id) for id in userlist.ArticlesPush]
return ArticleList(items=pushItems)
Also, you should probably wrap this method in a transaction.

Best way to make subapps with Traversal

Ok so I have my apps, that takes requests from root / Almost everything is using traversal.
But i'd like to make on top of that site a rest api.
So I'm off with two choices. I either separate the that in two different apps and put that rest application to : rest.site.com, Or I can move it to site.com/rest/*traversal
If I'm doing "/rest/*traversal", I guess I'll have to add a route called rest_traversal where the traversal path will be *traversal with the route /rest/*traversal. I did that once for an admin page.
I was wondering if there was a cleanest way to do that. I tried to use virtual_root, but as I understand virtual_root is actually getting added to the path for traversal.
like having virtual_root = /cms and requesting /fun will create the following path /cms/fun
I on the other hand wish to have /cms/fun turned into /fun
I know this has been answered already, but in case someone arrives here looking for another possible way to make "subapps" and using them in pyramid, I wanted to point out that some interesting things can be done with pyramid.wsgi
"""
example of wsgiapp decorator usage
http://docs.pylonsproject.org/projects/pyramid/en/1.3-branch/api/wsgi.html
"""
from pyramid.wsgi import wsgiapp2, wsgiapp
from pyramid.config import Configurator
from webob import Request, Response
import pprint
# define some apps
def wsgi_echo(environ, start_response):
"""pretty print out the environ"""
response = Response(body=pprint.pformat({k: v for k, v in environ.items()
if k not in ["wsgi.errors",
"wsgi.input",
"SCRIPT_NAME"]}))
return response(environ, start_response)
print Request.blank("/someurl").send(wsgi_echo).body
# convert wsgi app to a pyramid view callable
pyramid_echo = wsgiapp(wsgi_echo)
pyramid_echo_2 = wsgiapp2(wsgi_echo)
# wire up a pyramid application
config = Configurator()
config.add_view(pyramid_echo, name="foo") # /foo
config.add_view(pyramid_echo, name="bar") # /bar
config.add_view(pyramid_echo_2, name="foo_2") # /foo
config.add_view(pyramid_echo_2, name="bar_2") # /bar
pyramid_app = config.make_wsgi_app()
#call some urls
foo_body = Request.blank("/foo").send(pyramid_app).body
bar_body = Request.blank("/bar").send(pyramid_app).body
foo_body_2 = Request.blank("/foo_2").send(pyramid_app).body
bar_body_2 = Request.blank("/bar_2").send(pyramid_app).body
# both should be different because we arrived at 2 different urls
assert foo_body != bar_body, "bodies should not be equal"
# should be equal because wsgiapp2 fixes stuff before calling
# application in fact there's an additional SCRIPT_NAME in the
# environment that we are filtering out
assert foo_body_2 == bar_body_2, "bodies should be equal"
# so how to pass the path along? like /foo/fuuuu should come back
# /fuuuu does it
foo_body = Request.blank("/foo_2/fuuuu").send(pyramid_app).body
assert "'/fuuuu'," in foo_body, "path didn't get passed along"
# tldr: a wsgi app that is decorated with wsgiapp2 will recieve data
# as if it was mounted at "/", any url generation it has to do should
# take into account the SCRIPT_NAME variable that may arrive in the
# environ when it is called
If you're using traversal already, why not just use it to return your "rest API root" object when Pyramid traverses to /rest/? From there, everything will work naturally.
class ApplicationRoot(object):
def __getitem__(self, name):
if name == "rest":
return RestAPIRoot(parent=self, name=name)
...
If your "application tree" and "API tree" have the same children and you want to have different views registered for them depending on which branch of the tree the child is located in, you can use containment view predicate to register your API views, so they will only match when the child is inside the "API branch":
containment
This value should be a reference to a Python class or interface that a
parent object in the context resource’s lineage must provide in order
for this view to be found and called. The resources in your resource
tree must be “location-aware” to use this feature.
If containment is not supplied, the interfaces and classes in the
lineage are not considered when deciding whether or not to invoke the
view callable.
Another approach would be not to build a separate "API tree" but to use your "main" application's "URI-space" as RESTful API. The only problem with this is that GET and possibly POST request methods are already "taken" on your resources and mapped to your "normal" views which return HTML or consume HTTP form POSTs. There are numerous ways to work around this:
register the API views with a separate name, so, say GET /users/123 would return HTML and GET /users/123/json would return a JSON object. Similarly, POST /users/123 would expect HTTP form to be posted and POST /users/123/json would expect JSON. A nice thing about this approach is that you can easily add, say, an XML serializer at GET /users/123/xml.
use custom view predicates so GET /users/123 and GET /users/123?format=json are routed to different views. Actually, there's a built-in request_param predicate for that since Pyramid 1.2
use xhr predicate to differentiate requests based on HTTP_X_REQUESTED_WITH header or accept predicate to differentiate on HTTP_ACCEPT header

Categories