How to POST/PUT scraped data to RESTful API - python

I have a scraper that outputs JSON. I would like to programmatically read this output (e.g. on a daily basis) and deserialize it into my Django model, through a RESTful API like Tastypie. I would like to check for any duplicate entries / validate data before updating the model.
What is the best practice and most seamless way to do this?
--
JSON Output From Scraper (returns structured data)
Note: exchange_id is the foreign key of the Exchange object in my Django model
{
"website": "http://www.igg.com/",
"exchange_id": 1,
"ticker": "8002",
"full_name": "IGG Inc"
}
Django model
class Company (models.Model):
ticker = models.CharField(max_length=10, null=True)
full_name = models.CharField(max_length=200, null=True)
exchange = models.ForeignKey(Exchange, null=True)
website = models.URLField(null=True)
def __unicode__(self):
return self.ticker
def website_url(self):
if self.website:
return '%s' % (self.website, self.website)
else:
return ''
website_url.allow_tags = True
class Meta:
verbose_name_plural = "Companies"

I am going to assume that your app is private, and only you will have access to it. What you can do is implement django-restless with a model form.
from restless.http import Http201, Http400
from restless.views import Endpoint
from .forms import NewCompanyForm
class APIEndpoint(Endpoint):
"""
Endpoint for posting json data to server
"""
def post(self, request):
company_form = NewCompanyForm(request.data)
if company_form.is_valid():
# Check for duplicate data
# ...
if unique:
company_form.save()
return Http201({"message": "Post successful"})
else:
return Http400(reason='Data was not unique')
else:
return Http400(reason='You did not post a valid input')
Also, here is an example app using this library, https://github.com/dobarkod/django-restless/blob/master/testproject/testapp/views.py

As far as I understand, you need some tool to post/put data to your Service via RestAPI. Look at slumber. It very simple and very good interacting with tastypie.

Related

Creating new model instance through views.py including args through a url

I am trying to create a new model instance every time a url is accessed. so far, I have the function working in my views.py, but when the new model instance is created, the fields are empty (because I have not specified what I'd like in those fields in views.)
views.py
def session_invent(self):
session = Session() # I can add field data in here, but I want to get it via the URL
session.save()
messages.success(self, f'session invented!')
return redirect('blog-home')
urls.py
path('session/invent/', views.session_invent, name="session-invent"),
models.py
class Session(models.Model):
uid = models.CharField(max_length=50)
cid = models.CharField(max_length=50)
qid = models.CharField(max_length=50)
aid = models.CharField(max_length=50)
session_date = models.DateTimeField(auto_now_add=True)
def qid_plus_aid(self):
return '{}_{}'.format(self.qid, self.aid)
def __str__(self):
return self.uid
def get_absolute_url(self):
return reverse('session-detail', kwargs={'pk': self.pk})
Ok, so here is what i am trying to pull off:
right now if i enter mywebsite.com/session/invent/ a new Session model instance is created with empty fields. Is there a way I can fill in those fields with args in the URL? For example, something like...
mywebsite.com/session/invent/?uid=test_uid&cid=test_cid&qid=test_qid&aid=test_aid
Finished Answered code:
From the answer below here is how the updated views.py should look:
def session_invent(request):
session = Session.objects.create(
uid=request.GET['uid'],
cid=request.GET['cid'],
qid=request.GET['qid'],
aid=request.GET['aid']
)
messages.success(request, f'session invented from URL!')
return redirect('blog-home')
So, If I enter the following URL, a new record is created in my database with the values in each field set from the URL:
mywebsite.com/session/invent/?uid=test_uid&cid=test_cid&qid=test_qid&aid=test_aidz
Yes, the parameters are stored in the querystring, and you can use request.GET for a dictionary-like representation of the querystring, so:
def session_invent(request):
session = Session.objects.create(
uid=request.GET['uid'],
cid=request.GET['cid'],
qid=request.GET['qid'],
aid=request.GET['aid']
)
messages.success(request, f'session invented!')
return redirect('blog-home')
This will raise a HTTP 500 in case one of the keys is missing in the request.GET. You can use request.GET.get(…) [Django-doc] to access an element with an optional default value.
A GET request is however not supposed to have side effects. It is furthermore quite odd for a POST request to have a querystring.

Object of type User is not JSON serializable

I am creating a an auto complete search for my web page and where I am trying to fetch name of users from database using ajax calls.
My AJAX call is running fine and going to designated URL.
I have tried using JSON encoder but that also did not work.
I am a bit new to DJANGO.Please help
My views.py
def autocomplete(request):
if request.is_ajax():
q = request.GET.get('search', '').capitalize()
search_qs = Profile.objects.filter(user__username__startswith=q)
results = []
print (q)
for r in search_qs:
results.append(r.user)
data= json.dumps(list(results), cls=DjangoJSONEncoder)
else:
data = 'fail'
mimetype = 'application/json'
return HttpResponse(data, mimetype)
My models.py
class Profile(models.Model):
user = models.OneToOneField(User, on_delete=models.CASCADE)
image = models.ImageField(default='default.jpg', upload_to='profile_pics')
def __str__(self):
return f'{self.user.username} Profile'
Error I am getting
TypeError: Object of type User is not JSON serializable
[] "GET /ajax_calls/search/?**term=he** HTTP/1.1" 500 15860
I don't know why you are querying through Profile, where you can query directly through User. I think the proper implementation should be like this:
from django.core.serializers import serialize
users = User.objects.filter(username__startswith=q)
str_data = serialize('json', users, cls=DjangoJSONEncoder). # Or you don't need to provide the `cls` here because by default cls is DjangoJSONEncoder
data = json.loads(str_data)
Documentation can be found here.

How do i verify user in database in google appengine app ( can anyone recommend the best way to do user authentication for google appengine app)?

I have following code in models.py i can sort database by only key but not by string ?
from google.appengine.ext import ndb
class Roles(ndb.Model):
name = ndb.StringProperty()
owner = ndb.KeyProperty(kind='User')
created = ndb.DateTimeProperty(required=True, auto_now_add = True)
class RESTMeta:
user_owner_property = 'owner'
include_output_properties = ['name']
class Users(ndb.Model):
name = ndb.StringProperty()
email = ndb.StringProperty()
password = ndb.StringProperty()
roles = ndb.KeyProperty(kind='Roles')
owner = ndb.KeyProperty(kind='User')
created = ndb.DateTimeProperty(required=True, auto_now_add = True)
class RESTMeta:
user_owner_property = 'owner'
include_output_properties = ['name']
And the following in api.py
app = webapp2.WSGIApplication([
RESTHandler(
'/api/roles', # The base URL for this model's endpoints
models.Roles, # The model to wrap
permissions={
'GET': PERMISSION_ANYONE,
'POST': PERMISSION_ANYONE,
'PUT': PERMISSION_OWNER_USER,
'DELETE': PERMISSION_ADMIN
},
# Will be called for every PUT, right before the model is saved (also supports callbacks for GET/POST/DELETE)
put_callback=lambda model, data: model
),
RESTHandler(
'/api/users', # The base URL for this model's endpoints
models.Users, # The model to wrap
permissions={
'GET': PERMISSION_ANYONE,
'POST': PERMISSION_ANYONE,
'PUT': PERMISSION_OWNER_USER,
'DELETE': PERMISSION_ADMIN
},
# Will be called for every PUT, right before the model is saved (also supports callbacks for GET/POST/DELETE)
put_callback=lambda model, data: model
)],debug=True, config = config)
I can successfully get by key in api\users?q=roles=key('key')
How do i get specific user by String api\users?q=email=String('String')
The Question is how do I do user auth for google appengine app
You seem to be asking so many questions in one.
To get user by email, simply do this:
users = Users.query(Users.email=='query_email').fetch(1)
#note fetch() always returns a list
if users:
user_exists = True
else:
user_exists = False
Please note, you may need to update your datastore index to support that query. The easiest way to do it is to first run the code in your local development server, and the index will be automatically updated for you.
To answer your second questions, for user authentication I would recommend Django's in-built User Authentication. Please note that you can always run vanilla django on appengine with a Flexible VM using CloudSQL instead of the Datastore.
Alternatively, you could use the Appengine Users API, though your users would need to have Google Accounts.

Better Way to validate file by content-type in django

So i have implement the snippet hosted below:
https://djangosnippets.org/snippets/1303/
here is my code so far:
models.py
class Vehicle(models.Model):
pub_date = models.DateTimeField('Date Published', auto_now_add=True)
make = models.CharField(max_length=200)
model = models.CharField(max_length=200)
picture = models.FileField(upload_to='picture')
def __unicode__(self):
return self.title
def get_absolute_url(self):
return reverse('recipe_edit', kwargs={'pk': self.pk})
views.py
def vehicle_list(request, template_name='vehicle/vehicle_list.html'):
if request.POST:
form = VehicleForm(request.POST, request.FILES)
if form.is_valid():
form.save()
return redirect('vehicle_list')
else:
form = VehicleForm() # Create empty form
vehicles = Vehicle.objects.all() # Retrieve all vehicles from DB
return render(request, template_name, {
'vehicles': vehicles,
'form': form
},context_instance=RequestContext(request))
forms.py
class VehicleForm(forms.ModelForm):
class Meta:
model = Vehicle
def clean_picture(self):
content = self.cleaned_data['picture']
content_type = content.content_type.split('/')[0]
if content_type in settings.CONTENT_TYPES:
if content.size > settings.MAX_UPLOAD_SIZE:
raise forms.ValidationError('Please keep file size under %s', filesizeformat(content.size))
else:
raise forms.ValidationError('File type is not supported')
From what i understand, this approach can still easily overridden by modifying the header. What i am asking is, whether there is a better approach for this situation?
In order to verify that the given file content matches the given content type by the client you need a full fledged database which describes how the content type can be detected.
You can rely on the libmagic project instead though. There are bindings for this library available on the pypi: python-magic
You need to adjust your VehicleForm so that it does the content type detection:
class VehicleForm(forms.ModelForm):
class Meta(object):
model = Vehicle
def clean_picture(self):
content = self.cleaned_data['picture']
try:
content.open()
# read only a small chunk or a large file could nuke the server
file_content_type = magic.from_buffer(content.read(32768),
mime=True)
finally:
content.close()
client_content_root_type = content.content_type.split('/')[0]
file_content_root_type = file_content_type.split('/')[0]
if client_content_root_type in settings.CONTENT_TYPES and \
file_content_root_type in settings.CONTENT_TYPES:
if content.size > settings.MAX_UPLOAD_SIZE:
raise forms.ValidationError('Please keep file size under %s',
filesizeformat(content.size))
else:
raise forms.ValidationError('File type is not supported')
return content
This chunk of code was written to show how it works, not with reducing redundant code in mind.
I wouldn't recommend doing this by yourself in production Code. I would recommend using already present code. I would recommend the ImageField form field if you really only need to verify that an image has been uploaded and can be viewed. Please notice that the ImageField uses Pillow to ensure that the image can be opened. This might or might not pose a threat to your server.
There are also several other projects available which exactly implements the desired feature of ensuring that a file of a certain content type has been uploaded.

Django Tastypie add Content-Length header

I am quite new in Django development. I have a Resource and Model:
Model
class Player(models.Model):
pseudo = models.CharField(max_length=32, unique=True)
ModelResource
class PlayerResource(ModelResource):
class Meta:
queryset = Player.objects.all()
resource_name = 'player'
authentication = BasicAuthentication()
authorization = Authorization()
serializer = Serializer(formats=['xml'])
filtering = {
'pseudo': ALL,
}
And I am requesting all the players with /api/v1/player/?format=xml but it seems that the response header : Content-Length is missing which causes some issues in my app. How can I add it in the response header ?
Thanks a lot.
The lack of Content-Length was due to the lack of a Middleware.
For more information :
Look here : How do I get the content length of a Django response object?
But you can manually add the Content-Length like this :
def create_response(self, request, data, response_class=HttpResponse, **response_kwargs):
desired_format = self.determine_format(request)
serialized = self.serialize(request, data, desired_format)
response = response_class(content=serialized, content_type=build_content_type(desired_format), **response_kwargs)
response['Content-Length'] = len(response.content)
return response
You can add Content-Length header by overriding create_reponse method in your own resource for ex:
class MyResource(ModelResource):
class Meta:
queryset=MyModel.objects.all()
def create_response(self, ...)
# Here goes regular code that do the method from tastypie
return response_class(content=serialized, content_type=build_content_type(desired_format), Content-Length=value, **response_kwargs)

Categories