django-social-auth profile builder

django-social-auth profile builder - python

I recently started playing around with django-social-auth and am looking for some help from the community to figure out the best way to move forward with an idea.
Once a user has registered you have access to his oauth token which allows you to pull certain data.
In my case I want to build a nice little profile based on the users avatar, location and maybe some other information if it's available.
Would the best way be to:
build a custom task for celery and pull the information and build the profile?
or, make use of signals to build the profile?

This really comes down to synchronous vs asynchronous. Django signals are synchronous that is they will block the response until they are completed. Celery tasks are asynchronous.
Which will be better will depend on whether the benefits of handling the profile building asynchronously outweighs the negatives of the maintaining the extra infrastructure necessary for celery.
It's basically impossible to answer this without a lot more specific information regarding the specifics of your situation.

Related

Best practices for often repeating background tasks in django and pythonanywhere

So, I am currently working on a django project hosted at pythonanywhere, which includes a feature for notifications, while also receiving data externally from sensors through AWS. I have been thinking of the best practice in order to implement this.
I currently have a simple implementation which is a view that checks all notifications and does the actions as needed if required, with an always-on task (which simply means a script that is running independently) sending a REST request to the server every minute.
Server side:
views.py:
def checkNotifications(request):
notificationsObject = notifications.objects.order_by('thing').values_list('thing').distinct()
thingsList = list(notificationsObject)
for thing in thingsList:
valuesDic = returnAllField(thing)
thingNotifications = notifications.objects.filter(thing=thing)
#Do stuff for each notification
urls:
path('notifications/',views.checkNotifications,name="checkNotification")
and the client just sents a GET request to my URL/notifications/. which works.
Now, while researching I saw some other options such as the ones discussed here with django background tasks and/or celery:
How to initialize repeating tasks using Django Background Tasks?
Celery task best practices in Django/Python
as well as some other options.
My question is: Is there a benefit to moving from my first implementation to this one? The only benefit I can see directly is avoid abuse from another service trying to hit my URl to check notifications too often, but I can/have a required authentication to avoid that. And, is there a certain "best practice" with regards to this, considering that I am checking with this repeating task quite so often, it almost feels like there should be a more proper/cleaner solution. For one, I am not sure if running a repeating task is the best option with pythonanywhere.
(https://help.pythonanywhere.com/pages/AsyncInWebApps/ suggests using always-on tasks, but it also mentions django background tasks)
Thank you

To use Django background tasks on PythonAnywhere you need to run it using an always-on task, so it is not an alternative, but just the other use of always-on tasks.
You can also access your Django code in your always-on task directly with some kind of long-running management command, so you do not need to hit your web app with a special request.

Python + Tornado for accounting software

We're up to building a "Accounting Software (will call it as AS)" for mid and large sized companies. So AS is going to be comprehensive and have a lot of modules in relation. AS will run on cloud and has SOA approach.
What I'd like to ask is: Is using Python + Tornado good idea for development? What are the advantages and disadvantages? Especially when features like async (non-blocking), multithreading etc. are considered.
If you do not support this idea, which infrastructure is the best for our future AS you think?

Tornado is a good decision, if you need a lot of realtime events to be shown in your web application. For example chat (event: deliver new messages to all members of chat) or maybe some other actions (someone gives you a like and you know about it immediately). This is where async approach have all pros.
Databases
When you choose database, keep in mind, that you need an async driver for it.
For example to use MongoDB best choice is motor. To use Postgresql you probably need a momoko.
The cons of tornado are:
hard to start coding, if you are not familiar with async approach. For example, with django (most popular blocking python web framework) it is more easier to start, you have a lot of batteries included
smaller community, than django has
no ORM included
no admin part of site, you'll need to create it by yourself
Also here you can find some additional thoughts about this topic and an example of tornado application.

Handling Notifications to Users

I am building an application on GAE that needs to notify users when another user performs an action that affects them. A real world analogy would be being alerted when your friend comments on your facebook status.
I understand how the Channel API works to actually send notifications in real time, but I'm trying to understand the most effective way to store those notifications in the datastore. Ideally, I want the notification code to be decoupled from the actual event being performed. Is this a good use case for Prospective Search? It doesn't quite feel right since I don't need to perform any kind of searching, just: if you see a new comment, create a new notification that is stored in the datastore and pushed to the client through the channel api if they are connected. I basically need a database trigger, but I don't think GAE supports that.

Why don't you want to couple the event and its notifications in the first place?
I think it may be interesting to know in order to help you with your use case :)
If I had to do this I would launch a task queue anytime I write to the datastore something that might fire events...
That way you can do your write and have a separate "layer" to process the events.
Triggers would not even be that good of an option, since your application would have to "poll" the database to push events to the users' UI.
I think your process (firing events) does not belong to the database, since it might as well need business rules that the datastore cannot provide : for example when a users ignores another one, you should not fire events.
The more business logic you put in your database system, the more complex it gets to maintain & scale IMHO...

Looks like GAE does support mimicking database triggers using hooks.
Hooks can be useful for
query caching
auditing Datastore activity per-user
mimicking database triggers

Django authentication from an automated source

I have a set of URL's in my django application that trigger certain actions or processes. This would be similar to cron jobs. I have a script that polls one or more of these URLS at some regular inverval and I'm interested in adding a layer of security.
I'd like to set up an account for the script and require authentication before the proccesses would execute. I've been reading around in the Django user authentication documentation, along with python's urllib2 library and I'm just a bit lost. I have some ideas of how this might be done, but I dont' have a lot of experience in security like this.
Any suggested reading materials?

I have a script that polls one or more of these URLS at some regular inverval and I'm interested in adding a layer of security.
Have you considered using Celery? Celery works seamlessly with Django. This will let you periodically run jobs using the same authentication mechanism as the rest of the project. You can also make things a bit more uniform by avoiding urllib2.

Recommendation for click/event tracking mechanisms (python, django, celery, mongo etc)

I'm looking into way to track events in a django application (events would generally be clicks tied to a specific unique user id).
These events would essentially contain an event type like "click" and then each click event would be assigned to a unique id (many events can go to one id) and each event would have a data set including items like referrer etc...
I have tried mixpanel, but for now the data api they are offering seems too limiting as I can't seem to find a way to get all of my data out by a unique id (apart from the event itself).
I'm looking into using django-eventracker, but curious about any others thought on the best way to do this. Mongo or CouchDb seem like a great choice here, but the celery/rabbitmq looks really attractive with mongo. Pumping these events into the existing applications db seems limiting at this point.
Anyways, this is just a thread to see what others thoughts are on this and how they have implemented something like this...
shoot

I am not familiar with the pre-packaged solutions you mention. Were I to design this from scratch, I'd have a simple JS collecting info on clicks and posting it back to the server via Ajax (using whatever JS framework you're already using), and on the server side I'd simply append that info to a log file for later "offline" processing -- so that would be independent of django or other server-side framework, essentially.
Appending to a log file is a very light-weight action, while DBs for web use are generally way optimized for read-intensive (not write-intensive) operation, so I agree with you that force fitting that info (as it trickes in) into the existing app's DB is unlikely to offer good performance.

You probably want to keep a flexible format for your logs to anticipate future needs or changes. In this sense, the schema-less document-oriented databases are nice. One advantage is that the structure of your data will be close to your application needs for whatever analyses you perform later (so, avoiding some of the inevitable parsing/data munging work).
If you're thinking about using mysql, postgresql or such, then you should look into something like rsyslog for buffering writes and avoiding the performance penalty with heavy logging. (I can't say much about celery and other queueing mechanisms for this type of thing, but they sound promising.)
Mongodb has a some nice features that make it amenable to logging such as capped collections. A summary can be found in this post.

If by click, you mean a click on a link that loads a new page (or performs an AJAX request), then what you aim to do is fairly straightforward. Web servers tend to keep plain-text logs about requests - with information about the user, time/date, referrer, the page requested, etc. You could examine these logs and mine the statistics you need.
On the other hand, if you have a web application where clicks don't necessarily generate server requests, then collecting click information with javascript is your best bet.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.