Proper way of avoiding to store the same attachment twice

Proper way of avoiding to store the same attachment twice - python

I'm using the project.task model where delegation creates a parent/child link between both.
When delegating I would like the person who gets the delegated task to also have access to the attachments on the original task, how could I avoid to have to really copy it?
I've thought about using an <act_window> or a wizard which checks if there is a parent task and if so (also) show the parent task attachments.
The problem with act_window is that you would need to specify 2 different act_window records and that would still only cover one parent and one child relation (the task could be delegated more)
For the wizard approach it seems to be a lot of overkill work for something that could maybe be solved easier (hence the question).

I think building a wizard is the only way that will work, because there isn't a real link between attachment and project.task. If I were you, I would build a wizard that walks the parent relation to build a list of all ancestor task ids, plus the current task id. Then have the wizard open the attachment window using that list of ids as one of the domain search criteria.

Related

Lock the system

I am writing a mini-CRM system that two users can login at the same time and they can answer received messages. However, the problem is that they might response the same message because messages can only disappear when they click "Response" button. Is there any suggestion to me to lock the system?

This sounds like a great case for an 'optimistic locking' approach. Here are two methods I've used with much success. Often, I combine the two methods to ensure no data is lost by mis-matched object instances on POSTs.
The easy way: Add a version field to your model. On POST, check the POSTed version number vs. the object's version number. If they don't match, raise a validation error. If they do match, increment the version by 1.
More elegant approach: Django's Generic Relations (part of the content types framework). A table which stores the content_type and object_id of the object that's locked, along with the user who 'owns' that lock. Check this lock on GET requests, and disable POSTing if it's 'locked' by another user. 'Release' the lock on a page unload, session end, or browser exit. You can get very creative with this approach.

Add some boolean field (answered, is_answered.. etc) and check on every "Response" click if it answered.
Hope it will help.

How do I generate alfresco sites where different users can see different content?

I'd like to build a site which contains a few folders for different teams. However, there is one team that is common to one folder on all sites. I do not want that team to be allowed to see the content of the other folders. I tried creating a folder in a site and giving permission to a user via CMIS (in python), however that folder doesn't seem to be accessible from their share UI.
I'm not even sure this is the best way to do this. The organisation of the information requires that the areas are in the same place (i.e. the same site) however if you have access to the site you seem to have access to all the folders (I can't figure out a way of removing access to a folder on a site for a single user)
Also the requirement here is that it needs to be done programmatically; I'm not bothered particularly about using CMIS and if I have to rewrite the file/folder code, but in my head the best thing to do would be to add a widget on the share UI that access all the folders that a user has access to in the absence of being able to deny access to a folder.

As Gagravarr says you are going to have to break inheritance on the documentLibrary folder to get it to work like you want. Breaking inheritance is not supported by CMIS so you'll have to write your own web script to do that.
I would manually set the permissions until it works like you want it to, then once you get that working, write a web script that puts it into effect for all of your sites.

A share site comes with a security model where every person goes in at least one of four groups: Manager, Collaborator, Contributor and Consumer - either directly or indirectly via another group. Access is generally managed by access control lists. You might want to look at Alfresco: Folder permission by role to get an idea how that works. The site security model does not work for you if you find the need for more than those four groups to map access control. It can still be done, but I would strongly discourage you implementing that because it can get very hard to understand quickly.

How to setup a 3-tier web application project

EDIT:
I have added [MVC] and [design-patterns] tags to expand the audience for this question as it is more of a generic programming question than something that has direclty to do with Python or SQLalchemy. It applies to all applications with business logic and an ORM.
The basic question is if it is better to keep business logic in separate modules, or to add it to the classes that our ORM provides:
We have a flask/sqlalchemy project for which we have to setup a structure to work in. There are two valid opinions on how to set things up, and before the project really starts taking off we would like to make our minds up on one of them.
If any of you could give us some insights on which of the two would make more sense and why, and what the advantages/disadvantages would be, it would be greatly appreciated.
My example is an HTML letter that needs to be sent in bulk and/or displayed to a single user. The letter can have sections that display an invoice and/or a list of articles for the user it is addressed to.
Method 1:
Split the code into 3 tiers - 1st tier: web interface, 2nd tier: processing of the letter, 3rd tier: the models from the ORM (sqlalchemy).
The website will call a server side method in a class in the 2nd tier, the 2nd tier will loop through the users that need to get this letter and it will have internal methods that generate the HTML and replace some generic fields in the letter, with information for the current user. It also has internal methods to generate an invoice or a list of articles to be placed in the letter.
In this method, the 3rd tier is only used for fetching data from the database and perhaps some database related logic like generating a full name from a users' first name and last name. The 2nd tier performs most of the work.
Method 2:
Split the code into the same three tiers, but only perform the loop through the collection of users in the 2nd tier.
The methods for generating HTML, invoices and lists of articles are all added as methods to the model definitions in tier 3 that the ORM provides. The 2nd tier performs the loop, but the actual functionality is enclosed in the model classes in the 3rd tier.
We concluded that both methods could work, and both have pros and cons:
Method 1:
separates business logic completely from database access
prevents that importing an ORM model also imports a lot of methods/functionality that we might not need, also keeps the code for the model classes more compact.
might be easier to use when mocking out ORM models for testing
Method 2:
seems to be in line with the way Django does things in Python
allows simple access to methods: when a model instance is present, any function it
performs can be immediately called. (in my example: when I have a letter-instance available, I can directly call a method on it that generates the HTML for that letter)
you can pass instances around, having all appropriate methods at hand.

Normally, you use the MVC pattern for this kind of stuff, but most web frameworks in python have dropped the "Controller" part for since they believe that it is an unnecessary component. In my development I have realized, that this is somewhat true: I can live without it. That would leave you with two layers: The view and the model.
The question is where to put business logic now. In a practical sense, there are two ways of doing this, at least two ways in which I am confrontet with where to put logic:
Create special internal view methods that handle logic, that might be needed in more than one view, e.g. _process_list_data
Create functions that are related to a model, but not directly tied to a single instance inside a corresponding model module, e.g. check_login.
To elaborate: I use the first one for strictly display-related methods, i.e. they are somehow concerned with processing data for displaying purposes. My above example, _process_list_data lives inside a view class (which groups methods by purpose), but could also be a normal function in a module. It recieves some parameters, e.g. the data list and somehow formats it (for example it may add additional view parameters so the template can have less logic). It then returns the data set to the original view function which can either pass it along or process it further.
The second one is used for most other logic which I like to keep out of my direct view code for easier testing. My example of check_login does this: It is a function that is not directly tied to display output as its purpose is to check the users login credentials and decide to either return a user or report a login failure (by throwing an exception, return False or returning None). However, this functionality is not directly tied to a model either, so it cannot live inside an ORM class (well it could be a staticmethod for the User object). Instead it is just a function inside a module (remember, this is Python, you should use the simplest approach available, and functions are there for something)
To sum this up: Display logic in the view, all the other stuff in the model, since most logic is somehow tied to specific models. And if it is not, create a new module or package just for logic of this kind. This could be a separate module or even a package. For example, I often create a util module/package for helper functions, that are not directly tied for any view, model or else, for example a function to format dates that is called from the template but contains so much python could it would be ugly being defined inside a template.
Now we bring this logic to your task: Processing/Creation of letters. Since I don't know exactly what processing needs to be done, I can only give general recommendations based on my assumptions.
Let's say you have some data and want to bring it into a letter. So for example you have a list of articles and a costumer who bought these articles. In that case, you already have the data. The only thing that may need to be done before passing it to the template is reformatting it in such a way that the template can easily use it. For example it may be desired to order the purchased articles, for example by the amount, the price or the article number. This is something that is independent of the model, the order is now only display related (you could have specified the order already in your database query, but let's assume you didn't). In this case, this is an operation your view would do, so your template has the data ready formatted to be displayed.
Now let's say you want to get the data to create a specifc letter, for example a list of articles the user bough over time, together with the date when they were bought and other details. This would be the model's job, e.g. create a query, fetch the data and make sure it is has all the properties required for this specifc task.
Let's say in both cases you with to retrieve a price for the product and that price is determined by a base value and some percentages based on other properties: This would make sense as a model method, as it operates on a single product or order instance. You would then pass the model to the template and call the price method inside it. But you might as well reformat it in such a way, that the call is made already in the view and the template only gets tuples or dictionaries. This would make it easier to pass the same data out as an API (see below) but it might not necessarily be the easiest/best way.
A good rule for this decision is to ask yourself If I were to provide a JSON API additionally to my standard view, how would I need to modify my code to be as DRY as possible?. If theoretical is not enough at the start, build some APIs for the templates and see where you need to change things to the API makes sense next to the views themselves. You may never use this API and so it does not need to be perfect, but it can help you figure out how to structure your code. However, as you saw above, this doesn't necessarily mean that you should do preprocessing of the data in such a way that you only return things that can be turned into JSON, instead you might want to make some JSON specifc formatting for the API view.
So I went on a little longer than I intended, but I wanted to provide some examples to you because that is what I missed when I started and found out those things via trial and error.

How to change ancestor of an NDB record?

In the High-Replication Datastore (I'm using NDB), the consistency is eventual. In order to get a guaranteed complete set, ancestor queries can be used. Ancestor queries also provide a great way to get all the "children" of a particular ancestor with kindless queries. In short, being able to leverage the ancestor model is hugely useful in GAE.
The problem I seem to have is rather simplistic. Let's say I have a contact record and a message record. A given contact record is being treated as the ancestor for each message. However, it is possible that two contacts are created for the same person (user error, different data points, whatever). This situation produces two contact records, which have messages related to them.
I need to be able to "merge" the two records, and bring put all the messages into one big pile. Ideally, I'd be able to modify ancestor for one of the record's children.
The only way I can think of doing this, is to create a mapping and make my app check to see if record has been merged. If it has, look at the mappings to find one or more related records, and perform queries against those. This seems hugely inefficient. Is there more of "by the book" way of handling this use case?

The only way to change the ancestor of an entity is to delete the old one and create a new one with a new key. This must be done for all child (and grand child, etc) entities in the ancestor path. If this isn't possible, then your listed solution works.
This is required because the ancestor path of an entity is part of its unique key. Parents of entities (i.e., entities in the ancestor path) need not exist, so changing a parent's key will leave the children in the datastore with no parent.

Is it safe to pass Google App Engine Entity Keys into web pages to maintain context?

I have a simple GAE system that contains models for Account, Project and Transaction.
I am using Django to generate a web page that has a list of Projects in a table that belong to a given Account and I want to create a link to each project's details page. I am generating a link that converts the Project's key to string and includes that in the link to make it easy to lookup the Project object. This gives a link that looks like this:
My Project Name
Is it secure to create links like this? Is there a better way? It feels like a bad way to keep context.
The key string shows up in the linked page and is ugly. Is there a way to avoid showing it?
Thanks.

There is few examples, in GAE docs, that uses same approach, and also Key are using characters safe for including in URLs. So, probably, there is no problem.
BTW, I prefer to use numeric ID (obj_key.id()), when my model uses number as identifier, just because it's looks not so ugly.

Whether or not this is 'secure' depends on what you mean by that, and how you implement your app. Let's back off a bit and see exactly what's stored in a Key object. Take your key, go to shell.appspot.com, and enter the following:
db.Key(your_key)
this returns something like the following:
datastore_types.Key.from_path(u'TestKind', 1234, _app=u'shell')
As you can see, the key contains the App ID, the kind name, and the ID or name (along with the kind/id pairs of any parent entities - in this case, none). Nothing here you should be particularly concerned about concealing, so there shouldn't be any significant risk of information leakage here.
You mention as a concern that users could guess other URLs - that's certainly possible, since they could decode the key, modify the ID or name, and re-encode the key. If your security model relies on them not guessing other URLs, though, you might want to do one of a couple of things:
Reconsider your app's security model. You shouldn't rely on 'secret URLs' for any degree of real security if you can avoid it.
Use a key name, and set it to a long, random string that users will not be able to guess.
A final concern is what else users could modify. If you handle keys by passing them to db.get, the user could change the kind name, and cause you to fetch a different entity kind to that which you intended. If that entity kind happens to have similarly named fields, you might do things to the entity (such as revealing data from it) that you did not intend. You can avoid this by passing the key to YourModel.get instead, which will check the key is of the correct kind before fetching it.
All this said, though, a better approach is to pass the key ID or name around. You can extract this by calling .id() on the key object (for an ID - .name() if you're using key names), and you can reconstruct the original key with db.Key.from_path('kind_name', id) - or just fetch the entity directly with YourModel.get_by_id.

After doing some more research, I think I can now answer my own question. I wanted to know if using GAE keys or ids was inherently unsafe.
It is, in fact, unsafe without some additional code, since a user could modify URLs in the returned webpage or visit URL that they build manually. This would potentially let an authenticated user edit another user's data just by changing a key Id in a URL.
So for every resource that you allow access to, you need to ensure that the currently authenticated user has the right to be accessing it in the way they are attempting.
This involves writing extra queries for each operation, since it seems there is no built-in way to just say "Users only have access to objects that are owned by them".

I know this is an old post, but i want to clarify one thing. Sometimes you NEED to work with KEYs.
When you have an entity with a #Parent relationship, you cant get it by its ID, you need to use the whole KEY to get it back form the Datastore. In these cases you need to work with the KEY all the time if you want to retrieve your entity.

They aren't simply increasing; I only have 10 entries in my Datastore and I've already reached 7001.
As long as there is some form of protection so users can't simply guess them, there is no reason not to do it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.