how to delete document without deleting collection in firestore? - python

I want to create some kind of collection which cannot be deleted. The reason I made it like that is because when the document is empty my website can't do the data creation process
is it possible to create a collection in firestore that has an empty document?
i use python firebase_admin

In Firestore, there is no such thing as an "empty collection". Collections simply appear in the console when there is a document present, and disappear when the last document is deleted. If you want to know if a collection is "empty", then you can simply query it and check that it has 0 documents.
Ideally, your code should be robust enough to handle the possibility of a missing document, because Firestore will do nothing to stop a document from being deleted if that's what you do in the console or your code.

Related

Not able to retrieve data from firebase in django

I've been trying to get data from Firebase into my Django app the issue i face is that some of the documents are retrieved and some aren't. A really weird thing I noticed is when on the admin page the documents that can be accessed are highlighted in a darker shade than the ones that we aren't able to get from the database.
The highlighted issue is shown in the image above. The first document is highlighted but the second isn't and the first is read by the django function below
def home(request, user=""):
db = firestore.client()
docs = db.collection(u'FIR_NCR').stream()
for doc in docs:
print(doc.id,end="->")
s = db.collection(u'FIR_NCR').document(u'{}'.format(doc.id)).collection(u'all_data').get()
print(s[0].id,end="->")
print(s[0].to_dict())
return render(request, "home.html", {"user":user})
In this docs is not able to get the complete list of the documents necessary and hence the issue.
It would be wonderful if someone could help me understand what I'm doing wrong. T.I.A.
The document ID isn't actually highlighted. The difference between the first and the second ID is that the second one is in italics. That means there is no actual document with that ID. The reason why the Firestore console shows you a document ID at all for a missing document is because it has a nested subcollection. You can click into that missing document, then again click into the subcollection.
In Firestore, you can have subcollections nested under documents that don't exist. This is OK. Just be aware that these missing documents can't be discovered by a normal query in the collection where you see them in the console.

Elasticsearch for python - Get documents deleted by query

I'm using Elasticsearch in python, and I can't figure out how to get the ids of the documents deleted by the delete_by_query() method! By default it only the number of documents deleted.
There is a parameter called _source that if set to True should return the source of the deleted documents. This doesn't happen, nothing changes.
Is there a good way to know which document where deleted?
The delete by query endpoint only returns a macro summary of what happened during the task, mainly how many documents were deleted and some other details.
If you want to know the IDs of the document that are going to be deleted, you can do a search (with _source: false) before running the delete by query operation and you'll get the expected IDs.

Mongoengine: Check if document is already in DB

I am working on a kind of initialization routine for a MongoDB using mongoengine.
The documents we deliver to the user are read from several JSON files and written into the database at the start of the application using the above mentioned init routine.
Some of these documents have unique keys which would raise a mongoengine.errors.NotUniqueError error if a document with a duplicate key is passed to the DB. This is not a problem at all since I am able to catch those errors using try-except.
However, some other documents are something like a bunch of values or parameters. So there is no unique key which a can check in order to prevent those from being inserted to the DB twice.
I thought I could read all existing documents from the desired collection like this:
docs = MyCollection.objects()
and check whether the document to be inserted is already available in docs by using:
doc = MyCollection(parameter='foo')
print(doc in docs)
Which prints false even if there is a MyCollection(parameter='foo') document in the the DB already.
How can I achieve a duplicate detection without using unique keys?
You can check using an if statement:
if not MyCollection.objects(parameter='foo'):
# insert your documents

In Eve, what is the difference between inserting a document into a collection using the http method POST and using the mongo shell?

Background Information
The answer to my previous question (In Eve, how can you make a sub-resource of a collection and keep the parent collections endpoint?) was to use the multiple endpoints, one datasource feature of Eve. In the IRC channel, I was speaking with cuibonobo, and she was able to get this working by changing the game_id to be an objectid instead of a string, as shown here:
http://gist.github.com/uunsamp/d969116367181bb30731
I however didn't get this working, and as you can see from the conversation, I was putting documents into the collection differently:
14:59 < cuibonobo> no. it's just that since your previous settings file saved the game id as a string, the lookup won't work
15:00 < cuibonobo> it will only work on documents where game_id has been saved as an ObjectId
15:01 < cuibonobo> the way Eve currently works, if you set the type to 'objectid', it will convert the string to a Mongo ObjectId before saving it in the database. but that conversion doesn't happen with strings
15:02 < znn> i haven't been using eve for storing objects
15:02 < znn> i've been using the mongo shell interface for inserting items
15:03 < cuibonobo> oh. hmm. that might complicate things. Eve does type conversions and other stuff before inserting documents.
15:04 < cuibonobo> so inserting stuff directly into mongo generally isn't recommended
Question
Which leads me to stackoverflow :)
What is the difference between inserting a document into a collection using the http method POST and using the mongo shell? Will users eventually be able to use either method of document insertion?
Extra information
I was looking through http://github.com/nicolaiarocci/eve/blob/develop/eve/methods/post.py before asking this question, but this could take awhile to understand, much longer than just asking someone who maybe is more familiar with the code than myself.
The quick answer is that Eve is adding a few meta fields etag, updated, created along with every stored document. If you want to store documents locally (not going through HTTP) you can use post_internal:
Intended for internal post calls, this method is not rate limited,
authentication is not checked and pre-request events are not raised.
Adds one or more documents to a resource. Each document is validated
against the domain schema. If validation passes the document is inserted
and ID_FIELD, LAST_UPDATED and DATE_CREATED along with a link to the
document are returned. If validation fails, a list of validation issues
is returned.
Usage example:
from run import app
from eve.methods.post import post_internal
payload = {
"firstname": "Ray",
"lastname": "LaMontagne",
"role": ["contributor"]
}
with app.test_request_context():
x = post_internal('people', payload)
print(x)
Documents inserted with post_internal are subject to the same validation and will be stored like they were by API clients via HTTP. In 0.5-dev (not released yet) PATCH, PUT and DELETE internal methods have been added too.

Storing/working with tags in mongodb for a document management system

So I am working on a pet project where I'm storing various text files. I have setup my app to save the tags as a string in one of my collections so an example would be:
tags: "Linux Apache WSGI"
Storing them and searching for them work just fine but my question comes when I want to do something like a tag cloud, count all the various tags, or make a dynamic selection system based on tags, what is the best way to break them up to work with? Or should I be storing them some other way?
Logically I could scan through every record and get all the tags, break them based on space, then cache the result somehow. Maybe that's the right answer but I wanted to ask the community wisdom.
I'm using pymongo to interact with my database.
Or should I be storing them some other way?
The standard way to store tags is to store them as an array. In your case, the DB would look something like:
tags: ['linux', 'apached', 'wsgi']
... what is the best way to break them up to work with?
This is what Map/Reduce is designed for. This effectively "scans every record". The output of a Map/Reduce is another collection that you can query.
However, there's also another way to do this and that's to keep "counters" and update them. So when you save a new document you also increment all of the tags related to that document.

Categories