pymongo getTimestamp without ObjectId - python

in my mongodb, i have a collection where the docs are created not using ObjectId, how can I get the timestamp (generation_time in pymongo) of those docs? Thank you

If you don't store timestamps in documents, they wouldn't have any timestamps to retrieve.
If you store timestamps in some other way than via ObjectId, you would retrieve them based on how they are stored.

Related

Pandas df `to_gbq` with nested data

I'm working in a limited Airflow environment in which I don't have access to google-cloud-bigquery but do have access to pandas-gbq. My goal is to load some JSON API data using some schema involving records into a BigQuery table. My strategy is to first read all the data into a pandas dataframe using a dictionary to represent the records: e.g.
uuid
metadata1
...
001
{u'time_updated': u'', u'name':u'jeff'}
...
Then, I've been trying to use pandas_gbq.to_gbq to load into BQ. The issue is I get
Error at Row: 0, Reason: invalid, Location: metadata1, Message: This field: metadata1 is not a record. and I realize this is because from the Google Cloud website it says that pandas-gbq "Converts the DataFrame to CSV format before sending to the API, which does not support nested or array values."
And so I won't be able to upload a dataframe with records to BQ in this way since again I can't use google-cloud-bigquery in my environment.
What would be the best strategy for me to upload my data to BQ (around 30k rows and 6 or so columns with 8ish nested fields each)?
I know this sounds like a very bad strategy but I could upload a flattened version of all fields ini a record as a single string to the BQ table and then run a query from my code to replace these flattened fields with their record-form versions. But this seems really bad since for a time, the table would contain the wrong schema.
Any thoughts would be much appreciated. Thanks in advance.

Query mongoDB by date document created using pymongo

Lots of solutions to querying mongoDB using date/time field in MongoDB but what if the mongo doc doesn't have a date/time field?
I've noticed that when I hover the mouse over a document _id (using NoSQLBooster for MongoDB) I get a "createdAt" dropdown (see screenshot below). Just wondering if there is anyway to do a query using pymongo where documents are filtered based on a date/time range using their "createdAt" metadata?
In MongoDB the id of the docs contains the timestamp of creation, this is mentioned on this other question.
You can make a script that insert a date/field using this information to perform those queries or perform the query directly to using the objectId as in here.

firebase query by value in map

i am using python with firebase SDK, and have a table named jobs, each record, has a field named client, that is a map, each client has an id field. I would like to query the table for all the jobs that have the client with a certain id value, I found this explaining for how to query by array members but can't find anything about query by values of map fields.
will something like
.where("client.id", "==", id) work and be effective? how can I do this query in an effective way? create index maybe?
enter code here
It should work without creating an index. What you wrote should filter on the id property of the client field object for all documents of a collection.
See also:
Firestore: Query documents by property of object
Query Google Firestore database by custom object fields

How to get last modified record from Elasticsearch, when there is no timestamp field?

In my index, I don't have a timestamp field. I need to fetch only new records added every day, instead of fetching all records every time. Is there any way to do this even though I don't have any timestamp field
Elasticsearch doesnot store such fields.
You can use ingest feature or obviously add a timestamp field.

How to do a bulk insert without overwriting existing data - Pymongo?

I am trying to bulk insert data to MondoDB without overwriting existing data. I want to insert new data to the database if no match with unique id (sourceID). Looking at the documentation for Pymongo I have written some code but cannot make it work. Any ideas to what I am doing wrong?
db.bulk_write(UpdateMany({"sourceID"}, test, upsert=True))
db is the name of my database, SourceID is the unique ID of the documents that I don't want to overwrite in the existing data, test is the array that I am tying to insert.
Either I don't understand your requirement or you misunderstands the UpdateMany operation. As per documentation, this operation serves for modifying the existing data (those matching the query) and only if no documents match the query, and upsert=True, insert new documents. Are you sure you don't want to use insert_many method?
Also, in your example, the first parameter which should be a filter for update, is not a valid query which has to be in a form {"key": "value"}.

Categories