append in array pymongo - python

i want to update bookmarks array by id of _id_collection.
result = collections.find_one({"_id": ObjectId(id)},
{ 'array_of_collections': { '$elemMatch': { '_id_collection': ObjectId('5e9871582be940b6af4a9b41') }}})
print(result) # {'_id': ObjectId('5e986b4a07b94384ae8016b7'), 'array_of_collections': [{'_id_collection': ObjectId('5e9871582be940b6af4a9b41'), 'name_of_collection': 'test2', 'bookmarks': []}]}
here is my code of result of searching this object, now i can append in bookmarks array some values, but i don't know how to do it.
on this picture u can see my monbodb structure.

If you do the $elemMatch as part of the query and then use the positional operator $, you can successfully push values to your desired bookmarks array.
Try this:
import pprint
from bson import ObjectId
from pymongo import MongoClient
client = MongoClient()
db = client.tst1
coll = db.coll6
coll.update_one({"_id": ObjectId("5e9898c69c26fe7ba93476f6"),
'array_of_collections': {'$elemMatch': {'_id_collection': ObjectId("5e9898c69c26fe7ba93476f4")}}},
{'$push': {'array_of_collections.$.bookmarks': 'Pushed Value 1'}})
mlist1 = list(coll.find())
for mdoc in mlist1:
pprint.pprint(mdoc)
Result Document:
{'_id': ObjectId('5e9898c69c26fe7ba93476f6'),
'array_of_collections': [{'_id_collection': ObjectId('5e9898c69c26fe7ba93476f2'),
'bookmarks': [],
'name_of_collection': 'test'},
{'_id_collection': ObjectId('5e9898c69c26fe7ba93476f3'),
'bookmarks': [],
'name_of_collection': 'test2'},
{'_id_collection': ObjectId('5e9898c69c26fe7ba93476f4'),
'bookmarks': ['Pushed Value 1'],
'name_of_collection': 'test3'},
{'_id_collection': ObjectId('5e9898c69c26fe7ba93476f5'),
'bookmarks': [],
'name_of_collection': 'test4'}]}

Related

Why do Pymongo returns E11000 duplicate error when using python console

I am following the tutorial in Pymongo documentation
import pymongo
from pymongo import MongoClient
import datetime
client = MongoClient("mongodb://localhost:27017/")
test_db = client.test_db #This creates a database
posts = test_db.posts #This creates a collection of documents
post = {"author": "Doug",
"text": "My first blog post!",
"tags": ["mongodb", "python"],
"date": datetime.datetime.utcnow()}
post_id = posts.insert_one(post).inserted_id
The code works in both cases: press run in IDE or run line by line in python console. However, whenever I run it in python console, it gives me the
pymongo.errors.DuplicateKeyError: E11000 duplicate key error collection: text_database.another_collection index: _id_ dup key: { _id: ObjectId('5f505d1e233d210283dd4632') }, full error: {'index': 0, 'code': 11000, 'keyPattern': {'_id': 1}, 'keyValue': {'_id': ObjectId('5f505d1e233d210283dd4632')}, 'errmsg': "E11000 duplicate key error collection: text_database.another_collection index: _id_ dup key: { _id: ObjectId('5f505d1e233d210283dd4632') }"}
error. It is normal in creating a new empty collection and inserting the document except the duplicate error. May I ask why?
After insert_one() your post dictionary will look like this
>>>print(post)
{'_id': ObjectId('5f506285b54093f4b9202abe'),
'author': 'Doug',
'date': datetime.datetime(2020, 9, 3, 3, 27, 1, 729287),
'tags': ['mongodb', 'python'],
'text': 'My first blog post!'}
Now you trying to insert again with same post it will throw an error because now it has an _id field.
If you don't want to update the dictionary after insert. You can insert using the copy of dictionary. you can create copy of dictionary using dict(post) or post.copy()
Now you can change your insert as below
post_id = posts.insert_one(dict(post)).inserted_id
or
post_id = posts.insert_one(post.copy()).inserted_id

How to pull data from the JSON file

I have a JSON File which contains some data as below:
{
'count': 2,
'next': '?page=2',
'previous': None,
'results': [
{
'category': 'Triggers',
'id': '783_23058',
'name': 'Covid-19'
},
{
'category': 'Sources',
'id': '426_917746',
'name': 'Covid19Conversations'
}
]
}
I am able to extract the first 'id' and 'name' values as below
Doc_details = dict()
for item in companies:
doc_id = companies['results'][0]['id']
doc_name = companies['results'][0]['name']
Doc_details[doc_name] = doc_id
for key, value in Doc_details.items():
print(key,value)
Output:
Covid-19 783_23058
I am new to python. Can someone help me with:
Loop through it and extract all the key,value pairs
Save the results to an excel file.
If you already have the object, you can iterate through companies['results'] using list comprehension and map the objects to (key, value) pairs.
companies = {
'count': 2,
'next': '?page=2',
'previous': None,
'results': [{
'category': 'Triggers',
'id': '783_23058',
'name': 'Covid-19'
}, {
'category': 'Sources',
'id': '426_917746',
'name': 'Covid19Conversations'
}]
}
pairs = list(map(lambda x: [ x['id'], x['name'] ], companies['results']))
csv = '\n'.join('\t'.join(val for val in pair) for pair in pairs)
print(csv)
Result
783_23058 Covid-19
426_917746 Covid19Conversations
Writing to a file
Convert the list of pairs to a CSV file. See: Writing a Python list of lists to a csv file.
import csv
with open('pairs.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerows(pairs)
If you only want the name, id pairs, you can just do:
for result in companies['results']:
print(result['name'], result['id'])
# =>
# Covid-19 783_23058
# Covid19Conversations 426_917746
IIUC: You can use inbuilt json package to parse the json file as python dict and then you can use pandas library to write the excel file:
Try this:
import json
import pandas as pd
from pandas import ExcelWriter
with open("json_file.json", "r") as file:
info = json.load(file) # info contains all key-value pairs
# save to excel
writer = ExcelWriter('excel_file.xlsx')
pd.DataFrame(info["results"]).to_excel(writer, index=False)
writer.save()

How can I query a MongoDB database with different child levels?

I'm new to mongoDB using pymongo. I'm trying to query a collection and also get a specific child from a field. This is what I tried:
import pymongo
import csv
from pymongo import MongoClient
connection = MongoClient()
db = connection.database
collection1 = db.data1
collection2 = db.data2
writer = csv.writer(open("Result_example.csv", "w"))
with open('Data_example.csv') as csvfile:
spamreader = csv.reader(csvfile, delimiter=';')
for row in spamreader:
for rows in collection1.find({"_id": row[0]}, { "childs.first.name": 1}):
writer.writerow([row[0], rows.get("childs.first.name")])
The database structure is like this:
child
first
name
What I want to get is the name...Any ideas?
Thanks!!!
Other than the field not being pluralized in the example structure that was provided, the following query looks fine.
for rows in collection1.find({"_id": row[0]}, { "child.first.name": 1}):
Note that child field is singular.
rows is a reference to a dictionary object like below:
{
'child': {
'first': {
'name': 'Vorname'
}
}
}
rows.get("childs.first.name") returns None in writer.writerow([row[0], rows.get("childs.first.name")])
You can retrieve the name using
rows.get('child').get('first').get('name')
Or
rows['child']['first']['name']
You can save these nested key accesses by running an aggregation that returns the document id and firstname in place of collection1.find({"_id": row[0]}, { "child.first.name": 1}).
children_names = db.collection1.aggregate([
{
'$match': {'_id': ObjectId(row[0])}
},
{
'$replaceRoot': {'newRoot': {'_id': '$_id', 'first_name': '$child.first.name' }}
},
])
Key access could then be done once.
for rows in children_names:
writer.writerow([row[0], rows.get("first_name")])

SQLAlchemy - Update table using list of dictionaries

I have a table containing user data and I would like to update information for many of the users using a list of dictionaries. At the moment I am using a for loop to send an update statement one dictionary at a time, but it is slow and I am hoping that there is a bulk method to do this.
user_data = [{'user_id' : '12345', 'user_name' : 'John'}, {'user_id' : '11223', 'user_name' : 'Andy'}]
connection = engine.connect()
metadata = MetaData()
for row in user_data:
stmt = update(users_table).where(users_table.columns.user_id == row['user_id'])
results = connection.execute(stmt, row)
Thanks in advance!
from sqlalchemy.sql.expression import bindparam
connection = engine.connect()
stmt = users_table.update().\
where(users_table.c.id == bindparam('_id')).\
values({
'user_id': bindparam('user_id'),
'user_name': bindparam('user_name'),
})
connection.execute(stmt, [
{'user_id' : '12345', 'user_name' : 'John', '_id': '12345'},
{'user_id' : '11223', 'user_name' : 'Andy', '_id': '11223'}
])

python elasticserch: only exact match can return results

I am new to elasticsearch. There is an item in my index with the title: Using Python with Elasticsearch
When I search for it, I always get zero hit except that I search "Using Python with Elasticsearch" exactly.
like:
1, The code
import elasticsearch
INDEX_NAME= 'test_1'
from elasticsearch import Elasticsearch
es = Elasticsearch()
es.index(index=INDEX_NAME, doc_type='post', id=1, body={'title': 'Using Python with Elasticsearch', 'tags': ['python', 'elasticsearch', 'tips'], })
es.indices.refresh(index=INDEX_NAME)
res = es.indices.get(index=INDEX_NAME)
print res
The output is:
{u'test_1': {u'warmers': {}, u'settings': {u'index': {u'number_of_replicas': u'1', u'number_of_shards': u'5', u'uuid': u'Z2KLxeQLRay4rFgdK4yo9A', u'version': {u'created': u'2020199'}, u'creation_date': u'1484405704970'}}, u'mappings': {u'post': {u'properties': {u'blog': {u'type': u'string'}, u'title': {u'type': u'string'}, u'tags': {u'type': u'string'}, u'author': {u'type': u'string'}}}}, u'aliases': {}}}
2, I change the mappings with the code below:
INDEX_NAME = 'test_1'
from elasticsearch import Elasticsearch
es = Elasticsearch()
request_body = {
'mappings':{
'post': {
'properties': {
'title': {'type':'text'}
}
}
}
}
if es.indices.exists(INDEX_NAME):
res = es.indices.delete(index = INDEX_NAME)
print(" response: '%s'" % (res))
res = es.indices.create(index = INDEX_NAME, body= request_body, ignore=400)
print res
The output is
response: '{u'acknowledged': True}'
{u'status': 400, u'error': {u'caused_by': {u'reason': u**'No handler for type [text] declared on field [title]**', u'type': u'mapper_parsing_exception'}, u'root_cause': [{u'reason': u'No handler for type [text] declared on field [title]', u'type': u'mapper_parsing_exception'}], u'type': u'mapper_parsing_exception', u'reason': u'Failed to parse mapping [post]: No handler for type [text] declared on field [title]'}}
3, I update the elasticsearch from 1.9 to (5, 1, 0, 'dev')
4, I also try to change the mappings with the code below
request_body = {
'mappings':{
'post': {
'properties': {
**'title': {'type':'string', "index": "not_analyzed"}**
}
}
}
}
5 I also change the mappings in this way:
request_body = {
'mappings':{
'post': {
'properties': {
**'title': {'type':'string', "index": "analyzed"}**
}
}
}
}
but, it still cannot get the hits with the query "Using Python"!
Thanks a lot!!
I only install the python version elasticsearch. The code is only the simple demo code from the web.
Thanks a lot!
When you specify {"index" : "not_analyzed"} in the mapping, it means that elasticsearch is going to store it as it is without analyzing it. That's why you do not get the result when you search for just "Using Python". With elasticsearch 5.x, if you specify the datatype of your field type as text, elasticsearch will analyze it first and then store it. With that you will be able to get the match for 'Using Python' in your query. You can find more documentation around the text type here

Categories