Key Error in accessing collections from MongoDB (PyMongo) - python

Why does this work:
import pymongo
from selenium import webdriver
import smtplib
import sys
import json
from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client.properties
collection = db['capitalpacific']
fromDB = []
if collection.count() != 0:
for post in collection.find():
fromDB.append(post)
print(fromDB[0]['url'])
correctly prints url only from document 1 of collection (xxx.com)
but I get a keyError when I do this:
for i in range(0, 2):
print(fromDB[i]['url'}
KeyError: 'url'
The documents stored in the DB look like so :
{'url':'xxx.com', 'location':'oregon'}

KeyError generally means the key doesn't exist in the dictionary collection.
For example :
>>> mydoc1=dict(url='xxx.com', location='oregon')
>>> mydoc2=dict(wrongkey='yyy.com', location='oregon')
>>> mylist=[]
>>> mylist.append(mydoc1)
>>> mylist.append(mydoc2)
>>> print mylist[0]['url']
xxx.com
>>> for i in range(0, 2):
... print(mylist[i]['url'])
...
xxx.com
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
KeyError: 'url'
>>>
Here, mydoc2 doesn't have a key called 'url', hence the "KeyError" is being raised for the second element in the list.
So,are you sure 'url' exist in first two records. can you print the contents of "fromDB" and make
sure that first two records has 'url' key.
>>> print mylist
[{'url': 'xxx.com', 'location': 'oregon'}, {'wrongkey': 'yyy.com', 'location': 'oregon'}]

Related

Fail to acces dict in dict with string indeces must be integer error

import boto3
client = boto3.client('secretsmanager')
response = client.get_secret_value(SecretId='xxxx')
print('entire response:', response)
print('SecretString:',response['SecretString'])
print('testvalue:'response['SecretString']["testkey"])
I am trying to implement aws secretsmanager and need to acces the testvalue.
entire response:{---, u'SecretString': u'{"testkey":"testvalue","testkey2":"testvalue2"}', ----}
Secretstring:{"testkey":"testvalue","testkey2":"testvalue2"}
Traceback (most recent call last):
File "secretmanagertest.py", line 7, in <module>
print('testvalue',response['SecretString']["testkey"])
TypeError: string indices must be integers
When I try integer instead I only get the specific character.
print(response['SecretString'][0])
{
print(response['SecretString'][1])
"
print(response['SecretString'][2])
t
etc.
The response is a nested JSON document, not a dictionary yet. Decode it first, with json.loads():
import json
secret = json.loads(response['SecretString'])
print(secret['testkey'])
Demo:
>>> import json
>>> response = {u'SecretString': u'{"testkey":"testvalue","testkey2":"testvalue2"}'}
>>> response['SecretString']
u'{"testkey":"testvalue","testkey2":"testvalue2"}'
>>> json.loads(response['SecretString'])
{u'testkey2': u'testvalue2', u'testkey': u'testvalue'}
>>> json.loads(response['SecretString'])['testkey']
u'testvalue'

Get replicationLag in mongo with pyMongo

I am trying, to get replication-delay using db.rs.printSlaveReplicationInfo from python with pymongo, but not getting any proper way to do so.
I tried the following, but no help.
>>>from pymongo import MongoClient
>>>client = MongoClient()
>>>db = client.test_database
>>>db.rs.printSlaveReplicationInfo
Collection(Database(MongoClient([u'10.0.0.19:10006', u'10.0.0.68:10002']), u'xyz'), u'rs.printSlaveReplicationInfo')
db.rs.printSlaveReplicationInfo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib64/python2.7/site-packages/pymongo/collection.py", line 2413, in __call__
self.__name.split(".")[-1])
TypeError: 'Collection' object is not callable. If you meant to call the 'printSlaveReplicationInfo' method on a 'Collection' object it is failing because no such method exists.
>>> db.rs
Collection(Database(MongoClient([u'10.0.0.19:10006', u'10.0.0.68:10002']), u'xyz'), u'rs')
Can anyone help with this? or how to do it?
Thanks in advance.
I found out the answer.Here is the complete code :
(Note: You need to have admin privileges to run this command.)
uri = "mongodb://usernamen:password#host:port/admin"
conn = pymongo.MongoClient(uri)
db = conn['admin']
db_stats = db.command({'replSetGetStatus' :1})
primary_optime = 0
secondary_optime = 0
for key in db_stats['members'] :
if key['stateStr'] == 'SECONDARY' :
secondary_optime = key['optimeDate']
if key['stateStr'] == 'PRIMARY' :
primary_optime =key['optimeDate']
print 'primary_optime : ' + str(primary_optime)
print 'secondary_optime : ' + str(secondary_optime)
seconds_lag = (primary_optime - secondary_optime ).total_seconds()
#total_seconds() userd to get the lag in seconds rather than datetime object
print 'secondary_lag : ' + str(seconds_lag)
optime reperesents the date,till which that mongo-node has data.
You can read more about it here :
https://docs.mongodb.com/manual/reference/command/replSetGetStatus/

JSON sub for loop produces KeyError, but key exists

I'm trying to add the JSON output below into a dictionary, to be saved into a SQL database.
{'Parkirisca': [
{
'ID_Parkirisca': 2,
'zasedenost': {
'Cas': '2016-10-08 13:17:00',
'Cas_timestamp': 1475925420,
'ID_ParkiriscaNC': 9,
'P_kratkotrajniki': 350
}
}
]}
I am currently using the following code to add the value to a dictionary:
import scraperwiki
import json
import requests
import datetime
import time
from pprint import pprint
html = requests.get("http://opendata.si/promet/parkirisca/lpt/")
data = json.loads(html.text)
for carpark in data['Parkirisca']:
zas = carpark['zasedenost']
free_spaces = zas.get('P_kratkotrajniki')
last_updated = zas.get('Cas_timestamp')
parking_type = carpark.get('ID_Parkirisca')
if parking_type == "Avtomatizirano":
is_automatic = "Yes"
else:
is_automatic = "No"
scraped = datetime.datetime.fromtimestamp(time.time()).strftime('%Y-%m-%d %H:%M:%S')
savetodb = {
'scraped': scraped,
'id': carpark.get("ID_Parkirisca"),
'total_spaces': carpark.get("St_mest"),
'free_spaces': free_spaces,
'last_updated': last_updated,
'is_automatic': is_automatic,
'lon': carpark.get("KoordinataX_wgs"),
'lat': carpark.get("KoordinataY_wgs")
}
unique_keys = ['id']
pprint savetodb
However when I run this, it gets stuck at for zas in carpark["zasedenost"] and outputs the following error:
Traceback (most recent call last):
File "./code/scraper", line 17, in <module>
for zas in carpark["zasedenost"]:
KeyError: 'zasedenost'
I've been led to believe that zas is in fact now a string, rather than a dictionary, but I'm new to Python and JSON, so don't know what to search for to get a solution. I've also searched here on Stack Overflow for KeyErrror when key exist questions, but they didn't help, and I believe that this might be due to the fact that's a sub for loop.
Update: Now, when I swapped the double quotes for single quotes, I get the following error:
Traceback (most recent call last):
File "./code/scraper", line 17, in <module>
free_spaces = zas.get('P_kratkotrajniki')
AttributeError: 'unicode' object has no attribute 'get'
I fixed up your code:
Added required imports.
Fixed the pprint savetodb line which isn't valid Python.
Didn't try to iterate over carpark['zasedenost'].
I then added another pprint statement in the for loop to see what's in carpark when the KeyError occurs. From there, the error is clear. (Not all the elements in the array in your JSON contain the 'zasedenost' key.)
Here's the code I used:
import datetime
import json
from pprint import pprint
import time
import requests
html = requests.get("http://opendata.si/promet/parkirisca/lpt/")
data = json.loads(html.text)
for carpark in data['Parkirisca']:
pprint(carpark)
zas = carpark['zasedenost']
free_spaces = zas.get('P_kratkotrajniki')
last_updated = zas.get('Cas_timestamp')
parking_type = carpark.get('ID_Parkirisca')
if parking_type == "Avtomatizirano":
is_automatic = "Yes"
else:
is_automatic = "No"
scraped = datetime.datetime.fromtimestamp(time.time()).strftime('%Y-%m-%d %H:%M:%S')
savetodb = {
'scraped': scraped,
'id': carpark.get("ID_Parkirisca"),
'total_spaces': carpark.get("St_mest"),
'free_spaces': free_spaces,
'last_updated': last_updated,
'is_automatic': is_automatic,
'lon': carpark.get("KoordinataX_wgs"),
'lat': carpark.get("KoordinataY_wgs")
}
unique_keys = ['id']
pprint(savetodb)
And here's the output on the iteration where the KeyError occurs:
{u'A_St_Mest': None,
u'Cena_dan_Eur': None,
u'Cena_mesecna_Eur': None,
u'Cena_splosno': None,
u'Cena_ura_Eur': None,
u'ID_Parkirisca': 7,
u'ID_ParkiriscaNC': 72,
u'Ime': u'P+R Studenec',
u'Invalidi_St_mest': 9,
u'KoordinataX': 466947,
u'KoordinataX_wgs': 14.567929171694901,
u'KoordinataY': 101247,
u'KoordinataY_wgs': 46.05457609543313,
u'Opis': u'2,40 \u20ac /dan',
u'St_mest': 187,
u'Tip_parkirisca': None,
u'U_delovnik': u'24 ur (ponedeljek - petek)',
u'U_sobota': None,
u'U_splosno': None,
u'Upravljalec': u'JP LPT d.o.o.'}
Traceback (most recent call last):
File "test.py", line 14, in <module>
zas = carpark['zasedenost']
KeyError: 'zasedenost'
As you can see, the error is quite accurate. There's no key 'zasedenost' in the dictionary. If you look through your JSON, you'll see that's true for a number of the elements in that array.
I'd suggest a fix, but I don't know what you want to do in the case where this dictionary key is absent. Perhaps you want something like this:
zas = carpark.get('zasedenost')
if zas is not None:
free_spaces = zas.get('P_kratkotrajniki')
last_updated = zas.get('Cas_timestamp')
else:
free_spaces = None
last_updated = None

How to get highlighted searches on whoosh

I used an example code from pythonhosted.org but nothing seems to happen. This is code I used:
results = mysearcher.search(myquery)
for hit in results:
print(hit["title"])
I entered this code into python but it gives an error saying mysearcher is not defined. So I'm really not sure if I'm missing something out as I'm just trying to get the basics to get me up and running.
You are missing to define the searcher mysearcher, copy the whole code. Here is a complete example:
>>> import whoosh
>>> from whoosh.index import create_in
>>> from whoosh.fields import *
>>> schema = Schema(title=TEXT(stored=True), path=ID(stored=True), content=TEXT)
>>> ix = create_in("indexdir", schema)
>>> writer = ix.writer()
>>> writer.add_document(title=u"First document", path=u"/a",
... content=u"This is the first document we've added!")
>>> writer.add_document(title=u"Second document", path=u"/b",
... content=u"The second one is even more interesting!")
>>> writer.commit()
>>> from whoosh.qparser import QueryParser
>>> with ix.searcher() as searcher:
... query = QueryParser("content", ix.schema).parse("first")
... results = searcher.search(query)
... results[0]
...
{"title": u"First document", "path": u"/a"}
Than you can highlight like this:
for hit in results:
print(hit["title"])
# Assume "content" field is stored
print(hit.highlights("content"))

mongoengine: test1 is not a valid ObjectId

I got the following error message:
$ python tmp2.py
why??
Traceback (most recent call last):
File "tmp2.py", line 15, in <module>
test._id = ObjectId(i[0])
File "/home/mictadlo/.virtualenvs/unisnp/lib/python2.7/site-packages/bson/objectid.py", line 92, in __init__
self.__validate(oid)
File "/home/mictadlo/.virtualenvs/unisnp/lib/python2.7/site-packages/bson/objectid.py", line 199, in __validate
raise InvalidId("%s is not a valid ObjectId" % oid)
bson.errors.InvalidId: test1 is not a valid ObjectId
with this code:
from bson.objectid import ObjectId
from mongoengine import *
class Test(Document):
_id = ObjectIdField(required=True)
tag = StringField(required=True)
if __name__ == "__main__":
connect('dbtest2')
print "why??"
for i in [('test1', "a"), ('test2', "b"), ('test3', "c")]:
test = Test()
test._id = ObjectId(i[0])
test.char = i[1]
test.save()
How is it possible to use its own ids which are unique too?
According to the documentation: http://docs.mongoengine.org/apireference.html#fields, ObjectIdField is 'A field wrapper around MongoDB’s ObjectIds.'. So it cannot accept a string test1 as an object id.
You may have to change the code to something like this:
for i in [(bson.objectid.ObjectId('test1'), "a"), (bson.objectid.ObjectId('test2'), "b"), (bson.objectid.ObjectId('test3'), "c")]:
for your code to work (Assuming test1 etc are valid id)
Two things:
ObjectId receives a 24 hex string, you can't initialize it with that string. For instance, instead of using 'test1' you can use a string such as '53f6b9bac96be76a920e0799' or '111111111111111111111111'. You don't even need to initialize an ObjectId, you could do something like this:
...
test._id = '53f6b9bac96be76a920e0799'
test.save()
...
I don't know what are you trying to accomplish by using _id. If you are trying to produce and id field or "primary key" for you document, it's not necessary because one is generated automatically. Your code would be:
class Test(Document):
tag = StringField(required=True)
for i in [("a"), ("b"), ("c")]:
test = Test()
test.char = i[0]
test.save()
print(test.id) # would print something similar to 53f6b9bac96be76a920e0799
If you insist in using a field named _id you must know that your id will be the same, because internally, MongoDB calls it _id. If you still want to use string1 as identifier you should do:
class Test(Document):
_id = StringField(primary_key=True)
tag = StringField(required=True)

Categories