using boto to scan dynamodb table - python

Folks,
I have an 'admins' table, with 'UserName' as its HashKey.
The table looks like this:
admins = Table('admins')
admins.put_item(data={
'UserName':'jon',
'password':'pass1',
})
admins.put_item(data={
'UserName':'tom',
'password':'pass2',
})
So to pull the users out, I am trying to do the following, but failing:
admins = Table('admins')
all_admins = admins.scan()
for x in all_admins:
print x['UserName']
Why am I getting an empty set?
Thanks!

What you are doing looks correct.
Have you confirmed the data actually got written?(take a look at the AWS console)
Are you trying to read directly after writing? The default read is eventually consistent, and thus you may not find items directly after writing them

Use Item= instead of data= to insert the items, you can also use batch
admins = Table('admins')
admins.put_item(Item={
'UserName':'jon',
'password':'pass1'})
admins.put_item(Item={
'UserName':'tom',
'password':'pass2'})
And to pull the users you should use
admins = Table('admins')
all_admins = admins.scan(
ConsistentRead=True)
items = all_admins['Items']
for x in items:
print x['UserName']
This should do the job

Related

how to fetch firebase data?

I am new to python and firebase and I am trying to flaten my firebase database.
I have a database in this format
each cat has thousands of data in it. All I want is to fetch the cat names and put them in an array. for example I want the output to be ['cat1','cat2'....]
I was using this tutorial
http://ozgur.github.io/python-firebase/
from firebase import firebase
firebase = firebase.FirebaseApplication('https://your_storage.firebaseio.com', None)
result = firebase.get('/Data', None)
the problem with the above code is it'll attempt to fetch all the data under Data. How can I only fetch the "cats"?
if you want to get the values inside the cats as columns, try using the pyrebase, using pip install pyrebase at cmd / anaconda prompt(later prefered if you didn't set up PIP or Python at your environment paths. after installing:
import pyrebase
config {"apiKey": yourapikey
"authDomain": yourapidomain
"databaseURL": yourdatabaseurl,
"storageBucket": yourstoragebucket,
"serviceAccount": yourserviceaccount
}
Note: you can find all the information above at your Firebase's console:
https://console.firebase.google.com/project/ >>> your project >>> click on the icon "<'/>" with the tag "add firebase to your web app
back to the code...
make a neat definition so you can store it into a py file:
def connect_firebase():
# add a way to encrypt those, I'm a starter myself and don't know how
username: "usernameyoucreatedatfirebase"
password: "passwordforaboveuser"
firebase = pyrebase.initialize_app(config)
auth = firebase.auth()
#authenticate a user > descobrir como não deixar hardcoded
user = auth.sign_in_with_email_and_password(username, password)
#user['idToken']
# At pyrebase's git the author said the token expires every 1 hour, so it's needed to refresh it
user = auth.refresh(user['refreshToken'])
#set database
db = firebase.database()
return db
Ok, now save this into a neat .py file
NEXT, at your new notebook or main .py you're going to import this new .py file that we'll call auth.py from now on...
from auth import *
# add do a variable
db = connect_firebase()
#and now the hard/ easy part that took me a while to figure out:
# notice the value inside the .child, it should be the parent name with all the cats keys
values = db.child('cats').get()
# adding all to a dataframe you'll need to use the .val()
data = pd.DataFrame(values.val())
and thats it, print(data.head()) to check if the values / columns are where they're expected to be.
Firebase Realtime Database is one big JSON tree:
when you fetch data at a location in your database, you also retrieve
all of its child nodes.
The best practice is to denormalize your data, creating multiple locations (nodes) for the same data:
Many times you can denormalize the data by using a query to retrieve a
subset of the data
In your case, you may create a second node named "categories" where you list "only" the category names.
/cat1
/...
/cat2
/...
/cat3
/...
/cat4
/...
/categories
/cat1
/cat2
/cat3
/cat4
In this scenario you can use the update() method to write to more than one location at the same time.
I was exploring pyrebase documentation. As per that, we may extract only keys from some path.
To return just the keys at a particular path use the shallow() method.
all_user_ids = db.child("users").shallow().get()
In your case, it'll be something like:
firebase = pyrebase.initialize_app(config)
db = firebase.database()
allCats = db.child("data").shallow().get()
Let me know if it didn't help.

How do I retrieve a path's data from firebase database using python?

I have this firebase database structure
I want to print out the inventory list(Inventory) for each ID under Businesses.
So I tried this code
db = firebase.database()
all_users = db.child("Businesses").get()
for user in all_users.each():
userid = user.key()
inventorydb = db.child("Businesses").child(userid).child("Inventory")
print(inventorydb)
but all I got is this
<pyrebase.pyrebase.Database object at 0x1091eada0>
what am I doing wrong and how can I loop through each Business ID and print out their inventory?
First, you're printing a Database object. You need to get the data still.
You seem to already know how to get that as well as the children. Or you only copied the examples without understanding it...
Either way, you can try this
db = firebase.database()
businesses = db.child("Businesses")
for userid in businesses.shallow().get().each():
inventory = businesses.child(userid).child("Inventory").get()
print( inventory.val() )
On a side note, National_Stock_Numbers looks like it should be a value of the name, not a key for a child

Python, CouchDb: how to Update already existing document by ID

I am trying to update an already existing document by ID. My intention is to find the doc by its id, then change its "firstName" with new value coming in "json", then update it into the CouchDB database.
Here is my code:
def updateDoc(self, id, json):
doc = self.db.get(id)
doc["firstName"] = json["firstName"]
doc_id, doc_rev = self.db.save(doc)
print doc_id, doc_rev
print "Saved"
//"json" is retrieved from PUT request (request.json)
at self.db.save(doc) I'm getting exception as "too many values to unpack".
I am using Bottle framework, Python 2.7 and Couch Query.
How do I update the document by id? what is the right way to do it?
In couchdb-python the db.save(doc) method returns tuple of _id and _rev. You're using couch-query - a bit different project that also has a db.save(doc) method, but it returns a different result. So your code should look like this:
def updateDoc(self, id, json):
doc = self.db.get(id)
doc["firstName"] = json["firstName"]
doc = self.db.save(doc)
print doc['_id'], doc['_rev']
print "Saved"

Updating DataStore JSON values using endpoints (Python)

I am trying to use endpoints to update some JSON values in my datastore. I have the following Datastore in GAE...
class UsersList(ndb.Model):
UserID = ndb.StringProperty(required=True)
ArticlesRead = ndb.JsonProperty()
ArticlesPush = ndb.JsonProperty()
In general what I am trying to do with the API is have the method take in a UserID and a list of articles read (with an article being represented by a dictionary holding an ID and a boolean field saying whether or not the user liked the article). My messages (centered on this logic) are the following...
class UserID(messages.Message):
id = messages.StringField(1, required=True)
class Articles(messages.Message):
id = messages.StringField(1, required=True)
userLiked = messages.BooleanField(2, required=True)
class UserIDAndArticles(messages.Message):
id = messages.StringField(1, required=True)
items = messages.MessageField(Articles, 2, repeated=True)
class ArticleList(messages.Message):
items = messages.MessageField(Articles, 1, repeated=True)
And my API/Endpoint method that is trying to do this update is the following...
#endpoints.method(UserIDAndArticles, ArticleList,
name='user.update',
path='update',
http_method='GET')
def get_update(self, request):
userID = request.id
articleList = request.items
queryResult = UsersList.query(UsersList.UserID == userID)
currentList = []
#This query always returns only one result back, and this for loop is the only way
# I could figure out how to access the query results.
for thing in queryResult:
currentList = json.loads(thing.ArticlesRead)
for item in articleList:
currentList.append(item)
for blah in queryResult:
blah.ArticlesRead = json.dumps(currentList)
blah.put()
for thisThing in queryResult:
pushList = json.loads(thisThing.ArticlesPush)
return ArticleList(items = pushList)
I am having two problems with this code. The first is that I can't seem to figure out (using the localhost Google APIs Explorer) how to send a list of articles to the endpoints method using my UserIDAndArticles class. Is it possible to have a messages.MessageField() as an input to an endpoint method?
The other problem is that I am getting an error on the 'blah.ArticlesRead = json.dumps(currentList)' line. When I try to run this method with some random inputs, I get the following error...
TypeError: <Articles
id: u'hi'
userLiked: False> is not JSON serializable
I know that I have to make my own JSON encoder to get around this, but I'm not sure what the format of the incoming request.items is like and how I should encode it.
I am new to GAE and endpoints (as well as this kind of server side programming in general), so please bear with me. And thanks so much in advance for the help.
A couple things:
http_method should definitely be POST, or better yet PATCH because you're not overwriting all existing values but only modifying a list, i.e. patching.
you don't need json.loads and json.dumps, NDB does it automatically for you.
you're mixing Endpoints messages and NDB model properties.
Here's the method body I came up with:
# get UsersList entity and raise an exception if none found.
uid = request.id
userlist = UsersList.query(UsersList.UserID == uid).get()
if userlist is None:
raise endpoints.NotFoundException('List for user ID %s not found' % uid)
# update user's read articles list, which is actually a dict.
for item in request.items:
userslist.ArticlesRead[item.id] = item.userLiked
userslist.put()
# assuming userlist.ArticlesPush is actually a list of article IDs.
pushItems = [Article(id=id) for id in userlist.ArticlesPush]
return ArticleList(items=pushItems)
Also, you should probably wrap this method in a transaction.

couchdb-python change notifications

I'm trying to use couchdb.py to create and update databases. I'd like to implement notification changes, preferably in continuous mode. Running the test code posted below, I don't see how the changes scheme works within python.
class SomeDocument(Document):
#############################################################################
# def __init__ (self):
intField = IntegerField()#for now - this should to be an integer
textField = TextField()
couch = couchdb.Server('http://127.0.0.1:5984')
databasename = 'testnotifications'
if databasename in couch:
print 'Deleting then creating database ' + databasename + ' from server'
del couch[databasename]
db = couch.create(databasename)
else:
print 'Creating database ' + databasename + ' on server'
db = couch.create(databasename)
for iii in range(5):
doc = SomeDocument(intField=iii,textField='somestring'+str(iii))
doc.store(db)
print doc.id + '\t' + doc.rev
something = db.changes(feed='continuous',since=4,heartbeat=1000)
for iii in range(5,10):
doc = SomeDocument(intField=iii,textField='somestring'+str(iii))
doc.store(db)
time.sleep(1)
print something
print db.changes(since=iii-1)
The value
db.changes(since=iii-1)
returns information that is of interest, but in a format from which I haven't worked out how to extract the sequence or revision numbers, or the document information:
{u'last_seq': 6, u'results': [{u'changes': [{u'rev': u'1-9c1e4df5ceacada059512a8180ead70e'}], u'id': u'7d0cb1ccbfd9675b4b6c1076f40049a8', u'seq': 5}, {u'changes': [{u'rev': u'1-bbe2953a5ef9835a0f8d548fa4c33b42'}], u'id': u'7d0cb1ccbfd9675b4b6c1076f400560d', u'seq': 6}]}
Meanwhile, the code I'm really interested in using:
db.changes(feed='continuous',since=4,heartbeat=1000)
Returns a generator object and doesn't appear to provide notifications as they come in, as the CouchDB guide suggests ....
Has anyone used changes in couchdb-python successfully?
I use long polling rather than continous, and that works ok for me. In long polling mode db.changes blocks until at least one change has happened, and then returns all the changes in a generator object.
Here is the code I use to handle changes. settings.db is my CouchDB Database object.
since = 1
while True:
changes = settings.db.changes(since=since)
since = changes["last_seq"]
for changeset in changes["results"]:
try:
doc = settings.db[changeset["id"]]
except couchdb.http.ResourceNotFound:
continue
else:
// process doc
As you can see it's an infinite loop where we call changes on each iteration. The call to changes returns a dictionary with two elements, the sequence number of the most recent update and the objects that were modified. I then loop through each result loading the appropriate object and processing it.
For a continuous feed, instead of the while True: line use for changes in settings.db.changes(feed="continuous", since=since).
I setup a mailspooler using something similar to this. You'll need to also load couchdb.Session() I also use a filter for only receiving unsent emails to the spooler changes feed.
from couchdb import Server
s = Server('http://localhost:5984/')
db = s['testnotifications']
# the since parameter defaults to 'last_seq' when using continuous feed
ch = db.changes(feed='continuous',heartbeat='1000',include_docs=True)
for line in ch:
doc = line['doc']
// process doc here
doc['priority'] = 'high'
doc['recipient'] = 'Joe User'
# doc['state'] + 'sent'
db.save(doc)
This will allow you access your doc directly from the changes feed, manipulate your data as you see fit, and finally update you document. I use a try/except block on the actual 'db.save(doc)' so I can catch when a document has been updated while I was editing and reload the doc before saving.

Categories