Im working on a program using kivy and python, and i will be saving some data to a json file like say 'items.json'.
The thing is i intend to retrieve the data from the store and use them to form
a list of buttons in my app. here is an example.
store = JsonStore('items.json')
store.put('infinix', name = 'infinix', category = 'gadgets')
store.put('wrist watch', name = 'wrist watch', category = 'outfits')
store.put('t-shirt', name = 't-shirt', category = 'outfits')
this works well. But my problem is in retrieving the data.
i would like to get them in the same order i entered the data into the store.
for example if i do
store.keys()
i would like it to return
['infinix', 'wrist watch', 't-shirt']
which is the same order i entered the data.
currently whenever i try to retrieve the data, the order is mixed up.
is there a way to achieve what i need?
Any help is greatly appreciated.
The simplest option would seem to be just adding an extra storage key containing a list of your items in the correct order. You can then just check this first, and load them in that order.
Related
I'm brand new here and brand new to Python and programming in general. I wrote a simple script today that I'm pretty proud of as a new beginner. I used BS4 and Requests to scrape some data from a website. I put all of the data in dictionaries inside a list. The same key/value pairs exist for every list item. For simplicity, I'm left with something like this:
[{'country': 'us', 'state':'new york', 'people':50},{'country':'us', 'state':'california','people':30']}
Like I said, pretty simple, but then I can turn it into a Pandas dataframe and everything is organized with a few hundred different dictionaries inside the list. My next step is to do run this scrape every hour for 5 hours--and the only thing that changes is the value of the 'people' key. All of the sudden I'm not sure a list of lists of dictionaries (did I say that right?!) is a great idea. Plus, I really only need to get the updated values of 'people' from the webpage. Is this something I can realistically do with built in Python lists and dictionaries? I don't know much about databases, but I'm thinking that maybe SQLite might be good to use. I really only know about it in concept but haven't worked with it. Thoughts?
Ideally, after several scrapes, I would have easy access to the data to say, see 'people' in 'new york' over time. Or find at what time 'california' had the highest number of people. And then I could plot the data in 1000 different ways! I'd love any guidance or direction here. Thanks a bunch!
You could create a Python class, like this:
class StateStats:
def __init__(self, country, state, people):
self.country = country
self.state = state
self.people = people
def update():
# Do whatever your update script is here
# Except, update the value self.people when it changes
# Like this: self.people = newPeopleValueAsAVariable
And then create instances of it like this:
# For each country you have scraped, make a new instance of this class
# This assumes that the list you gathered is stored in a variable named my_list
state_stats_list = []
for dictionary in my_list:
state_stats_list.append(
StateStats(
dictionary['country'],
dictionary['state'],
dictionary['people']
)
)
# Or, instead, you can just create the class instances
# when you scrape the webpage, instead of creating a
# list and then creating another list of classes from that list
You could also use a database like SQLite, but I think this will be fine for your purpose. Hope this helps!
I am fairly new to Python and what I am trying to achieve is read json and store in a dictionary. I need to store a common variable index, a change variable index and a list of changes for the change variable.
So far I get the json message and set the common variable index and append it to a list. I then store the change variable in the same list before appending all the changes for it.
For example: If I had two authors and for each author they wrote three books, for each book I need to store the changes they made(store page numbers changed). Once all that data is stored, I would need to loop over all the changes for each book and see if they appear in the other books. Then print out the author and the book that has the same page number changed. (I know it is not the best example, but you get the point.
Reading in from a file:
list = []
item = []
with open("tst.file",'r') as f:
for line in f:
authorRegEx = re.search(r"\([A-Z0-9-]+)",line)
author = authorRegEx.group(0).strip("()")
list.append(author)
bookItemRegEx = re.search(r"\*[A-Z0-9-]+\*",line)
book = bookItemRegEx.strip("*")
list.append(book)
***Then call a rest api to get the json message****
modifiedPages = api call
for i in range(len(modifiedPages['values'])):
pageNumber = modifiedPages['value'][i]['page']
item.append(pageNumber)
list.append(item)
item=[]
print list
Which outputs
['Author1','book1,[u'page1','page2','page3'],'Author1','book2', [u'page2,page4'],'Author2','book1'[u'page2','page3']]
At this point I am at a loss as to what to do next. I need to loop over the page(x) variable for each author and book and check if it appears in the others. Then need to print out which Author, book and page that contains the same page(x) number.
Thanks in advance.
I'm using Python and "requests" to practice the use of API. I've had success with basic requests and parsing, but having difficulty with list comprehension for a more complex project.
I requested from a server and got a dictionary. From there, I used:
participant_search = (match1_request['participantIdentities'])
To convert the values of the participantIdentities key to get the following data:
[{'player':
{'summonerName': 'Crescent Bladex',
'matchHistoryUri': '/v1/stats/player_history/NA1/226413119',
'summonerId': 63523774,
'profileIcon': 870},
'participantId': 1},
My goal here is to combine the summonerId and participantId to one list. Which is easy normally, but the order of ParticipantIdentities is randomized. So the player I want information on will sometimes be 1st on the list, and other times third.
So I can't use the var = list[0] like how I would normally do.
I have access to summonerId, so I'm thinking I can search the list the summonerId, then somehow collect all the information around it. For instance, if I knew 63523774 then I could find the key for it. From here, is it possible to find the parent list of the key?
Any guidance would be appreciated.
Edit (Clarification):
Here's the data I'm working with: http://pastebin.com/spHk8VP0
At line 1691 is where participant the nested dictionary 'participantIdentities' is. From here, there are 10 dictionaries. These 10 dictionaries include two nested dictionaries, "player" and "participantId".
My goal is to search these 10 dictionaries for the one dictionary that has the summonerId. The summonerId is something I already know before I make this request to the server.
So I'm looking for some sort of "search" method, that goes beyond "true/false". A search method that, if a value is found within an object, the entire dictionary (key:value) is given.
Not sure if I properly understood you, but would this work?
for i in range(len(match1_request['participantIdentities'])):
if(match1_request['participantIdentities'][i]['summonerid'] == '63523774':
# do whatever you want with it.
i becomes the index you're searching for.
ds = match1_request['participantIdentities']
result_ = [d for d in ds if d["player"]["summonerId"] == 12345]
result = result_[0] if result_ else {}
See if it works for you.
You can use a dict comprehension to build a dict wich uses summonerIds as keys:
players_list = response['participantIdentities']
{p['player']['summonerId']: p['participantId'] for p in players_list}
I think what you are asking for is: "How do I get the stats for a given a summoner?"
You'll need a mapping of participantId to summonerId.
For example, would it be helpful to know this?
summoner[1] = 63523774
summoner[2] = 44610089
...
If so, then:
# This is probably what you are asking for:
summoner = {ident['participantId']: ident['player']['summonerId']
for ident in match1_request['participantIdentities']}
# Then you can do this:
summoner_stats = {summoner[p['participantId']]: p['stats']
for p in match1_request['participants']}
# And to lookup a particular summoner's stats:
print summoner_stats[44610089]
(ref: raw data you pasted)
I have a database with a bunch of regular documents that look something like this (example from wiki):
{
"_id":"some_doc_id",
"_rev":"D1C946B7",
"Subject":"I like Plankton",
"Author":"Rusty",
"PostedDate":"2006-08-15T17:30:12-04:00",
"Tags":["plankton", "baseball", "decisions"],
"Body":"I decided today that I don't like baseball. I like plankton."
}
I'm working in Python with couchdb-python and I want to know if it's possible to add a field to each document. For example, if I wanted to have a "Location" field or something like that.
Thanks!
Regarding IDs
Every document in couchdb has an id, whether you set it or not. Once the document is stored you can access it through the doc._id field.
If you want to set your own ids you'll have to assign the id value to doc._id. If you don't set it, then couchdb will assign a uuid.
If you want to update a document, then you need to make sure you have the same id and a valid revision. If say you are working from a blog post and the user adds the Location, then the url of the post may be a good id to use. You'd be able to instantly access the document in this case.
So what's a revision
In your code snippet above you have the doc._rev element. This is the identifier of the revision. If you save a document with an id that already exists, couchdb requires you to prove that the document is still the valid doc and that you are not trying to overwrite someone else's document.
So how do I update a document
If you have the id of your document, you can just access each document by using the db.get(id) function. You can then update the document like this:
doc = db.get(id)
doc['Location'] = "On a couch"
db.save(doc)
I have an example where I store weather forecast data. I update the forecasts approximately every 2 hours. A separate process is looking for data that I get from a different provider looking at characteristics of tweets on the day.
This looks something like this.
doc = db.get(id)
doc_with_loc = GetLocationInformationFromOtherProvider(doc) # takes about 40 seconds.
doc_with_loc["_rev"] = doc["_rev"]
db.save(doc_with_loc) # This will fail if weather update has also updated the file.
If you have concurring processes, then the _rev will become invalid, so you have to have a failsave, eg. this could do:
doc = db.get(id)
doc_with_loc = GetLocationInformationFromAltProvider(doc)
update_outstanding = true
while update_outstanding:
doc = db.get(id) //reretrieve this to get
doc_with_loc["_rev"] = doc["_rev"]
update_outstanding = !db.save(doc_with_loc)
So how do I get the Ids?
One option suggested above is that you actively set the id, so you can retrieve it. Ie. if a user sets a given location that is attached to a URL, use the URL. But you may not know which document you want to update - or even have a process that finds all the document that don't have a location and assign one.
You'll most likely be using a view for this. Views have a mapper and a reducer. You'll use the first one, forget about the last one. A view with a mapper does the following:
It returns a simplyfied/transformed way of looking at your data. You can return multiple values per data or skip some. It gives the data you emit a key, and if you use the _include_docs function it will give you the document (with _id and rev alongside).
The simplest view is the default view db.view('_all_docs') this will return all documents and you may not want to update all of them. Views for example will be stored as a document as well when you define these.
The next simple way is to have view that only returns items that are of the type of the document. I tend to have a _type="article in my database. Think of this as marking that a document belongs to a certain table if you had stored them in a relational database.
Finally you can filter elements that have a location so you'd have a view where you can iterate over all those docs that still need a location and identify this in a separate process. The best documentation on writing view can be found here.
Basically, I'm trying to use the api offered by govtrack.us to pull information and store into my own app's datastore for further manipulation (using python). The api serves json by default and when i get the json I start to get lost on how to add each element I want to the datastore. For example, this json has the following keys: {u'meta', u'objects'}. Each object has the following keys:
[u'gender_label', u'osid', u'id', u'pvsid', u'current_role', u'name_sortable', u'firstname', u'twitterid', u'middlename', u'lastname', u'bioguideid', u'birthday', u'link', u'youtubeid', u'nickname', u'name', u'roles', u'gender', u'namemod', u'metavidid', u'name_no_details', u'resource_uri']
I want to be able to take each object and store it into the datastore (but not all the information just some - like u'current_role', u'youtubeid', u'name', etc...).
Right now, I've got this function that pulls the json:
def get_congressman():
url = 'http://www.govtrack.us/api/v1/person?roles__current=true&limit=3000'
content = None
try:
content = urllib2.urlopen(url).read()
except URLError:
return
if content:
return content
And this to iterate over the returned json:
current_congressman = get_congressman()
j = json.loads(current_congressman)
name = [c['name_no_details'] for c in j['objects']]
youtube = [c['youtubeid'] for c in j['objects']]
gender_list = [c['gender_label'] for c in j['objects']]
Instead of adding all the people's names, gender, youtube feeds, etc. to a seperate list, I would like to add each object to their own list containing the information needed to add to the datastore. Basically, a list like:
["Gary Ackerman", "Male", "RepAckerman"]
But one for each object. What would be the best way to go about this? Or do I have to have a list of all names, another list of all genders, and so forth and then match them up?
First you need to define a model. See the Ndb Overview, for details.
Lets say your model is something like
class Congressman(ndb.model):
...
Then instead of computing one list of names, other for youtube ids, you will traverse all the objects once and create a Congressman object and store it.
for congressman_info in j['objects']:
congressman = Congressman(gender=congresmman_info['gender_label'],
name=congressman_info['name_no_details'], ...)
congressman.put()