How to add an entry to firebase using requests module?

How to add an entry to firebase using requests module? - python

I have a firebase database that holds a list of words, so has structure like this:
words_to_add
--words
----0:'hello'
----1:'goodbye'
so basically its {words: ['hello','goodbye']}.
I want to add to this dictionary so that its holds additional words.
I have tried to use requests.patch("Firebase_URL", data= '{"words":"[new_list_of_words]"}' and with requests.put() but they overwrite the contents of words to a string of the list which isn't what I want. How could I just add another entry to list?

A path request to the REST API of the Realtime Database takes each key in the dictionary you pass to it and writes the value from your request into the database for that key. Since you specify words as the key, all existing words will be replaced.
You can perform a deep update by providing a path as the key, so for example data= '{"words/2": "banana", "words/3": "pie"}'. Although I haven't tested this on arrays, it should add the second and third element to the array.
The problem with this is of course is that you'd have to know (and agree with any other users) on how many items there currently are in the array, leading to all kinds of possible problems. This is one of the many reasons that Firebase recommends against using arrays in it's vintage blog post: Best Practices: Arrays in Firebase.
In this case, the better structure would be to use a map like his for the words in your database:
words: {
"hello": true,
"goodbye": true
}
The true value here is just needed since Firebase won't store a key without a value and has no real meaning.
Now that we have a map of words, you can path it with:
requests.patch("Firebase_URL", data='{"words/banana":true, "words/pie":true}'

Related

Can you access a parent field from a child object in Django?

tl;dr: I want to express something like [child.child_field_value, child.parent_field_value] on a Django child model and get an iterable like ['Alsatian', 'Dog'] or similar.
Context: I'm trying to prepare a dict for a JSON API in Django, such that I have two models, Evaluation and its parent Charity.
In the view I filter for all Evaluations meeting certain parameters, and then use a dict comp nexted in a list comp on evaluation.__dict__.items() to drop Django's '_state' field (this isn't the focus of this question, but please tell me if you know a better practice!):
response = { 'evaluations': [{
key:value for key, value in evaluation.__dict__.items()
if key not in ['_state']} for evaluation in evaluations]}
But I want a good way to combine the fields charity_name and charity_abbreviation of each Evaluation's parent charity with the rest of that evaluation's fields. So far the best way I can find/think of is during the dict comp to conditionally check whether the field we're iterating through is charity_id and if so to look up that charity and return an array of the two fields.
But I haven't figured out how to do that, and it seems likely to end up with something very messy which isn't isn't functionally ideal, since I'd rather that array was two key:value pairs in line with the rest of the dictionary.

Populate DB with JSON files, search values, return only matching. Or something different

First, I am relatively new at programming; Python is the only language I have any familiarity with using. Secondly, I put DB in the question because that's what seems right to me after searching around, but I am open to not using a DB at all if that's easier or more efficient.
What I Have to Work With
I have a folder with ~75,000 JSON files. They all have the same structure; here is an example of what they look like (more on that below):
{
"id": 93480
"author": "",
"joined by": [],
"date_created": "2010-04-28T16:07:21Z"
"date_modified": "2020-02-21T21:42:45.655644Z"
"type": "010combined",
"page_count": null,
"plain_text": "",
"html": "",
"extracted_by_ocr": false,
"cited": [
]
}
One way that the real files differ from the above is that either the "plain_text" or the "html" key will have an actual value, namely text (whether plaintext or HTML). The length of that text can vary from a couple of sentences to over 200 pages worth of text. Thus, the JSON files range in size from 907 bytes at the smallest to 2.1 MB.
What I'm Trying to Do
I want to be able, essentially, to search through all the files for a word or phrase contained in either the plain_text or HTML fields and, at a minimum, return a list of files containing that word or phrase. [Ideally, I'd do other things with them, as well, but I can figure that stuff out later. What I'm stumped on is where to begin.]
What I Can't Figure Out
Whether to even bother with a document-store db like MongoDB (or PostgreSQL). If that's the appropriate way to handle this, I'm open to working my way through it. But I can't even tell if that's how I should attack the problem, or if I should instead just use a Python script to iterate over the files in the folder directly. Can you use populate a DB with all the files in a folder, then search for a substring in each row? The fact that some of these files have a ton of text in one of the values makes it seem weird to me to use a DB at all, but again: I don't know what I'm doing.
I think I know to iterate over the files directly with Python. I know how to open files, and I know how to get a list of keys from JSON files. But how do you search for a matching substring in two JSON values? And then, if the substring is found in one of them, how do you return the "id" field to a list, close the file, and move to the next one? (I mean, obviously, the basic structure is a conditional. Here's the logical structure of what I'm thinking here:
Variable = "substring I want to match"
List = [] # Will hold ids of files containing variable
Open file
Read file to the end
Search file [or just the two JSON keys?] for variable
If variable found append "id" to list
Close file
Move to the next one in the directory
It's the actual code part that I'm stumbling over.

Idea using pandas since I don't know about search engines, some copied from: How to read multiple json files into pandas dataframe?
dfs = [] # an empty list to store the data frames
for file in file_list:
data = pd.read_json(file, lines=True) # read data frame from json file
dfs.append(data) # append the data frame to the list
temp = pd.concat(dfs, ignore_index=True) # concatenate all the data frames in the list.
Creating it will take forever but once that's done you can search and do operations quickly. E.g. if you want to find all id where author is not empty:
id_list = temp.loc[temp['author'] != '']['id'].tolist()
If the combined size of all your files is gigantic, you may want to consult the docs to store things more efficiently https://pandas.pydata.org/pandas-docs/stable/user_guide/scale.html or use another method.

How do I not create a new document if the previous document in the collection contains a certain value?

I am using the python library to programmatically create documents in a collection as such:
user = client.query(q.create(q.collection("my_collection"), {
"data": {
"UTC_datetime": str(datetime.now(pytz.UTC)),
"item_one": str(value_one),
"item_two": str(value_two),
"item_three": str(value_three)
}
}))
Upon certain conditions being met the python app executes again.
If item_two on the next app execution has the same value again I do not want a new document to be created.
How do I craft the above query to perform this?
Currently, I am reading the previous document, extracting the value from item_two and performing an if/else statement to either proceed to store a new document or sys.exit().
I'm positive there is a more elegant solution that is based within Fauna's logic instead of Python's, however, I have not been able to achieve this.

You can create a unique index (https://docs.fauna.com/fauna/current/api/fql/indexes) for item_two to ensure that duplicates are not possible. You may also want to an upsert implementation
https://forums.fauna.com/t/multi-document-upsert/488/3
https://forums.fauna.com/t/does-fauna-supports-upserts/208
q.If(
q.Exists(q.Match(q.Index('unique_item_two'), str(value_two))),
q.Update(...),
q.Create(...)
)

Python- Insert new values into 'nested' list?

What I'm trying to do isn't a huge problem in php, but I can't find much assistance for Python.
In simple terms, from a list which produces output as follows:
{"marketId":"1.130856098","totalAvailable":null,"isMarketDataDelayed":null,"lastMatchTime":null,"betDelay":0,"version":2576584033,"complete":true,"runnersVoidable":false,"totalMatched":null,"status":"OPEN","bspReconciled":false,"crossMatching":false,"inplay":false,"numberOfWinners":1,"numberOfRunners":10,"numberOfActiveRunners":8,"runners":[{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":2.8,"size":34.16},{"price":2.76,"size":200},{"price":2.5,"size":237.85}],"availableToLay":[{"price":2.94,"size":6.03},{"price":2.96,"size":10.82},{"price":3,"size":33.45}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832765}...
All I want to do is add in an extra field, containing the 'runner name' in the data set below, into each of the 'runners' sub lists from the initial data set, based on selection_id=selectionId.
So initially I iterate through the full dataset, and then create a separate list to get the runner name from the runner id (I should point out that runnerId===selectionId===selection_id, no idea why there are multiple names are used), this works fine and the code is shown below:
for market_book in market_books:
market_catalogues = trading.betting.list_market_catalogue(
market_projection=["RUNNER_DESCRIPTION", "RUNNER_METADATA", "COMPETITION", "EVENT", "EVENT_TYPE", "MARKET_DESCRIPTION", "MARKET_START_TIME"],
filter=betfairlightweight.filters.market_filter(
market_ids=[market_book.market_id],
),
max_results=100)
data = []
for market_catalogue in market_catalogues:
for runner in market_catalogue.runners:
data.append(
(runner.selection_id, runner.runner_name)
)
So as you can see I have the data in data[], but what I need to do is add it to the initial data set, based on the selection_id.
I'm more comfortable with Php or Javascript, so apologies if this seems a bit simplistic, but the code snippets I've found on-line only seem to assist with very simple Python lists and nothing 'nested' (to me the structure seems similar to a nested array).
As per the request below, here is the full list:
{"marketId":"1.130856098","totalAvailable":null,"isMarketDataDelayed":null,"lastMatchTime":null,"betDelay":0,"version":2576584033,"complete":true,"runnersVoidable":false,"totalMatched":null,"status":"OPEN","bspReconciled":false,"crossMatching":false,"inplay":false,"numberOfWinners":1,"numberOfRunners":10,"numberOfActiveRunners":8,"runners":[{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":2.8,"size":34.16},{"price":2.76,"size":200},{"price":2.5,"size":237.85}],"availableToLay":[{"price":2.94,"size":6.03},{"price":2.96,"size":10.82},{"price":3,"size":33.45}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832765},{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":20,"size":3},{"price":19.5,"size":26.36},{"price":19,"size":2}],"availableToLay":[{"price":21,"size":13},{"price":22,"size":2},{"price":23,"size":2}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832767},{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":11,"size":9.75},{"price":10.5,"size":3},{"price":10,"size":28.18}],"availableToLay":[{"price":11.5,"size":12},{"price":13.5,"size":2},{"price":14,"size":7.75}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832766},{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":48,"size":2},{"price":46,"size":5},{"price":42,"size":5}],"availableToLay":[{"price":60,"size":7},{"price":70,"size":5},{"price":75,"size":10}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832769},{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":18.5,"size":28.94},{"price":18,"size":5},{"price":17.5,"size":3}],"availableToLay":[{"price":21,"size":20},{"price":23,"size":2},{"price":24,"size":2}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832768},{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":4.3,"size":9},{"price":4.2,"size":257.98},{"price":4.1,"size":51.1}],"availableToLay":[{"price":4.4,"size":20.97},{"price":4.5,"size":30},{"price":4.6,"size":16}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832771},{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":24,"size":6.75},{"price":23,"size":2},{"price":22,"size":2}],"availableToLay":[{"price":26,"size":2},{"price":27,"size":2},{"price":28,"size":2}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":12832770},{"status":"ACTIVE","ex":{"tradedVolume":[],"availableToBack":[{"price":5.7,"size":149.33},{"price":5.5,"size":29.41},{"price":5.4,"size":5}],"availableToLay":[{"price":6,"size":85},{"price":6.6,"size":5},{"price":6.8,"size":5}]},"sp":{"nearPrice":null,"farPrice":null,"backStakeTaken":[],"layLiabilityTaken":[],"actualSP":null},"adjustmentFactor":null,"removalDate":null,"lastPriceTraded":null,"handicap":0,"totalMatched":null,"selectionId":10064909}],"publishTime":1551612312125,"priceLadderDefinition":{"type":"CLASSIC"},"keyLineDescription":null,"marketDefinition":{"bspMarket":false,"turnInPlayEnabled":false,"persistenceEnabled":false,"marketBaseRate":5,"eventId":"28180290","eventTypeId":"2378961","numberOfWinners":1,"bettingType":"ODDS","marketType":"NONSPORT","marketTime":"2019-03-29T00:00:00.000Z","suspendTime":"2019-03-29T00:00:00.000Z","bspReconciled":false,"complete":true,"inPlay":false,"crossMatching":false,"runnersVoidable":false,"numberOfActiveRunners":8,"betDelay":0,"status":"OPEN","runners":[{"status":"ACTIVE","sortPriority":1,"id":10064909},{"status":"ACTIVE","sortPriority":2,"id":12832765},{"status":"ACTIVE","sortPriority":3,"id":12832766},{"status":"ACTIVE","sortPriority":4,"id":12832767},{"status":"ACTIVE","sortPriority":5,"id":12832768},{"status":"ACTIVE","sortPriority":6,"id":12832770},{"status":"ACTIVE","sortPriority":7,"id":12832769},{"status":"ACTIVE","sortPriority":8,"id":12832771},{"status":"LOSER","sortPriority":9,"id":10317013},{"status":"LOSER","sortPriority":10,"id":10317010}],"regulators":["MR_INT"],"countryCode":"GB","discountAllowed":true,"timezone":"Europe\/London","openDate":"2019-03-29T00:00:00.000Z","version":2576584033,"priceLadderDefinition":{"type":"CLASSIC"}}}

i think i understand what you are trying to do now
first hold your data as a python object (you gave us a json object)
import json
my_data = json.loads(my_json_string)
for item in my_data['runners']:
item['selectionId'] = [item['selectionId'], my_name_here]
the thing is that my_data['runners'][i]['selectionId'] is a string, unless you want to concat the name and the id together, you should turn it into a list or even a dictionary
each item is a dicitonary so you can always also a new keys to it
item['new_key'] = my_value

So, essentially this works...with one exception...I can see from the print(...) in the loop that the attribute is updated, however what I can't seem to do is then see this update outside the loop.
mkt_runners = []
for market_catalogue in market_catalogues:
for r in market_catalogue.runners:
mkt_runners.append((r.selection_id, r.runner_name))
for market_book in market_books:
for runner in market_book.runners:
for x in mkt_runners:
if runner.selection_id in x:
setattr(runner, 'x', x[1])
print(market_book.market_id, runner.x, runner.selection_id)
print(market_book.json())
So the print(market_book.market_id.... displays as expected, but when I print the whole list it shows the un-updated version. I can't seem to find an obvious solution, which is odd, as it seems like a really simple thing (I tried messing around with indents, in case that was the problem, but it doesn't seem to be, its like its not refreshing the market_book list post update of the runners sub list)!

How to insert clean data into Firebase via REST API and Python

I have a simple Firebase that I mostly interact with via Javascript, which works really well. However, I also have a Python program that needs to get data from existing children and put/update data on existing children. I tried python-firebasin, which would do what I want, but it is unreliable (hangs, fails, etc.).
So I'm looking at the python-firebase REST wrapper. This seems efficient, and works well. However, every time I try to post() data, I get not just the data I'm posting, but some kind of unique string paired with it, all inserted as a child.
For example, via Javascript, I might say:
db = new Firebase('https://myfirebase.firebaseio.com/testval/');
db.transaction(function(current) { return 1; });
This would then give me a Firebase that looked like:
|---testval: 1
But when I try to do something similar with the Python Firebase REST wrapper, such as:
db = firebase.FirebaseApplication('https://myfirebase.firebaseio.com/')
db.post('/testval/',1)
My Firebase looks something like this:
|---testval:
|---JI4BiBbICSEAnM9mDXf: 1
In other words, it inserts a new child, gives it a new string, and then appends the data. Is there any way to insert/modify data on my Firebase using the REST wrapper that would do it cleanly like I'm doing with Javascript? Without adding children, without adding these unique strings?

Try this instead:
db.put(1)
db.post() is the equivalent of .push() in the JavaScript API, so it creates a unique ID for you. db.put() is equivalent to .set() and will just set the data, which appears to be what you want.
Note that there is no equivalent for transactions in the REST API, but your example was just using a transaction to do a .set() so hopefully you don't actually need them.

Try this:
db = firebase.FirebaseApplication('https://myfirebase.firebaseio.com/')
db.put('', 'testval', 1)
put takes three arguments : first is url or path, second is the key name or the snapshot name and third is the data(json)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to add an entry to firebase using requests module? - python

Related

Can you access a parent field from a child object in Django?

Populate DB with JSON files, search values, return only matching. Or something different

How do I not create a new document if the previous document in the collection contains a certain value?

Python- Insert new values into 'nested' list?

How to insert clean data into Firebase via REST API and Python

Categories

Resources