I was adding some data to Firebase with Python and I want to use the MD5 strings I generated as the unique index for each record. The auto-generated key in Firebase looks like "-KgMvzKKxVgj4RKN-3x5". Is it possible to replace its value with Python? I know how to do it with Javascript though. Please help... Thanks in advance!
f = firebase.FirebaseApplication('https://xxxxx.firebaseio.com')
f.post('meeting/',
{
"MD5index":MD5String,
"title": title,
"date": date,
"time": time,
"location": location
})
It sure is. Just use put instead of post:
f = firebase.FirebaseApplication('https://xxxxx.firebaseio.com')
f.put('meeting/mymeetingkey',
{
"MD5index":MD5String,
"title": title,
"date": date,
"time": time,
"location": location
})
Related
I am new to any kind of programming. This is an issue I encountered when using mongodb. Below is the collection structure of the document I imported from two different csv files.
{
"_id": {
"$oid": "61bc4217ed94f9d5fe6a350c"
},
"Telephone Number": "8429950810",
"Date of Birth": "01/01/1945"
}
{
"_id": {
"$oid": "61bc4217ed94f9d5fe6a350c"
},
"Telephone Number": "8129437810",
"Date of Birth": "01/01/1998"
}
{
"_id": {
"$oid": "61bd98d36cc90a9109ab253c"
},
"TELEPHONE_NUMBER": "9767022829",
"DATE_OF_BIRTH": "16-Jun-98"
}
{
"_id": {
"$oid": "61bd98d36cc9090109ab253c"
},
"TELEPHONE_NUMBER": "9567085829",
"DATE_OF_BIRTH": "16-Jan-91"
}
The first two entries are from a csv and the next two entries from another csv file. Now I am creating a user interface where users can search for a telephone number. How to write the query to search the telephone number value in both the index ( Telephone Number and TELEPHONE_NUMBER) using find() in the above case. If not possible is there a way to change the index's to a desired format while importing csv to db. Or is there a way where I create two different collection and then import csv to each collections and then perform a collective search of both the collections. Or can we create a compound index and then search the compound index instead. I am using pymongo for all the operations.
Thankyou.
You can use or query if different key is used to store same type of data.
yourmongocoll.find({"$or":[ {"Telephone Number":"8429950810"}, {"TELEPHONE_NUMBER":8429950810}]})
Assuming you have your connection string to connect via pymongo. Then the following is an example of how to query for the telephone number "8429950810":
from pymongo import MongoClient
client = MongoClient("connection_string")
db = client["db"]
collection = db["collection"]
results = collection.find({"Telephone Number":"8429950810"})
Please note this will return as type cursor, if you would like your documents in a list consider wrapping the query in list() like so:
results = list(collection.find({"Telephone Number":"8429950810"}))
I'm working on this REST application in python Flask and a driver called pymongo. But if someone knows mongodb well he/she maybe able to answer my question.
Suppose Im inserting a new document in a collection say students. I want to get the whole inserted document as soon as the document is saved in the collection. Here is what i've tried so far.
res = db.students.insert_one({
"name": args["name"],
"surname": args["surname"],
"student_number": args["student_number"],
"course": args["course"],
"mark": args["mark"]
})
If i call:
print(res.inserted_id) ## i get the id
How can i get something like:
{
"name": "student1",
"surname": "surname1",
"mark": 78,
"course": "ML",
"student_number": 2
}
from the res object. Because if i print res i am getting <pymongo.results.InsertOneResult object at 0x00000203F96DCA80>
Put the data to be inserted into a dictionary variable; on insert, the variable will have the _id added by pymongo.
from pymongo import MongoClient
db = MongoClient()['mydatabase']
doc = {
"name": "name"
}
db.students.insert_one(doc)
print(doc)
prints:
{'name': 'name', '_id': ObjectId('60ce419c205a661d9f80ba23')}
Unfortunately, the commenters are correct. The PyMongo pattern doesn't specifically allow for what you are asking. You are expected to just use the inserted_id from the result and if you needed to get the full object from the collection later do a regular query operation afterwards
I'll try to explain my goal. I have to write reports based on a document sent to me that has common strings in it. For example, the document sent to me contains data like:
"reportId": 84561234,
"dateReceived": "2020-01-19T17:54:31.000+0000",
"reportingEsp": {
"firstName": "Google",
"lastName": "Reviewer",
"addresses": {
"address": [
{
"street1": "1600 Ampitheater Parkway",
"street2": null,
"city": "Mountainview",
"postalCode": "94043",
"state": "CA",
"nonUsaState": null,
"country": "US",
"type": "BUSINESS"
This is an example of the 'raw' data. It is also presented in a PDF. I have tried scraping the PDF using tabula, but there seems to be some issue with fonts?? So I only get about 10% of the text. And I am wondering/thinking going after the raw data will be more accurate/easier...(if you think scraping the PDF would be easier, please let me know)
So I used this code:
with open('filetobesearched.txt', 'r') as searchfile:
for line in searchfile:
if 'reportId' in line:
print (line)
if 'dateReceived' in line:
print (line)
if 'firstName' in line:
print (line)
and this is where trouble starts... there are multiple occurrences of the string 'firstName' in the file. So my code as exists prints each of those one after the other. In the raw file those fields exist in different sections each are preceded by a section header like in the example above 'reportingESP'. So I'd like my code to somehow know the 'firstName' string belongs to a given section and the next occurrence belongs to another section to be printed with it... (make sense?)
Eventually I'd like to parse out the address information but omit any fields with a null.
And ULTIMATELY I'd like the data outputted into a file I could then in turn import into my report template and fill those fields as applicable. Which seems like a huge thing to me... so I'll be happy with help simply parsing through the raw data and outputting the results to a file in the proper order.
Thanks in advance for any help!
Thanks, yes TIL - it's json data. So I accomplished my goal like this:
JSON Data
"reportId": 84561234,
"dateReceived": "2020-01-19T17:54:31.000+0000",
"reportingEsp": {
"firstName": "Google",
"lastName": "Reviewer",
"addresses": {
"address": [
{
"street1": "1600 Ampitheater Parkway",
"street2": null,
"city": "Mountainview",
"postalCode": "94043",
"state": "CA",
"nonUsaState": null,
"country": "US",
"type": "BUSINESS"
My code:
import json
# read files
myjsonfile=open('file.json', 'r')
jsondata=myjsonfile.read()
# Parse
obj=json.loads(jsondata)
#parse through the json data to populate report variables
rptid = str(str(obj['reportId']))
dateReceived = str(str(obj['dateReceived']))
print('Report ID: ', rptid)
print('Date Received: ', dateReceived)
So now that I have those as variables I am trying to using them to fill a docx template... but that's another question I think.
Consider this one answered. Thanks again!
Is it possible to have field names that do not conform to python variable naming rules? To elaborate, is it possible to have the field name as "Job Title" instead of "job_title" in the export file. While may not be useful in JSON or XML exports, such an functionality might be useful while exporting in CSV format. For instance, if I need to use this data to import to another system which is already configured to accept CSVs with a certain field name.
Tried to reading the Item Pipelines documentation but it appears to be for an "an item has been scraped by a spider" but not for the field names themselves (Could be totally wrong though).
Any help in this direction would be really helpful!
I would suggest you to use a third party lib called scrapy-jsonschema. With that you can define your Items like this:
from scrapy_jsonschema.item import JsonSchemaItem
class MyItem(JsonSchemaItem):
jsonschema = {
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "MyItem",
"description": "My Item with spaces",
"type": "object",
"properties": {
"id": {
"description": "The unique identifier for the employee",
"type": "integer"
},
"name": {
"description": "Name of the employee",
"type": "string"
},
"job title": {
"description": "The title of employee's job.",
"type": "string",
}
},
"required": ["id", "name", "job title"]
}
And populate it like this:
item = MyItem()
item['job title'] = 'Boss'
You can read more about here.
This solution address the Item definition as you asked, but you can achieve similar results without defining an Item. For example, you could just scrape the data into a dict and yield it back to scrapy.
yield {
"id": response.xpath('...').get(),
"name": response.xpath('...').get(),
"job title": response.xpath('...').get(),
}
with scrapy crawl myspider -o file.csv that would scrape into a csv and the columns will have the names you chose.
You could also have the spider directly write into a csv, or it's pipeline, etc. Several ways to do it without a Item definition.
I want to update every field of my MongoDB collection using the field's own value to do so.
Example: if I have this document: "string": "foo", a possible update would do this: "string": $string.lower(). Here, $string would be "foo", but I don't know how to do this with PyMongo.
I've tried this:
user_collection.update_many({}, { "$set": { "word": my_func("$word")}})
Which replaces everything with "$word".
I've been able to do it successfully iterating each document but it takes too long.
As I know you can't find and update in one statement using python function. You can either use mongo query language:
user_collection.update_many({}, { "$set": {"name": { "$concat": ["$name", "_2"]}}})
or use separate functions of pymongo:
for obj in user_collection.find({some query here}):
user_collection.update({"_id": obj['_id']}, { "$set": {"name": my_func(obj['name']) } })