I'm connecting to my mongodb using pymongo:
client = MongoClient()
mongo = MongoClient('localhost', 27017)
mongo_db = mongo['test']
mongo_coll = mongo_db['test'] #Tweets database
I have a cursor and am looping through every record:
cursor = mongo_coll.find()
for record in cursor: #for all the tweets in the database
try:
msgurl = record["entities"]["urls"] #look for URLs in the tweets
except:
continue
The reason for the try/except is because if ["entities"]["urls"] does not exist, it errors out.
How can I determine whether ["entities"]["urls"] exists?
Record is a dictionary in which the key "entities" links to another dictionary, so just check to see if "urls" is in that dictionary.
if "urls" in record["entities"]:
If you just want to proceed in any case, you can also use get.
msgurl = record["entities"].get("urls")
This will cause msgurl to equal None if there is no such key.
I'm not familiar with pymongo, but why don't you change your query so it only returns results that contain "urls"? Something like:
mongo_coll.find({"entities.urls": {$exists:1}})
http://docs.mongodb.org/manual/reference/operator/exists/
Related
I'm pretty new to SQL but I need it for a school project. I'm trying to make a (python) web-app which requires accounts. I'm able to put data into my SQL database but now I need some way to verify if an e-mail (inputted via html form) already exists inside the database. Probably the easiest query ever but I haven't got a single clue on how to get started. :(
I'm sorry if this is a duplicate question but I can't find anything out there that does what I need.
if you are using SQLAlchemy in your project:
#app.route("/check_email")
def check_email():
# get email from you form data
email = request.form.get("email")
# check if someone already register with the email
user = Users.query.filter_by(email=email).first()
if not user:
# the email doesnt exist
pass
else:
# the email exists
pass
Users.query.filter_by(email=email).first() equal to SQL:
SELECT * from users where email="EMAIL_FROM_FORM_DATA"
if you are using pymsql(or something like that):
import pymsql
#app.route("/check_email")
def check_email():
# get email from you form data
email = request.form.get("email")
conn = connect(host='localhost',port=3306,user='',password='',database='essentials')
cs1 = conn.cursor()
params = [email]
# cursor return affected rows
count = cs1.execute('select * from users where email=%s', params) # prevent SqlInject
if count == 0:
# count 0 email
else:
# the email exists
# and if you want to fetch the user's info
user_info = cs1.fetchall() # the user_info should be a tuple
# close the connection
cs1.close()
conn.close()
I was able to solve my issue by simply using
INSERT IGNORE and after that checking if it was ignored with the primary key.
Thank you for everyone that helped out though!
I would like the data to be inserted in mycollection, but it'll literally insert into a collection called 'collection' when I use the collection variable before insert_one.
client = MongoClient()
db = client['mydb']
collection = db['mycollection']
db.collection.insert_one({"id": "hello"})
I didn't realize I had to remove the db part. This worked:
collection = db['mycollection']
collection.insert_one({"id": "hello"})
I'm using pypyodbc with SQL Server 2016.
I am trying to insert and grab the last id inserted into database following another user's remarks but the returned value seems to be encrypted. Is there a way to decrypt it?
def executeSQL (command):
connection = pypyodbc.connect('Driver={SQL Native Client};'
'Server=blah\blah;'
'Database=Impact;'
'uid=Admin;pwd=F$sfgdfgs99')
cursor=connection.cursor()
cursor.execute(command)
id = cursor.execute("SELECT ##IDENTITY")
connection.commit()
connection.close()
return id
sqlexecute = 'INSERT INTO PERSONS([LastName], [FirstName]) VALUES(\''+lastname.encode('utf-8')+'\',\''+firstname.encode('utf-8')+'\');\n'
lastid = executeSQL(sqlexecute)
print lastid
Output:
<pypyodbc.Cursor instance at 0x000000000B870C88>
It is not encrypted, it is telling you the type of the object that this is an instance of. In this case, it is pypyodbc.Cursor.
To fetch the actual rows, you do id.fetchall() which will return a list of the results. You can then loop over them to read the contents.
I am using AppEngine with the Python runtime environment to host a dashboard for my team. The data for the dashboard is stored in Memcache and/or Cloud Datastore. New data is pulled into the application using the BigQuery API.
class ExampleForStackOverflow(webapp2.RequestHandler):
def get(self):
credentials = GoogleCredentials.get_application_default()
bigquery_service = build('bigquery', 'v2', credentials=credentials)
query = """SELECT field1, field2
FROM
[table_name];"""
try:
timeout = 10000
num_retries = 5
query_request = bigquery_service.jobs()
query_data = {
'query': (query),
'timeoutMs': timeout,
}
query_response = query_request.query(
projectId='project_name',
body=query_data).execute(num_retries=num_retries)
# Insert query response into datastore
for row in query_response['rows']:
parent_key = ndb.Key(MyModel, 'default')
item = MyModel(parent=parent_key)
item.field1 = row['f'][0]['v']
item.field2 = row['f'][1]['v']
item.put()
except HttpError as err:
print('Error: {}'.format(err.content))
raise err
These queries will return an indeterminate number of records. I want the dashboard to display the results of the queries regardless of the number of records so using order() by created and then using fetch() to pull a certain number of records won't help.
Is it possible to write a query to return everything from the last put() operation?
So far I have tried to return all records that have been written within a certain time window (e.g. How to query all entries from past 6 hours ( datetime) in GQL?)
That isn't working for me in a reliable way because every so often the cron job that queries for the new data will fail so I'm left with a blank graph until the cron job runs the following day.
I need a resilient query that will always return data. Thanks in advance.
You could have an additional DateTimeProperty type property in MyModel, let's call it last_put, which will have the auto_now option set to True. So the datetime of the most recent update of such entity would be captured in its last_put property.
In your get() method you'd start with an ancestor query on the MyModel entities, sorted by last_put and fetching only one item - it will be the most recently updated one.
The last_put property value of that MyModel entity will give the datetime of the last put() you're seeking. Which you can then use in your bigquery query, as mentioned in the post you referenced, to get the entities since that datetime.
Dan's answer led me down the right path but I used a variation of what he suggested (mostly because I don't have a good understanding of ancestor queries). I know this isn't the most efficient way to do this but it'll work for now. Thanks, Dan!
My model:
class MyModel(ndb.Model):
field1 = ndb.StringProperty(indexed=True)
field2 = ndb.StringProperty(indexed=True)
created = ndb.DateTimeProperty(default=datetime.datetime.now())
My query:
query = MyModel.query().order(-MyModel.created)
query = query.fetch(1, projection=[MyModel.created])
for a in query:
time_created = a.created
query = MyModel.query()
query = query.filter(MyModel.created == time_created)
I am pretty sure the code I am doing is logical and makes sense but when I run it against my test the result comes back quite unusual, I am trying find a session from a cookie, and then use it to retrieve the user from the sessions table. I am using the bottle framework to test my program
active_session = bottle.request.get_cookie(COOKIE_NAME)
cursor = db.cursor()
if active_session:
cursor.execute("SELECT usernick FROM sessions WHERE sessionid=?", (active_session, ))
active_user = cursor.fetchone()
return active_user
else:
return None
The result is as follows
self.assertEqual(nick_from_cookie, nick)
AssertionError: ('Bobalooba',) != 'Bobalooba'\
I know I am so close could someone point me in the right direction
if active_session:
cursor.execute("SELECT usernick FROM sessions WHERE sessionid=?", (active_session, ))
active_user = cursor.fetchone()
return active_user[0] if active_user else active_user
It's because you are fetching the entire row and comparing it to a single string. You need to grab the single field contained in the retrieved row. Try this as the input to your assertion.
return active_user[0]