I have a record like this in my MongoDB
{
_id: "611ae422a01534cecce5533d"
firstname:"Test1"
lastname: "Test2"
}
This is my code
res = firstnameCollection.aggregate([{'$sample': {'size': 1 }}])
print(list(res))
I want to get a random record out of my database. I have many records like I mentioned above.
The output I get is a list of the record. When I remove the list attribute, I get:
<pymongo.command_cursor.CommandCursor object at 0x0000021078B82EB0>
I just want to output the firstname of the element:
Test1
Once you convert the response to a list, you can retrieve the first element using an index. You could do something like this:
docs = list(res)
first_name = docs[0]['firstname'] if docs and 'firstname' in docs[0] else 'placeholder'
The approach makes sure you don't get an IndexError by taking advantage of the fact that empty lists are considered false. It also checks that firstname is in the document so that you don't get a KeyError.
I'm working on a human stats database for a simulation game and cannot figure out a certain function. The data about each person is stored as a string in a humandict list. Every in-game year the ageup() func should be called, and change each strings data value.
This is the string data format that i use ( the list consists of these values which store data about every human ) :
##ID, age, strengths, smarts
I call the .split() method in order to divide those different numbers in a string to a list, and apply int() to each list item in order to use them in math. The ageup() function should access every humandict item and change the age value by 1. This is the code I currently use ( which doesn't work as intended ):
for unit in humandict:
human = unit.split()
age = int(human[1])
age += 1
replace = (str(human[0])+" "+str(age)+" "+str(human[2])+" "+str(human[3]))
humandict[int(human[0])] = replace
print(humandict)
The code successfully runs once, and the next time the function is called I then get the following error:
File "main.py", line 15, in ageup
human = unit.split()
AttributeError: 'NoneType' object has no attribute 'split'
I simply don't understand where the problem is arising, it can due to wrong ID assignment or something else. But I know for sure that using dictionary here is a better and efficient way to handle data.
So here is how you can implement the same stuff with dictionary:
human_list_of_dict = [{'ID':<int>, 'age':<int>, 'strengths':<str>, 'smarts':<str>}]
above written is a list of dictionary to store data right now it has only 1 dictionary in it but there can be as much as you need. then you simple call it just like a list with few changes.
for unit in human_list_of_dict:
unit['age'] = unit['age']+1
By this way you can save you hassle of converting string to list and vice-versa. Also code is efficient this way(since there is less data manipulation).
Tying to filter data using queryset.filter() in Django. but it returns not what I expecting. can someone correct me.
single data cell looks like below.(each line separated by \n)
こちらは1行です。(0.57)\n
こちらは2行です。(0.67)\n
こちらは3行です。(0.77)\n
こちらは4行です。(0.87)\n
こちらは5行です。(0.697)
code like below
queryset = queryset.filter(predicted_result__regex = r"\A.*", predicted_result__contains='(0.5') |\
queryset.filter(predicted_result__regex = r"\A.*", predicted_result__contains='(0.6') |\
queryset.filter(predicted_result__regex = r"\A.*", predicted_result__contains='(0.7')
output:
this will be considering all 5 lines not only the first line.
target:
only get values contains in first line between score (inside brackets)0.5 to 0.8. all the other lines should omit.
expected result:
こちらは1行です。(0.57)\n
The problem with your current query is that your 'single data cell' is currently a string. Using a regex to calculate what may or may not be expected will still return the entire string on 'True'. You should separate the 'text' from the 'float' values in order to query more efficiently.
class Model(model):
descriptor = models.CharField()
value = models.FloatField()
Now you can actually query the 'value' on it's own.
If you are unable to change the model for whatever reason, you need to do multiple queries and use the union/intersection/difference methods.
fives_queryset = queryset.filter(predicted_result__contains'(0.5')
sixes_queryset = queryset.filter(predicted_result__contains'(0.6')
sevens_queryset = queryset.filter(predicted_result__contains'(0.7')
result_queryset = union(fives_queryset, sixes_queryset, sevens_queryset)
I got a view like this:
function (doc, meta) {
if(doc.type){
var id = doc.id ? doc.id: "";
var company = doc.company ? doc.company: "";
var store = doc.store ? doc.store: "";
emit([doc.type, id, company, store]);
}
}
And documents which all contain a type and a combination of the other 3 fields, depending on the type.
I want to query generically via this view with the following function:
def find_by_type_pageing_by_id_company_store(self, format_function=None, page=None, rows=None, recent=None, type=None, id="", company="", store="", include_docs=True):
if not type:
logger.error("No Type Provided in find by type query")
raise exceptions.InvalidQueryParams("No Type Provided in find by type query")
view = VIEW_BY_TYPE_VIN_COMPANY_STORE
cb = self.get_cb_bucket()
query = Query()
# 'recent' and 'rows' are equivalent and will be unified to 'limit' here
if recent and rows:
raise exceptions.InvalidQueryParams(detail="Query may not contain both 'recent' and 'rows'")
limit = rows or recent
if limit:
try:
rows_per_page = int(limit)
except ValueError:
raise exceptions.InvalidQueryParams(detail="Query params 'recent' and 'rows' have to be integers")
if rows_per_page > settings.PAGINATION_MAX_ROWS_LIMIT:
raise exceptions.InvalidQueryParams(detail="Query params 'recent' and 'rows' may not exceed %s. "
"Use the additional param 'page=2', 'page=3', etc. to access "
"more objects" % settings.PAGINATION_MAX_ROWS_LIMIT)
try:
page = 1 if page is None else int(page)
except ValueError:
raise exceptions.InvalidQueryParams(detail="Query param 'page' has to be an integer")
skip = rows_per_page * (page - 1)
query.limit = rows_per_page
query.skip = skip
query.mapkey_range = [
[type, id, company, workshop],
[type, id + query.STRING_RANGE_END, company + query.STRING_RANGE_END, store + query.STRING_RANGE_END]
]
rows = cb.query(view['doc'], view['view'], include_docs=include_docs, query=query)
if format_function is None:
format_function = self.format_function_default
return_array = format_function(rows)
return return_array
It works flawlessly when only querying for a certain type, or a type and an id range.
But if I e.g. want to have all docs of a certain type belonging to a company, disregarding id and store, also docs of other companies are delivered.
I tried by:
query.mapkey_range = [
["Vehicle", "", "abc", ""]
["Vehicle", q.STRING_RANGE_END, "abc", q.STRING_RANGE_END]
]
I know, somehow the order of the values in the composite key matters, thats why the query for an id range probably is succesful.
But I could not find any detailed explanation how the order matters and how to handle this use case.
Any idea or hint how to cope with this?
Thank you in advance.
with compound keys, the order in emit determines the internal "sorting" of the index. When using range query, this order is used.
In your case:
index contains all Vehicles
all the Vehicles are then sorted by id
for each similar id, Vehicles are sorted by company
for each similar id and company, Vehicles are then sorted by store
Let's take an example of 4 vehicles. Here is what the index would look like:
Vehicle,a,ACME,store100
Vehicle,c,StackOverflow,store1001
Vehicle,d,ACME,store100
Vehicle,e,StackOverflow,store999
Here is what happens with a range query:
The view engine finds the first row >= to the startKey from your range
It then finds the last one that is <= to the endKey of your range
It returns every row in between in the array
You can see how, depending on the ids, this can lead to seemingly bad results: for [["Vehicle", "", "ACME", ""], ["Vehicle", RANGE_END, "ACME", RANGE_END]] here is what happens:
row 1 (a) is identified as the lowest matching the startKey
row 4 (e) doesn't match the endKey, because "Vehicle,e,StackOverflow,store999" is greater than "Vehicle,RANGE_END,ACME,RANGE_END" due to the third component
row 3 (d) is the upper bound: Vehicle <= Vehicle, d <= RANGE_END, ACME <= ACME, store100 <= RANGE_END
hence row 1-3 are returned, including row 2 from "StackOverflow"
TL/DR: Ordering in the emit matters, you cannot query with sparse "jokers" in the left side of the compound key.
Change the map function to emit(doc.type, doc.company, doc.store, id) (most generic to least generic attribute) and it should work fine after you rework your query accordingly.
Here is a link from the doc explaining compound keys and ranges with dates: Partial Selection With Compound Keys
You have two options for querying your documents by a variable number/order of fields:
Use a multidimentional view (aka. spatial view), which lets you omit parts of the compound key in the query. Here is an example of using such a view: http://developer.couchbase.com/documentation/server/4.0/views/sv-example2.html
Use N1QL, which lets you actually query on any number of fields dynamically. Make sure you add indexes for the fields you intend to query, and use the EXPLAIN statement to check that your queries execute as you expect them to. Here is how you use N1QL in Python: http://developer.couchbase.com/documentation/server/4.0/sdks/python-2.0/n1ql-queries.html
As you've already discovered, you cannot use a regular view, because you can only query it by the exact order of fields in your compound key.
I'm trying to do a Django database save from a form where I don't have to manually specify the fieldnames (as I do in the 2nd code block), the way I am trying to do this is as below (1st code block) as I got the tip from another S.O. post. However, when I try this I get the error "dictionary update sequence element #0 has length 4; 2 is required", I even tried it, as below, with just a testdict dictionary, instead of the request.POST, but am still getting the error.. obviously the field value is fine since it works in the 2nd code block, so I am stumped as to why this is happening, would appreciate if anyone can shed any light on this for me... thanks
trying it this way gives the error:
testdict = {'name':'account_username','value':'vvvvvv'}
for name, value in testdict.iteritems():
if name != '' and name != 'top_select':
b = Twitter(**dict((name, value)))
b.save()
>>> dictionary update sequence element #0 has length 4; 2 is required
but this works fine:
b = Twitter(account_username='vvvvvv')
b.save()
Not sure what you are trying to do, but maybe you want something like this
b = Twitter(**{name: value})
But to get the equivalent to Twitter(account_username='vvvvvv') you would need something like this
Twitter(**{testdict['name'], testdict['value']})
where testdict would only contain a single entity to send to Twitter()
Then the code would look more like this
test_twits = [{'name':'account_username','value':'vvvvvv'},
{'name':'account_username','value':'wwwwww'},
]
for twit in test_twits:
name = twit['name']
value = twit['value']
if name != '' and name != 'top_select':
b = Twitter(**{name: value})
b.save()
Correct me if I am wrong.
From your second code snippet I take it that the Twitter class needs account_username as a keyword argument. When you are iterating through the dictionary using iteritems you are passing the name - i.e. the key of the dictionary as the keyword argument to the class. Isn't this wrong? The dictionary's keys are name and value, _not _ account_username. I believe you need the one of values from the dictionary to be passed as keyword argument, not one of the keys.
just do this:
dict(((name, value),))
'dict' takes a sequence of key, value tuples whereas you are giving it one key, value tuple.
The reason it says '... sequence element #0 has length 4' is because the key 'name' from testdict has a length of 4.