Django haystack testing - python

I am trying to test that my search View renders the results from my search correctly, however, the search uses the indexes from my live database and not my test database, so when I create some objects for my test case, they are not reflected on the search page.
How can I make haystack use an index of the search database, or better still, just fake it and not use an index, but instead use the database as is. This would be fine for that test case and probably faster.
I could only seem to find this article when googling: http://reliablybroken.com/b/2012/12/testing-django-haystack-whoosh/
and it does not work with the current versions.
Pip versions:
django==1.7.5
django-haystack==2.4.0

I've encountered a similar use case in our project. Here's the rough idea of our implementation. Note that if you use the simple_backend (SB), some of your custom filters/prepare method may not work as expected. Hence, it's advisable to use non-SB backend (e.g. Elastic Search) even in testing mode.
from django.core.urlresolvers import reverse
from django.test import TestCase
from django.test.utils import override_settings
from haystack import connections
from haystack.utils.loading import ConnectionHandler, UnifiedIndex
from myapp.models import MyModel
from myapp.search_indexes import MyModelIndex
TEST_INDEX = {
'default': {
'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
'URL': 'http://127.0.0.1:9200/',
'INDEX_NAME': 'test_index',
},
}
#override_settings(HAYSTACK_CONNECTIONS=TEST_INDEX )
class SearchViewTest(TestCase):
def setUp(self):
"""
Some ideas taken from here:
https://github.com/django-haystack/django-haystack/blob/v2.6.0/test_haystack/test_views.py#L40
"""
connections = ConnectionHandler(TEST_INDEX )
super(SearchViewTest, self).setUp()
self.mm = MyModel.objects.create(name='Dummy Title')
# Stow.
self.old_unified_index = connections['default']._index
self.ui = UnifiedIndex()
self.mmi = MyModelIndex()
self.ui.build(indexes=[self.mmi])
connections['default']._index = self.ui
# Update the 'index'.
backend = connections['default'].get_backend()
backend.clear()
backend.update(self.mmi, MyModel.objects.all())
def tearDown(self):
connections['default']._index = self.old_unified_index
super(SearchViewTest, self).tearDown()
def test_search_results(self):
response = self.client.get('/search?q=dummy')
self.assertIn(self.mm.name, response)

create a test index and make sure that your haystack configs point to the test index when in test mode.
Something like: if 'test' in sys.argv:
Besides this you can either add another index in the search engine or use another search engine (such as whoosh) for your tests.
The other option is to actually mock the response that you get from the search engine and pass that to the views / forms you are trying to test.

Related

What is the proper way to define the collection name at the application startup with MongoEngine?

I use MongoEngine as an ODM in my Flask application. Depending on the passed configuration document, MongoEngine should use a different collection.
At the moment I achieve this by changing the internal meta variable model._meta['collection']. Is there an alternative for selecting the collection?
from mongoengine import connect
from api_service.model import MyModel
create_app(config):
app = Flask(__name__)
# load app.config
connect(app.config['MONGODB_DB'],
host=app.config['MONGODB_HOST'],
port=app.config['MONGODB_PORT'],
username=app.config['MONGODB_USERNAME'],
password=app.config['MONGODB_PASSWORD'],
)
MyModel._meta['collection'] = app.config['MONGODB_MYMODEL_COLLECTION']
I know that you can define the collection by meta:{} in the class body of the model (see here). But I am not in the app context there and therefore I cannot access `app.config'.
You can simply modify the meta attribute inside the class itself
class MyModel(Document):
meta = {"collection": "my_actual_collection_name"}
...
Check This for more meta attributes you can use
Solution Update
I defined a helper class that can have a provide an access the application's configurations
class AppConfigHelper:
from flask import current_app
APP_CONFIG = current_app.config
and in the document import and use that class to get the collection name.
class MyModel(Document):
meta = {'collection': AppConfigHelper.APP_CONFIG['MONGODB_MYMODEL_COLLECTION']}
...
This is not the best solution I can think of, but it does the job.
Caution: this is not gonna work if you run it separately from Flask, it is going to crash, you can run it inside the app itself, or using flask shell

How to programmatically generate the CREATE TABLE SQL statement for a given model in Django?

I need to programmatically generate the CREATE TABLE statement for a given unmanaged model in my Django app (managed = False)
Since i'm working on a legacy database, i don't want to create a migration and use sqlmigrate.
The ./manage.py sql command was useful for this purpose but it has been removed in Django 1.8
Do you know about any alternatives?
As suggested, I post a complete answer for the case, that the question might imply.
Suppose you have an external DB table, that you decided to access as a Django model and therefore have described it as an unmanaged model (Meta: managed = False).
Later you need to be able to create it in your code, e.g for some tests using your local DB. Obviously, Django doesn't make migrations for unmanaged models and therefore won't create it in your test DB.
This can be solved using Django APIs without resorting to raw SQL - SchemaEditor. See a more complete example below, but as a short answer you would use it like this:
from django.db import connections
with connections['db_to_create_a_table_in'].schema_editor() as schema_editor:
schema_editor.create_model(YourUnmanagedModelClass)
A practical example:
# your_app/models/your_model.py
from django.db import models
class IntegrationView(models.Model):
"""A read-only model to access a view in some external DB."""
class Meta:
managed = False
db_table = 'integration_view'
name = models.CharField(
db_column='object_name',
max_length=255,
primaty_key=True,
verbose_name='Object Name',
)
some_value = models.CharField(
db_column='some_object_value',
max_length=255,
blank=True,
null=True,
verbose_name='Some Object Value',
)
# Depending on the situation it might be a good idea to redefine
# some methods as a NOOP as a safety-net.
# Note, that it's not completely safe this way, but might help with some
# silly mistakes in user code
def save(self, *args, **kwargs):
"""Preventing data modification."""
pass
def delete(self, *args, **kwargs):
"""Preventing data deletion."""
pass
Now, suppose you need to be able to create this model via Django, e.g. for some tests.
# your_app/tests/some_test.py
# This will allow to access the `SchemaEditor` for the DB
from django.db import connections
from django.test import TestCase
from your_app.models.your_model import IntegrationView
class SomeLogicTestCase(TestCase):
"""Tests some logic, that uses `IntegrationView`."""
# Since it is assumed, that the `IntegrationView` is read-only for the
# the case being described it's a good idea to put setup logic in class
# setup fixture, that will run only once for the whole test case
#classmethod
def setUpClass(cls):
"""Prepares `IntegrationView` mock data for the test case."""
# This is the actual part, that will create the table in the DB
# for the unmanaged model (Any model in fact, but managed models will
# have their tables created already by the Django testing framework)
# Note: Here we're able to choose which DB, defined in your settings,
# will be used to create the table
with connections['external_db'].schema_editor() as schema_editor:
schema_editor.create_model(IntegrationView)
# That's all you need, after the execution of this statements
# a DB table for `IntegrationView` will be created in the DB
# defined as `external_db`.
# Now suppose we need to add some mock data...
# Again, if we consider the table to be read-only, the data can be
# defined here, otherwise it's better to do it in `setUp()` method.
# Remember `IntegrationView.save()` is overridden as a NOOP, so simple
# calls to `IntegrationView.save()` or `IntegrationView.objects.create()`
# won't do anything, so we need to "Improvise. Adapt. Overcome."
# One way is to use the `save()` method of the base class,
# but provide the instance of our class
integration_view = IntegrationView(
name='Biggus Dickus',
some_value='Something really important.',
)
super(IntegrationView, integration_view).save(using='external_db')
# Another one is to use the `bulk_create()`, which doesn't use
# `save()` internally, and in fact is a better solution
# if we're creating many records
IntegrationView.objects.using('external_db').bulk_create([
IntegrationView(
name='Sillius Soddus',
some_value='Something important',
),
IntegrationView(
name='Naughtius Maximus',
some_value='Whatever',
),
])
# Don't forget to clean after
#classmethod
def tearDownClass(cls):
with connections['external_db'].schema_editor() as schema_editor:
schema_editor.delete_model(IntegrationView)
def test_some_logic_using_data_from_integration_view(self):
self.assertTrue(IntegrationView.objects.using('external_db').filter(
name='Biggus Dickus',
))
To make the example more complete... Since we're using multiple DB (default and external_db) Django will try to run migrations on both of them for the tests and as of now there's no option in DB settings to prevent this. So we have to use a custom DB router for testing.
# your_app/tests/base.py
class PreventMigrationsDBRouter:
"""DB router to prevent migrations for specific DBs during tests."""
_NO_MIGRATION_DBS = {'external_db', }
def allow_migrate(self, db, app_label, model_name=None, **hints):
"""Actually disallows migrations for specific DBs."""
return db not in self._NO_MIGRATION_DBS
And a test settings file example for the described case:
# settings/test.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': 'db_name',
'USER': 'username',
'HOST': 'localhost',
'PASSWORD': 'password',
'PORT': '1521',
},
# For production here we would have settings to connect to the external DB,
# but for testing purposes we could get by with an SQLite DB
'external_db': {
'ENGINE': 'django.db.backends.sqlite3',
},
}
# Not necessary to use a router in production config, since if the DB
# is unspecified explicitly for some action Django will use the `default` DB
DATABASE_ROUTERS = ['your_app.tests.base.PreventMigrationsDBRouter', ]
Hope this detailed new Django user user-friendly example will help someone and save their time.
unfortunately there seems to be no easy way to do this, but for your luck I have just succeeded in producing a working snippet for you digging in the internals of the django migrations jungle.
Just:
save the code to get_sql_create_table.py (in example)
do $ export DJANGO_SETTINGS_MODULE=yourproject.settings
launch the script with python get_sql_create_table.py yourapp.yourmodel
and it should output what you need.
Hope it helps!
import django
django.setup()
from django.db.migrations.state import ModelState
from django.db.migrations import operations
from django.db.migrations.migration import Migration
from django.db import connections
from django.db.migrations.state import ProjectState
def get_create_sql_for_model(model):
model_state = ModelState.from_model(model)
# Create a fake migration with the CreateModel operation
cm = operations.CreateModel(name=model_state.name, fields=model_state.fields)
migration = Migration("fake_migration", "app")
migration.operations.append(cm)
# Let the migration framework think that the project is in an initial state
state = ProjectState()
# Get the SQL through the schema_editor bound to the connection
connection = connections['default']
with connection.schema_editor(collect_sql=True, atomic=migration.atomic) as schema_editor:
state = migration.apply(state, schema_editor, collect_sql=True)
# return the CREATE TABLE statement
return "\n".join(schema_editor.collected_sql)
if __name__ == "__main__":
import importlib
import sys
if len(sys.argv) < 2:
print("Usage: {} <app.model>".format(sys.argv[0]))
sys.exit(100)
app, model_name = sys.argv[1].split('.')
models = importlib.import_module("{}.models".format(app))
model = getattr(models, model_name)
rv = get_create_sql_for_model(model)
print(rv)
For Django v4.1.3, the above get_create_sql_for_model soruce code changed like this:
from django.db.migrations.state import ModelState
from django.db.migrations import operations
from django.db.migrations.migration import Migration
from django.db import connections
from django.db.migrations.state import ProjectState
def get_create_sql_for_model(model):
model_state = ModelState.from_model(model)
table_name = model_state.options['db_table']
# Create a fake migration with the CreateModel operation
cm = operations.CreateModel(name=model_state.name, fields=model_state.fields.items())
migration = Migration("fake_migration", "app")
migration.operations.append(cm)
# Let the migration framework think that the project is in an initial state
state = ProjectState()
# Get the SQL through the schema_editor bound to the connection
connection = connections['default']
with connection.schema_editor(collect_sql=True, atomic=migration.atomic) as schema_editor:
state = migration.apply(state, schema_editor, collect_sql=True)
sqls = schema_editor.collected_sql
items = []
for sql in sqls:
if sql.startswith('--'):
continue
items.append(sql)
return table_name,items
#EOP
I used it to create all tables (like the command syncdb of old Django version):
for app in settings.INSTALLED_APPS:
app_name = app.split('.')[0]
app_models = apps.get_app_config(app_name).get_models()
for model in app_models:
table_name,sqls = get_create_sql_for_model(model)
if settings.DEBUG:
s = "SELECT COUNT(*) AS c FROM sqlite_master WHERE name = '%s'" % table_name
else:
s = "SELECT COUNT(*) AS c FROM information_schema.TABLES WHERE table_name='%s'" % table_name
rs = select_by_raw_sql(s)
if not rs[0]['c']:
for sql in sqls:
exec_by_raw_sql(sql)
print('CREATE TABLE DONE:%s' % table_name)
The full soure code can be found at Django syncdb command came back for v4.1.3 version

Refresh mongodb collection structure through python mongoengine

I'm writing a simple Flask app, with the sole purpose to learn Python and MongoDB.
I've managed to reach to the point where all the collections are defined, and CRUD operations work in general. Now, one thing that I really want to understand, is how to refresh the collection, after updating its structure. For example, say that I have the following model:
user.py
class User(db.Document, UserMixin):
email = db.StringField(required=True, unique=True)
password = db.StringField(required=True)
active = db.BooleanField()
first_name = db.StringField(max_length=64, required=True)
last_name = db.StringField(max_length=64, required=True)
registered_at = db.DateTimeField(default=datetime.datetime.utcnow())
confirmed = db.BooleanField()
confirmed_at = db.DateTimeField()
last_login_at = db.DateTimeField()
current_login_at = db.DateTimeField()
last_login_ip = db.StringField(max_length=45)
current_login_ip = db.StringField(max_length=45)
login_count = db.IntField()
companies = db.ListField(db.ReferenceField('Company'), default=[])
roles = db.ListField(db.ReferenceField(Role), default=[])
meta = {
'indexes': [
{'fields': ['email'], 'unique': True}
]
}
Now, I already have entries in my user collection, but I want to change companies to:
company = db.ReferenceField('Company')
How can I refresh the collection's structure, without having to bring the whole database down?
I do have a manage.py script that helps me and also provides a shell:
#!/usr/bin/python
from flask.ext.script import Manager
from flask.ext.script.commands import Shell
from app import factory
app = factory.create_app()
manager = Manager(app)
manager.add_command("shell", Shell(use_ipython=True))
# manager.add_command('run_tests', RunTests())
if __name__ == "__main__":
manager.run()
and I have tried a couple of commands, from information that I could recompile and out of my basic knowledge:
>>> from app.models import db, User
>>> import mongoengine
>>> mongoengine.Document(User)
field = iter(self._fields_ordered)
AttributeError: 'Document' object has no attribute '_fields_ordered'
>>> mongoengine.Document(User).modify() # well, same result as above
Any pointers on how to achieve this?
Update
I am asking all of this, because I have updated my user.py to match my new requests, but anytime I interact with the db its self, since the table's structure was not refreshed, I get the following error:
FieldDoesNotExist: The field 'companies' does not exist on the
document 'User', referer: http://local.faqcolab.com/company
Solution is easier then I expected:
db.getCollection('user').update(
// query
{},
// update
{
$rename: {
'companies': 'company'
}
},
// options
{
"multi" : true, // update all documents
"upsert" : false // insert a new document, if no existing document match the query
}
);
Explanation for each of the {}:
First is empty because I want to update all documents in user collection.
Second contains $rename which is the invoking action to rename the fields I want.
Last contains aditional settings for the query to be executed.
I have updated my user.py to match my new requests, but anytime I interact with the db its self, since the table's structure was not refreshed, I get the following error
MongoDB does not have a "table structure" like relational databases do. After a document has been inserted, you can't change it's schema by changing the document model.
I don't want to sound like I'm telling you that the answer is to use different tools, but seeing things like db.ListField(db.ReferenceField('Company')) makes me think you'd be much better off with a relational database (Postgres is well supported in the Flask ecosystem).
Mongo works best for storing schema-less documents (you don't know before hand how your data is structured, or it varies significantly between documents). Unless you have data like that, it's worth looking at other options. Especially since you're just getting started with Python and Flask, there's no point in making things harder than they are.

Django Search: Setting Haystack_Default_Operator = 'OR' has no effect

I'm using Haystack and Whoosh to do search with a django site I'm building. I'd like to use an OR operator on search terms (e.g. "Search String" will find objects with text "Search" OR "String" instead of "Search" AND "String")
This seems pretty straight forward as haystack allows you to override the default "AND" operator by setting HAYSTACK_DEFAULT_OPERATOR = 'OR' in your settings.py file.
Unfortunately, adding this to my settings.py has had no effect. I've found a couple of tangential references to this behavior on stackoverflow, but no solution. I've also found an issue posted on github, but it's been there since last year with no comments or classification.
I may be doing something wrong, so figured I'd post here and see if there's a solution. I'm kinda stuck without one!
My haystack settings in my settings.py:
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.whoosh_backend.WhooshEngine',
'PATH': os.path.join(os.path.dirname(__file__), 'whoosh_index'),
},
}
HAYSTACK_DEFAULT_OPERATOR = 'OR'
HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'
My view:
from haystack import views as hsviews
def search_test(request):
return hsviews.basic_search(request)
My search_indexes.py file:
import datetime
from haystack import indexes
from myApp.models import MyModel
from django.contrib.auth.models import User
class MyModelIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.NgramField(document=True, use_template= True)
isPublic = indexes.BooleanField(model_attr='isPubliclyVisible')
brand = indexes.CharField(model_attr='brand')
model = indexes.CharField(model_attr='model')
owner = indexes.CharField(model_attr='owner')
owner_username = indexes.CharField()
obj_type = indexes.CharField()
def get_model(self):
return MyModel
def index_queryset(self, using=None):
"""Used when the entire index for model is updated."""
return self.get_model().objects.filter(isPubliclyVisible = True)
def prepare_owner_username(self, obj):
return obj.owner.user.username
def prepare_obj_type(self,obj):
return 'MyModel'
I did find this workaround (which I haven't tested/thought through for my solution yet), but I figured this warranted its own question in case I/we are doing something wrong.
Instead of using the Haystack built-in basic_search function, I would suggest writing your own view so you would have more control of how the search queries are performed. That way, you can process more complex searches by extending your view or custom search query function, plus it would be easier to test.
For example, you can build separate SearchQuerySet filters to perform each of the keywords you're seaching for, then "OR" them together, like this:
def get_query(request):
"""
This function retrieves any query terms (e.g q=search+term)
from the request object.
:param request: request object
:returns: query terms as a list (split on whitespace)
"""
query = None
qs_keyword = 'q'
if (qs_keyword in request.GET) and request.GET[qs_keyword].strip():
query_string = request.GET[qs_keyword]
query = query_string.split()
return query
def perform_query(request):
"""
This is a helper function to perform the actual query.
You can extend this to handle more complicated searches using AND,
OR, boolean qualifiers, etc.
:param request: request object
:returns: SearchQuerySet results
"""
query = get_query(request)
if not query:
results = EmptySearchQuerySet()
else:
results = SearchQuerySet()
for search_term in query:
# you can use the "|" (or) operator
results |= results.filter(content=search_term)
# or else use "filter_or"
# results = results.filter_or(content=search_term)
return results
def your_search_view(request, *args, **kwargs):
"""
This is your search view to process the query and display your results.
"""
# call "perform_query" to do the actual search
results = perform_query(request)
# do the rest of your view processing ...
return render_to_response(etc.)

set namespace for all db operations

I'm trying to set the namespace for all DB operations for the Google App Engine in python, but i can't get it done.
Currently my code looks something like this:
""" Set Google namespace """
if user:
namespace = thisUser.namespace
namespace_manager.set_namespace(namespace)
""" End Google namespace """
#Then i have all sorts of classes:
class MainPage(BaseHandler):
def get(self):
#code with DB operations like get and put...
class MainPage2(BaseHandler):
def get(self):
#code with DB operations like get and put...
class MainPage3(BaseHandler):
def get(self):
#code with DB operations like get and put...
app = webapp2.WSGIApplication([ ... ], debug=True, config=webapp2_config)
The problem with this is, is that in the classes all DB operations are still done on the default namespace (so as if no namespace is set). Eventhough i set the namespace in the very top of my code.
When i print the variable "namespace", which i also set in the top of the code, then i do get to see the namespace that i wish to use.
But it looks like Google App Engine somewhere resets the namespace to empty before running the code in the classes.
So now i'm wondering if there's a good way to set the namespace once somewhere.
Currently i set it like this in all "def's":
class MainPage(BaseHandler):
def get(self):
namespace_manager.set_namespace(namespace)
#code with DB operations like get and put...
class MainPage(BaseHandler):
def get(self):
namespace_manager.set_namespace(namespace)
#code with DB operations like get and put...
etc...
It's just not a very elegant solution.
You need to write a middleware that will intercept the request and will set the namespace according to your app logic.
A good solution is to add a hook. Something like that should be works.
from google.appengine.api import apiproxy_stub_map
NAMESPACE_NAME = 'noname'
def namespace_call(service, call, request, response):
if hasattr(request, 'set_name_space'):
request.set_name_space(NAMESPACE_NAME)
apiproxy_stub_map.apiproxy.GetPreCallHooks().Append(
'datastore-hooks', namespace_call, 'datastore_v3')
You can add it in your main.py or appengine_config.py. By this way the hook is configured during the loading of the instances and keeps his state.
You can use appconfig.py and define namespace_manager_default_namespace_for_request()
Have a read of https://developers.google.com/appengine/docs/python/multitenancy/multitenancy see the first section of "Setting the Current Namespace"

Categories