Django-Haystack & Elasticsearch brake with queries containing special characters - python

So I've been trying to fix a bug that really annoys me: Django-Haystack & Elasticsearch queries are working with accents but it brakes everytime with queries containing special characters like dash - and apostrophes '.
For example let's use Baie-d'Urfé as the query.
Here's my code:
forms.py
class FacetedProductSearchForm(FacetedSearchForm):
def __init__(self, *args, **kwargs):
data = dict(kwargs.get("data", []))
self.ptag = data.get('ptags', [])
self.q_from_data = data.get('q', '')
super(FacetedProductSearchForm, self).__init__(*args, **kwargs)
def search(self):
sqs = super(FacetedProductSearchForm, self).search()
# Ideally we would tell django-haystack to only apply q to destination
# ...but we're not sure how to do that, so we'll just re-apply it ourselves here.
q = self.q_from_data
sqs = sqs.filter(destination=Exact(q))
print('should be applying q: {}'.format(q))
print(sqs)
if self.ptag:
print('filtering with tags')
print(self.ptag)
sqs = sqs.filter(ptags__in=[Exact(tag) for tag in self.ptag])
return sqs
Using FacetedSearch in View.py
class FacetedSearchView(BaseFacetedSearchView):
form_class = FacetedProductSearchForm
facet_fields = ['ptags']
template_name = 'search_result.html'
paginate_by = 30
context_object_name = 'object_list'
And my search_indexes.py
class ProductIndex(indexes.SearchIndex, indexes.Indexable):
text = indexes.EdgeNgramField(
document=True, use_template=True,
template_name='search/indexes/product_text.txt')
destination = indexes.CharField(model_attr="destination") #boost=1.125
# Tags
ptags = indexes.MultiValueField(model_attr='_ptags', faceted=True)
# for auto complete
content_auto = indexes.EdgeNgramField(model_attr='destination')
# Spelling suggestions
suggestions = indexes.FacetCharField()
def get_model(self):
return Product
def index_queryset(self, using=None):
"""Used when the entire index for model is updated."""
return self.get_model().objects.filter(timestamp__lte=timezone.now())
Any ideas on how to fix this?
Thanks a lot!

The problem seems to be related to Elasticsearch itself, so what I did is remove all my Elasticsearch instances and reformulated my search view to simple postgresql queries.
Final observation after solving this:
50$ / month saved and a search engine working like a charm!

Related

Python Django Invalid Input Syntax Error for Int even though it isn't supposed to be an integer

I am making an e-commerce app using Python, Django 2.1, and PostgreSQL. When I click on one of the categories to show products specific to that category, it tells me that I have a data error. Apparently, in the URL, there is a string when there is supposed to be an integer. I don't understand this. Please help
I have tried many different things, but none seem to work.
My URL Patterns:
path('products/phones/', product_views.CategoryDetailView.as_view(template_name='products/category_details/phones.html'), name='phones'),
path('products/laptops/', product_views.CategoryDetailView.as_view(template_name='products/category_details/laptops.html'), name='laptops'),
path('products/desktops/', product_views.CategoryDetailView.as_view(template_name='products/category_details/desktops.html'), name='desktops'),
path('products/keyboards/', product_views.CategoryDetailView.as_view(template_name='products/category_details/keyboards.html'), name='keyboards'),
path('products/mice-and-mouse-pads/', product_views.CategoryDetailView.as_view(template_name='products/category_details/mice.html'), name='mice'),
path('products/headsets/', product_views.CategoryDetailView.as_view(template_name='products/category_details/headsets.html'), name='headsets'),
path('products/printers-scanners-and-fax/', product_views.CategoryDetailView.as_view(template_name='products/category_details/printers.html'), name='printers'),
path('products/consoles/', product_views.CategoryDetailView.as_view(template_name='products/category_details/consoles.html'), name='consoles'),
path('products/misc/', product_views.CategoryDetailView.as_view(template_name='products/category_details/misc.html'), name='misc'),
The error received is:
django.db.utils.DataError: invalid input syntax for integer: "Phones"
LINE 1: ...s_product" WHERE "products_product"."category_id" = 'Phones'
My Views
model = Category
queryset = Category.objects.all()
template_name = 'products/category_detail.html'
context_object_name = 'categories'
def get_context_data(self, **kwargs):
context = super().get_context_data(**kwargs)
# Query Sets
context['phones'] = Product.objects.filter(category='Phones')
context['laptops'] = Product.objects.filter(category='Laptops')
context['total_laptops'] = len(Product.objects.filter(category='Laptops'))
context['desktops'] = Product.objects.filter(category='Desktops')
context['total_desktops'] = len(Product.objects.filter(category='Desktops'))
context['keyboards'] = Product.objects.filter(category='Keyboards')
context['total_keyboards'] = len(Product.objects.filter(category='Keyboards'))
context['mice'] = Product.objects.filter(category='Mice and Mouse Pads')
context['total_mice'] = len(Product.objects.filter(category='Mice and Mouse Pads'))
context['printers'] = Product.objects.filter(category='Printers, Scanners, and Fax Machines')
context['total_printers'] = len(Product.objects.filter(category='Printers, Scanners, and Fax Machines'))
context['consoles'] = Product.objects.filter(category='Consoles')
context['total_consoles'] = len(Product.objects.filter(category='Consoles'))
context['miscellaneous'] = Product.objects.filter(category='Miscellaneous')
context['total_miscellaneous'] = len(Product.objects.filter(category='Miscelaneous'))
return context

Why is this Django QuerySet returning no results?

I have this Class in a project, and I'm trying to get previous and next elements of current one.
def get_context(self, request, *args, **kwargs):
context = super(GuidePage, self).get_context(request, *args, **kwargs)
context = get_article_context(context)
all_guides = GuidePage.objects.all().order_by("-date")
context['all_guides'] = all_guides
context['next_guide'] = all_guides.filter(date__lt=self.date)
context['prev_guide'] = all_guides.filter(date__gt=self.date)
print context['next_guide']
print context['prev_guide']
return context
These two lines:
context['prev_guide'] = all_guides.filter(date__lt=self.date)
context['next_guide'] = all_guides.filter(date__gt=self.date)
are returning empty results as printed in the console:
(QuerySet[])
(QuerySet[])
What am I missing?
EDIT:
I changed lt and gt to lte and gte. As I understand that will include results that are also equal in date.
In this case I got ALL elements. All elements were created the same day, but, of course, at different times, so they should be different by minutes. Is this difference not taking into account when filtering for greater/lesser ?
If you want to filter not only by date, but time also, you must change the relevant field in your model to be of DateTimeField instead of DateField.
Like this:
from django.db import models
class MyModel(models.Model):
date_time = models.DateTimeField()
Now, you can do stuff like all_guides.filter(date_time__lte=self.date_time) or all_guides.filter(date_time__gte=self.date_time).
Carefull of the two underscores __.

HAYSTACK_SIGNAL_PROCESSOR failing to update index in realtime as None object is passed

I am using haystack and elasticsearch. I am building indexes in the following manner-->
class BookIndex(indexes.SearchIndex,indexes.Indexable):
text= indexes.CharField(document=True,use_template=True)
content_auto = indexes.EdgeNgramField(model_attr='title',boost=1.5)
isbn_13 = indexes.CharField(model_attr='isbn_13')
category = indexes.CharField()
sub_category = indexes.CharField()
def prepare_sellers(self, Book):
return [seller.name for seller in Book.sellers.all()]
def prepare_category(self, Book):
return [Book.category.name]
def prepare_sub_category(self, Book):
return [Book.sub_category.name]
And I have included the following in the settings file :-
HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'
But when I am going to add data to my database by doing -> http://dpaste.com/01739SW , haystack index updating is failing and I am getting the following error-->http://dpaste.com/2YGXZ8J
Can someone please help me out in fixing the issue. Thank you.

UnprojectedPropertyError in google app engine

The following is a working code in google app engine. This is used to display some records from a SentMail Model. The SentMail have a large no of fields, here i only showed the fields we now need. Since for displaying data i do not require to take the complete record. Hence i used projection. Note: This code is working
class SentMail(ndb.Model):
to_email = ndb.StringProperty()
bounced = ndb.BooleanProperty(default=False)
bounce_type = ndb.StringProperty()
bounceSubType = ndb.StringProperty()
#staticmethod
def get_bounced_emails():
db_query = SentMail.query(SentMail.bounced==True, projection=['to_email', 'bounce_type'], distinct=True).order(SentMail.bounce_type).order(SentMail.to_email).fetch()
return db_query if len(db_query) > 0 else None
class BounceCompaintRender(session_module.BaseSessionHandler):
"""docstring for BounceCompaintRender"""
def get(self):
bounced_emails = self.get_data('complete_bounced_rec')
template_values = {
'bounced_emails': bounced_emails
}
path = os.path.join(os.path.dirname(__file__), 'templates/bounce_complaint_emails.html')
self.response.out.write(template.render(path, template_values))
else:
self.redirect('/login')
def get_data(self, key):
data = memcache.get(key)
if data is not None:
for each in data:
logging.info(each)
return data
else:
data = SentMail.get_bounced_emails()
memcache.add(key="complete_bounced_rec", value=data, time=3600)
return data
Here the only change I made is in SentMail.get_bounced_emails()
#staticmethod
def get_bounced_emails():
db_query = SentMail.query(SentMail.bounced==True, projection=['to_email', 'bounceSubType'], distinct=True).order(SentMail.bounceSubType).order(SentMail.to_email).fetch()
return db_query if len(db_query) > 0 else None
I now get an error UnprojectedPropertyError: Property bounceSubType is not in the projection. I checked the logs and i found in the projection field one parameter is missing and eventhough it has a value None(None is not the only value). I tried clearing memcache but the problem still arises. The following is the log
SentMail(key=Key('SentMail', 655553213235462), bounceSubType=None, to_email=u'test#example.com', _projection=('to_email',))
This error is due to the difference in the model SentMail wrote in models.py and in datastore (i.e., properties are different in both). For this you need to update all records in datastore related to SentMail then datastore and model will have the same fields.

Separating "user-owned" from "other" data in Django template

I have an Openstack-powered, Django-modified application that shows the disk images and snapshots available for a user to launch. The user currently sees both snapshots they created and ones they did not. I would like to separate the current table into two based on whether they are owned by the user or not.
My two table definitions are as follows (note I altered row_actions accordingly):
class UserSnapshotsTable(OldSnapshotsTable):
cloud = tables.Column(get_cloud, verbose_name=_("Cloud"))
class Meta:
name = "usersnapshots"
verbose_name = _("User Snapshots")
table_actions = (DeleteSnapshot,)
row_actions = (LaunchSnapshot, LaunchCluster, EditImage, DeleteSnapshot)
pagination_param = "snapshot_marker"
row_class = UpdateRow
status_columns = ["status"]
class OtherSnapshotsTable(OldSnapshotsTable):
cloud = tables.Column(get_cloud, verbose_name=_("Cloud"))
class Meta:
name = "othersnapshots"
verbose_name = _("Other Snapshots")
table_actions = (DeleteSnapshot,)
row_actions = (LaunchSnapshot, LaunchCluster)
pagination_param = "snapshot_marker"
row_class = UpdateRow
status_columns = ["status"]
I have altered the HTML template to pull the "UserSnapshotsTable" and "OtherSnapshotsTable" tables (I copied the original table and renamed both), but both full tables still generate under the respective headings. There are two functions generating the data:
def get_usersnapshots_data(self):
req = self.request
marker = req.GET.get(UserSnapshotsTable._meta.pagination_param, None)
try:
usersnaps, self._more_snapshots = api.snapshot_list_detailed(req,
marker=marker)
except:
usersnaps = []
exceptions.handle(req, _("Unable to retrieve user-owned snapshots."))
return usersnaps
def get_othersnapshots_data(self):
req = self.request
marker = req.GET.get(OtherSnapshotsTable._meta.pagination_param, None)
try:
othersnaps, self._more_snapshots = api.snapshot_list_detailed(req,
marker=marker)
except:
othersnaps = []
exceptions.handle(req, _("Unable to retrieve non-user-owned snapshots."))
return othersnaps
There are also Edit/Delete options defined for images, and imported for snapshots, that seem to have a key comparison. Here's the "Delete" one (line 7):
class DeleteImage(tables.DeleteAction):
data_type_singular = _("Image")
data_type_plural = _("Images")
def allowed(self, request, image=None):
if image:
return image.owner == request.user.tenant_id
# Return True to allow table-level bulk delete action to appear.
return True
def delete(self, request, obj_id):
api.image_delete(request, obj_id)
How can I separate those tables out? This is my first time asking a question here, so please let me know if I can provide further information. Apologies for the length of it.
As far as I see you are using glanceclient. If that so you can use extra_filters parameter of snapshot_list_detailed() to filter only user images like this:
usersnaps, self._more_snapshots = api.snapshot_list_detailed(
req,
marker = marker,
extra_filters = {"owner": "user_name"}
)
Under cover snapshot_list_detailed uses GET images of Openstack Image Service API.

Categories