Split in py. write in django - python

asd = "qweasdzxc";
qwen = asd.split("")
self.response.out.write(qwen[0]) # write q
i can split the sentence with this way but i want to do this with using django template in my html doc.
How can i do this
Thanks for helping

You might want to write a custom filter for django for this. Here is a snippet for splitting.
There is also make_list and slice (look on the same page as the last link) to accomplish the same goal.
Given the example slice is probably what you want.

Related

Django SearchVector using icontains

I am trying to search for a list of values in multiple columns in postgres (via django). I was able to use SearchQuery and SearchVector and this works great if one of the search values matches a full word. I was hoping to use icontains so that partial strings could also be used in the search. Is this possible and if so could someone point me in the right direction. Here is an example of my approach below.
Example Data:
Superhero.objects.create(
superhero='Batman',
publisher='DC Comics',
alter_ego='Bruce Wayne',
)
Superhero.objects.create(
superhero='Hulk',
publisher='Marvel Comics',
alter_ego='Bruce Banner',
)
Django filter:
from django.contrib.postgres.search import SearchQuery, SearchVector
query = SearchQuery('man') | SearchQuery('Bruce')
vector = SearchVector('superhero', 'alter_ego', 'publisher')
queryset = queryset.annotate(search=vector).filter(search=query)
This would return the Hulk record but I am hoping I can somehow use like 'icontains' so that when searching for 'man' the Batman record would also be returned. Any help is appreciated!
You can apply icontains to the filter like:
queryset = queryset.annotate(search=vector).filter(search__icontains=query)
So SearchQuery and SearchVector are a part of Django's Full Text searching functionality and it doesnt look like you can achieve what I was wanting to do with these functions. I have taken a different approach thanks to Julian Phalip's approach here.. https://www.julienphalip.com/blog/adding-search-to-a-django-site-in-a-snap/

Grouping list of similar urls in python

I have a large sets of urls. Some are similar to each other i.e. they represent the similar set of pages.
For eg.
http://example.com/product/1/
http://example.com/product/2/
http://example.com/product/40/
http://example.com/product/33/
are similar. Similarly
http://example.com/showitem/apple/
http://example.com/showitem/banana/
http://example.com/showitem/grapes/
are also similar. So i need to represent them as http://example.com/product/(Integers)/
where (Integers) = 1,2,40,33 and http://example.com/showitem/(strings)/ where strings = apple,banana,grapes ... and so on.
Is there any inbuilt function or library in python to do find these similar urls from large set of mixed urls? How can this be done more efficiently? Please suggest. Thanks in advance.
Use a string to store the first part of the URL and just handle IDs, example:
In [1]: PRODUCT_URL='http://example.com/product/%(id)s/'
In [2]: _ids = '1 2 40 33'.split() # split string into list of IDs
In [3]: for id in _ids:
...: print PRODUCT_URL % {'id':id}
...:
http://example.com/product/1/
http://example.com/product/2/
http://example.com/product/40/
http://example.com/product/33/
The statement print PRODUCT_URL % {'id':id} uses Python string formatting to format the product URL depending on the variable id passed.
UPDATE:
I see you've changed your question. The solution for your problem is quite domain-specific and depends on your data set. There are several approaches, some more manual than others. One such approach would be to get the top-level URLs i.e. to retrieve the domain name:
In [7]: _url = 'http://example.com/product/33/' # url we're testing with
In [8]: ('/').join(_url.split('/')[:3]) # get domain
Out[8]: 'http://example.com'
In [9]: ('/').join(_url.split('/')[:4]) # get domain + first URL sub-part
Out[9]: 'http://example.com/product'
[:3] and [:4] above are just slicing the list resulting from split('/')
You can set the result as a key on a dict for which you keep a count of each time you encounter the URL part. And move on from there. Again the solution depends on your data. If it gets more complex than above then I suggest you look into regex as the other answers suggest.
You can use regular expressions to handle that cases. You can go to the Python documentation to see how is this handle.
Also you can see how Django implement this on its routings system
I'm not exactly sure what specifically you are looking for. It sounds to me that you are looking for something to match URLs. If this is indeed what you want then I suggest you use something that is built using regular expressions. One example can be found here.
I also suggest you take a look at Django and its routing system.
Not in Python, but I've created a Ruby Library (and an accompanying app) --
https://rubygems.org/gems/LinkGrouper
It works on all links (doesn't need to know any pattern).

how do I modify a url that I pick at random in python

I have an app that will show images from reddit. Some images come like this http://imgur.com/Cuv9oau, when I need to make them look like this http://i.imgur.com/Cuv9oau.jpg. Just add an (i) at the beginning and (.jpg) at the end.
You can use a string replace:
s = "http://imgur.com/Cuv9oau"
s = s.replace("//imgur", "//i.imgur")+(".jpg" if not s.endswith(".jpg") else "")
This sets s to:
'http://i.imgur.com/Cuv9oau.jpg'
This function should do what you need. I expanded on #jh314's response and made the code a little less compact and checked that the url started with http://imgur.com as that code would cause issues with other URLs, like the google search I included. It also only replaces the first instance, which could causes issues.
def fixImgurLinks(url):
if url.lower().startswith("http://imgur.com"):
url = url.replace("http://imgur", "http://i.imgur",1) # Only replace the first instance.
if not url.endswith(".jpg"):
url +=".jpg"
return url
for u in ["http://imgur.com/Cuv9oau","http://www.google.com/search?q=http://imgur"]:
print fixImgurLinks(u)
Gives:
>>> http://i.imgur.com/Cuv9oau.jpg
>>> http://www.google.com/search?q=http://imgur
You should use Python's regular expressions to place the i. As for the .jpg you can just append it.

How to transform hyperlink codes into normal URL strings?

I'm trying to build a blog system. So I need to do things like transforming '\n' into < br /> and transform http://example.com into < a href='http://example.com'>http://example.com< /a>
The former thing is easy - just using string replace() method
The latter thing is more difficult, but I found solution here: Find Hyperlinks in Text using Python (twitter related)
But now I need to implement "Edit Article" function, so I have to do the reverse action on this.
So, how can I transform < a href='http://example.com'>http://example.com< /a> into http://example.com?
Thanks! And I'm sorry for my poor English.
Sounds like the wrong approach. Making round-trips work correctly is always challenging. Instead, store the source text only, and only format it as HTML when you need to display it. That way, alternate output formats / views (RSS, summaries, etc) are easier to create, too.
Separately, we wonder whether this particular wheel needs to be reinvented again ...
Since you are using the answer from that other question your links will always be in the same format. So it should be pretty easy using regex. I don't know python, but going by the answer from the last question:
import re
myString = 'This is my tweet check it out http://tinyurl.com/blah'
r = re.compile(r'(http://[^ ]+)')
print r.sub(r'\1', myString)
Should work.

Django-Haystack with Solr contains search

I am using haystack within a project using solr as the backend. I want to be able to perform a contains search, similar to the Django .filter(something__contains="...")
The __startswith option does not suit our needs as it, as the name suggests, looks for words that start with the string.
I tried to use something like *keyword* but Solr does not allow the * to be used as the first character
Thanks.
To get "contains" functionallity you can use:
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" side="back"/>
<filter class="solr.LowerCaseFilterFactory" />
as index analyzer.
This will create ngrams for every whitespace separated word in your field. For example:
"Index this!" => x, ex, dex, ndex, index, !, s!, is!, his!, this!
As you see this will expand your index greatly but if you now enter a query like:
"nde*"
it will match "ndex" giving you a hit.
Use this approach carefully to make sure that your index doesn't get too large. If you increase minGramSize, or decrease maxGramSize it will not expand the index as mutch but reduce the "contains" functionallity. For instance setting minGramSize="3" will require that you have at least 3 characters in your contains query.
You can achieve the same behavior without having to touch the solr schema. In your index, make your text field an EdgeNgramField instead of a CharField. Under the hood this will generate a similar schema to what lindstromhenrik suggested.
I am using an expression like:
.filter(something__startswith='...')
.filter_or(name=''+s'...')
as is seems solr does not like expression like '...*', but combined with or will do
None of the answers here do a real substring search *keyword*.
They don't find the keyword that is part of a bigger string, (not a prefix or suffix).
Using EdgeNGramFilterFactory or the EdgeNgramField in the indexes can only do a "startswith" or a "endswith" type of filtering.
The solution is to use a NgramField like this:
class MyIndex(indexes.SearchIndex, indexes.Indexable):
...
field_to_index= indexes.NgramField(model_attr='field_name')
...
This is very elegant, because you don't need to manually add anything to the schema.xml

Categories