Django - regex optional function parameters in URL - python

I am starting off with Django now (already have quite a bit of Python knowledge, as well as with other languages). I am wondering whether it is possible to pass optional parameters through the url to a view (function that is called when a certain url is entered). What I have:
url(regex=r'^bydate/year=(?P<year>[0-9]+)_month=(?P<month>[0-9]+)_day=(?P<day>[0-9]+)/$', view=views.question_by_date, name='question_by_date')
So, in other words, if the end of the url looks like this, for example:
...bydate/year=2001_month=11_day=2/
then it calls the question_by_date function, whose signature looks as follows:
question_by_date(request, **kwargs)
So with the above url, question_by_date will be called as
question_by_date(request, year=2001, month=11, day=2)
But I also want the user to be able to type in the url specifying just the year, e.g.
...bydate/year=2005/
which will call
question_by_date(request, year=2005)
Or for that matter, any combination of year, month, day (like just the year and the month, or just the year and day even, etc.)
So, is this possible? I am not so experienced in regex, and I understand that you can have optional string matches (zero or more) in regex, which will match the above just fine in normal circumstances, but here we are also passing (optional) parameters to a function.
NOTE:
A very similar question to this has already been asked here. I realize that I could make a different URL for each combination, but that would entail making 8 different URLs. Also, that question was asked 6 years ago. Hopefully some enhancement has been made in the meantime?

I think what you need is GET parameters for this:
Define you url without any parameters:
url(r'^bydate/$', views.question_by_date, name='question-by-date')
In your views, extract GET parameters:
from datetime import date
def question_by_date(request):
year = request.GET.get('year', 2005)
month = request.GET.get('month', 1)
day = request.GET.get('day', 1)
# use the parameters however you want afterwards
Call your url like:
http://localhost:8000/bydate/?year=2016&month=1&day=1
Check django doc for more details about http GET.

Related

Return PLS-00306 During login in with python

I am working on a crawler using Python to grab some data on company internal web.but when I posted all the data,it showed PLS-00306 wrong number or type of arguments in call to PM_USER_LOGIN_SP
ORA-066550:line 1, column 7
PL/SQL: Statement ignored
I checked my Firefox inspector again and again, and all my request data were right, even I removed some of my request data or changed it, it returned another error code.
Is there someone help me out what's the problem.
Oracle procedure PM_USER_LOGIN_SP has one or more parameters, each of them having its own data type. When calling that procedure, you must match number and data type of each of them.
For example, if it expects 3 parameters, you can't pass only 2 (nor 4) of them (because of wrong number of arguments (parameters)).
If parameter #1 is DATE, you can't pass letter A to it (because of a wrong type). Note that DATEs are kind of "special", because something that looks like a date to us, humans (such as 20.01.2018, which is today) passed to Oracle procedure's DATE data type parameter must really be a date. '20.01.2018' is a string, so either pass date literal, such as DATE '2018-01-20' or use appropriate function with a format mask, TO_DATE('20.01.2018', 'dd.mm.yyyy').
Therefore, have a look at the procedure first, pay attention to what it expects. Then check what you pass to it.

Grouping list of similar urls in python

I have a large sets of urls. Some are similar to each other i.e. they represent the similar set of pages.
For eg.
http://example.com/product/1/
http://example.com/product/2/
http://example.com/product/40/
http://example.com/product/33/
are similar. Similarly
http://example.com/showitem/apple/
http://example.com/showitem/banana/
http://example.com/showitem/grapes/
are also similar. So i need to represent them as http://example.com/product/(Integers)/
where (Integers) = 1,2,40,33 and http://example.com/showitem/(strings)/ where strings = apple,banana,grapes ... and so on.
Is there any inbuilt function or library in python to do find these similar urls from large set of mixed urls? How can this be done more efficiently? Please suggest. Thanks in advance.
Use a string to store the first part of the URL and just handle IDs, example:
In [1]: PRODUCT_URL='http://example.com/product/%(id)s/'
In [2]: _ids = '1 2 40 33'.split() # split string into list of IDs
In [3]: for id in _ids:
...: print PRODUCT_URL % {'id':id}
...:
http://example.com/product/1/
http://example.com/product/2/
http://example.com/product/40/
http://example.com/product/33/
The statement print PRODUCT_URL % {'id':id} uses Python string formatting to format the product URL depending on the variable id passed.
UPDATE:
I see you've changed your question. The solution for your problem is quite domain-specific and depends on your data set. There are several approaches, some more manual than others. One such approach would be to get the top-level URLs i.e. to retrieve the domain name:
In [7]: _url = 'http://example.com/product/33/' # url we're testing with
In [8]: ('/').join(_url.split('/')[:3]) # get domain
Out[8]: 'http://example.com'
In [9]: ('/').join(_url.split('/')[:4]) # get domain + first URL sub-part
Out[9]: 'http://example.com/product'
[:3] and [:4] above are just slicing the list resulting from split('/')
You can set the result as a key on a dict for which you keep a count of each time you encounter the URL part. And move on from there. Again the solution depends on your data. If it gets more complex than above then I suggest you look into regex as the other answers suggest.
You can use regular expressions to handle that cases. You can go to the Python documentation to see how is this handle.
Also you can see how Django implement this on its routings system
I'm not exactly sure what specifically you are looking for. It sounds to me that you are looking for something to match URLs. If this is indeed what you want then I suggest you use something that is built using regular expressions. One example can be found here.
I also suggest you take a look at Django and its routing system.
Not in Python, but I've created a Ruby Library (and an accompanying app) --
https://rubygems.org/gems/LinkGrouper
It works on all links (doesn't need to know any pattern).

How to define multiple optional variables in the URL?

In django doc proposed instead GET method use urlpatterns, and made ​​convenient way to handle these variables. But if at least one of the variables is not necessary I'll have to write more lines in url.py. I like that I can avoid this?
Example:
If I want to take a sample of posts in a given year, in urlpatterns I should add something like this:
url(r'^articles/(?P<year>\d{4})/$', 'news.views.show_archive'),
url: .../articles/1994/
If I want to make the sample positions for a particular month a specific year, in urlpatterns I should add something like this:
url(r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$', 'news.views.show_archive'),
url: .../articles/2003/03/
But if I want to see the records of all the years created particular month of year I have to add also this line:
url(r'^articles/(?P<month>\d{2})/$', 'news.views.show_archive'),
url: .../articles/03/
But I would like to do only one line that specifies the maximum set of variables, but that would process any of these URL.
To be honest I'm not sure that this is possible.
regexps can have optional parts, and view functions can have optional arguments. Also, you can still use querystrings (through request.GET) for what has no business being part of the URL (like query terms for a "search" view, ordering and filtering for a listing view, etc).
The point of using urlpatterns instead of querystrings is to build clean "semantic" urls, ie /blog/posts/<post_id>/ instead of /blog/posts/?post_id=<post_id>.
you could try like this
url(r'^articles/(?P<year>\d{4})/(?P<month>\d{2})/$', 'news.views.show_archive'),
def show_archive(request,year=None,month=None):
if year and month:
.....................
elif year:
.....................
elif month:
....................

How do I display most recent week items by default with WeekArchiveView?

I'm astonished by how little documentation on class-based generic views there is.
Anything slightly more complex than a trivial sample has to get done through guesswork, trial and error.
I want to use WeekArchiveView to display a week's item list.
There's my urls.py entry:
url(r'^items/(?P<year>\d{4})/week/(?P<week>\d{1,2})/$', ItemWeekArchiveView.as_view())
When no year or week is specified, I get an error page.
I want them to equal today's year and week by default.
What is the right place for tweak here? Should I introduce another mixing and override a method?
Urls like /items/ or /items/2011/ wouldn't match your regexp because \d{4} means exactly 4 digits.
You probably should specify two another url entries for both cases:
url(r'^items/$', AchievementListView.as_view(
year=str(date.today().year), week=str(date.today().isocalendar()[1])
)),
url(r'^items/(?P<year>\d{4})/week/(?P<week>\d{1,2})/$', ItemWeekArchiveView.as_view()),
(Using isocalendar to get the week number).

Irregular String Parsing on Python

I'm new to python/django and I am trying to suss out more effective information from my scraper. Currently, the scraper takes a list of comic book titles and correctly divides them into a CSV list in three parts (Published Date, Original Date, and Title). I then pass the current date and title through to different parts of my databse, which I do in my Loader script (convert mm/dd/yy into yyyy-mm-dd, save to "pub_date" column, title goes to "title" column).
A common string can look like this:
10/12/11|10/12/11|Stan Lee's Traveler #12 (10 Copy Incentive Cover)
I am successfully grabbing the date, but the title is trickier. In this instance, I'd ideally like to fill three different columns with the information after the second "|". The Title should go to "title", a charfield. the number 12 (after the '#') should go into the DecimalField "issue_num", and everything between the '()' 's should go into the "Special" charfield. I am not sure how to do this kind of rigorous parsing.
Sometimes, there are multiple #'s (one comic in particular is described as a bundle, "Containing issues #90-#95") and several have multiple '()' groups (such as, "Betrayal Of The Planet Of The Apes #1 (Of 4)(25 Copy Incentive Cover)
)
What would be a good road to start onto crack this problem? My knowledge of If/else statements quickly fell apart for the more complicated lines. How can I efficiently and (if possible) pythonic-ly parse through these lines and subdivide them so I can later slot them into the correct place in my database?
Use the regular expression module re. For example, if you have the third |-delimited field of your sample record in a variable s, then you can do
match = re.match(r"^(?P<title>[^#]*) #(?P<num>[0-9]+) \((?P<special>.*)\)$", s)
title = match.groups('title')
issue = match.groups('num')
special = match.groups('special')
You'll get an IndexError in the last three lines for a missing field. Adapt the RE until it parses everything your want.
Parsing the title is the hard part, it sounds like you can handle the dates etc yourself. The problem is that there is not one rule that can parse every title but there are many rules and you can only guess which one works on a particular title.
I usually handle this by creating a list of rules, from most specific to general and try them out one by one until one matches.
To write such rules you can use the re module or even pyparsing.
The general idea goes like this:
class CantParse(Exception):
pass
# one rule to parse one kind of title
import re
def title_with_special( title ):
""" accepts only a title of the form
<text> #<issue> (<special>) """
m = re.match(r"[^#]*#(\d+) \(([^)]+)\)", title)
if m:
return m.group(1), m.group(2)
else:
raise CantParse(title)
def parse_extra(title, rules):
""" tries to parse extra information from a title using the rules """
for rule in rules:
try:
return rule(title)
except CantParse:
pass
# nothing matched
raise CantParse(title)
# lets try this out
rules = [title_with_special] # list of rules to apply, add more functions here
titles = ["Stan Lee's Traveler #12 (10 Copy Incentive Cover)",
"Betrayal Of The Planet Of The Apes #1 (Of 4)(25 Copy Incentive Cover) )"]
for title in titles:
try:
issue, special = parse_extra(title, rules)
print "Parsed", title, "to issue=%s special='%s'" % (issue, special)
except CantParse:
print "No matching rule for", title
As you can see the first title is parsed correctly, but not the 2nd. You'll have to write a bunch of rules that account for every possible title format in your data.
Regular expression is the way to go. But if you fill uncomfortably writing them, you can try a small parser that I wrote (https://github.com/hgrecco/stringparser). It translates a string format (PEP 3101) to a regular expression. In your case, you will do the following:
>>> from stringparser import Parser
>>> p = Parser(r"{date:s}\|{date2:s}\|{title:s}#{issue:d} \({special:s}\)")
>>> x = p("10/12/11|10/12/11|Stan Lee's Traveler #12 (10 Copy Incentive Cover)")
OrderedDict([('date', '10/12/11'), ('date2', '10/12/11'), ('title', "Stan Lee's Traveler "), ('issue', 12), ('special', '10 Copy Incentive Cover')])
>>> x.issue
12
The output in this case is an (ordered) dictionary. This will work for any simple cases and you might tweak it to catch multiple issues or multiple ()
One more thing: notice that in the current version you need to manually escape regex characters (i.e. if you want to find |, you need to type \|). I am planning to change this soon.

Categories