MongoAlchemy regex return nothing - python

I'm testing MongoAlchemy for a project and I've to search user by name.
I'm trying to make a regex but query result is always empty.
I tried two methods :
import re
users = User.query.filter({"name":re.compile("/a/", re.IGNORECASE)}).all()
And :
users = User.query.filter(User.name.regex('/a/', ignore_case=True)).all()
Even if I use a very general regex like /.*/, the result is always empty.
Thank you.

In python regular expressions are not defined using /regexp/, this is javascript syntax.
The proper way to initialize regular expressions would be:
re.compile(r".*", re.IGNORECASE)
So you should use:
users = User.query.filter({"name": re.compile(r".*", re.IGNORECASE)}).all()

Related

How to remove the sensitive information before #github.com to sanitize it correctly using Python 3.9 and/or regex?

I need to include a username and token in a github url to access a private repo on github.
After accessing it, I need to sanitize it to obtain the clean version.
The input pattern is https://{username}:{token}#github.com/{repo_owner}/{repo-name}
The output pattern i want is https://github.com/{repo_owner}/{repo-name}
For e.g. I am given this
https://usernameabc:token1234#github.com/abc/easy-as-123
I want this
https://github.com/abc/easy-as-123
How do I do this with Python? I am okay to use regex
What I use that works
I am using this
def sanitize_github_url(github_url_with_username_token):
github_url_with_username_token = github_url_with_username_token.lower()
index = github_url_with_username_token.find("github.com/", 0)
suffix = github_url_with_username_token[index:]
return f"https://{suffix}"
And it works for my purposes. Is there a better way to do this?
I'd prefer to not use regex in this scenario, and instead use a url manipulation library like furl
eg:
from furl import furl
url = furl("https://usernameabc:token1234#github.com/abc/easy-as-123")
url.password = None
url.username = None
print(str(url))
output:
https://github.com/abc/easy-as-123
Use regex with backward and forward lookaround.
raw = r'https://usernameabc:token1234#github.com/abc/easy-as-123'
re.sub("(?<=https://).*?(?=github.com)", "", raw)

How can I use regex in django's Replace function

I'm trying to update all urls in my query by using django's update and Replace function using regex.
Here's what I've tried so far but it seems like django's Value expression did not recognize my regex.
from django.db.models import Value
from django.db.models.functions import Replace
Foo.objects.filter(
some_url__iregex=r'^\/some-link\/\d+\/').update(
some_url=Replace(
'some_url', Value(r'^\/some-link\/\d+\/'),
Value('/some-link/')))
My goal is to remove all numbers after /some-link/ (e.g. /some-link/55/test to just /some-link/test)
I wasn't able to get the Replace function from Django working, but I was able to use the Function for REGEXP_REPLACE detailed in this answer: https://stackoverflow.com/a/68402017/3056056

Django domain + regex parameter not working on production machine

I currently have a django view with a fairly simple search function (takes user input, returns a list of objects). For usability, I'd like the option of passing search paramters via url like so:
www.example.com/search/mysearchstring
Where mysearchstring is the input to the search function. I'm using regex to validate any alphanumeric or underscore characters.
The problem I'm having is that while this works perfectly in my development environment, it breaks on the live machine.
Currently, I am using this exact same method (with different regex patterns) in other django views without any issues. This leads me to believe that either.
1) My regex is truly bad (more likely)
2) There is a difference in regex validators between environments (less likely)
The machine running this is using django 1.6 and python 2.7, which are slightly behind my development machine, but not significantly.
urls.py
SEARCH_REGEX = '(?P<pdom>\w*)?'
urlpatterns = patterns('',
....
url(r'^polls/search/' + SEARCH_REGEX, 'polls.views.search'),
...)
Which are passed to the view like this
views. py
def search(request, pdom):
...
When loading up the page, I get the following error:
ImproperlyConfigured: "^polls/search/(?P<pdom>\w*)?" is not a valid regular expression: nothing to repeat
I've been scratching my head over this one for a while. I've attempted to use a few different methods of encapsulation around the expression with no change in results. Would appreciate any insight!
I would change it to this:
SEARCH_REGEX = r'(?P<pdom>.+)$'
It's usually a good idea to use raw strings r'' for regular expressions in python.
The group will match the entire content of the search part of your url. I would handle query string validation in the view, instead of in the url regex. If someone tries to search polls/search/two+words, you should not return a 404, but instead a 400 status and a error message explaining that the search string was malformed.
Finally, you might want to follow the common convention for search urls. Which is to use a query parameter called q. So your url-pattern would be ^polls/search/$, and then you just handle the q in the view using something like this:
def search_page_view(request):
query_string = request.GET.get('q', '')

Django URL regex "is not a valid regular expression" error

I'm having a bit of trouble configuring the following url. I want it to be able to match a pages which start off with a category and then finish with a slug, examples:
/category1/post1/
/category2/post2/
/category3/post3/
/category1/post4/
/category2/post5/
I've tried many different methods with no success... I always get an "is not a valid regular expression" error.
This is how I thought it should work:
url(r'^(?P<category1|category2|category3>[\w\-]+)/(?P<slug>[\w\-]+)/$', blog_post, name = 'blog_post'),
I am fairly new to regex and trying to learn so any help on this one with an explanation would be much appreciated :)
Your pattern is incorrect; you are putting the alternative values in the wrong place. You put them in the name of the group:
(?P<category1|category2|category3>...)
Put them in the part the name is supposed to match instead:
(?P<category>category1|category2|category3)
Making the full registration:
url(r'^(?P<category>category1|category2|category3)/(?P<slug>[\w\-]+)/$', blog_post, name='blog_post'),
I'm assuming your blog_post callable looks something like:
def blog_post(category, slug):

how do I modify a url that I pick at random in python

I have an app that will show images from reddit. Some images come like this http://imgur.com/Cuv9oau, when I need to make them look like this http://i.imgur.com/Cuv9oau.jpg. Just add an (i) at the beginning and (.jpg) at the end.
You can use a string replace:
s = "http://imgur.com/Cuv9oau"
s = s.replace("//imgur", "//i.imgur")+(".jpg" if not s.endswith(".jpg") else "")
This sets s to:
'http://i.imgur.com/Cuv9oau.jpg'
This function should do what you need. I expanded on #jh314's response and made the code a little less compact and checked that the url started with http://imgur.com as that code would cause issues with other URLs, like the google search I included. It also only replaces the first instance, which could causes issues.
def fixImgurLinks(url):
if url.lower().startswith("http://imgur.com"):
url = url.replace("http://imgur", "http://i.imgur",1) # Only replace the first instance.
if not url.endswith(".jpg"):
url +=".jpg"
return url
for u in ["http://imgur.com/Cuv9oau","http://www.google.com/search?q=http://imgur"]:
print fixImgurLinks(u)
Gives:
>>> http://i.imgur.com/Cuv9oau.jpg
>>> http://www.google.com/search?q=http://imgur
You should use Python's regular expressions to place the i. As for the .jpg you can just append it.

Categories