get best match for django query

get best match for django query - python

I have one thing and need suggest. Please help me on this.
my scenario is:
i have one model like ordercode
in this table i have prefix like
1 | US
12 | Canada
13 | UK
134 | Australia
and more.
then, i have string like 12345678, and i need to get best match for this string .
if user enter 12345678 best match is 12|Canada , if user enter 135678975 best match is 13|Uk, and if user enter 1345676788 best match is 134.
How can i do it in django query?
Thanks,

it will require multiple request to check every char in the given order code...
def get_matching_country(order_num):
i = 1
matching_req = Country.objects.none()
while true:
req = Country.objects.filter(country_code=order_num[:i])
if res.exists():
matching_req = req
i += 1
else:
break
return matching_req

def match_country(request, string):
qs = Country.objects.filter(country_code__startswith=string) #or int(string) but you might want to write an exception if person provides a letters in that numbers.
return ...
Have you saved your country codes as strings or integers?

Related

python how to dynamically find a persons name in a string

im working on a project where i have to use speech to text as an input to determine who to call, however using the speech to text can give some unexpected results so i wanted to have a little dynamic matching of the strings, i'm starting small and try to match 1 single name, my name is Nick Vaes, and i try to match my name to the spoken text, but i also want it to match when for example some text would be Nik or something, idealy i would like to have something that would match everything if only 1 letter is wrong so
Nick
ick
nik
nic
nck
would all match my name, the current simple code i have is:
def user_to_call(s):
if "NICK" or "NIK" in s.upper(): redirect = "Nick"
if redirect: return redirect
for a 4 letter name its possible to put all possibilities in the filter, but for names with 12 letters it is a little bit of overkill since i'm pretty sure it can be done way more efficient.

You need to use Levenshtein_distance
A python implementation is nltk
import nltk
nltk.edit_distance("humpty", "dumpty")

What you basically need is fuzzy string matching, see:
https://en.wikipedia.org/wiki/Approximate_string_matching
https://www.datacamp.com/community/tutorials/fuzzy-string-python
Based on that you can check how similar is the input compared your dictionary:
from fuzzywuzzy import fuzz
name = "nick"
tomatch = ["Nick", "ick", "nik", "nic", "nck", "nickey", "njick", "nickk", "nickn"]
for str in tomatch:
ratio = fuzz.ratio(str.lower(), name.lower())
print(ratio)
This code will produce the following output:
100
86
86
86
86
80
89
89
89
You have to experiment with different ratios and check which will suit your requirements to miss only one letter

From what I understand, you are not looking at any fuzzy matching. (Because you did not upvote other responses).
If you are just trying to evaluate what you specified in your request, here is the code. I have put some additional conditions where I printed the appropriate message. Feel free to remove them.
def wordmatch(baseword, wordtoMatch, lengthOfMatch):
lis_of_baseword = list(baseword.lower())
lis_of_wordtoMatch = list(wordtoMatch.lower())
sum = 0
for index_i, i in enumerate(lis_of_wordtoMatch):
for index_j, j in enumerate(lis_of_baseword):
if i in lis_of_baseword:
if i == j and index_i <= index_j:
sum = sum + 1
break
else:
pass
else:
print("word to match has characters which are not in baseword")
return 0
if sum >= lengthOfMatch and len(wordtoMatch) <= len(baseword):
return 1
elif sum >= lengthOfMatch and len(wordtoMatch) > len(baseword):
print("word to match has no of characters more than that of baseword")
return 0
else:
return 0
base = "Nick"
tomatch = ["Nick", "ick", "nik", "nic", "nck", "nickey","njick","nickk","nickn"]
wordlength_match = 3 # this says how many words to match in the base word. In your case, its 3
for t_word in tomatch:
print(wordmatch(base,t_word,wordlength_match))
the output looks like this
1
1
1
1
1
word to match has characters which are not in baseword
0
word to match has characters which are not in baseword
0
word to match has no of characters more than that of baseword
0
word to match has no of characters more than that of baseword
0
Let me know if this served your purpose.

Need to search a string for a "two word" pattern in python

I’m trying to search a long string of characters for a country name. The country name is sometimes more than one word, such as Costa Rica.
Here is my code:
eol = len(CountryList)
for c in range(0, eol):
country = str(CountryList[c])
countrymatch = re.search(country, fullsampledata)
if countrymatch:
...
fullsampledata is a long string with all the data in one line. I’m trying to parse out the country by cycling thru a list of valid country names. If country is only one word, such as ‘Holland’, it finds it. However, if country is two or more words, ‘Costa Rica’, it doesn’t find it. Why?

You can search for a substring in a string using the .find() function as follows
fullsampledata = "hwfekfwekjfnkwfehCosta Ricakwjfkwfekfekfw"
fullsampledata.find("Morocco")
-1
fullsampledata.index("Costa Rica")
17
So you can make your if statement as follows
fullsampledata = "hwfekfwekjfnkwfehCosta Ricakwjfkwfekfekfw"
country = "Costa Rica"
if fullsampledata.index(country) != -1:
# Found
pass
else:
# Not Found
pass

In [1]: long_string = 'asdfsadfCosta Ricaasdkj asdfsd asdjas USA alsj'
In [2]: 'Costa Rica' in long_string
Out[2]: True
You don't have your code properly shown and I'm a little too lazy to parse it. Hope this helps.

Add letters to string conditionally

Input: 1 10 avenue
Desired Output: 1 10th avenue
As you can see above I have given an example of an input, as well as the desired output that I would like. Essentially I need to look for instances where there is a number followed by a certain pattern (avenue, street, etc). I have a list which contains all of the patterns and it's called patterns.
If that number does not have "th" after it, I would like to add "th". Simply adding "th" is fine, because other portions of my code will correct it to either "st", "nd", "rd" if necessary.
Examples:
1 10th avenue OK
1 10 avenue NOT OK, TH SHOULD BE ADDED!
I have implemented a working solution, which is this:
def Add_Th(address):
try:
address = address.split(' ')
except AttributeError:
pass
for pattern in patterns:
try:
location = address.index(pattern) - 1
number_location = address[location]
except (ValueError, IndexError):
continue
if 'th' not in number_location:
new = number_location + 'th'
address[location] = new
address = ' '.join(address)
return address
I would like to convert this implementation to regex, as this solution seems a bit messy to me, and occasionally causes some issues. I am not the best with regex, so if anyone could steer me in the right direction that would be greatly appreciated!
Here is my current attempt at the regex implementation:
def add_th(address):
find_num = re.compile(r'(?P<number>[\d]{1,2}(' + "|".join(patterns + ')(?P<following>.*)')
check_th = find_num.search(address)
if check_th is not None:
if re.match(r'(th)', check_th.group('following')):
return address
else:
# this is where I would add th. I know I should use re.sub, i'm just not too sure
# how I would do it
else:
return address
I do not have a lot of experience with regex, so please let me know if any of the work I've done is incorrect, as well as what would be the best way to add "th" to the appropriate spot.
Thanks.

Just one way, finding the positions behind a digit and ahead of one of those pattern words and placing 'th' into them:
>>> address = '1 10 avenue 3 33 street'
>>> patterns = ['avenue', 'street']
>>>
>>> import re
>>> pattern = re.compile(r'(?<=\d)(?= ({}))'.format('|'.join(patterns)))
>>> pattern.sub('th', address)
'1 10th avenue 3 33th street'

"In" or "reverse contains" query in Django filters for comparing strings

I'm writing the backend code for an autocomplete form. Each entry I return is described by a category name and a number within that category.
When the user types "CAT123", I want to use Django filters to filter down to category names or numbers that are contained in the user's query.
In other words, I want to execute a query like:
Entry.objects.filter(Q(category__in = query) | Q(num__in = query))
where the filters test if category AAA is in query AAA 555, and if number 555 is in query AAA 555, respectively.
However, __in seems to only work for lists, and __contains checks the other way ('AAA 555' is not in 'AAA').
What's the right filter expression to use for this "is contained in a string" idea?
Or is there a way to reverse the contains expression, so that the filter looks like Q(query__contains = category)?

you should use __contains but you should split the query into category and number parts before you use it. This is a good case for using regex. Below is a simple example, but be aware that this won't work well for unicode category names (for example, with letters such as üéøā). Look around for more tutorials if you want a more flexible regex.
import re
entries = [
"sdf",
"A124",
"124",
"ASDF 124",
]
for i, e in enumerate(entries):
number = re.search(r'\d+', e)
word = re.search(r'[A-Za-z]+', e)
print i
if number:
print( " " + number.group() )
if word:
print( " " + word.group() )
returns:
0
sdf
1
124
A
2
124
3
124
ASDF
You would use the captured matches like this:
Entry.objects.filter(Q(category__contains = word) | Q(num__contains = number))

Find all IPs on an HTML Page

I want to get an HTML page with python and then print out all the IPs from it.
I will define an IP as the following:
x.x.x.x:y
Where:
x = a number between 0 and 256.
y = a number with < 7 digits.
Thanks.

Right. The only part I cant do is the regular expression one. – das 9 mins ago If someone shows me that, I will be fine. – das 8 mins ago
import re
ip = re.compile(r"\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?):\d{1,6}\b")
junk = " 1.1.1.1:123 2.2.2.2:321 312.123.1.12:123 "
print ip.findall(junk)
# outputs ['1.1.1.1:123', '2.2.2.2:321']
Here is a complete example:
import re, urllib2
f = urllib2.urlopen("http://www.samair.ru/proxy/ip-address-01.htm")
junk = f.read()
ip = re.compile(r"\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?):\d{1,6}\b")
print ip.findall(junk)
# ['114.30.47.10:80', '118.228.148.83:80', '119.70.40.101:8080', '12.47.164.114:8888', '121.
# 17.161.114:3128', '122.152.183.103:80', '122.224.171.91:3128', '123.234.32.27:8080', '124.
# 107.85.115:80', '124.247.222.66:6588', '125.76.228.201:808', '128.112.139.75:3128', '128.2
# 08.004.197:3128', '128.233.252.11:3124', '128.233.252.12:3124']

The basic approach would be:
Use urllib2 to download the contents of the page
Use a regular expression to extract IPv4-like addresses
Validate each match according to the numeric constraints on each octet
Print out the list of matches
Please provide a clearer indication of what specific part you are having trouble with, along with evidence to show what it is you've tried thus far.

Not to turn this into a who's-a-better-regex-author-war but...
(\d{1,3}\.){3}\d{1,3}\:\d{1,6}

Try:
re.compile("\d?\d?\d.\d?\d?\d.\d?\d?\d.\d?\d?\d:\d+").findall(urllib2.urlopen(url).read())

In action:
\b(?: # A.B.C in A.B.C.D:port
(?:
25[0-5]
| 2[0-4][0-9]
| 1[0-9][0-9]
| [1-9]?[0-9]
)\.
){3}
(?: # D in A.B.C.D:port
25[0-5]
| 2[0-4][0-9]
| 1[0-9][0-9]
| [1-9]?[0-9]
)
:[1-9]\d{0,5} # port number any number in (0,999999]
\b

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

get best match for django query - python

def match_country(request, string): qs = Country.objects.filter(country_code__startswith=string) #or int(string) but you might want to write an exception if person provides a letters in that numbers. return ... Have you saved your country codes as strings or integers?

Related

python how to dynamically find a persons name in a string

Need to search a string for a "two word" pattern in python

Add letters to string conditionally

"In" or "reverse contains" query in Django filters for comparing strings

Find all IPs on an HTML Page

Categories

Resources