What is the equivalent of My.Resources(vb.net) in Python? - python

I would like to use resources in a python project with Flask and output their names. I know how it works in VB. But I don't have idea what the equivalent of My.Resources.ResourceManager is in Python. Is there same functionality in Python?
I want to save multiple regex paterns like as below.
And also I want to use it in code by name.
Name Value
Regex1 (?Pnickname\s*.+?)
Regex2 (?Paddress\s*.+?)

Welcome to SO!
Essentially, you don't need to worry about resource management in python most of the time because it is done automatically for you. So, to save a regex pattern:
import re
# create pattern strings
regex1 = '(?P<nickname>\s*.+?)'
regex2 = '(?P<address>\s*.+?)'
test_string = 'nickname jojo rabbit.'
matches = re.search(regex1, test_string)
As you probably noticed, there is nothing special here. Creating and storing these patterns is just like declaring any string or other type of variables.
If you want to save all your patterns more neatly, you can use a dictionary where the names of the patterns are the keys and the pattern strings are the values, like so:
import re
regex_dictionary = {'regex1':'(?P<nickname>\s*.+?)'}
# to add another regex pattern:
regex_dictionary['regex2'] = '(?P<address>\s*.+?)'
test_string = 'nickname jojo rabbit.'
# to access and search using a regex pattern:
matches = re.search(regex_dictionary['regex1'], test_string)
I hope this makes sense!
Read more about python's regex: https://www.w3schools.com/python/python_regex.asp#matchobject
Read more about python's dictionaries: https://www.w3schools.com/python/python_dictionaries.asp
Read more about python's resource management: https://www.drdobbs.com/web-development/resource-management-in-python/184405999

Related

How to find a pattern in string & replace in HTML Code

I have HTML Code in a string variable. I want to modify this tag, from this:
3.jpg
to
3.jpg, basically add "download="3.jpg"
I want to do this, with all links that have .jpg,.png,.gif,.jpeg,.mp4 extension at the end.
There might be easier ways to accomplish this, but I think one way to start could be using regex. Define a pattern to find all of the file endings. Then fetch the file-name (e.g., 3.jpg) to compile a string that .replace()s the first pattern. Like this:
import re
# all possible formats you mentioned:
html = ['3.jpg',
'3.png',
'3.gif',
'3.jpeg',
'3.mp4']
# regex patterns (everything within paranthesis is going to be extracted
regex1 = re.compile(r'(\.jpg\"|\.png\"|\.gif\"|\.jpeg\"|\.mp4\")')
regex2 = re.compile(r'\/images\/(.*?)\.')
# iterate over the strings
for x in html:
if regex1.search(x): # if pattern is found:
# find and extract
a = regex1.search(x).group(1)
b = regex2.search(x).group(1)
# compile new string by replacing a
new = x.replace(a, f'{a} download="{b + a}')
print(new)
This gives you:
3.jpg
3.png
3.gif
3.jpeg
3.mp4
If you want to learn more on regex, see the documentation.
Also, note that f-strings (as in f'{a} download="{b + a}') are supported in python version >3.6.

Extract specific word from the string using Python

I have a string Job_Cluster_AK_Alaska_Yakutat_CDP.png
From the string above, I want to extract only the word after this word Job_Cluster_AK_Alaska_ and before .png.
So basically I want to extract after fourth word separated by underscore and till the word before .png
I am new to regex.
Finally I want only Yakutat_CDP.
I think what you are asking for is something like this:
import os
# I think you will have different jobs/pngs, so pass these variables from somewhere
jobPrefix = 'Job_Cluster_AK_Alaska_'
pngString = 'Job_Cluster_AK_Alaska_Yakutat_CDP.png'
# Split filename/extension
pngTitle = os.path.splitext(pngString)[0]
# Get the filename without the jobPrefix
finalTitle = pngTitle[len(jobPrefix):]
Edit
Try to avoid regular expressions as it is much slower in general than string slicing
You can do it even without regex like so:
s = 'Job_Cluster_AK_Alaska_Yakutat_CDP.png'
print(s[len('Job_Cluster_AK_Alaska_'):-len('.png')])
In essence here I take the substring starting immediately after Job_Cluster_AK_Alaska_ and ending before .png.
Still probably a regex approach is more readable and maintanable:
import re
m = re.match('Job_Cluster_AK_Alaska_(.*).png')
print(m[1])

find multiple things in a string using regex in python

My input string contains various entities like this:
conn_type://host:port/schema#login#password
I want to find out all of them using regex in python.
As of now, I am able to find them one by one, like
conn_type=re.search(r'[a-zA-Z]+',test_string)
if (conn_type):
print "conn_type:", conn_type.group()
next_substr_len = conn_type.end()
host=re.search(r'[^:/]+',test_string[next_substr_len:])
and so on.
Is there a way to do it without if and else?
I expect there to be some way, but not able to find it. Please note that every entity regex is different.
Please help, I don't want to write a boring code.
Why don't you use re.findall?
Here is an example:
import re;
s = 'conn_type://host:port/schema#login#password asldasldasldasdasdwawwda conn_type://host:port/schema#login#email';
def get_all_matches(s):
matches = re.findall('[a-zA-Z]+_[a-zA-Z]+:\/+[a-zA-Z]+:+[a-zA-Z]+\/+[a-zA-Z]+#+[a-zA-Z]+#[a-zA-Z]+',s);
return matches;
print get_all_matches(s);
this will return a list full of matches to your current regex as seen in this example which in this case would be:
['conn_type://host:port/schema#login#password', 'conn_type://host:port/schema#login#email']
If you need help making regex patterns in Python I would recommend using the following website:
A pretty neat online regex tester
Also check the re module's documentation for more on re.findall
Documentation for re.findall
Hope this helps!
>>>import re
>>>uri = "conn_type://host:port/schema#login#password"
>>>res = re.findall(r'(\w+)://(.*?):([A-z0-9]+)/(\w+)#(\w+)#(\w+)', uri)
>>>res
[('conn_type', 'host', 'port', 'schema', 'login', 'password')]
No need for ifs. Use findall or finditer to search through your collection of connection types. Filter the list of tuples, as need be.
If you like it DIY, consider creating a tokenizer. This is very elegant "python way" solution.
Or use a standard lib: https://docs.python.org/3/library/urllib.parse.html but note, that your sample URL is not fully valid: there is no schema 'conn_type' and you have two anchors in the query string, so urlparse wouldn't work as expected. But for real-life URLs I highly recommend this approach.

Matching regex to set

I am looking for a way to match the beginning of a line to a regex and for the line to be returned afterwards. The set is quite extensive hence why I cannot simply use the method given on Python regular expressions matching within set. I was also wondering if regex is the best solution. I have read the http://docs.python.org/3.3/library/re.html alas, it does not seem to hold the answer. Here is what I have tried so far...
import re
import os
import itertools
f2 = open(file_path)
unilist = []
bases=['A','G','C','N','U']
patterns= set(''.join(per) for per in itertools.product(bases, repeat=5))
#stuff
if re.match(r'.*?(?:patterns)', line):
print(line)
unilist.append(next(f2).strip())
print (unilist)
You see, the problem is that I do not know how to refer to my set...
The file I am trying to match it to looks like:
#SRR566546.970 HWUSI-EAS1673_11067_FC7070M:4:1:2299:1109 length=50 TTGCCTGCCTATCATTTTAGTGCCTGTGAGGTGGAGATGTGAGGATCAGT
+
hhhhhhhhhhghhghhhhhfhhhhhfffffeee[X]b[d[ed`[Y[^Y
You are going about it the wrong way.
You simply leave the set of characters to the regular expression:
re.search('[AGCNU]{5}', line)
matches any 5-character pattern built from those 5 characters; that matches the same 3125 different combinations you generated with your set line, but doesn't need to build all possible combinations up front.
Otherwise, your regular expression attempt had no correlation to your patterns variable, the pattern r'.*?(?:patterns)' would match 0 or more arbitrary characters, followed by the literal text 'patterns'.
According to what I've understood from your question, it seems to me that this could fit your need:
import re
sss = '''dfgsdfAUGNA321354354
!=**$=)"nNNUUG54788
=AkjhhUUNGffdffAAGjhff1245GGAUjkjdUU
.....cv GAUNAANNUGGA'''
print re.findall('^(.+?[AGCNU]{5})',sss,re.MULTILINE)

python regex on variable

Please help with my regex problem
Here is my string
source="http://www.amazon.com/ref=s9_hps_bw_g200_t2?pf_rd_m=ATVPDKIKX0DER&pf_rd_i=3421"
source_resource="pf_rd_m=ATVPDKIKX0DER"
The source_resource is in the source may end with & or with .[for example].
So far,
regex = re.compile("pf_rd_m=ATVPDKIKX0DER+[&.]")
regex.findall(source)
[u'pf_rd_m=ATVPDKIKX0DER&']
I have used the text here. Rather using text, how can i use source_resource variable with & or . to find this out.
If the goal is to extract the pf_rd_m value (which it apparently is as you are using regex.findall), than I'm not sure regex are the easiest solution here:
>>> import urlparse
>>> qs = urlparse.urlparse(source).query
>>> urlparse.parse_qs(qs)
{'pf_rd_m': ['ATVPDKIKX0DER'], 'pf_rd_i': ['3421']}
>>> urlparse.parse_qs(qs)['pf_rd_m']
['ATVPDKIKX0DER']
You also have to escape the .
pattern=re.compile(source_resource + '[&\.]')
You can just build the string for the regular expression like a normal string, utilizing all string-formatting options available in Python:
import re
source_and="http://rads.stackoverflow.com/amzn/click/B0030DI8NA/pf_rd_m=ATVPDKIKX0DER&"
source_dot="http://rads.stackoverflow.com/amzn/click/B0030DI8NA/pf_rd_m=ATVPDKIKX0DER."
source_resource="pf_rd_m=ATVPDKIKX0DER"
regex_string = source_resource + "[&\.]"
regex = re.compile(regex_string)
print regex.findall(source_and)
print regex.findall(source_dot)
>>> ['pf_rd_m=ATVPDKIKX0DER&']
['pf_rd_m=ATVPDKIKX0DER.']
I hope this is what you mean.
Just take note that I modified your regular expression: the . is a special symbol and needs to be escaped, as is the + (I just assumed the string will only occur once, which makes the use of + unnecessary).

Categories