I am using Python's regex with an if-statement: if the match is None, then it should go to the else clause. But it shows this error:
AttributeError: 'NoneType' object has no attribute 'group'
The script is:
import string
chars = re.escape(string.punctuation)
sub='FW: Re: 29699'
if re.search("^FW: (\w{10})",sub).group(1) is not None :
d=re.search("^FW: (\w{10})",sub).group(1)
else:
a=re.sub(r'['+chars+']', ' ',sub)
d='_'.join(a.split())
Every help is great help!
Your problem is this: if your search doesn't find anything, it will return None. You can't do None.group(1), which is what your code amounts to. Instead, check whether the search result is Noneānot the search result's first group.
import re
import string
chars = re.escape(string.punctuation)
sub='FW: Re: 29699'
search_result = re.search(r"^FW: (\w{10})", sub)
if search_result is not None:
d = search_result.group(1)
else:
a = re.sub(r'['+chars+']', ' ', sub)
d = '_'.join(a.split())
print(d)
# FW_RE_29699
Related
I wrote the search code and I want to store what is between " " as one place in the list, how I may do that? In this case, I have 3 lists but the second one should is not as I want.
import re
message='read read read'
others = ' '.join(re.split('\(.*\)', message))
others_split = others.split()
to_compile = re.compile('.*\((.*)\).*')
to_match = to_compile.match(message)
ors_string = to_match.group(1)
should = ors_string.split(' ')
must = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w+))', message) if term and not term.startswith('-')]
must_not = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w+))', message) if term and term.startswith('-')]
must_not = [s.replace("-", "") for s in must_not]
print(f'must: {must}')
print(f'should: {should}')
print(f'must_not: {must_not}')
Output:
must: ['read', '"find find"', 'within', '"plane"']
should: ['"exactly', 'needed"', 'empty']
must_not: ['russia', '"destination good"']
Wanted result:
must: ['read', '"find find"', 'within', '"plane"']
should: ['"exactly needed"', 'empty'] <---
must_not: ['russia', '"destination good"']
Error when edited the message, how to handle it?
Traceback (most recent call last):
ors_string = to_match.group(1)
AttributeError: 'NoneType' object has no attribute 'group'
Your should list splits on whitespace: should = ors_string.split(' '), this is why the word is split in the list. The following code gives you the output you requested but I'm not sure that is solves your problem for future inputs.
import re
message = 'read "find find":within("exactly needed" OR empty) "plane" -russia -"destination good"'
others = ' '.join(re.split('\(.*\)', message))
others_split = others.split()
to_compile = re.compile('.*\((.*)\).*')
to_match = to_compile.match(message)
ors_string = to_match.group(1)
# Split on OR instead of whitespace.
should = ors_string.split('OR')
to_remove_or = "OR"
while to_remove_or in should:
should.remove(to_remove_or)
# Remove trailing whitespace that is left after the split.
should = [word.strip() for word in should]
must = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w+))', message) if term and not term.startswith('-')]
must_not = [term for term in re.findall(r'\(.*?\)|(-?(?:".*?"|\w+))', message) if term and term.startswith('-')]
must_not = [s.replace("-", "") for s in must_not]
print(f'must: {must}')
print(f'should: {should}')
print(f'must_not: {must_not}')
I am writing a JavaScript crawler application.
The application needs to open JavaScript files and find some specific code in order to do some stuff with them.
I am using regular expressions to find the code of interest.
Consider the following JavaScript code:
let nlabel = rs.length ? st('string1', [st('string2', ctx = 'ctx2')], ctx = 'ctx1') : st('Found {0}', [st(this.param)]);
As you can see there is the st function which is called three times in the same line. The first two calls have an extra parameter named ctx but the third one doesn't have it.
What I need to do is to have 3 re matches as below:
Match 1
Group: function = "st('"
Group: string = "string1"
Group: ctx = "ctx1"
Match 2
Group: function = "st('"
Group: string = "string2"
Group: ctx = "ctx2"
Match 3
Group: function = "st('"
Group: string = "Found {0}"
Group: ctx = (None)
I am using the regex101.com to test my patterns and the pattern that gives the closest thing to what I am looking for is the following:
(?P<function>st\([\"'])(?P<string>.+?(?=[\"'](\s*,ctx\s*|\s*,\s*)))
You can see it in action here.
However, I have no idea how to make it return the ctx group the way I want it.
For your reference I am using the following Python code:
matches = []
code = "let nlabel = rs.length ? st('string1', [st('string2', ctx = 'ctx2')], ctx = 'ctx1') : st('Found {0}', [st(this.param)], ctx = 'ctxparam'"
pattern = "(?P<function>st\([\"'])(?P<string>.+?(?=[\"'](\s*,ctx\s*|\s*,\s*)))"
for m in re.compile(pattern).finditer(code):
fnc = m.group('function')
msg = m.group('string')
ctx = m.group('ctx')
idx = m.start()
matches.append([idx, fnc, msg, ctx])
print(matches)
I have the feeling that re alone isn't capable to do exactly what I am looking for but any suggestion/solution which gets closer is more than welcome.
http://cs1.ucc.ie/~adc2/cgi-bin/lab7/index.html
You can check out the error for yourself by just inputting anything into anyone of the boxes , doesnt have to be all, any help would be great
, I will send on the code after this
from cgitb import enable
enable()
from cgi import FieldStorage,escape
print('Content-Type: text/html')
print()
actor=''
genre=''
theyear=''
director=''
mood=''
result=''
form_data= FieldStorage()
if len(form_data) != 0:
try:
actor=escape(form_data.getfirst('actor'))
genre=escape(form_data.getfirst('genre'))
theyear=escape(form_data.getfirst('theyear'))
director=escape(form_data.getfirst('director'))
mood= escape(form_data.getfirst('mood'))
connection = db.connect('####', '###', '####', '###')
cursor = connection.cursor(db.cursors.DictCursor)
cursor.execute("""SELECT title
FROM films
WHERE (actor = '%s')
OR (actor='%s' AND genre='%s')
OR (actor='%s' AND genre='%s' AND theyear='%i')
OR (actor='%s' AND genre='%s' AND theyear='%i' AND director='%s')
OR (actor='%s' AND genre='%s' AND theyear='%i' AND director='%s' AND mood='%s') % (actor, actor,genre, actor,genre,theyear, actor,genre,theyear,director,actor,genre,theyear,director,mood))
""")
result = """<table>
<tr><th>Your movie!</th></tr>
<tr><th></th></tr>"""
for row in cursor.fetchall():
result+= '<tr><td>%s</td></tr>' ,(row['title'])
result+= '</table>'
cursor.close()
connection.close()
except db.Error:
result = '<p>Sorry! We are currently experiencing technical difficulties.</p>'
Your <input> is named year but you try to run escape(form_data.getfirst('theyear')). getfirst returns None when there is no corresponding form value and escape fails one None. For similar reasons you need to better handle optional fields like what Willem said in the comments.
According to the error code:
/users/2020/adc2/public_html/cgi-bin/lab7/index.py in ()
24 try:
25 actor=escape(form_data.getfirst('actor'))
=> 26 genre=escape(form_data.getfirst('genre'))
27 theyear=escape(form_data.getfirst('theyear'))
28 director=escape(form_data.getfirst('director'))
genre = '', escape = <function escape>, form_data = FieldStorage(None, None, [MiniFieldStorage('actor', 'i')]), form_data.getfirst = <bound method FieldStorage.getfirst of FieldStorage(None, None, [MiniFieldStorage('actor', 'i')])>
/usr/local/lib/python3.4/cgi.py in escape(s=None, quote=None)
1038 warn("cgi.escape is deprecated, use html.escape instead",
1039 DeprecationWarning, stacklevel=2)
=> 1040 s = s.replace("&", "&") # Must be done first!
1041 s = s.replace("<", "<")
1042 s = s.replace(">", ">")
s = None, s.replace undefined
escape() seems to get None as an argument. Escape() uses replace() directly internally, according to the given code snippet.. So the quick fix would be to make sure that you do not give None into the escape method but maybe an empty string instead.
my_non_none_value = form_data.getfirst('actor') if form_data.getfirst('actor') else ""
bla = escape(my_non_none_value)
long version:
my_non_none_value = form_data.getfirst('actor')
if my_non_none_value is None:
my_non_none_value = ""
bla = escape(my_non_none_value)
Side note: escape() in cgi is deprecated, use html.escape() instead.
I have a model with
class dbf_att(models.Model):
name = models.CharField(max_length=50, null=True)
And i'd like to check later that object.name match some regex:
if re.compile('^\d+$').match(att.name):
ret = 'Integer'
elif re.compile('^\d+\.\d+$').match(att.name):
ret = 'Float'
else:
ret = 'String'
return ret
This always return 'String' when some of the att.name should match those regex.
Thanks!
You can try with RegexValidator
Or you can to it with package django-regex-field, but i would rather recommand you to use built-in solution, the less third-party-apps the better.
Regex are great, but sometimes it is more simpler and readable to use other approaches. For example, How about just using builtin types to check for the type
try:
att_name = float(att.name)
ret = "Integer" if att_name.is_integer() else "Float"
except ValueError:
ret = "String"
FYI, your regex code works perfectly fine. You might want to inspect the data that is being checked.
Demo:
>>> import re
>>> a = re.compile('^\d+$')
>>> b = re.compile('^\d+\.\d+$')
>>> a.match('10')
<_sre.SRE_Match object at 0x10fe7eb28>
>>> a.match('10.94')
>>> b.match('10')
>>> b.match('10.94')
<_sre.SRE_Match object at 0x10fe7eb90>
>>> a.match("string")
>>> b.match("string")
Can somebody help me with this code? I'm trying to make a python script that will play videos and I found this file that download's Youtube videos. I am not entirely sure what is going on and I can't figure out this error.
Error:
AttributeError: 'NoneType' object has no attribute 'group'
Traceback:
Traceback (most recent call last):
File "youtube.py", line 67, in <module>
videoUrl = getVideoUrl(content)
File "youtube.py", line 11, in getVideoUrl
grps = fmtre.group(0).split('&')
Code snippet:
(lines 66-71)
content = resp.read()
videoUrl = getVideoUrl(content)
if videoUrl is not None:
print('Video URL cannot be found')
exit(1)
(lines 9-17)
def getVideoUrl(content):
fmtre = re.search('(?<=fmt_url_map=).*', content)
grps = fmtre.group(0).split('&')
vurls = urllib2.unquote(grps[0])
videoUrl = None
for vurl in vurls.split('|'):
if vurl.find('itag=5') > 0:
return vurl
return None
The error is in your line 11, your re.search is returning no results, ie None, and then you're trying to call fmtre.group but fmtre is None, hence the AttributeError.
You could try:
def getVideoUrl(content):
fmtre = re.search('(?<=fmt_url_map=).*', content)
if fmtre is None:
return None
grps = fmtre.group(0).split('&')
vurls = urllib2.unquote(grps[0])
videoUrl = None
for vurl in vurls.split('|'):
if vurl.find('itag=5') > 0:
return vurl
return None
You use regex to match the url, but it can't match, so the result is None
and None type doesn't have the group attribute
You should add some code to detect the result
If it can't match the rule, it should not go on under code
def getVideoUrl(content):
fmtre = re.search('(?<=fmt_url_map=).*', content)
if fmtre is None:
return None # if fmtre is None, it prove there is no match url, and return None to tell the calling function
grps = fmtre.group(0).split('&')
vurls = urllib2.unquote(grps[0])
videoUrl = None
for vurl in vurls.split('|'):
if vurl.find('itag=5') > 0:
return vurl
return None
Just wanted to mention the newly walrus operator in this context because this question is marked as a duplicate quite often and the operator may solve this very easily.
Before Python 3.8 we needed:
match = re.search(pattern, string, flags)
if match:
# do sth. useful here
As of Python 3.8 we can write the same as:
if (match := re.search(pattern, string, flags)) is not None:
# do sth. with match
Other languages had this before (think of C or PHP) but imo it makes for a cleaner code.
For the above code this could be
def getVideoUrl(content):
if (fmtre := re.search('(?<=fmt_url_map=).*', content)) is None:
return None
...
just wanted to add to the answers, a group
of data is expected to be in a sequence, so you can
match each section of the grouped data without
skipping over a data because if a word is skipped from a
sentence, we may not refer to the sentence as one group anymore, see the below example for more clarification, however, the compile method is deprecated.
msg = "Malcolm reads lots of books"
#The below code will return an error.
book = re.compile('lots books')
book = re.search(book, msg)
print (book.group(0))
#The below codes works as expected
book = re.compile ('of books')
book = re.search(book, msg)
print (book.group(0))
#Understanding this concept will help in your further
#researchers. Cheers.