I am trying to match the pattern in the given statement and print it as a dictionary. That pattern has four parts. When I test those parts individually then they are printed individually but when I combined them they give none. Kindly help me.
I tried the following code, but it does not work as expected.
pattern= """
(?P<host>[0-9]{3}\.[0-9]{3}\.[0-9]{3}\.[0-9]{3}) - (?P<user_name>[\w]*) (?P<time>\[.*?\]) (?P<Request>\".*?\")
"""
statement= """ 146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1"
"""
item=re.match(pattern,statement,re.VERBOSE|re.IGNORECASE)
print(item.groupdict())
I want to print the following output.
host :146.204.224.152 , user_name: feest6811 , time:[21/Jun/2019:15:45:24 -0700], Request: "POST /incentivize HTTP/1.1"
This could be helpful:
import re, json
regex = r"(?P<host>(?:\d{3}\.){3}\d{3})\s-\s(?P<user_name>\S+)\s\[(?P<date>.+?)\]\s\"(?P<request>.+?)\""
test_str = "146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] \"POST /incentivize HTTP/1.1\""
for m in re.finditer(regex, test_str):
if m:
print(m.groupdict())
# as a string
print(json.dumps(m.groupdict()))
else:
print("No match")
Should give output as:
{'host': '146.204.224.152', 'user_name': 'feest6811', 'date': '21/Jun/2019:15:45:24 -0700', 'request': 'POST /incentivize HTTP/1.1'}
Related
I am using wagtail to build a blog website. In order to count the visits of each blog, I override the get_context() and serve() to set cookie. The codes are as follows:
def get_context(self, request):
authorname=self.author.get_fullname_or_username()
data = count_visits(request, self)
context = super().get_context(request)
context['client_ip'] = data['client_ip']
context['location'] = data['location']
context['total_hits'] = data['total_hits']
context['total_visitors'] =data['total_vistors']
context['cookie'] = data['cookie']
context['author']=authorname
return context
def serve(self, request):
context = self.get_context(request)
print(request.COOKIES)
template = self.get_template(request)
response = render(request, template, context)
response.set_cookie(context['cookie'], 'true', max_age=3000)
return response
And the function count_visits is as follows:
def count_visits(request, obj):
data={}
ct = ContentType.objects.get_for_model(obj)
key = "%s_%s_read" % (ct.model, obj.pk)
if not request.COOKIES.get(key):
blog_visit_count, created =BlogVisitNumber.objects.get_or_create(content_type=ct, object_id=obj.pk)
blog_visit_count.count += 1
blog_visit_count.save()
The problem I met is that when I reload the blog page, serve() was loaded twice. The first time request was with cookie info. But the second time the cookie of the request was {}, which caused the count of visit does not work correctly.
I do not understand why serve() is called twice everytime I reload page.
After adding a print command in serve(), the log of reload blog page is as follows:
[19/Sep/2022 23:08:44] "GET /notifications/api/unread_list/?max=5 HTTP/1.1" 200 38
{'csrftoken': 'vAnNLAp8Fb9zPkUOSrHTFGB5Bp4W4aLu8mQWPklIhX1JpDTcKIceeId03jTKJpA3', 'sessionid': '9qdgjulitvosan1twtqe0h5bpfn4z3hg', 'bloglistingpage_7_read': 'true', 'blogdetailpage_6_read': 'true'}
[19/Sep/2022 23:08:44] "GET /blog-1/ HTTP/1.1" 200 27594
{}
[19/Sep/2022 23:08:45] "GET /blog-1/ HTTP/1.1" 200 23783
[19/Sep/2022 23:08:46] "GET /notifications/api/unread_list/?max=5 HTTP/1.1" 200 38
And the log of curl is:
[19/Sep/2022 23:12:24] "GET /notifications/api/unread_list/?max=5 HTTP/1.1" 200 38
{}
[19/Sep/2022 23:12:29] "GET /blog-1/ HTTP/1.1" 200 23783
This question already has answers here:
Coursera Course - Introduction of Data Science in Python Assignment 1
(2 answers)
Closed last year.
I have written a function containing regex to separate some special parts of a txt file. The code works fine but I would like to get a dictionary as an output from this and the length should be 979:
import re
def logs():
with open("C:/Users/ASUS/Desktop/logdata.txt", "r") as file:
logdata = file.read()
pattern = '''
(?P<host>\d{1,}\.\d{1,}\.\d{1,}\.\d{1,}) # host name
\s+\S+\s+
(?P<user_name>(?<=-\s)(\w+|-)(?=\s))\s+\[ # user_name
(?P<time>([^[]+))\]\s+" # time
(?P<request>[^"]+)" # request
'''
for item in re.finditer(pattern, logdata, re.VERBOSE):
print(item.groupdict())
This function is supposed to turn a text like this:
146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
to this capturing host, user_name etc:
{"host":"146.204.224.152",
"user_name":"feest6811",
"time":"21/Jun/2019:15:45:24 -0700",
"request":"POST /incentivize HTTP/1.1"}
How can I do this?
Just use groupdict() directly:
import re
def rtr_dict(txt):
pattern = '''
(?P<host>\d{1,}\.\d{1,}\.\d{1,}\.\d{1,}) # host name
\s+\S+\s+
(?P<user_name>(?<=-\s)(\w+|-)(?=\s))\s+\[ # user_name
(?P<time>([^[]+))\]\s+" # time
(?P<request>[^"]+)" # request
'''
if m:=re.match(pattern, txt, flags=re.VERBOSE):
return m.groupdict()
tgt='146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622'
>>>rtr_dict(tgt)
{'host': '146.204.224.152', 'user_name': 'feest6811', 'time': '21/Jun/2019:15:45:24 -0700', 'request': 'POST /incentivize HTTP/1.1'}
Just could you please tell me how I could make it for more than just one line just the way I used a for loop for that.
Given:
tgt='''146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
146.204.224.153 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4623
146.204.224.154 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4624'''
If you have more than one match, you can return a list of dicts:
def rtr_dict(txt):
pattern = '''
(?P<host>\d{1,}\.\d{1,}\.\d{1,}\.\d{1,}) # host name
\s+\S+\s+
(?P<user_name>(?<=-\s)(\w+|-)(?=\s))\s+\[ # user_name
(?P<time>([^[]+))\]\s+" # time
(?P<request>[^"]+)" # request
'''
return [m.groupdict() for m in re.finditer(pattern, txt, flags=re.VERBOSE)]
>>> rtr_dict(tgt)
[{'host': '146.204.224.152', 'user_name': 'feest6811', 'time': '21/Jun/2019:15:45:24 -0700', 'request': 'POST /incentivize HTTP/1.1'}, {'host': '146.204.224.153', 'user_name': 'feest6811', 'time': '21/Jun/2019:15:45:24 -0700', 'request': 'POST /incentivize HTTP/1.1'}, {'host': '146.204.224.154', 'user_name': 'feest6811', 'time': '21/Jun/2019:15:45:24 -0700', 'request': 'POST /incentivize HTTP/1.1'}]
Or use a generator:
def rtr_dict(txt):
pattern = '''
(?P<host>\d{1,}\.\d{1,}\.\d{1,}\.\d{1,}) # host name
\s+\S+\s+
(?P<user_name>(?<=-\s)(\w+|-)(?=\s))\s+\[ # user_name
(?P<time>([^[]+))\]\s+" # time
(?P<request>[^"]+)" # request
'''
for m in re.finditer(pattern, txt, flags=re.VERBOSE):
yield m.groupdict()
>>> list(rtr_dict(tgt))
# same list of dicts...
Its late but this Verbose regex will also do (to return a list of dictionaries]
import re
def logs():
with open("C:/Users/ASUS/Desktop/logdata.txt", "r") as file:
logdata = file.read()
pattern = """
(?P<host>[\d\.]*) #IP host
(\ -\ ) #followed by
(?P<user_name>[\w-]*) #user name
(\ *\[) #followed by
(?P<time>[^\]]*) #time
(\]\ *") #followed by
(?P<request>[^\"]*) #request"""
return [item.groupdict() for item in re.finditer(pattern, logdata, re.VERBOSE)]
Background Information :
I have developed server in django python and deployed on AWS Elastic Beanstalk by using ELB.
Server content many API named like as /testapi, /xyz, etc.
Problem Facing :
Sometime it gave me 4xx or 200 when i called /testapi.
Attaching logs of AWS EC2 instant which show some of request are failed with 4xx response code and some are ok i.e 200.
Please guide me on the right path so I can solve it.
-----------------------Start of code-------------------
#api_view(['POST'])
#permission_classes((AllowAny,))
def get_profile_details(request):
if request.method == 'POST':
try:
logger = logging.getLogger("oauth")
logger.info("get_profile_details POST endpoint called.")
data = dict()
client_id = request.data.get(UiString.client_id)
client_secret = request.data.get(UiString.client_secret)
logger.info("get_profile_details : client_id = " + client_id)
logger.info("get_profile_details : client_secret = " + client_secret)
if helper.validateClientIdAndClientSecret(client_id, client_secret):
logger.info("get_profile_details : Invalid Client Credentials")
return helper.returnResponse(ResponseCodes.Fail, "Unauthorized access", {}, False, status.HTTP_401_UNAUTHORIZED)
try:
type = request.data.get(UiString.type)
email = request.data.get(UiString.email)
mobile = request.data.get(UiString.mobile)
username = request.data.get(UiString.username)
if helper.isNullOrEmpty(type):
return helper.returnResponse(ResponseCodes.missing_required_param, "Type should not be null", data, False,
status.HTTP_400_BAD_REQUEST)
if type == 'email' and helper.isNullOrEmpty(email):
return helper.returnResponse(ResponseCodes.missing_required_param, "Email should not be null", data, False,
status.HTTP_400_BAD_REQUEST)
elif type == 'mobile' and helper.isNullOrEmpty(mobile):
return helper.returnResponse(ResponseCodes.missing_required_param, "Mobile should not be null", data, False,
status.HTTP_400_BAD_REQUEST)
elif type == 'username' and helper.isNullOrEmpty(username):
return helper.returnResponse(ResponseCodes.missing_required_param, "Username should not be null", data,
False, status.HTTP_400_BAD_REQUEST)
try:
# Check user detail based on type
if type == 'email':
logger.info("get_profile_details : by using email as type")
user_details = User.objects.get(email=email)
user_profile_detail = user_details.user_profile
elif type == 'username':
logger.info("get_profile_details : by using username as type")
user_details = User.objects.get(username=username)
user_profile_detail = user_details.user_profile
elif type == 'mobile':
logger.info("get_profile_details : by using mobile as type")
user_profile_detail = UserProfile.objects.get(mobile1=mobile)
#user_details = User.objects.get(pk=user_profile_detail.user_id)
except (UserProfile.DoesNotExist, User.DoesNotExist) as ex:
logger.info("get_profile_details : user does not exist")
logger.exception(ex)
return helper.returnResponse(ResponseCodes.user_does_not_exist, "user does not exist",
data,
False, status.HTTP_200_OK)
except Exception as ex:
logger.info("get_profile_details : Got a exception")
logger.exception(ex)
return helper.returnResponse(ResponseCodes.exception_in_finding_user, "Got a exception",
data,
False, status.HTTP_200_OK)
user = user_profile_detail.user
user_profile_serializer = UserProfileSerializer(user.user_profile)
logger.info("get_profile_details : Got user detail : " + user.username)
data[UiString.user] = {
UiString.email : user.email,
UiString.username : user.username,
UiString.first_name : user.first_name,
UiString.last_name : user.last_name,
UiString.user_profile : user_profile_serializer.data,
}
logger.info("get_profile_details : User details retrived successfully")
return helper.returnResponse(ResponseCodes.success, "User details retrived successfully",
data,
False, status.HTTP_200_OK)
except Exception as ex:
logger.info("get_profile_details : Exception in finding user")
logger.exception(ex)
return helper.returnResponse(ResponseCodes.exception_in_finding_user, "Exception in finding user detail api",
data,
False, status.HTTP_200_OK)
except:
response = helper.generateStandardResponse(ResponseCodes.exception_in_finding_user, "Failed get user from db. Something went wrong.", data,
False);
return Response(response, status.HTTP_417_EXPECTATION_FAILED)
return Response("Method not allowed.", status.HTTP_405_METHOD_NOT_ALLOWED)
--------------------Start Of Logs -------------------
172.31.16.19 - - [28/Oct/2017:12:01:59 +0000] "POST /testapi/ HTTP/1.1" 404 218 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 - - [28/Oct/2017:12:02:01 +0000] "POST /testapi/ HTTP/1.1" 404 218 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 - - [28/Oct/2017:12:02:09 +0000] "POST /testapi/ HTTP/1.1" 404 218 "http://192.168.0.167/" "MozillaXYZ/1.0"
127.0.0.1 (-) - - [28/Oct/2017:12:02:22 +0000] "GET / HTTP/1.1" 200 54 "-" "Python-urllib/2.7"
172.31.2.71 (35.154.225.66) - - [28/Oct/2017:12:02:44 +0000] "PUT /xyz/ HTTP/1.1" 200 872 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.16.19 (35.154.225.66) - - [28/Oct/2017:12:02:54 +0000] "PUT /xyz/ HTTP/1.1" 200 834 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 (35.154.225.66) - - [28/Oct/2017:12:03:23 +0000] "POST /testapi/ HTTP/1.1" 200 866 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.16.19 (35.154.225.66) - - [28/Oct/2017:12:03:25 +0000] "PUT /xyz/ HTTP/1.1" 200 857 "http://192.168.0.167/" "MozillaXYZ/1.0"
::1 (-) - - [28/Oct/2017:12:03:33 +0000] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.25 (Amazon) mod_wsgi/3.5 Python/3.4.3 (internal dummy connection)"
::1 (-) - - [28/Oct/2017:12:03:34 +0000] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.25 (Amazon) mod_wsgi/3.5 Python/3.4.3 (internal dummy connection)"
::1 (-) - - [28/Oct/2017:12:03:44 +0000] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.25 (Amazon) mod_wsgi/3.5 Python/3.4.3 (internal dummy connection)"
::1 (-) - - [28/Oct/2017:12:03:45 +0000] "OPTIONS * HTTP/1.0" 200 - "-" "Apache/2.4.25 (Amazon) mod_wsgi/3.5 Python/3.4.3 (internal dummy connection)"
172.31.2.71 (35.154.225.66) - - [28/Oct/2017:12:03:51 +0000] "POST /testapi/ HTTP/1.1" 200 906 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.16.19 (35.154.225.66) - - [28/Oct/2017:12:04:03 +0000] "POST /testapi/ HTTP/1.1" 200 884 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 (35.154.225.66) - - [28/Oct/2017:12:04:08 +0000] "POST /testapi/ HTTP/1.1" 200 904 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 (35.154.225.66) - - [28/Oct/2017:12:04:30 +0000] "POST /testapi/ HTTP/1.1" 200 882 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 (52.66.89.157) - - [28/Oct/2017:12:05:17 +0000] "POST /testapi/ HTTP/1.1" 200 1268 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.16.19 (35.154.225.66) - - [28/Oct/2017:12:05:33 +0000] "POST /testapi/ HTTP/1.1" 200 861 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.16.19 (35.154.225.66) - - [28/Oct/2017:12:05:48 +0000] "POST /testapi/ HTTP/1.1" 400 88 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 (35.154.225.66) - - [28/Oct/2017:12:05:53 +0000] "POST /testapi/ HTTP/1.1" 200 882 "http://192.168.0.167/" "MozillaXYZ/1.0"
172.31.2.71 (35.154.225.66) - - [28/Oct/2017:12:06:11 +0000] "PUT /xyz/ HTTP/1.1" 200 830 "http://192.168.0.167/" "MozillaXYZ/1.0"
I'm developing a site and there are some old URLs to redirect. I use something like this:
url(r'^i/personagens/*(?:index\.php)*', 'whatever.views.legacy_personanges_home', name='legacy_personanges_home'),
and in the views
def legacy_personanges_home(request):
return HttpResponsePermanentRedirect('/bio/')
While developing I collect old urls and run them on a simple verifier to check my redirects and old pages are handled accordingly. Here it is:
host = 'http://127.0.0.1:8000'
oks = 0
bads = 0
good_codes = [200,301,302]
legacy_urls = [
'/i/personagens/index.php',
'/i/personagens/',
'/i/personagens/index.php?x=1',
]
import urllib2
for path in legacy_urls:
try:
request = urllib2.Request(host + path)
request.get_method = lambda : 'HEAD'
response = urllib2.urlopen(request)
print response.getcode(), path, '-->', response.geturl()
if response.getcode() in good_codes:
oks += 1
else:
bads += 1
except urllib2.HTTPError, e:
print e.code, path
bads += 1
total = oks + bads
print 'OK: %d/%d' % (oks, total)
print 'BAD: %d/%d' % (bads, total)
Response:
200 /i/personagens/index.php --> http://127.0.0.1:8000/bio/
200 /i/personagens/ --> http://127.0.0.1:8000/bio/
200 /i/personagens/index.php?x=1 --> http://127.0.0.1:8000/bio/
OK: 3/3
BAD: 0/3
[Finished in 2.7s]
Nice! urlopen() follows redirects and the response should be a 200. geturl() shows me the final url.
Later I discovered there were lots of different new urls to redirect, so I decided to use the Django Redirects framework. I installed it and put those very 3 urls there to redirect to my '/bio/' new url.
The Weird part:
I opened http://127.0.0.1:8000/personagens/index.php in the browser, it worked, I could see the 301 there and the following 200.
I ran again the same verifier:
404 /i/personagens/index.php
404 /i/personagens/
404 /i/personagens/index.php?x=1
OK: 0/3
BAD: 3/3
[Finished in 1.2s]
I could see the logs
[28/Feb/2014 13:00:41] "HEAD /i/personagens/index.php HTTP/1.1" 404 0
[28/Feb/2014 13:00:41] "HEAD /i/personagens/ HTTP/1.1" 404 0
[28/Feb/2014 13:00:41] "HEAD /i/personagens/index.php?x=1 HTTP/1.1" 404 0
I changed the HEAD to GET just in case... still not working
thanks in advance
I'm writing a quick app to view a giant XML file with some AJAX style calls to viewgroup. My problem is session['groups'] not persisting. I have some old array with only 4 members that is stuck somewhere (cookie?..). That value is present when view is called. I then overwrite that session member with info from the recently opened xml file that contains 20+ members.
However, when viewgroup is called the session variable has reverted to the old value with only 4 members in the array!
Code followed by output. Note the 3 sessionStatus() calls
def sessionStatus():
print "# of groups in session = " + str(len(session['groups']))
#app.route('/')
def index():
cams = [file for file in os.listdir('xml/') if file.lower().endswith('xml')]
return render_template('index.html', cam_files=cams)
#app.route('/view/<xmlfile>')
def view(xmlfile):
path = 'xml/' + secure_filename(xmlfile)
print 'opening ' + path
xmlf = open(path, 'r')
tree = etree.parse(xmlf)
root = tree.getroot()
p = re.compile(r'Group')
groups = []
for g in root:
if (p.search(g.tag) is not None) and (g.attrib['Comment'] != 'Root'):
groups.append(Group(g.attrib['Comment']))
sessionStatus()
session['groups'] = groups
sessionStatus()
return render_template('view.html', xml=xmlfile, groups=groups)
#app.route('/viewgroup/<name>')
def viewGroup(name):
groups = session['groups']
sessionStatus()
if groups is None or len(groups) == 0:
raise Exception('invalid group name')
groups_filtered = [g for g in groups if g.name == name]
if len(groups_filtered) != 1:
raise Exception('invalid group name', groups_filtered)
group = groups_filtered[0]
prop_names = [p.name for p in group.properties]
return prop_names
Output
opening xml/d.xml
# of groups in session = 5
# of groups in session = 57
127.0.0.1 - - [17/Aug/2011 17:27:29] "GET /view/d.xml HTTP/1.1" 200 -
127.0.0.1 - - [17/Aug/2011 17:27:29] "GET /static/ivtl.css HTTP/1.1" 304 -
127.0.0.1 - - [17/Aug/2011 17:27:29] "GET /static/jquery.js HTTP/1.1" 304 -
127.0.0.1 - - [17/Aug/2011 17:27:29] "GET /static/raphael-min.js HTTP/1.1" 304 -
127.0.0.1 - - [17/Aug/2011 17:27:29] "GET /static/ivtl.css HTTP/1.1" 304 -
127.0.0.1 - - [17/Aug/2011 17:27:29] "GET /favicon.ico HTTP/1.1" 404 -
# of groups in session = 5
127.0.0.1 - - [17/Aug/2011 17:27:31] "GET /viewgroup/DeviceInformation HTTP/1.1" 200 -
I need all 57 groups to stay around. Any hints?
The data was simply too big to serialize into the session. Now I generate a key into a global dict and store that key in the session.
gXmlData[path] = groups
There's the problem that the global dict will stay around forever with more and more keys but the process isn't meant to live long.
Maybe your data is too big.
If your data is larger than 4KB, a server-side session is needed. Have a look at Flask-Session.