Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am not sure how I can extract the variables or groups I created in my regular expression. Specifically datetime and IP. I have read other postings and the documentation but I am getting a bit confused. I was wondering if someone could generate an example for me to follow. What I would like to do is to be able to extract datetime and IP for later use. Perhaps stored in a variable to be called on later
sample log:
log = 'Oct 7 13:24:36 192.168.10.2 2013: 10:07-13:24:35 httpproxy[15359]: id="0001"
httpproxy515139 = re.compile(r'(?P<datetime>\w\w\w\s+\d+\s+\d\d:\d\d:\d\d)\s+(?P<IP>d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}).*')
This sample should help you:
>>> import re
>>> sample = 'this is a sample text'
>>> third_word = re.compile(r'\S+ \S+ (?P<word>\S+) .*')
>>> ms = third_word.match(sample)
>>> ms.groupdict()
{'word': 'a'}
You need to access the groupdict() method of the returned match object.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have a string
x='125mg'
First, i want to detect that number and text are together and if they are together so i want to separate it into 125 and mg.
Try this:
import re
a = '125mg switch'
' '.join(re.findall(r'[A-Za-z]+|\d+', a))
Output:
'125 mg switch'
This is the very simple task in python using the regular expresiion package in python .Here i am providing u the code for splitting the number from the string:
python code:
import re
a='125msg'
result=re.findall('\d+',a)
for i in result:
print(i)
You could simply do it using Regular Expression in Python. I don't know whether pandas can do that.
read more about it from this link
import re
test_str = "125mg"
res = re.findall(r'[A-Za-z]+|\d+', test_str)
print(str(res))
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'd like to remove all whitespaces in URLs / Email addresses. The addresses are in a "normal" string, like: "Today the weather is fine. Tomorrow, we'll see. More information: www.weather .com or info #weather.com"
I'm looking for a good regex (using the re module of Python), but my versions can't handle all cases
re.sub(u'(www)([ .])([a-zA-Z\-]+)([ .])([a-z]+)', '\\1.\\3.\\5')
Your expression for url just require a little fixing. The regex expression for email can also be inherited from url expression.
>>> #EXPRESSIONS:
>>> url = "(www)+([ .])+([a-zA-Z\-]+)+([ .])+([a-z]+)"
>>> ema = "([a-zA-Z]+)+([ +#]+)+([a-zA-Z\-]+.com)"
>>>
>>> #IMPORTINGS:
>>> import re
>>>
>>> #YOUR DATA:
>>> string = "Today the weather is fine. Tomorrow, we'll see. More information: www.weather .com or info #weather.com"
>>>
>>> #Scraping Data
>>> "".join(re.findall(url,string)[0])
'www.weather.com'
>>> "".join(re.findall(ema,string)[0]).replace(" ","")
'info#weather.com'
>>>
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am new to python (pardon my bad terminology). I cannot find a solution to my problem.
I am trying to make a simple encryption system. I want to convert the characters of user input to specific characters. For example: ABC would turn into ZYX.
Can anyone help me with this? Thanks.
assuming you just want a simple substitution cipher you can use the translate function:
# in python3:
# table = str.maketrans('ABC', 'ZYX')
# in python2:
from string import maketrans
table = maketrans('ABC', 'ZYX') # add the rest of the alphabet and the desired
# subsitutions
print('CBA'.translate(table))
# output: 'XYZ'
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have a content like this:
aid: "1168577519", cmt_id = 1168594403;
Now I want to get all number sequence:
1168577519
1168594403
by regex.
I have never meet regex problem, but this time I should use it to do some parse job.
Now I can just get sequence after "aid" and "cmt_id" respectively. I don't know how to merge them into one regex.
My current progress:
pattern = re.compile('(?<=aid: ").*?(?=",)')
print pattern.findall(s)
and
pattern = re.compile('(?<=cmt_id = ).*?(?=;)')
print pattern.findall(s)
There are many different approaches to designing a suitable regular expression which depend on the range of possible inputs you are likely to encounter.
The following would solve your exact question but could fail given different styled input. You need to provide more details, but this would be a start.
re_content = re.search("aid\: \"([0-9]*?)\",\W*cmt_id = ([0-9]*?);", input)
print re_content.groups()
This gives the following output:
('1168577519', '1168594403')
This example assumes that there might be other numbers in your input, and you are trying to extract just the aid and cmt_id values.
The simplest solution is to use re.findall
Example
>>> import re
>>> string = 'aid: "1168577519", cmt_id = 1168594403;'
>>> re.findall(r'\d+', string)
['1168577519', '1168594403']
>>>
\d+ matches one or more digits.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
for some reason when I get regex to get the number i need it returns none.
But when I run it here http://regexr.com/38n3o it works
the regex was designed to get the last number of the ip so it can be removed
lanip=74.125.224.72
notorm=re.search("/([1-9])\w+$/g", lanip)
That is not how you define a regular expressions in Python. The correct way would be:
import re
lanip="74.125.224.72"
notorm=re.search("([1-9])\w+$", lanip)
print notorm
Output:
<_sre.SRE_Match object at 0x10131df30>
You were using a javascript regex style. To read more on correct python syntax read the documentation
If you want to match the last number of an IP use:
import re
lanip="74.125.224.72"
notorm=re.search("(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)", lanip)
print notorm.group(4)
Output:
72
Regex used from http://www.regular-expressions.info/examples.html
Your example did work in this scenario, but would match a lot of false positives.
What is lanip's type? That can't run.
It needs to be a string, i.e.
lanip = "74.125.224.72"
Also your RE syntax looks strange, make sure you've read the documentation on Python's RE syntax.