Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a string like
x = '''
Anrede:*
Herr
*Name:*
Tobias
*Firma:*
*Strasse/Nr:*
feringerweg
*PLZ/Ort:*
72531
*Mail:*
tovoe#gmeex.de [1]
'''
In that there is a zip number PLZ/Ort:, this is zip number, i wanted to find the zip number from whole string, so the possible way is to use regex, but don't know regex,
Assuming the input in your example is file with multiple strings, you can try something like this:
import re
for line in open(filename, 'r'):
matchPattern = "^(\d{5})$"
match = re.match(matchPattern, line, flags=0)
print match.group(0) #the whole match
If this is just a long string, you can use the same match pattern but without ^ (line begin) and $ (line end) indicators --> (\d{5})
I'm assuming that the Postleitzahl always follows two lines that look like *PLZ/Ort:* and
, and that it's the only text on its line. If that's the case, then you can use something like:
import re
m = re.search('^\*PLZ/Ort:\*\n
\n(\d{5})', x, re.M)
if m:
print m.group(1)
You can try this regex:
(?<=PLZ\/Ort)[\s\S]+?([a-zA-Z0-9\- ]{3,9})
It will support Alpha numeric postal codes as well. You can see postal codes length/format from here.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Write a Python program that will search for lines that start with 'F', followed by 2 characters, followed by 'm:' using the mbox-short.txt text file.
Write a Python program that will search for lines that start with From and have an # sign
My code:
import re
file_hand = open("mbox-short.txt")
for line in file_hand:
line = line.rstrip()
if re.search('From:', line):
print(line)
your code seems to lack the actual regular expression that will find the result you are looking for. If I understand correctly, your aim is to find lines starting with F, followed by ANY two characters. If this is the case, you wish to print the line to the terminal. Let me guide you:
import re
file_hand = open("mbox-short.txt")
for line in file_hand: #NB: After a new scope is entered, use indentation
result = re.search("$f..", line) #pattern, search string
#$ matches character before the first in a line
#. matches 1 occurence of any character
if result.group() != "": #access result of re.search with group() method
print(line)
I trust you can follow this. If you need capital F, I will leave it as a homework exercise for you to find out how to do the capital F.
You can practice with regexp here:
https://regexr.com/
Or read more about it here:
https://www.youtube.com/watch?v=rhzKDrUiJVk
I think you didn't ask your question clear enough for everybody to understand. Also, insert your code for better readability ('Code Sample'). I already did that with your code, so you can have a look at that.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I am new to regex and encountered a problem. I need to parse a list of last names and first names to use in a url and fetch an html page. In my last names or first names, if it's something like "John, Jr" then it should only return John but if it's something like "J.T.R", it should return "JTR" to make the url work. Here is the code I wrote but it doesn't capture "JTR".
import re
last_names_parsed=[]
for ln in last_names:
L_name=re.match('\w+', ln)
last_names_parsed.append(L_name[0])
However, this will not capture J.T.R properly. How should I modify the code to properly handle both?
you can add \. to the regular expression:
import re
final_data = [re.sub('\.', '', re.findall('(?<=^)[a-zA-Z\.]+', i)[0]) for i in last_names]
Regex explanation:
(?<=^): positive lookbehind, ensures that the ensuring regex will only register the match if the match is found at the beginning of the string
[a-zA-Z\.]: matches any occurrence of alphabetical characters: [a-zA-Z], along with a period .
+: searches the previous regex ([a-zA-Z\.]) as long as a period or alphabetic character is found. For instance, in "John, Jr", only John will be matched, because the comma , is not included in the regex expression [a-zA-Z\.], thus halting the match.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
... html ...
[{"url":"/test/test/url","id":"111111"},{"url":"/test/test/url","id":"111111"}, {"url":"/test/test/url","id":"1111"}]
.... html ...
I have some json type string in html.
How make rex expression to extract pattern as
"/test/test/url" and "1111" comes after "id":
Thanks in advance,
Don't use regular expressions here, use the json module. This is what it's designed for.
import json
mylist = json.loads(html)
for subdict in mylist:
print subdict['url']
print subdict['id']
You should go with #Haidro's answer on this, but if you want to use a regex, or see how you would, then here's some sample code:
regex = re.compile(r'\"url\":("[^"]+"),\"id\":("[^"]+")')
match = re.finditer(regex, yourString)
for m in match:
print m.group(1), m.group(2)
[^"] is a character class for accepting all non- " characters.
EDIT:
I love how I recommend the other answer, but explain how to do it if one really wants to know, yet I somehow still get downvoted.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I want to perform re.search using the pattern as a raw string like below.
m=re.search(r'pattern',string)
But if I have the 'pattern' in variable like pat='pattern'. How do I perform raw search?
You declare the pattern string as a raw string:
regexpattern = r'pattern'
m=re.search(regexpattern,string)
you can give the raw input this way. test is the string variable.
pat = """pat%s""" % test
pattern = re.compile(pat, re.I | re.M)
match = pattern.search(l)
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm new to python regular expression so any help will be appreciated. Thanks in advance.
I have this
string = "Restaurant_Review-g503927-d3864736-Reviews"
I would like extract 'g503927' and 'd3864736' from it.
I know you can use re.match(pattern, string, flags=0)
But not sure how to write the regex for it. Plz help
Using re.findall:
>>> s = "Restaurant_Review-g503927-d3864736-Reviews"
>>> re.findall('[a-z]\d+', s)
['g503927', 'd3864736']
[a-z]\d+ matches lowercase alphabet followed by digits.
This should work
import re
pattern = re.compile("[a-z][0-9]+")
a non-regex solution but it depends on what is delimiting the units, here i assume it's a -:
s = "Restaurant_Review-g503927-d3864736-Reviews"
outputs = [i for i in s.split('-') if i[0].isalpha() and i[1:].isdigit()]
no need to use Regex... use the split() method:
s = "Restaurant_Review-g503927-d3864736-Reviews"
print s.split('-')
print s.split('-')[1]
print s.split('-')[2]
more info here: http://docs.python.org/2/library/stdtypes.html#str.split