Python regex: Add "-" inside a string using re.sub - python

I have this line in my .txt file:
2016CT1021
I want to make it like this:
2016-CT-1021
I tried to use this Python regex: "re.sub":
data = re.sub(r'\d\d+(?:\w\w\d\d\d\d)', r'\d\d+(?:-\w\w-\d\d\d\d)', data)
But it didn't change/replace. Please someone help me. Thank you!

For current example will work
re.sub(r'(\d\d+)(\w\w)(\d\d\d\d)', r'\1-\2-\3', data)
you should group with brackets and use group number in the replace expression.

Related

Python Regex - How do I fetch a word after a specific word in a string using python regex?

I need to fetch "repo-name" which is "sonar-repo" from the above multi-line commit string. Can this be achieved with regex? Output Expected: sonar-repo
Here is the string which I need to read using regex,
commit_message=
"""repo-name=sonar-repo;repo-title=Sonar;repo-description=A little demo;repo-requester=Jack
"""
You should be able to use regex to look for repo-name= and then look for the ; right after and get what's inbetween. Something like this:
(?<=repo-name=).*?(?=;)
Tested it here with regex101
Try this:
import re
commit_message= 'repo-name=sonar-repo;repo-title=Sonar;repo-description=A little demo;repo-requester=Jack'
print(re.search(r'repo-name=(.*?);', commit_message).group(1))
Output:
sonar-repo

Extract values in name=value lines with regex

I'm really sorry for asking because there are some questions like this around. But can't get the answer fixed to make problem.
This are the input lines (e.g. from a config file)
profile2.name=share2
profile8.name=share8
profile4.name=shareSSH
profile9.name=share9
I just want to extract the values behind the = sign with Python 3.9. regex.
I tried this on regex101.
^profile[0-9]\.name=(.*?)
But this gives me the variable name including the = sign as result; e.g. profile2.name=. But I want exactly the inverted opposite.
The expected results (what Pythons re.find_all() return) are
['share2', 'share8', 'shareSSH', 'share9']
Try pattern profile\d+\.name=(.*), look at Regex 101 example
import re
re.findall('profile\d+\.name=(.*)', txt)
# output
['share2', 'share8', 'shareSSH', 'share9']
But this problem doesn't necessarily need regex, split should work absolutely fine:
Try removing the ? quantifier. It will make your capture group match an empty st
regex101

Regex to remove strings from list that do not match given prefix

I have a string that includes multiple comma-separated lists of values, always embedded between <mks:Field name="MyField"> and </mks:Field>.
For example:
<mks:Field name="MyField">X001_ABC</mks:Field><mks:Field name="AnotherField">X002_XYZ</mks:Field><mks:Field name="MyField"></mks:Field><mks:Field name="MyField">X000_Test1,X000_Test2</mks:Field><mks:Field name="MyField">X001_ABC,X000_Test1</mks:Field><mks:Field name="MyField">X000_Test1,X000_Test2,X002_XYZ</mks:Field>
In this example I have the following values to work with:
X001_ABC
(empty)
X000_Test1,X000_Test2
X001_ABC,X000_Test1
X000_Test1,X000_Test2,X002_XYZ
Now I want to remove all the values that do not start with the prefix ""X000_", including any needless commas, so that my result looks like this:
<mks:Field name="MyField"></mks:Field><mks:Field name="AnotherField">X002_XYZ</mks:Field><mks:Field name="MyField"></mks:Field><mks:Field name="MyField">X000_Test1,X000_Test2</mks:Field><mks:Field name="MyField">X000_Test1</mks:Field><mks:Field name="MyField">X000_Test1,X000_Test2</mks:Field>
I have tried the following regex, but it does not work properly if only one value exists not matching my regex and I do not want to change my regex if a new value matching my prefix is introduced (e.g. X000_Test3).
Search: (?<=name="MyField">)[^<>](?:.*?(X000_Test1,X000_Test2|X000_Test1|X000_Test2))?.*?(?=</mks:Field>)
Replace: \1
This gives me the following result that does not match the expected output:
<mks:Field name="MyField">X000_Test1,X000_Test2</mks:Field><mks:Field name="MyField">X000_Test1</mks:Field><mks:Field name="MyField">X000_Test2</mks:Field>
Unfortunately I cannot simply parse the string with something else - I only have the option of a regex search/replace in this case.
Thank you in advance, any help would be appreciated.
If you are using Javascript use this:
prefix='X000';
let pattern= new RegExp(`((?<=>)|,)((?!${prefix}|[>\<,]).)*(,|(?=\<))`, 'g');
For any other language use this:
'/((?<=>)|,)((?!X000|[>\<,]).)*(,|(?=\<))/';
X000 being the prefix you want to keep

Filtering a string based on some group of characters

I would like to filter the string mentioned below in python.
u'reviews': [{u'content':
Can someone please let me know to how I should create a new string like this below.
reviews: [{content:
I am aware that python provides an excellent patter matching technique but don't know how it works.
Thanks in advance
import re
x="u'reviews': [{u'content':"
print re.sub(r"u'([^']*)'",r"\1",x)
You can use re.sub for this.

Regex to grab number in line

I have an html file that I am reading the below line from. I would like to grab only the number that appears after the ':' and before the ',' using REGEX... THANKS IN ADVANCE
"totalPages":15,"bloodhoundHtml"
"totalPages":([0-9]*),
You can see the Demo here
Then the python code is
import re
p = re.compile('"totalPages":([0-9]*),')
print p.findall('"totalPages":15,"bloodhoundHtml"')
you can try :\d+, to get the ':15,'
then you can trim first':' and trim end ',' to get the pure numbers,
I don't know if python can use variable in the regex, I'm a c# programe, in c#, I can use :(?<id>\d+), to match this string, and get the number directly by result.group["id"]
:\d{1,},
Also works for parsing the line you gave. According to this post, you might run into some trouble parsing the HTML

Categories