Regex capture group is not capturing data [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I am trying to capture any alpha numeric character between ''
Regex
'(.*.doc)' will only capture .doc files.
'(\w)' should capture any alpha numeric character.
But I am looking to capture any character between '' except the ---- characters.

Here you can use the following regular expression: ([^\-\[\][\n']+)
An example:
regexr.com/5btcs

Is this good?
'[^'-]*'
Means a single ', then anything not ' or -, then another '.
If you wish to capture things around the dashes though, you might have to capture inclusively and filter them out.

Related

Filtering string column with specific character [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have a string column like this and want to filter content between "/" character:
A
9/17/001.a.x.y.16.04451
006.b.021017006814
2/17/000.c.m.n.15.00668/008
And the expected output is
A
001.a.x.y.16.04451
006.b.021017006814
000.c.m.n.15.00668
How could i make it done with python/R/Mysql
Thank youuu
In MySQL, you can use regexp_replace():
select t.*,
regexp_replace(a, '^[^/]+/[^/]+/([^/]*)[/|^]', '$1')
from t;
The logic is that you seem to want the third component between slashes if there is one. Otherwise, you seem to want the entire string.
Here is a db<>fiddle.

Find all words including those with special characters [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I have texts in an excel file that looks something like this:
alpha123_4rf
45beta_Frank
Red5Great_Sam_Fun
dan.dan_mmem_ber
huh_k
han.jk_jj
huhu
I am trying to use a regex to match all of these words and save them into a set().
I have tried r"(\w+..*?_.*?\w+)" as seen here . But cant seem to capture the word huhu that does not have special characters.
Your regex is capturing word that have a _ in them, and huhu don't.
You could change your regex to match every letter, number, underscore, and dots, multiple times.
([\w.]+)
I've fork your regex101
If you wish to match something more precise, you might need to give us more information about your context and what exactly you are trying to match.

Regex: match everything but escaped characters [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am using scrapy to scrape the date that a comment was posted on a forum. I have been able to scrape the contents of the divider that contains the date, but it has escaped characters on both sides that make the string unusable. I need to create a regex expression which matches everything except for escaped characters.
The string I am working with is "\r\n\t\t\t\r\n\t\t\t\t08-07-2019, 11:37:16 AM\r\n\t\t\t\r\n\t\t\t\r\n\t\t\t". I want only to match the date inside.
The pattern that I was trying to use was (?<!\\\\)\\+[\\w-]+, as was recommended by other topics, but this doesn't match anything in that string.
You don't need regex if you want to match everything. I strongly recommend you to use Item Loaders in Scrapy to process your fields (using .strip() etc).
Also you can remove unwanted characters from your string using XPath normalize-space():
event_time = response.xpath('normalize-space(string(//YOUR/XPATH/HERE))').get()
But if you want to match part of a complex string you can use regular expresssion of course:
event_time = response.xpath('//YOUR/XPATH/HERE').re_first(r'(\d{2}-\d{2}-\d{4},\s+\d{2}:\d{2}:\d{2}\s+\w{2})')

Python regex with re.findall [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I work with Python 2.7. It is my first question here.
Here is my code:
import re
string = "0581111105822222749533333"
result = re.findall(r'058',string) # ['058', '058']
I want to add 5 digits after 058 and receive:
# ['05811111','05822222']
How to do this?
Thank you.
You can extract what you need using 058\d{5} pattern. This matches '058' characters and keeps extracting 5 digits after those.

Matches whatever regular expression is not inside the parentheses [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
How can I match strings that are not inside a set of strings using python regular expressions?
Ex: set of strings ('/abc|/bcd')
I want to match any string other than that in the parentheses. That should be exact match.
Here's a fun one for you:
^(?!\/(?:abc|bcd)$).+
It uses a negative lookahead to ensure that the string being matched isn't one of the strings you don't want, then grabs whatever else is left.
Demo on Regex101

Categories