I´m a network engineer with no experience in programming, recently in python, but making small improvements everyday.
I need some help in getting multiple matches in IF statements like:
if "access-class 30" in output and "exec-timeout 5 5" in output:
print ('###### ACL VTY OK!!! ######')
Is it possible to check multiple keywords in a single string ?
Thanks for all your time.
Use the all function with a generator expression:
data = ["access-class 30", "exec-timeout 5 5"]
if all(s in output for s in data):
print('###### ACL VTY OK!!! ######')
Yes it is possible.
You can use regular expressions(Regex).
import re
li = [] # List of all the keywords
for l in li
for m in re.finditer(l,output)
if m !=None:
print 'match found'
Related
I'm looking to find a way in python spark to search a string with separate two words. for example: IPhone x or Samsun s10 ...
I want to give a text file and (Iphone x) as a composite string for example, and get result then.
All what i find in the internet is just one word count
IUUC:
In spark 2.0 and if you were gunna read it from a file, for exemple a .csv file:
df = spark.read.format("csv").option("header", "true").load("pathtoyourcsvfile.csv")
then you can filter it using regex like this:
pattern = "\s+(word1|word2)\s+"
filtered = df.filter(df['<thedesiredcolumnhere>'].rlike(pattern))
You can try to write your own UDF combine with wordsegmente to segment your words, and you can add new word to the dictionary to help library to segment new words, such as "Iphone x"
For example:
>>> from wordsegment import clean
>>> clean('She said, "Python rocks!"')
'shesaidpythonrocks'
>>> segment('She said, "Python rocks!"')
['she', 'said', 'python', 'rocks']
If you don't want to use library, you can also see Word segmentation using dynamic programming
This is the answer:
# give a file
rdd = sc.textFile("/root/PycharmProjects/Spark/file")
# give a composite string
string_ = "Iphone x"
# filer by line containing the string
new_rdd = rdd.filter(lambda line: string_ in line)
# collect these lines
rt = str(new_rdd.collect())
# apply regex to find all words and count
count = re.findall(string_, rt) them
I am trying to extract separated multi words from a python list with two different list as a query string. My sentences list is
lst = ['we have the terrible HIV epidemic that takes down the life expectancy of the African ','and I take the regions down here','The poorest are down']
lst_verb = ['take','go','wake']
lst_prep = ['down','up','in']
import re
output=[]
item = 'down'
p = re.compile(r'(?:\w+\s+){1,20}'+item)
for i in lst:
output.append(p.findall(i))
for item in output:
print(item)
with this i am able to extract word from the list, However I am only want to extract separated multiwords, i.e it should extract the word from the list "and I take the regions down here".
furthermore, I want to use the word from lst_verb and lst_prep as query string.
for example
re.findall(r \lst_verb+'*.\b'+ \lst_prep)
Thank you for your answer.
You can use regex like
(?is)^(?=.*\b(take)\b)(?=.*?\b(go)\b)(?=.*\b(where)\b)(?=.*\b(wake)\b).*
To match Multiple words
like this your example
use functions to create regex string from the verbs and prep.
hope this helps
I'm trying to get the "real" name of a movie from its name when you download it.
So for instance, I have
Star.Wars.Episode.4.A.New.Hope.1977.1080p.BrRip.x264.BOKUTOX.YIFY
and would like to get
Star Wars Episode 4 A New Hope
So I'm using this regex:
.*?\d{1}?[ .a-zA-Z]*
which works fine, but only for a movie with a number, as in 'Iron Man 3' for example.
I'd like to be able to get movies like 'Interstellar' from
Interstellar.2014.1080p.BluRay.H264.AAC-RARBG
and I currently get
Interstellar 2
I tried several ways, and spent quite a lot of time on it already, but figured it wouldn't hurt asking you guys if you had any suggestion/idea/tip on how to do it...
Thanks a lot!
Given your examples and assuming you always download in 1080p (or know that field's value):
x = 'Interstellar.2014.1080p.BluRay.H264.AAC-RARBG'
y = x.split('.')
print " ".join(y[:y.index('1080p')-1])
Forget the regex (for now anyway!) and work with the fixed field layout. Find a field you know (1080p) and remove the information you don't want (the year). Recombine the results and you get "Interstellar" and "Star Wars Episode 4 A New Hope".
The following regex would work (assuming the format is something like moviename.year.1080p.anything or moviename.year.720p.anything:
.*(?=.\d{4}.*\d{3,}p)
Regex example (try the unit tests to see the regex in action)
Explanation:
\.(?=.*?(?:19|20)\d{2}\b)|(?:19|20)\d{2}\b.*$
Try this with re.sub.See demo.
https://regex101.com/r/hR7tH4/10
import re
p = re.compile(r'\.(?=.*?(?:19|20)\d{2}\b)|(?:19|20)\d{2}\b.*$', re.MULTILINE)
test_str = "Star.Wars.Episode.4.A.New.Hope.1977.1080p.BrRip.x264.BOKUTOX.YIFY\nInterstellar.2014.1080p.BluRay.H264.AAC-RARBG\nIron Man 3"
subst = " "
result = re.sub(p, subst, test_str)
Assuming, there is always a four-digit-year, or a four-digit-resolution notation within the movie's file name, a simple solution replaces the not-wanted parts as this:
"(?:\.|\d{4,4}.+$)"
by a blank, strip()'ing them afterwards ...
For example:
test1 = "Star.Wars.Episode.4.A.New.Hope.1977.1080p.BrRip.x264.BOKUTOX.YIFY"
test2 = "Interstellar.2014.1080p.BluRay.H264.AAC-RARBG"
res1 = re.sub(r"(?:\.|\d{4,4}.+$)",' ',test1).strip()
res2 = re.sub(r"(?:\.|\d{4,4}.+$)",' ',test2).strip()
print(res1, res2, sep='\n')
>>> Star Wars Episode 4 A New Hope
>>> Interstellar
I am making a simple chat bot in Python. It has a text file with regular expressions which help to generate the output. The user input and the bot output are separated by a | character.
my name is (?P<'name'>\w*) | Hi {'name'}!
This works fine for single sets of input and output responses, however I would like the bot to be able to store the regex values the user inputs and then use them again (i.e. give the bot a 'memory'). For example, I would like to have the bot store the value input for 'name', so that I can have this in the rules:
my name is (?P<'word'>\w*) | You said your name is {'name'} already!
my name is (?P<'name'>\w*) | Hi {'name'}!
Having no value for 'name' yet, the bot will first output 'Hi steve', and once the bot does have this value, the 'word' rule will apply. I'm not sure if this is easily feasible given the way I have structured my program. I have made it so that the text file is made into a dictionary with the key and value separated by the | character, when the user inputs some text, the program compares whether the user input matches the input stored in the dictionary, and prints out the corresponding bot response (there is also an 'else' case if no match is found).
I must need something to happen at the comparing part of the process so that the user's regular expression text is saved and then substituted back into the dictionary somehow. All of my regular expressions have different names associated with them (there are no two instances of 'word', for example...there is 'word', 'word2', etc), I did this as I thought it would make this part of the process easier. I may have structured the thing completely wrong to do this task though.
Edit: code
import re
io = {}
with open("rules.txt") as brain:
for line in brain:
key, value = line.split('|')
io[key] = value
string = str(raw_input('> ')).lower()+' word'
x = 1
while x == 1:
for regex, output in io.items():
match = re.match(regex, string)
if match:
print(output.format(**match.groupdict()))
string = str(raw_input('> ')).lower()+' word'
else:
print ' Sorry?'
string = str(raw_input('> ')).lower()+' word'
I had some difficulty to understand the principle of your algorithm because I'm not used to employ the named groups.
The following code is the way I would solve your problem, I hope it will give you some ideas.
I think that having only one dictionary isn't a good principle, it increases the complexity of reasoning and of the algorithm. So I based the code on two dictionaries: direg and memory
Theses two dictionaries have keys that are indexes of groups, not all the indexes, only some particular ones, the indexes of the groups being the last in each individual patterns.
Because, for the fun, I decided that the regexes must be able to have several groups.
What I call individual patterns in my code are the following strings:
"[mM]y name [Ii][sS] (\w*)"
"[Ii]n repertory (\w*) I [wW][aA][nN][tT] file (\w*)"
"[Ii] [wW][aA][nN][tT] to ([ \w]*)"
You see that the second individual pattern has 2 capturing groups: consequently there are 3 individual patterns, but a total of 4 groups in all the individual groups.
So the creation of the dictionaries needs some additional care to take account of the fact that the index of the last matching group ( which I use with help of the attribute of name lastindex of a regex MatchObject ) may not correspond to the numbering of individual regexes present in the regex pattern: it's harder to explain than to understand. That's the reason why I count in the function distr() the occurences of strings {0} {1} {2} {3} {4} etc whose number MUST be the same as the number of groups defined in the corresponding individual pattern.
I found the suggestion of Laurence D'Oliveiro to use '||' instead of '|' as separator interesting.
My code simulates a session in which several inputs are done:
import re
regi = ("[mM]y name [Ii][sS] (\w*)"
"||Hi {0}!"
"||You said that your name was {0} !!!",
"[Ii]n repertory (\w*) I [wW][aA][nN][tT] file (\w*)"
"||OK here's your file {0}\\{1} :"
"||I already gave you the file {0}\\{1} !",
"[Ii] [wW][aA][nN][tT] to ([ \w]*)"
"||OK, I will do {0}"
"||You already did {0}. Do yo really want again ?")
direg = {}
memory = {}
def distr(regi,cnt = 0,di = direg,mem = memory,
regnb = re.compile('{\d+}')):
for i,el in enumerate(regi,start=1):
sp = el.split('||')
cnt += len(regnb.findall(sp[1]))
di[cnt] = sp[1]
mem[cnt] = sp[2]
yield sp[0]
regx = re.compile('|'.join(distr(regi)))
print 'direg :\n',direg
print
print 'memory :\n',memory
for inp in ('I say that my name is Armano the 1st',
'In repertory ONE I want file SPACE',
'I want to record music',
'In repertory ONE I want file SPACE',
'I say that my name is Armstrong',
'But my name IS Armstrong now !!!',
'In repertory TWO I want file EARTH',
'Now my name is Helena'):
print '\ninput ==',inp
mat = regx.search(inp)
if direg[mat.lastindex]:
print 'output ==',direg[mat.lastindex]\
.format(*(d for d in mat.groups() if d))
direg[mat.lastindex] = None
memory[mat.lastindex] = memory[mat.lastindex]\
.format(*(d for d in mat.groups() if d))
else:
print 'output ==',memory[mat.lastindex]\
.format(*(d for d in mat.groups() if d))
if not memory[mat.lastindex].startswith('Sorry'):
memory[mat.lastindex] = 'Sorry, ' \
+ memory[mat.lastindex][0].lower()\
+ memory[mat.lastindex][1:]
result
direg :
{1: 'Hi {0}!', 3: "OK here's your file {0}\\{1} :", 4: 'OK, I will do {0}'}
memory :
{1: 'You said that your name was {0} !!!', 3: 'I already gave you the file {0}\\{1} !', 4: 'You already did {0}. Do yo really want again ?'}
input == I say that my name is Armano the 1st
output == Hi Armano!
input == In repertory ONE I want file SPACE
output == OK here's your file ONE\SPACE :
input == I want to record music
output == OK, I will do record music
input == In repertory ONE I want file SPACE
output == I already gave you the file ONE\SPACE !
input == I say that my name is Armstrong
output == You said that your name was Armano !!!
input == But my name IS Armstrong now !!!
output == Sorry, you said that your name was Armano !!!
input == In repertory TWO I want file EARTH
output == Sorry, i already gave you the file ONE\SPACE !
input == Now my name is Helena
output == Sorry, you said that your name was Armano !!!
OK, let me see if I understand this:
You want to a dictionary of key-value pairs. This will be the “memory” of the chatbot.
You want to apply regular-expression rules to user input. But which rules might apply is conditional on which keys are already present in the memory dictionary: if “name” is not yet defined, then the rule that defines “name” applies; but if it is, then the rule that mentions “word” applies.
Seems to me you need more information attached to your rules. For example, the “word” rule you gave above shouldn’t actually add “word” to the dictionary, otherwise it would only apply once (imagine if the user keeps trying to say “my name is x” more than twice).
Does that give you a bit more idea about how to proceed?
Oh, by the way, I think “|” is a poor choice for a separator character, because it can occur in regular expressions. Not sure what to suggest: how about “||”?
I am trying to parse some data and just started reading up on regular Expressions so I am pretty new to it. This is the code I have so far
String = "MEASUREMENT 3835 303 Oxygen: 235.78 Saturation: 90.51 Temperature: 24.41 DPhase: 33.07 BPhase: 29.56 RPhase: 0.00 BAmp: 368.57 BPot: 18.00 RAmp: 0.00 RawTem.: 68.21"
String = String.strip('\t\x11\x13')
String = String.split("Oxygen:")
print String[1]
String[1].lstrip
print String[1]
What I am trying to do is to do is remove the oxygen data (235.78) and put it in its own variable using an regular expression search. I realize that there should be an easy solution but I am trying to figure out how regular expressions work and they are making my head hurt. Thanks for any help
Richard
re.search( r"Oxygen: *([\d.]+)", String ).group( 1 )
import re
string = "blabla Oxygen: 10.10 blabla"
regex_oxygen = re.compile('''Oxygen:\W+([0-9.]*)''')
result = re.findall(regex_oxygen,string)
print result
What for?
print String.split()[4]
For general parsing of lists like this one could
import re
String = "MEASUREMENT 3835 303 Oxygen: 235.78 Saturation: 90.51"
String = String.replace(':','')
value_list=re.split("MEASUREMENT\W+[0-9]+\W+[0-9]+\W",String)[1].rstrip().split()
values = dict(zip(value_list[::2],map(float,value_list[1::2])))
I believe the answer to you specific problem has been posted. However I wanted to show you a few ressource for regular expression for python. The python documentation on regular expression is the place to start.
O'reilly also has many good books on the subject, either if you want to understand regular expression deep down or just enough to make things work.
Finally regular-expressions.info is a good ressource for regular expression among mainstream languages. You can even test your regular expression on the website.
I would like to share my ?is this an email? regex expresion, just to inspire you. :)
9 emailregex = "^[a-zA-Z.a-zA-Z]+#mycompany.org$"
10
11 def validateEmail(email):
12 """returns 1 if is an email, 0 if not """
13 # len(x.y#mycompany.org) = 17
14 if len(email)>=17:
15 if re.match(emailregex,email)!= None:
16 return 1
17 return 0