Regex in Python to find the exact string - python

Just started working with python and got stuck with a find and match.
Code is as below
for line in f:
if c in line:
catch = line.split(' ', 1)[1]
return catch
f.close()
so if my line has "TRA trace" and when I input TR, it returns the vale for TRA and not TR. Is there anything that can be done in the if condition to mach the exact input string. Thank you.

You can do:
for line in f:
catch = line.split(' ', 1)
if c in catch: # checks if one of the tokens is c
return catch
# Or
if c == catch[0]: # checks if the first token is c
return catch

And for the Regex you can go for this.
import re
m=re.search('('+c+')',line)
if m.groups()
return m.group(1)

You can use regex to match the exact input string
EX:
import re
def checkStr(word, line):
result = re.search(r'\b'+ word + r'\b', line)
if result:
print "Found!"
print result.group()
else:
print "No Match!!!"
word = 'TR'
line = "TRA trace"
checkStr(word, line) #No Match!!!
word = 'TR'
line = "TR trace"
checkStr(word, line) #Found! TR

Related

Python Coding for Reversing Sentences

this is the code i am using so far.
translated = []
line = input('Line: ')
while line != '':
for word in line.split():
letters = list(word)
letters.reverse()
word = ''.join(letters)
translated.append(word)
if line == '':
print(' '.join(translated))
elif line:
line = input('Line: ')
it is suppose to read lines of input from the user. An empty line is suppose to signify the end of any inputs. Then the program is suppose to read all the lines, then reproduce them in their original order with each word reversed in place.
For example if i was to input: Hello how are you
Its output shout be: olleH woh era uoy
Currently it is asking for the inputs, then stopping when there is an empty line, but not producing anything. No reversed words no nothing.
Can anyone tell me what i am doing wrong, and help me out with my code??
The print statement needs to be outside the loop. Your loop condition ensures that line is never '' inside the loop, so the if condition is never satisfied.
For the same reason, you need to rethink the elif.
as #Flav points out to read all lines before an empty line to end the input. I have edited the solution as below:
lines = [] # to store all line inputs
while True:
line = raw_input('Line: ') # input if using python3 or raw_input if python2.6/7
if line == '':
break
lines.append(line)
for line in lines:
print (' '.join([word[::-1] for word in line.split(' ')]))
You could probably do it like this.
' '.join( [ i[::-1] for i in line.split( ' ' ) ] )
Split the line into words
Reverse each word
Put them back together
The issue is that when the line is empty, your while loop stops. You should get rid of the if / else which are useless here.
Full script:
translated = []
line = input('Line: ')
while line != '':
for word in line.split():
letters = list(word)
letters.reverse()
word = ''.join(letters)
translated.append(word)
#The above for loop could be done in one line with:
#translated.extend([word[::-1] for word in line.split()])
line = input('Line: ')
print(' '.join(translated))
This works perfect
a = "Hello how are you"
" ".join([ "".join(reversed(x)) for x in re.findall('\w+',a) ])

How to get word by word from a string?

For the purpose of sentiment analysis I want to analyse each word in a sentence. I want to store each word in a variable and then process it. I use the following code and i got an error message saying :
Attribute Error: 'list' object has no attribute 'split'
line = ' hello this is a test sentence'
while line:
line=line.split(' ')
print '\n'
What is the solution for above problem?
Here is what happens in your code:
line = "..." - line is a string
while line: - start looping, as non-empty string evaluates to True
line = line.split(" ") - split line by spaces, line is now a list
print '\n' - print a newline character
while line: - non-empty list evaluates True, so loop again
line = line.split(" ") - line is a list, hence AttributeError
I am not sure why you are using a while loop here, you probably want:
for word in line.split(" "):
print word
# ... process word
the issue here is actually when the loop hits its second iteration line is no longer a string. and so the logic says is object line not None if yes, run split on it. However at this point line is now a list.
what you really want is
line = 'hello this is a sentance'
words = line.split()
for w in words:
print w
Here are two ways:
string.split(' ') ?
>>> a="1.MATCHES$$TEXT$$STRING"
>>> a.split("$$TEXT$$")
['1.MATCHES', 'STRING']
>>> a="2.MATCHES $$TEXT$$ STRING"
>>> a.split("$$TEXT$$")
['2.MATCHES ', ' STRING']
and:
>>> [x.strip() for x in "2.MATCHES $$TEXT$$ STRING".split("$$TEXT$$")]
['2.MATCHES', 'STRING']
so whats nice is you don't have to loop, you have have to assign it and use it.
a="my;string;here"
a = a.split(";")
for w in a:
print w
Just split the string once: wordList = line.split()
And use the wordList to iterate over:
for x in wordList:
doWork...
p.s.: I don't quite get why you would print a newline character in each iteration of the loop.

Forward slash not found in last string in line

My code is not very good but this is a really interesting problem. When looking for a forward slash in a string all are found except for if the forward slash is in the last word in the file. Here is my code.
#!/usr/bin/python
import sys
if len(sys.argv)!=2:
print "usage: %s filename\n" % (sys.argv[0]);
exit(0);
f = open(sys.argv[1]);
lines = [i for i in f.readlines()]
finals = [];
for line in lines:
words = line.split(",");
for word in words:
if word.find("/") != -1:
datefixes = word.split("/")
if datefixes[2].__len__() == 4:
temp = datefixes[2]
word = datefixes[0] + "-" + datefixes[1] + "-" + temp[-2:]
finals += "," + word;
tempstring = ''.join(finals)
finallist = tempstring.split("\r\n")
finalstring = ""
for tmpstrpart in finallist:
if tmpstrpart != "" or tmpstrpart !="\r\n":
finalstring += tmpstrpart[1:] + "\r\n"
print finalstring
and here is a sample input
ACPVBF,1930-729,Z729,12/16/2014,6/10/2008,1/5/2003,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7/11/2009,2/6/2004,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9/17/1946,5/13/1942,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9/17/1946
in the code these lines are split by commas. if the word at the end contains a / the forward slash is not found. but only if it is at the end. the rest work fine.
edit: The output I am currently getting on these lines is:
ACPVBF,1930-729,Z729,12-16-14,6-10-08,1-5-03,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7-11-09,2-6-04,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9-17-46,5-13-42,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9/17/1946
the output that I am trying to get from these lines is:
ACPVBF,1930-729,Z729,12-16-14,6-10-08,1-5-03,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7-11-09,2-6-04,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9-17-46,5-13-42,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,[9-17-46]
I want the one with the brackets around it to change also.
final working code based on BrenBarn's answer:
#!/usr/bin/python
import sys
import re
if len(sys.argv)!=2:
print "usage: %s filename\n" % (sys.argv[0]);
exit(0);
f = open(sys.argv[1]);
x = f.read()
f.close()
filename = sys.argv[1]
filename = filename[:-4] + " finished.csv"
f = open(filename, 'w')
f.write(re.sub(r'(\d{1,2})/(\d{1,2})/\d{2}(\d{2})', r'\1-\2-\3', x))
f.close()
Thanks for all the help. Sorry I can't upvote yet.
I'm not sure what the problem is, but I think what you're trying to do can be accomplished way more easily by just using a regular expression.
>>> print x
ACPVBF,1930-729,Z729,12/16/2014,6/10/2008,1/5/2003,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7/11/2009,2/6/2004,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9/17/1946,5/13/1942,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9/17/1946
>>> print re.sub(r'(\d{1,2})/(\d{1,2})/\d{2}(\d{2})', r'\1-\2-\3', x)
ACPVBF,1930-729,Z729,12-16-14,6-10-08,1-5-03,44-48-46,39-43-41,35-39-37,29-33-31
ACPVGT,1930-729,Z729,25-29-27,19-23-21,14-18-16,7-11-09,2-6-04,48-2-0,42-46-44
ACPUQH,1930-729,Z729,32-40-19,26-34-13,21-29-8,14-22-1,9-17-46,5-13-42,49-7-36
ACPVOU,1930-729,Z729,42-0-29,36-44-23,31-39-18,24-32-11,19-27-6,15-23-2,9-17-46
I think the issue with your code is the trailing new line. My guess is it would fail for last word in each line. I would suggest you do a line.strip() before splitting

String reverse in Python

Write a simple program that reads a line from the keyboard and outputs the same line where
every word is reversed. A word is defined as a continuous sequence of alphanumeric characters
or hyphen (‘-’). For instance, if the input is
“Can you help me!”
the output should be
“naC uoy pleh em!”
I just tryed with the following code, but there are some problem with it,
print"Enter the string:"
str1=raw_input()
print (' '.join((str1[::-1]).split(' ')[::-2]))
It prints "naC uoy pleh !em", just look the exclamation(!), it is the problem here. Anybody can help me???
The easiest is probably to use the re module to split the string:
import re
pattern = re.compile('(\W)')
string = raw_input('Enter the string: ')
print ''.join(x[::-1] for x in pattern.split(string))
When run, you get:
Enter the string: Can you help me!
naC uoy pleh em!
You could use re.sub() to find each word and reverse it:
In [8]: import re
In [9]: s = "Can you help me!"
In [10]: re.sub(r'[-\w]+', lambda w:w.group()[::-1], s)
Out[10]: 'naC uoy pleh em!'
My answer, more verbose though. It handles more than one punctuation mark at the end as well as punctuation marks within the sentence.
import string
import re
valid_punctuation = string.punctuation.replace('-', '')
word_pattern = re.compile(r'([\w|-]+)([' + valid_punctuation + ']*)$')
# reverses word. ignores punctuation at the end.
# assumes a single word (i.e. no spaces)
def word_reverse(w):
m = re.match(word_pattern, w)
return ''.join(reversed(m.groups(1)[0])) + m.groups(1)[1]
def sentence_reverse(s):
return ' '.join([word_reverse(w) for w in re.split(r'\s+', s)])
str1 = raw_input('Enter the sentence: ')
print sentence_reverse(str1)
Simple solution without using re module:
print 'Enter the string:'
string = raw_input()
line = word = ''
for char in string:
if char.isalnum() or char == '-':
word = char + word
else:
if word:
line += word
word = ''
line += char
print line + word
you can do this.
print"Enter the string:"
str1=raw_input()
print( ' '.join(str1[::-1].split(' ')[::-1]) )
or then, this
print(' '.join([w[::-1] for w in a.split(' ') ]))

python file reading

I have file /tmp/gs.pid with content
client01: 25778
I would like retrieve the second word from it.
ie. 25778.
I have tried below code but it didn't work.
>>> f=open ("/tmp/gs.pid","r")
>>> for line in f:
... word=line.strip().lower()
... print "\n -->" , word
Try this:
>>> f = open("/tmp/gs.pid", "r")
>>> for line in f:
... word = line.strip().split()[1].lower()
... print " -->", word
>>> f.close()
It will print the second word of every line in lowercase. split() will take your line and split it on any whitespace and return a list, then indexing with [1] will take the second element of the list and lower() will convert the result to lowercase. Note that it would make sense to check whether there are at least 2 words on the line, for example:
>>> f = open("/tmp/gs.pid", "r")
>>> for line in f:
... words = line.strip().split()
... if len(words) >= 2:
... print " -->", words[1].lower()
... else:
... print 'Line contains fewer than 2 words.'
>>> f.close()
word="client01: 25778"
pid=word.split(": ")[1] #or word.split()[1] to split from the separator
If all lines are of the form abc: def, you can extract the 2nd part with
second_part = line[line.find(": ")+2:]
If not you need to verify line.find(": ") really returns a nonnegative number first.
with open("/tmp/gs.pid") as f:
for line in f:
p = line.find(": ")
if p != -1:
second_part = line[p+2:].lower()
print "\n -->", second_part
>>> open("/tmp/gs.pid").read().split()[1]
'25778'

Categories