Python pattern search return only inside quotes - python

need help with the output. As of now it outputs the entire line
import re
pattern = re.compile('Computer([0-9]|[1-9][0-9]|[1-9][0-9][0-9])Properties')
with open("Test.xml") as f:
for line in f:
if pattern.search(line):
print(line)
Result
<Computer0Properties name="BRSM">
</Computer0Properties>
<Computer1Properties name="4U-142">
</Computer1Properties>
What I want no quotes or anything around the results
BRSM
4U-142
Have tried
print(re.findall(r"(\"[\w\s]+\")",line ))
it outputs
['"BRSM"']
[]
[]
completly missing the second result

import re
pattern = re.compile('Computer([0-9]|[1-9][0-9]|[1-9][0-9][0-9])Properties')
with open("Test.xml") as f:
for line in f:
if pattern.search(line):
print(line.split("\"", 4)[1], line.split("\"", 4)[3])
#BRSM 4U-142

with open(filepath) as f:
for line in f:
#parse each line returned and return only the names inside the quote
resultName = re.findall('"([^"]*)"',line )

Related

Reading from a file reads only the last line in python

I have a problem with python and file reading. When i'm running this code, it reads only the last line from the text file. This is the code:
with open("main.txt", "r") as f:
for line in f:
if line.startswith("console.out(") and line.endswith(")"):
if line.startswith("console.out(\"") and line.endswith("\")"):
consoleFixWS = line.replace("console.out(\"", "")
finalOutWS = consoleFixWS.replace("\")", "")
print(finalOutWS)
The text file:
console.out("Test")
console.out("Test2")
It prints only "Test2" and I tried everything. Thanks in advance.
The first line actually ends with a \n newline character. If you run line = line.strip() before the conditional it will remove any leading and trailing whitespace.
with open("main.txt") as f:
for line in f:
line = line.strip()
if line.startswith("console.out(") and line.endswith(")"):
if line.startswith("console.out(\"") and line.endswith("\")"):
consoleFixWS = line.replace("console.out(\"", "")
finalOutWS = consoleFixWS.replace("\")", "")
print(finalOutWS)
Probably it is your startswith/replace is bugging. Try re module: very nice and flexible:
import re
regex = "console\.out\((?P<text>(.*))\)"
with open("main.txt", "r") as f:
for line in f:
search = re.search(regex, line)
if search:
text = search.group('text')
finalOutWS = text.replace("\\", "")
print(finalOutWS)
Works fine

Eliminate specific number in a data file using Python

I have a large file and I want to delete all the values '24' within the data file. I have used this code but it doesn't do what I want. Suggestions please. Thanks
This is the data file
24,24,24,24,24,24,1000,1000,24,24,24,1000,1000,1000,1000,24,24,24,24,24,24,24,24,24,24,1000,1000,1000,1000,1000,1000,1000,1000,24,24,24,24,1000,1000,1000,1000,24,1000,24,24,24,24,1000,1000,1000,1000,1000,24,24,24,24,24,24,1000,24,24,24,24,1000,1000,1000,1000,1000,1000,1000,1000,1000,24,24,24,24,1000,1000,1000,1000,24,1000,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,1000,1000,24,24,24,24,24,24,1000,1000,1000,24,24,24,24,1000,1000,1000,1000,1000,1000,1000,1000,1000,24,24,24,24,24,24,24,24,24,24,24,24,24,1000,1000,24,24,24,24,24,24,24,24,24,1000,1000,1000,24,24,24,1000,24,24,1000,1000,24,24,24,24,1000,1000,1000,1000,1000,1000,1000,24,24,24,1000,1000,1000,1000,1000,1000,24,24,24,1000,1000,1000,1000,1000,1000,1000,24,24,24,24,1000,1000,24,1000,1000,24,24,1000,1000,1000,1000,1000,1000,1000,24,24,24,1000,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,1000,1000,24,24,24,1000,1000,1000,1000,1000,24,24,24,24,24,24,24,24,1000,1000,1000,1000,1000,24,24,24,24,24,24,1000,24,24,24,24,24,24,24,24,24,1000,1000,1000,1000,1000,1000,24,24,24,24,24,24,24,24,24,24,1000,1000,1000,24,1000,1000,1000,1000,24,24,1000,1000,24,24,24,24,24,24,24,1000,24,24,24,24,24,24,1000,1000,1000,1000,1000,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,24,1000,1000,1000,1000,1000
Code
content = open('txt1.txt', 'r').readlines()
cleandata = []
for line in content:
line = {i:None for i in line.replace("\n", "").split()}
for value in line.copy():
if value == "24":
line.pop(value)
cleandata.append(" ".join(line) + "\n")
open('txt2.txt', 'w').writelines(cleandata)
This should do it:
content = open('txt1.txt', 'r').readlines()
cleandata = []
for line in content:
line = line.replace('24','')
cleandata.append(line)
open('txt2.txt', 'w').writelines(cleandata)
You could use a regex for it, to match the 24 and delete it.
import re
regex24 = re.compile(r"\b24,?\b")
f = open('txt1.txt', 'r')
cleans = [regex24.sub("", line) for line in f.readlines()]
open('txt2.txt', 'w').writelines(cleans)

How to replace a line that has a specific text

I want to search for particular text and replace the line if the text is present in that line.
In this code I replace line 125, but want to replace dynamically according to the text:
file = open("config.ini", "r")
lines = file.readlines()
lines[125] = "minimum_value_gain = 0.01" + '\n'
f.writelines(lines)
f.close()
How do I make it that if a line has:
minimum_value_gain =
then replace that line with:
minimum_value_gain = 0.01
There is no reason for you to manually parse a config.ini file textually. You should use configparser to make things much simpler. This library reads the file for you, and in a way converts it to a dict so processing the data is much easier. For your task you can do something like:
import configparser
config = configparser.ConfigParser()
config.read("config.ini")
for section in config:
if config.has_option(section, "minimum_value_gain"):
config.set(section, "minimum_value_gain", "0.01")
with open("config.ini", 'w') as f:
config.write(f)
Since you are replacing complete line so if statement will do the trick for you, no need to replace text
#updated make sure one line doesn't have both values
file = open("config.ini", "r")
lines=file.readlines()
newlines = []
for line in lines:
if "minimum_value_gain" in line:
line = "minimum_value_gain = 0.01" + '\n'
if "score_threshold" in line:
line = "Values you want to add"+'\n'
newlines.append(line)
f.writelines(newlines)
f.close()
Little bit messy and not optimized but get's the job the, first readlines and inserts the next_text to the given pos(line). If the line doesn't exists Raises IndexError, else writes to the file
def replace_in_file(filename: str, search_text: str, string_to_add: str) -> None:
with open(filename, "r+") as file_to_write:
lines = file_to_write.readlines()
file_to_write.seek(0)
file_to_write.truncate()
for line in lines:
if line.startswith(search_text):
line = line.rstrip("\n") + string_to_add + "\n"
file_to_write.write(line)
replace_in_file("sdf.txt", "minimum_value_gain", " = 0.01")
You can use also the regex library of Python.
Here is an example.
It is better not to read and write in the same file, that is not good practice. Write in a different file then eventually rename it.
import re
pattern = 'minimum_value_gain'
string_to_replace = 'minimum_value_gain = 0.01\n'
file = open("config.ini", "r")
fileout = open("new_config.ini", "a")
lines=file.readlines()
newlines = [string_to_replace if re.match(pattern, line) else line for line in lines]
f.close()
fileout.writelines(lines)
fileout.close()
You can rename the file afterwards :
import os
os.remove("config.ini")
os.rename("new_config.ini", "config.ini")
Set the string you would like to look for (match_string = 'example')
Have a list output_list that is empty
Use with open(x,y) as z: (this will automatically close the file after completion)
for each line in file.readlines() - run through each line of the file
The if statement adds your replacement line if the match_string is in the line, else just the adds the line
NOTE: All variables can be any name that is not reserved (don't call something just 'list')
match_string = 'example'
output_list = []
with open("config.ini", "r") as file:
for line in file.readlines():
if match_string in line:
output_list.append('minimum_value_gain = 0.01\n')
else:
output_list.append(line)
Maybe not ideal for the first introduction to Python (or more readable) - But I would have done the problem as follows:
with open('config.ini', 'r') as in_file:
out_file = ['minimum_value_gain = 0.01\n' if 'example' in line else line for line in in_file.readlines()]
To replace a specific text in a string
a = 'My name is Zano'
b = a.replace('Zano', 'Zimmer')

Python: modify part of a string inside a .cfg file

I want to access a file (C:\Programmer\Test.txt
), find a string inside that file beginning with 'SS' and replace everything after that on the same line with a new string 'C:\Test\Flash'
The code below prints out the line I want to modify but I can't seem to find a suitable function that will replace everything after the 'SS' with the new string.
import re
for line in open('C:\Programmer\Build\Test.txt'):
if line.startswith('SS'):
print(line)
storedline = line
print(storedline)
You can do
file_path = 'C:\Programmer\Build\Test.txt'
new_line_content = 'C:\Test\Flash'
output = []
with open(file_path, 'r') as infile:
line = infile.readline()
while line:
if line[0:2] == 'SS':
output.append('SS{}\n'.format(new_line_content))
else:
output.append(line)
line = infile.readline()
with open(file_path, 'w') as outfile:
outfile.write(''.join(output))
Note that here the detection of the line(s) if line[0:2] == 'SS' is based on interpreting literally your requirement 'find a string inside that file beginning with 'SS''

execfile issue with the filename argument in python

I have two python scripts. One match script, the other is regex script. I want to start with my matchParser.py script and then go to regexParser.py. I want that my regexParser knows the filename of the matchParser and continue to use. I hope I could explain it clearly. I tried a lot, but unfortunately without success.
My OutputError: TypeError: coercing to Unicode: need string or buffer, file found
matchParser.py
import glob
intfiles = glob.glob("C:\Users\x\Desktop\x\*.csv")
header_saved = False
with open('outputdre121.csv','wb') as fout:
for filename in intfiles:
with open(filename) as fin:
header = next(fin)
if not header_saved:
fout.write(header)
header_saved = True
for line in fin:
fout.write(line)
print "start next script: regexParser.py"
execfile("regexParser.py")
regexParser.py
import re
import matchParser
lines = [line.strip() for line in open(matchParser.fout)] ## here the filename from MatchParser.py
with open('outputregexdre13.csv', "w") as output_csv:
for result in lines:
match = re.search(r'((\s\d+?[1-9])!?[ ])', result)
if match: output_csv.write(match.group(0) + '\n')
thanks!
I have found the solution:
Very simple actually ..
I add to matchParser.fout the .name method
lines = [line.strip() for line in open(matchParser.fout.name)]

Categories