Searching for text in a file - python

hi there got a couple of probs, say in my text file i have:
abase
abased
abasement
abasements
abases
This coding below is meant to find a word in a file and print all the lines to the end of the file. But it doesnt it only prints out my search term and not the rest of the file.
search_term = r'\b%s\b' % search_term
for line in open(f, 'r'):
if re.match(search_term, line):
if search_term in line:
f = 1
if f: print line,
Say i searched for abasement, i would like the output to be:
abasement
abasements
abases
My final problem is, i would like to search a file a print the lines my search term is in and a number of lines befer and after the searchterm. If i searched the text example above with 'abasement' and i defined the number of lines to print either side as 1 my output would be:
abased
abasement
abasements
numb = ' the number of lines to print either side of the search line '
search_term = 'what i search'
f=open("file")
d={}
for n,line in enumerate(f):
d[n%numb]=line.rstrip()
if search_term in line:
for i in range(n+1,n+1+numb):
print d[i%numb]
for i in range(1,numb):
print f.next().rstrip()

For the first part of the question, unindent your if f: print line,. Otherwise, you're only trying to print when the regex matches.
It's not clear to me what your question is in the second part. I see what you're trying to do, and your code, but you've not indicated how it misbehaves.

For the first part the algorithm goes like this (in pseudo code):
found = False
for every line in the file:
if line contains search term:
found = True
if found:
print line

Related

how to go to next line if match is found and again check for the word count in that line

I am trying to find word count by find a match line if match is found go to next line and count the word in that line
id = open('id.txt','r')
ids = id.readlines()
for i in range(0, len(ids) - 1, 1):
actual_id = ids[i]
print(actual_id)
with open('sample2.txt', 'r') as f:
for line in f:
if re.search(r'{actual_id}|RQ', line):
next_line = line.next()
if next_line == 'RQ':
print(line)
with open('output.txt', 'a') as f:
f.write('\n' + line)
Sample.txt text file:
[07-12-2022 13:27:45.728|Info|0189B31C|RQ]
<ServiceRQ><SaleInfo><CityCode Solution=1>BLQ</CityCode><CountryCode Solution=2>NL</CountryCode><CurrencyCode>EUR</CurrencyCode><Channel>ICI</Channel></ServiceRQ>
[07-12-2022 13:27:45.744|Info|0189B31D|RQ]
<ServiceRQ><SaleInfo><CityCode Solution=1>BLQ</CityCode><CountryCode>NL</CountryCode><CurrencyCode>EUR</CurrencyCode><Channel>ICI</Channel></ServiceRQ>
0189B31C
0189B31D
These are unique id's which are store in different text file I am trying to read the 1st id from text file and match that id in Sample.txt and if match is found go to next line and count the number of Solution words and print.
Please can someone help me for find the code I am little confused.
I have no experience with the "requests" module. But since no one has answered your question yet, I thought maybe this would suit you. The code should work fine if the number of lines is even. I mean, the code will put strings in the "payload" and do the rest only if there is an entire pair consisting of an odd and an even string.
with open('Sample.txt', 'r') as f:
while True:
try:
odd_line=next(f)
even_line=next(f)
except StopIteration:
break
#payload=...
#headers=...
#response=...
#print(response.text)
You can use the flag re.DOTALL with the regex {idf}\|RQ.*?</ServiceRQ>, this way the regex matches any character including a newline, and the non-greedy modifier (.*?) part makes sure that few characters as possible will be matched until the string </ServiceRQ> is found. Then, you can use findall to obtain the number of Solution words in the string.
import re
with open('sample2.txt', 'r') as sample_file:
sample2 = sample_file.read()
id_dict = {}
with open('id.txt', 'r') as id_file:
for idf in id_file.read().split():
id_found = re.findall(fr'{idf}\|RQ.*?</ServiceRQ>', sample2, re.DOTALL)
if id_found:
solution_found = re.findall('Solution', id_found[0])
id_dict[idf] = len(solution_found)
print(id_dict)
Output from id_dict
{
'0189B31C': 2,
'0189B31D': 1
}

Search for a multiple line string in multiple files and give the next string out

I try to learn python by working on projects I really would like to have.
Now I'm at a point where I don't know how to solve it.
I want to search through a file and want lo list all the strings which stand behind an indicator string which also can variate.
Therefore, I need to search for a multiple line string with an unknown number of tabs between the strings and would like to know the string after this (multiple times in a file)
solution.append(
base.fresher(
current = indicator_var,
nominal = unknown_value,
comment = "comment XYZ"
)
)
#comment
solution.append(
base.fresher(
current = indicator_var,
nominal = unknown_value,
comment = "comment ABCDEFG"
)
)
" Base.fresher( current = indicater_var" is something I would like to search. But here I have the problem that I don't know how many tabs are between the "Base.fresher(" and "current = indicater_var". This can varriate.
And how should I proceed after I found it, how do I get the "unknown_value". And this multiple times in one file.
Actually I have no code to show you. I tried it so many times that the result was an unreadable piece of code which is even more confusing.
This is all I have right know:
your_search_word = "base.fresher("
file = open("test_file.txt", "r")
for line in file:
splitted = line.split("\n")
variables.append(splitted)
your_string = variables
list_of_words = your_string.split()
next_word = list_of_words[list_of_words.index(your_search_word) + 1]
print(next_word)
I had a little success with part of this code a few days ago, so I'm clinging to it, but I also know I have no idea how to get anywhere here.
words = []
your_search_word = "base.fresher("
with open("test_file.txt", "r") as f:
found = False
for line in f:
line = line.strip()
if found:
words.append(line)
elif your_search_word == line
found = True
Notes:
with open is maybe safer, because you don't have to close the file and therefore can't forget.
I used strip() to get rid of the newlines
we set a variable to True when a line equals the searchwords
we only add lines to our list 'words' when the found variable is True
With the help of nessuno I made a few changes (remove "=" and ",")
words = []
your_search_word = "current = indicator_var,"
with open("test_file.txt", "r") as f:
found = False
for line in f:
line = line.strip()
if found:
words.append(line.split("= ")[1].split(",")[0])
found = False
elif your_search_word == line:
found = True
print(words)
['unknown_value', 'unknown_value']
This is not the final solution but I have to work on it for a while and post my results here.
Thanks a lot!

How to check to see if a certain line is found before a certain point in a txt file?

I need to figure out if a certain phrase/line is found before another phrase takes place in a text file. If the phrase is found, I will pass, if it does not exist, I will add a line above the cutoff. Of note, the phrase can occur later in the document as well.
An example of what this txt format would be could be:
woijwoi
woeioasd
woaije
Is this found
owijefoiawjwfioj
This is the cutoff
asoi w
more text lines
Is this found
aoiw
The search should cut off on the phrase "This is the cutoff". It is unknown what line the cutoff will be on. If "Is this found" exists before the cutoff, pass. If it does not, I want to add the phrase "Adding a line" right above the cutoff to the output doc.
An example of the code I've tried so far, with all strings previously defined:
find = 'Is this found'
with open(longStr1) as old_file:
lines = old_file.readlines()
with open(endfile1, "w") as new_file:
for num, line in enumerate(lines):
if "This is the" in line:
base_num = num
for num in range(1, base_num):
if not find in line:
if line.startswith("This is the"):
line = newbasecase + line
I am getting an error for "name 'base_num' is not defined" Is there a better way to perform this search?
What about something like this? Looks for both find and cutoff index positions, then cycles through the line list and checks for the cutoff index, evaluates if there's a previous "find" variable and if not adds the "Adding a line" line and ends the new file.
find = "Is this found"
find_index = 0
cutoff = "This is the cutoff"
cutoff_index = 0
with open(longStr1) as old_file:
lines = old_file.readlines()
if find in lines:
find_index = lines.index(find)
if cutoff in lines:
cutoff_index = lines.index(cutoff)
with open(endfile1, "w") as new_file:
for num, line in enumerate(lines):
if cutoff in line:
if cutoff_index < find_index:
new_file.write("Adding a line\n")
new_file.write(line)
break
new_file.write(line)

How to open a file in python, read the comments ("#"), find a word after the comments and select the word after it?

I have a function that loops through a file that Looks like this:
"#" XDI/1.0 XDAC/1.4 Athena/0.9.25
"#" Column.4: pre_edge
Content
That is to say that after the "#" there is a comment. My function aims to read each line and if it starts with a specific word, select what is after the ":"
For example if I had These two lines. I would like to read through them and if the line starts with "#" and contains the word "Column.4" the word "pre_edge" should be stored.
An example of my current approach follows:
with open(file, "r") as f:
for line in f:
if line.startswith ('#'):
word = line.split(" Column.4:")[1]
else:
print("n")
I think my Trouble is specifically after finding a line that starts with "#" how can I parse/search through it? and save its Content if it contains the desidered word.
In case that # comment contain str Column.4: as stated above, you could parse it this way.
with open(filepath) as f:
for line in f:
if line.startswith('#'):
# Here you proceed comment lines
if 'Column.4' in line:
first, remainder = line.split('Column.4: ')
# Remainder contains everything after '# Column.4: '
# So if you want to get first word ->
word = remainder.split()[0]
else:
# Here you can proceed lines that are not comments
pass
Note
Also it is a good practice to use for line in f: statement instead of f.readlines() (as mentioned in other answers), because this way you don't load all lines into memory, but proceed them one by one.
You should start by reading the file into a list and then work through that instead:
file = 'test.txt' #<- call file whatever you want
with open(file, "r") as f:
txt = f.readlines()
for line in txt:
if line.startswith ('"#"'):
word = line.split(" Column.4: ")
try:
print(word[1])
except IndexError:
print(word)
else:
print("n")
Output:
>>> ['"#" XDI/1.0 XDAC/1.4 Athena/0.9.25\n']
>>> pre_edge
Used a try and except catch because the first line also starts with "#" and we can't split that with your current logic.
Also, as a side note, in the question you have the file with lines starting as "#" with the quotation marks so the startswith() function was altered as such.
with open('stuff.txt', 'r+') as f:
data = f.readlines()
for line in data:
words = line.split()
if words and ('#' in words[0]) and ("Column.4:" in words):
print(words[-1])
# pre_edge

Printing Lines in a File

So here's the problem I have,
I can find the Search Term in my file but at the moment I can only print out the line that the Search Term is in. (Thanks to Questions posted by people earlier =)). But I cannot print out all the lines to the end of the file after the Search Term. Here is the coding I have so far:-
search_term = r'\b%s\b' % search_term
for line in open(f, 'r'):
if re.match(search_term, line):
print line,
Thanks in advance!
It can be much improved if you first compile the regex:
search_term_regex = re.compile(r'\b%s\b' % search_term)
found = False
for line in open(f):
if not found:
found = bool(search_term_regex.findall(line))
if found:
print line,
Then you're not repeating the print line.
You could set a boolean flag, e.g. "found = True";
and do a check for found==True, and if so print the line.
Code below:
search_term = r'\b%s\b' % search_term
found = False;
for line in open(f, 'r'):
if found==True:
print line,
elif re.match(search_term, line):
found = True;
print line,
To explain this a bit: With the boolean flag you are adding some state to your code to modify its functionality. What you want your code to do is dependent on whether you have found a certain line of text in your file or not, so the best way to represent such a binary state (have I found the line or not found it?) is with a boolean variable like this, and then have the code do different things depending on the value of the variable.
Also, the elif is just a shortening of else if.

Categories