I volunteer for a modded minecraft community as a Moderators, and we deal with chatlogs frequently.
I'm building a program to take a server's chatlog, find matching usernames in it and writing them to a new file.
The way I have it at the moment, it takes the file, and converts each line into a list item, I regex the username using an expression, and write the line if it contains a match. The problem is, the way chatlogs come they are formatted like this: BOTusername, and I want to use the program to strip the BOT part before searching (makes it neater when written at the end.)
I know this is possible when you read the file normally using f.read('file.txt') but I was wondering if its possible do this with a list instead. Here is an example of what the list looks like.
The code I have so far is as follows:
import os
import re
username = 'UsernameHere'
path = os.path.dirname(__file__)
with open('chatlogs.txt', 'r') as f:
chatlogs = f.readlines()
print(chatlogs) # for debugging
# Checks the chatlogs for username matches
for line in chatlogs:
if re.match('(.*)' + username + '(.*)', line):
d = open('output.txt', 'a')
d.write(line)
If your username variable consistently looks like "BOTsomeusername", you can strip the first three characters with some simple indexing:
username = username[3:]
Thanks to Hoog for pointing me in the right direction!
I got it to work with the following code:
for line in chatlogs:
if re.match('(.*)' + username + '(.*)', line):
d = open('output.txt', 'a')
timestamp = line.split(']', 1)[0] + '] ' # includes the bracket that gets removed in the split operation
message = line.split('] ', 1)[1][3:] # saves the remainder of the message after the third character (After BOT)
d.write(timestamp + message)
Related
i am a completely beginner in programming, and i am trying to make my first python script, and i have a url in this form: https://stackoverflow.com/questions/ID
where ID is changed every time in a loop, and a list of IDs given in a text file.
now i tried to do it this way:
if id_str != "":
y = f'https://stackoverflow.com/questions/{id}'
browser.get(y)
but it opens only the first ID in the text file, so i need to know how to make it get a different ID from the text file every time.
Thanks in Advance
Generally it can be something like this:
with open(filename) as file:
lines = file.readlines()
for line in lines:
if id_str != "":
y = f'https://stackoverflow.com/questions/{id_str}'
browser.get(y)
where filename is a text file containing questions ids.
Each line here containing a single id string.
It can be more complicated, according to your needs / implementation.
I have multiple text files in a folder say "configs", I want to search a particular text "-cfg" in each file and copy the data after -cfg from opening to closing of inverted commas ("data"). This result should be updated in another text file "result.txt" with filename, test name and the config for each file.
NOTE: Each file can have multiple "cfg" in separate line along with test name related to that configuration.
E.g: cube_demo -cfg "RGB 888; MODE 3"
My approach is to open each text file one at a time and find the pattern, then store the required result into a buffer. Later, copy the entire result into a new file.
I came across Python and looks like it's easy to do it in Python. Still learning python and trying to figure out how to do it. Please help. Thanks.
I know how to open the file and iterate over each line to search for a particular string:
import re
search_term = "Cfg\s(\".*\")" // Not sure, if it's correct
ifile = open("testlist.csv", "r")
ofile = open("result.txt", "w")
searchlines = ifile.readlines()
for line in searchlines:
if search_term in line:
if re.search(search_term, line):
ofile.write(\1)
// trying to get string with the \number special sequence
ifile.close()
ofile.close()
But this gives me the complete line, I could not find how to use regular expression to get only the "data" and how to iterate over files in the folder to search the text.
Not quite there yet...
import re
search_term = "Cfg\s(\".*\")" // Not sure, if it's correct
"//" is not a valid comment marker, you want "#"
wrt/ your regexp, you want (from your specs) : 'cfg', followed by one or more space, followed by any text between double quotes, stopping at the first closing double quote, and want to capture the part between these double quotes. This is spelled as 'cfg "(.?)"'. Since you don't want to deal with escape chars, the best way is to use a raw single quoted string:
exp = r'cfg *"(.+?)"'
now since you're going to reuse this expression in a loop, you might as well compile it already:
exp = re.compile(r'cfg *"(.+?)"')
so now exp is a re.pattern object instead of string. To use it, you call it's search(<text>) method, with your current line as argument. If the line matches the expression, you'll get a re.match object, else you'll get None:
>>> match = exp.search('foo bar "baaz" boo')
>>> match is None
True
>>> match = exp.search('foo bar -cfg "RGB 888; MODE 3" tagada "tsoin"')
>>> match is None
False
>>>
To get the part between the double quotes, you call match.group(1) (second captured group, the first one being the one matchin the whole expression)
>>> match.group(0)
'cfg "RGB 888; MODE 3"'
>>> match.group(1)
'RGB 888; MODE 3'
>>>
Now you just have to learn and make correct use of files... First hint: files are context managers that know how to close themselves. Second hint: files are iterable, no need to read the whole file in memory. Third hint : file.write("text") WONT append a newline after "text".
If we glue all this together, your code should look something like:
import re
search_term = re.compile(r'cfg *"(.+?)"')
with open("testlist.csv", "r") as ifile:
with open("result.txt", "w") as ofile:
for line in ifile:
match = search_term.search(line)
if match:
ofile.write(match.group(1) + "\n")
EDIT: See bottom of post for the entire code
I am new to this forum and I have an issue that I would be grateful for any help solving.
Situation and goal:
- I have a list of strings. Each string is one word, like this: ['WORD', 'LINKS', 'QUOTE' ...] and so on.
- I would like to write this list of words (strings) on separate lines in a new text file.
- One would think the way to do this would be by appending the '\n' to every item in the list, but when I do that, I get a blank line between every list item. WHY?
Please have a look at this simple function:
def write_new_file(input_list):
with open('TEKST\\TEKST_ny.txt', mode='wt') as output_file:
for linje in input_list:
output_file.write(linje + '\n')
This produces a file that looks like this:
WORD
LINKS
QUOTE
If I remove the '\n', then the file looks like this:
WORDLINKSQUOTE
Instead, the file should look like this:
WORD
LINKS
QUOTE
I am obviously doing something wrong, but after a lot of experimenting and reading around the web, I can't seem to get it right.
Any help would be deeply appreciated, thank you!
Response to link to thread about write() vs. writelines():
Writelines() doesn't fix this by itself, it produces the same result as write() without the '\n'. Unless I add a newline to every list item before passing it to the writelines(). But then we're back at the first option and the blank lines...
I tried to use one of the answers in the linked thread, using '\n'.join() and then write(), but I still get the blank lines.
It comes down to this: For some reason, I get two newlines for every '\n', no matter how I use it. I am .strip()'ing the list items of newline characters to be sure, and without the nl everything is just one massive block of texts anyway.
On using another editor: I tried open the txt-file in windows notepad and in notepad++. Any reason why these programs wouldn't display it correctly?
EDIT: This is the entire code. Sorry for the Norwegian naming. The purpose of the program is to read and clean up a text file and return the words first as a list and ultimately as a new file with each word on a new line. The text file is a list of Scrabble-words, so it's rather big (9 mb or something). PS: I don't advocate Scrabble-cheating, this is just a programming exercise :)
def renskriv(opprinnelig_ord):
nytt_ord = ''
for bokstav in opprinnelig_ord:
if bokstav.isupper() == True:
nytt_ord = nytt_ord + bokstav
return nytt_ord
def skriv_ny_fil(ny_liste):
with open('NSF\\NSF_ny.txt', 'w') as f:
for linje in ny_liste:
f.write(linje + '\n')
def behandle_kildefil():
innfil = open('NSF\\NSF_full.txt', 'r')
f = innfil.read()
kildeliste = f.split()
ny_liste = []
for item in kildeliste:
nytt_ord = renskriv(item)
nytt_ord = nytt_ord.strip('\n')
ny_liste.append(nytt_ord)
skriv_ny_fil(ny_liste)
innfil.close()
def main():
behandle_kildefil()
if __name__ == '__main__':
main()
I think there must be some '\n' among your lines, try to skip empty lines.
I suggest you this code.
def write_new_file(input_list):
with open('TEKST\\TEKST_ny.txt', 'w') as output_file:
for linje in input_list:
if not linje.startswith('\n'):
output_file.write(linje.strip() + '\n')
You've said in the comments that python is writing two carriage return ('\r') characters for each line feed ('\n') character you write. It's a bit bizaare that python is replacing each line feed with two carriage returns, but this is a feature of opening a file in text mode (normally the translation would be to something more useful). If instead you open your file in binary mode then this translation will not be done and the file should display as you wish in Notepad++. NB. Using binary mode may cause problems if you need characters outside the ASCII range -- ASCII is basically just latin letters (no accents), digits and a few symbols.
For python 2 try:
filename = "somefile.txt"
with open(filename, mode="wb") as outfile:
outfile.write("first line")
outfile.write("\n")
outfile.write("second line")
Python 3 will be a bit more tricky. For each string literal you wish you write you must prepend it with a b (for binary). For each string you don't have immediate access to, or don't wish to change to a binary string, then you must encode it using the encode() method on the string. eg.
filename = "somefile.txt"
with open(filename, mode="wb") as outfile:
outfile.write(b"first line")
outfile.write(b"\n")
some_text = "second line"
outfile.write(some_text.encode())
My code looks like this:
def storescores():
hs = open("hst.txt","a")
hs.write(name)
hs.close()
so if I run it and enter "Ryan"
then run it again and enter "Bob"
the file hst.txt looks like
RyanBob
instead of
Ryan
Bob
How do I fix this?
If you want a newline, you have to write one explicitly. The usual way is like this:
hs.write(name + "\n")
This uses a backslash escape, \n, which Python converts to a newline character in string literals. It just concatenates your string, name, and that newline character into a bigger string, which gets written to the file.
It's also possible to use a multi-line string literal instead, which looks like this:
"""
"""
Or, you may want to use string formatting instead of concatenation:
hs.write("{}\n".format(name))
All of this is explained in the Input and Output chapter in the tutorial.
In Python >= 3.6 you can use new string literal feature:
with open('hst.txt', 'a') as fd:
fd.write(f'\n{name}')
Please notice using 'with statment' will automatically close the file when 'fd' runs out of scope
All answers seem to work fine. If you need to do this many times, be aware that writing
hs.write(name + "\n")
constructs a new string in memory and appends that to the file.
More efficient would be
hs.write(name)
hs.write("\n")
which does not create a new string, just appends to the file.
The answer is not to add a newline after writing your string. That may solve a different problem. What you are asking is how to add a newline before you start appending your string. If you want to add a newline, but only if one does not already exist, you need to find out first, by reading the file.
For example,
with open('hst.txt') as fobj:
text = fobj.read()
name = 'Bob'
with open('hst.txt', 'a') as fobj:
if not text.endswith('\n'):
fobj.write('\n')
fobj.write(name)
You might want to add the newline after name, or you may not, but in any case, it isn't the answer to your question.
I had the same issue. And I was able to solve it by using a formatter.
file_name = "abc.txt"
new_string = "I am a new string."
opened_file = open(file_name, 'a')
opened_file.write("%r\n" %new_string)
opened_file.close()
I hope this helps.
There is also one fact that you have to consider.
You should first check if your file is empty before adding anything to it. Because if your file is empty then I don't think you would like to add a blank new line in the beginning of the file. This code
first checks if the file is empty
If the file is empty then it will simply add your input text to the file else it will add a new line and then it will add your text to the file. You should use a try catch for os.path.getsize() to catch any exceptions.
Code:
import os
def storescores():
hs = open("hst.txt","a")
if(os.path.getsize("hst.txt") > 0):
hs.write("\n"+name)
else:
hs.write(name)
hs.close()
I presume that all you are wanting is simple string concatenation:
def storescores():
hs = open("hst.txt","a")
hs.write(name + " ")
hs.close()
Alternatively, change the " " to "\n" for a newline.
import subprocess
subprocess.check_output('echo "' + YOURTEXT + '" >> hello.txt',shell=True)
f=open("Python_Programs/files_forhndling.txt","a+")
inpt=str(input("Enter anything:\n>>"))
f.write(inpt)
f.write("\n")
print("Data inserted Successfully")
f.close()
welcome to file handling
new line
file handling in python
123456
78875454
✔ Output 💡 CLICK BELOW & SEE ✔
You need to change parameter "a" => "a+".
Follow this code bellows:
def storescores():
hs = open("hst.txt","a+")
Reading the million resources around the web, I'm getting more confused than helped, as I believe that there are many ways to do what I need to do with py.
So I hope some of you python gurus can lend me a hand.
What I need to do is the following:
Prompt user for input [INPUT]
Open an html file (simple, nothing too big)
Search for <a target="_top" href="http://website">Local website</a>
Replace http://website (which is never the same string) with [INPUT]
Write the file (as the same file opened)
Now, if I understand correctly I should use regex within python, is this correct?
My pseudo code (sorry, I know it looks terrible) would be:
var = raw_input("Enter input: ")
print var, "will be the new site"
import re
o = open("test.html","w")
data = open("test.html").read()
o.write( re.sub("<a target="_top" href="(*)">Local website</a>",var,data) )
o.close()
The above is probably not even the best way to do this, but it works without the regex part, doing a simple match-replace (where the match is always the same).
Any hint from you folks?
Your code looks pretty good. I just changed a little bit. I wasn't super clear on what your question was since your code seems to be functional. Hope it helps:
import re
INFILE = 'test.html'
OUTFILE = 'replaced.html'
new_site_name = raw_input('Enter input: ')
print new_site_name, 'will be the new site.'
pattern = '<a .* href="(.+)">.+</a>'
replacement = '<a target="_top" href=%s>Local website</a>' % new_site_name
with open(INFILE, 'r') as f:
html_text = f.read()
with open(OUTFILE, 'w') as f:
f.write(re.sub('<a .* href="(.+)">.+</a>', replacement, html_text))