python for save each list into some output file - python

i have already this code
#!usr.bin/env python
with open('honeyd.txt', 'r') as infile, open ('test.rule', 'w') as outfile:
for line in infile:
outfile.write('alert {} {} -> {} {}\n'.format(*line.split()))
this code is use to split all lines and save it into a file
my goal is to split all lines and save it into some files, as many as the line that i have in honeyd.txt. one line for one output file. if i have 3 lines, then each line much save in an output file. so i have 3 output files. if i have 10 lines, then each line much save in an output file. so i have 10 output files.
anyone can help with it?

Assuming you're ok with sequential numbering for your filenames:
with open('honeyd.txt', 'r') as infile:
for index, line in enumerate(infile, 1):
with open('test.rule{}'.format(index), 'w') as outfile:
outfile.write('alert {} {} -> {} {}\n'.format(*line.split()))
This will create files named test.rule1, test.rule2, etc.

Try this:
with open('honeyd.txt') as f:
lines = [line.strip().split() for line in f] # a list of lists
for i in range(len(lines)):
with open('test_{}.rule'.format(i), 'w') as f2:
f2.write("alert {} {} -> {} {}\n".format(*lines[i]))

Related

Python remove line containing specific char at the front

How do I remove a line from a txt file which start with ">"?
For example, in the txt file, there is about 250k+ lines and if I were to use the code below, it will take quite some time.
data = ""
with open(fileName) as f:
for line in f:
if ">" not in line:
line = line.replace("\n", "")
data += line
An example of the txt file is:
> version 1.0125 revision 0... # This is the line to be removed
some random line 1
some random line 2
> version 1.0126 revision 0... # This is the line to be removed
...
I have tried using data = f.read(), it is instant but the data will contain line that start with ">".
Any help is appreciated. Thank you :)
Not knowing what you want to do with the data afterwards, this should be fast and correct:
with open(fileName) as f:
data = "".join(line for line in f if not line.startswith(">"))
If you just want to remove these lines from the file, I would honestly not do it in Python, but in your shell directly, e.g. on Linux:
$ grep -v '^>' original_file.txt >fixed_file.txt
If you insist on Python, do it on a line-by-line basis:
with open(original_file) as f:
with open(new_file, "w") as g:
for line in f:
if not line.startswith(">"):
g.write(line)
Use two files, one for reading, second for appending:
with open(fileName, 'r') as f, open(fileName.raplace('.txt', '_1.txt'), 'a+') as df:
for line in f.readlines():
if not line.startswith('>'):
df.write(line)

Making the reading and writing of text files quicker

I have the following code, where I read an input list, split it on its backslash, and then append the variable evid to the evids array. Next, I open a file called evids.txt and write the evids to that file. How do I speed up/reduce the number of lines in this code? Thanks.
evids = []
with open('evid_list.txt', 'r') as infile:
data = infile.readlines()
for i in data:
evid = i.split('/')[2]
evids.append(evid)
with open('evids.txt', 'w') as f:
for i in evids:
f.write("%s" % i)
with open('evid_list.txt', 'r') as infile, open('evids.txt', 'w') as ofile:
for line in infile:
ofile.write('{}\n'.format(line.split('/')[2]))

Comparing two lines from two text files according to a single part of the text file

I have two text files and I want to write out two new text files according to whether there is a common section to each line in the two original text files.
The format of the text files is as follows:
commontextinallcases uniquetext2 potentiallycommontext uniquetext4
There are more than 4 columns but you get the idea. I want to check the 'potentiallycommontext' part in each text file and if they are the same write out the whole line of each text file to a new text file for each with its own unique text still in place.
Spliting it is fairly easy just using the .split() command when reading it in. I have found the following code:
with open('some_file_1.txt', 'r') as file1:
with open('some_file_2.txt', 'r') as file2:
same = set(file1).intersection(file2)
same.discard('\n')
with open('some_output_file.txt', 'w') as file_out:
for line in same:
file_out.write(line)
But I am not sure this would work for my case where I need to split the lines. Is there a way to do this I am missing?
Thanks
I don't think, that this set-approach is suitable for your case.
I'd try like
with open('some_file_1.txt', 'r') as file1, open('some_file_2.txt', 'r') as file2, open('some_output_file.txt', 'w') as file_out:
for line1, line2 in zip(file1, file2):
if line1.split()[2] == line2.split()[2]:
file_out.write(line1)
file_out.write(line2)
There might be shorter solutions but this should work
PCT_IDX = _ # find which index of line.split() corresponds to potentiallycommontext
def lines(filename):
with open(filename, 'r') as file:
for line in file:
line = line.rstrip('\n')
yield line
lines_1 = lines('some_file_1.txt')
lines_2 = lines('some_file_2.txt')
with open('some_output_file.txt', 'w') as file_out:
for (line_1, line_2) in zip(lines_1, lines_2):
maybe_cmn1 = line_1.split()[PCT_IDX]
maybe_cmn2 = line_2.split()[PCT_IDX]
if maybe_cmn1 == maybe_cmn2:
file_out.write(line_1)
file_out.write(line_2)

Python read .txt and split words after symbol #

I have a large 11 GB .txt file with email addresses. I would like to save only the strings till the # symbol among each other. My output only generate the first line.I have used this code of a earlier project. I would like to save the output in a different .txt file. I hope someone could help me out.
my code:
import re
def get_html_string(file,start_string,end_string):
answer="nothing"
with open(file, 'rb') as open_file:
for line in open_file:
line = line.rstrip()
if re.search(start_string, line) :
answer=line
break
start=answer.find(start_string)+len(start_string)
end=answer.find(end_string)
#print(start,end,answer)
return answer[start:end]
beginstr=''
end='#'
file='test.txt'
readstring=str(get_html_string(file,beginstr,end))
print readstring
Your file is quite big (11G) so you shouldn't keep all those strings in memory. Instead, process the file line by line and write the result before reading next line.
This should be efficient :
with open('test.txt', 'r') as input_file:
with open('result.txt', 'w') as output_file:
for line in input_file:
prefix = line.split('#')[0]
output_file.write(prefix + '\n')
If your file looks like this example:
user#google.com
user2#jshds.com
Useruser#jsnl.com
You can use this:
def get_email_name(file_name):
with open(file_name) as file:
lines = file.readlines()
result = list()
for line in lines:
result.append(line.split('#')[0])
return result
get_email_name('emails.txt')
Out:
['user', 'user2', 'Useruser']

join separate files with python

I want to join 100 differeant files into one.
Example of file with data:
example1.txt has in this format:
something
something
somehting
example2.txt has in this format:
something
something
somehting
and all the 100 files have the same format of data and also have a common name example1.....example100 which mean the example is the same and have a number.
from itertools import chain
infiles = [open('{}_example.txt'.format(i+1), 'r') for i in xrange(113)]
with open('example.txt', 'w') as fout:
for lines in chain(*infiles):
fout.write(lines)
I used this but the problem is the first line of the next file joined with the last of the previous file
If you have 100 files, better to just use an array of files:
infiles = [open('example{}.txt'.format(i+1), 'r') for i in xrange(100)]
with open('Join.txt', 'w') as fout:
for lines in izip_longest(*infiles, fillvalue=''):
lines = [line.rstrip('\n') for line in lines]
print >> fout, separator.join(lines)
I would open a new file as writable: join.txt, and then loop through the files you want with a range(1,100):
join = open('Join.txt','w')
for file in range(1,100):
file = open('example'+file+'.txt','r')
file = file.readlines()
for line in file:
join.write(line)

Categories