Check if there are new strings in a txt file - python

I am trying to make a function which will compare two txt files. If it recognizes new lines that are in one file but not in the other, it will add them in a list and also in that file that does not contain those new lines. It fails to do that. Here is my function. What am I doing wrong?
newLinks = []
def newer():
with open('cnbcNewLinks.txt', 'w') as newL:
for line in open('cnbcCleanedLinks.txt'):
if line not in "cnbcNewLinks.txt":
newLinks.append(line)
newL.write(line)
else:
continue
cleaned = ''.join(newLinks)
print(cleaned)

I put in python code what #Alex suggested.
See the doc for set.
I replace you text file name by a.txt and b.txt to be easily readable.
# First read the files and compare then using `set`
with open('a.txt', 'r') as newL, open('b.txt', 'r') as cleanL:
a = set(newL)
b = set(cleanL)
add_to_cleanL = list(a - b) # list with line in newL that are not in cleanL
add_to_newL = list(b - a) # list with line in cleanL that are not in newL
# Then open in append mode to add at the end of the file
with open('a.txt', 'a') as newL, open('b.txt', 'a') as cleanL:
newL.write(''.join(add_to_newL)) # append the list at the end of newL
cleanL.write(''.join(add_to_cleanL)) # append the list at the end of cleanL

If files not big, then move data in list,
both of list convert in set and use 'differ' builtin functions, two times.
then add difference in files.

Related

Read content of txt files into lists to find duplicates

I'm new to Python.
My code should read 2 different .txt files into lists and compare them to find and delete duplicates.
Code
import os
dir = os.listdir
T = "Albums"
if T not in dir():
os.mkdir("Albums")
with open('list.txt','w+') as f:
linesA = f.readlines()
print(linesA) # output empty
with open('completed.txt','w+') as t:
linesB = t.readlines()
print(linesB) # output empty
for i in linesA[:]:
if i in linesB:
linesA.remove(i)
print(linesA)
print(linesB)
I tried the code above with following inputs:
in list.txt I wrote (on separate lines) A, B and C.
in completed.txt I wrote (also on separate lines) A and B.
It should have first output the content of the lists, which were empty for some reasons.
Why are the read lists empty?
Does this help:
I suggest using not os.path.exists(entry) instead of not entry in os.listdir(), it's not relevant for the problem, but I point it out anyway. (Also, you overwrote the built-in dir function)
I've split up the file using split("\n")
I've changed the way the files are opened to r+, this doesn't clear the file unlike w+.
Please note that if you want to use readlines you have to remove the new line for each entry.
import os
with open('list.txt','w+') as file:
file.write("Foo\n")
file.write("Bar")
with open('completed.txt','w+') as file:
file.write("Bar\n")
file.write("Python")
T = "Albums"
if not os.path.exists(T):
os.mkdir("Albums")
with open('list.txt','r+') as f:
linesA = f.read().split("\n")
print(linesA)
with open('completed.txt','r+') as t:
linesB = t.read().split("\n")
print(linesB)
for entry in list(linesA):
if entry in linesB:
linesA.remove(entry)
print(linesA)
print(linesB)
Output:
['Foo', 'Bar']
['Bar', 'Python']
['Foo']
['Bar', 'Python']
This makes little sense.
dir = os.listdir
You wanted to call os.listdir().
What you did was assign a reference to that function,
without actually calling the function.
Better to dispense with dir and just phrase it this way:
if T not in os.listdir():
with open('list.txt','w+') as f:
linesA = f.readlines()
...
with open('completed.txt','w+') as t:
linesB = t.readlines()
You wanted to open those with 'r' read mode,
rather than write.

generating list by reading from file

i want to generate a list of server addresses and credentials reading from a file, as a single list splitting from newline in file.
file is in this format
login:username
pass:password
destPath:/directory/subdir/
ip:10.95.64.211
ip:10.95.64.215
ip:10.95.64.212
ip:10.95.64.219
ip:10.95.64.213
output i want is in this manner
[['login:username', 'pass:password', 'destPath:/directory/subdirectory', 'ip:10.95.64.211;ip:10.95.64.215;ip:10.95.64.212;ip:10.95.64.219;ip:10.95.64.213']]
i tried this
with open('file') as f:
credentials = [x.strip().split('\n') for x in f.readlines()]
and this returns lists within list
[['login:username'], ['pass:password'], ['destPath:/directory/subdir/'], ['ip:10.95.64.211'], ['ip:10.95.64.215'], ['ip:10.95.64.212'], ['ip:10.95.64.219'], ['ip:10.95.64.213']]
am new to python, how can i split by newline character and create single list. thank you in advance
You could do it like this
with open('servers.dat') as f:
L = [[line.strip() for line in f]]
print(L)
Output
[['login:username', 'pass:password', 'destPath:/directory/subdir/', 'ip:10.95.64.211', 'ip:10.95.64.215', 'ip:10.95.64.212', 'ip:10.95.64.219', 'ip:10.95.64.213']]
Just use a list comprehension to read the lines. You don't need to split on \n as the regular file iterator reads line by line. The double list is a bit unconventional, just remove the outer [] if you decide you don't want it.
I just noticed you wanted the list of ip addresses joined in one string. It's not clear as its off the screen in the question and you make no attempt to do it in your own code.
To do that read the first three lines individually using next then just join up the remaining lines using ; as your delimiter.
def reader(f):
yield next(f)
yield next(f)
yield next(f)
yield ';'.join(ip.strip() for ip in f)
with open('servers.dat') as f:
L2 = [[line.strip() for line in reader(f)]]
For which the output is
[['login:username', 'pass:password', 'destPath:/directory/subdir/', 'ip:10.95.64.211;ip:10.95.64.215;ip:10.95.64.212;ip:10.95.64.219;ip:10.95.64.213']]
It does not match your expected output exactly as there is a typo 'destPath:/directory/subdirectory' instead of 'destPath:/directory/subdir' from the data.
This should work
arr = []
with open('file') as f:
for line in f:
arr.append(line)
return [arr]
You could just treat the file as a list and iterate through it with a for loop:
arr = []
with open('file', 'r') as f:
for line in f:
arr.append(line.strip('\n'))

importing from a text file to a dictionary

filename:dictionary.txt
YAHOO:YHOO
GOOGLE INC:GOOG
Harley-Davidson:HOG
Yamana Gold:AUY
Sotheby’s:BID
inBev:BUD
code:
infile = open('dictionary.txt', 'r')
content= infile.readlines()
infile.close()
counters ={}
for line in content:
counters.append(content)
print(counters)
i am trying to import contents of the file.txt to the dictionary. I have searched through stack overflow but please an answer in a simple way (not with open...)
First off, instead of opening and closing the files explicitly you can use with statement for opening the files which, closes the file automatically at the end of the block.
Secondly, as the file objects are iterator-like objects (one shot iterable) you can loop over the lines and split them with : character. You can do all of these things as a generator expression within dict function:
with open('dictionary.txt') as infile:
my_dict = dict(line.strip().split(':') for line in infile)
I assume that you don't have semi-colons in your keys.
In that case you should:
#read lines from your file
lines = open('dictionary.txt').read().split('\n')
#create an empty dictionary
dict = {}
#split every lines at ':' and use the left element as a key for the right value
for l in lines:
content = l.split(':')
dict[content[0]] = content[1]

Opening a file in Python

Question:
How can I open a file in python that contains one integer value per line. Make python read the file, store data in a list and then print the list?
I have to ask the user for a file name and then do everything above. The file entered by the user will be used as 'alist' in the function below.
Thanks
def selectionSort(alist):
for index in range(0, len(alist)):
ismall = index
for i in range(index,len(alist)):
if alist[ismall] > alist[i]:
ismall = i
alist[index], alist[ismall] = alist[ismall], alist[index]
return alist
I think this is exactly what you need:
file = open('filename.txt', 'r')
lines = [int(line.strip()) for line in file.readlines()]
print(lines)
I didn't use a with statement here, as I was not sure whether or not you intended to use the file further in your code.
EDIT: You can just assign an input to a variable...
filename = input('Enter file path: ')
And then the above stuff, except open the file using that variable as a parameter...
file = open(filename, 'r')
Finally, submit the list lines to your function, selectionSort.
selectionSort(lines)
Note: This will only work if the file already exists, but I am sure that is what you meant as there would be no point in creating a new one as it would be empty. Also, if the file specified is not in the current working directory you would need to specify the full path- not just the filename.
Easiest way to open a file in Python and store its contents in a string:
with open('file.txt') as f:
contents = f.read()
for your problem:
with open('file.txt') as f:
values = [int(line) for line in f.readlines()]
print values
Edit: As noted in one of the other answers, the variable f only exists within the indented with-block. This construction automatically handles file closing in some error cases, which you would have to do with a finally-construct otherwise.
You can assign the list of integers to a string or a list
file = open('file.txt', mode = 'r')
values = file.read()
values will have a string which can be printed directly
file = open('file.txt', mode = 'r')
values = file.readlines()
values will have a list for each integer but can't be printed directly
f.readlines() read all the lines in your file, but what if your file contains a lot of lines?
You can try this instead:
new_list = [] ## start a list variable
with open('filename.txt', 'r') as f:
for line in f:
## remove '\n' from the end of the line
line = line.strip()
## store each line as an integer in the list variable
new_list.append(int(line))
print new_list

How can I append to the new line of a file while using write()?

In Python:
Let's say I have a loop, during each cycle of which I produce a list with the following format:
['n1','n2','n3']
After each cycle I would like to write to append the produced entry to a file (which contains all the outputs from the previous cycles). How can I do that?
Also, is there a way to make a list whose entries are the outputs of this cycle? i.e.
[[],[],[]] where each internal []=['n1','n2','n3] etc
Writing single list as a line to file
Surely you can write it into a file like, after converting it to string:
with open('some_file.dat', 'w') as f:
for x in xrange(10): # assume 10 cycles
line = []
# ... (here is your code, appending data to line) ...
f.write('%r\n' % line) # here you write representation to separate line
Writing all lines at once
When it comes to the second part of your question:
Also, is there a way to make a list whose entries are the outputs of this cycle? i.e. [[],[],[]] where each internal []=['n1','n2','n3'] etc
it is also pretty basic. Assuming you want to save it all at once, just write:
lines = [] # container for a list of lines
for x in xrange(10): # assume 10 cycles
line = []
# ... (here is your code, appending data to line) ...
lines.append('%r\n' % line) # here you add line to the list of lines
# here "lines" is your list of cycle results
with open('some_file.dat', 'w') as f:
f.writelines(lines)
Better way of writing a list to file
Depending on what you need, you should probably use one of the more specialized formats, than just a text file. Instead of writing list representations (which are okay, but not ideal), you could use eg. csv module (similar to Excel's spreadsheet): http://docs.python.org/3.3/library/csv.html
f=open(file,'a') first para is the path of file,second is the pattern,'a' is append,'w' is write, 'r' is read ,and so on
im my opinion,you can use f.write(list+'\n') to write a line in a loop ,otherwise you can use f.writelines(list),it also functions.
Hope this can help you:
lVals = []
with open(filename, 'a') as f:
for x,y,z in zip(range(10), range(5, 15), range(10, 20)):
lVals.append([x,y,z])
f.write(str(lVals[-1]))

Categories