Create hash table from a 2 elements of a file in Python - python

I'm trying to combine every 2 elements of a txt file, and hash it to create a hash table, using Python. My code is as below:
import hashlib
def SHA1_hash(string):
hash_obj = hashlib.sha1(string.encode())
return(hash_obj.hexdigest())
with open("/Users/admin/Downloads/Project_files/dictionary.txt") as f:
text_file = open("/Users/admin/Downloads/Project_files/text_combined.txt", "w",encoding = 'utf-8')
for i in f.readlines():
for j in f.readlines():
text_c = i.strip() + j.strip()
n = text_file.write(SHA1_hash(text_c) + "\n")
text_file.close()
The file is 64KB (more than 5700 lines). I tried to run the code but it is not working nor showing any errors. The destination file (text_combined.txt) did not have anything either. Can I ask if I am doing it right or wrong?
I am new to Python as well as programming so please excuse me if I ask any bad questions. Thank you so much.

The second f.readlines() has nothing to read, because you've already read the entire file.
Read the file into a list variable, then iterate through the list.
with open("/Users/admin/Downloads/Project_files/dictionary.txt") as f, open("/Users/admin/Downloads/Project_files/text_combined.txt", "w",encoding = 'utf-8') as textfile:
lines = f.readlines():
for i in lines:
for j in lines:
text_c = i.strip() + j.strip()
n = text_file.write(SHA1_hash(text_c) + "\n")

Related

How to output ONLY new additions between files with Python Difflib?

I am comparing two text files using Difflib like so:
import difflib
new_file = open(file_name, "r")
old_file = open(old_file_name, "r")
file_difference = difflib.ndiff(old_file.readlines(), new_file.readlines())
My goal is to ONLY output additions. I do not want to know about changes to existing lines. However, I've run into a problem where all changes/additions are marked with "+ ", and all subtractions are marked with "- ". I've done a lot of searching, and it appears there's no way to differentiate a line that has been changed, and a line that is brand new. I am confused on how to proceed.
import csv
f1 = open(old_file_name, "r")
contents1 = f1.read()
f2 = open(file_name, "r",)
contents2 = f2.read()
for data in contents2:
if data not in contents1:
file = open(output_path, 'a', newline='')
# writing the data into the file
with file:
write = csv.writer(file)
write.writerows(data)
A great friend of mine provided a code snippet that answered my question:
# Open the files for comparison
with open(file_name, "r") as new_file:
with open(old_file_name, "r") as old_file:
# Find the differences between the two files
file_difference = difflib.ndiff(old_file.readlines(), new_file.readlines())
new_lines = []
file_difference = tuple(x for x in file_difference)
idx = 0
fdiff_size = len(file_difference)
while idx < fdiff_size:
line = file_difference[idx]
if line.startswith("- "):
if idx + 1 < fdiff_size and file_difference[idx + 1].startswith("? "):
# this chunk is a change, so ignore this and the next 3 lines
idx += 4
continue
elif line.startswith("+ "):
new_lines.append(line)
# always iterate after new item or no change
idx += 1

Adding a new string to the end of a specific line in a text file

I'm new to python hence I am unable to implement the solutions I've found online in order to fix my problem.
I am trying to add a specific string to the end of a specific line to a textfile. As I understand text commands, I must overwrite the file if I don't want to append to the end of it. So, my solution is as follows:
ans = 'test'
numdef = ['H',2]
f = open(textfile, 'r')
lines = f.readlines()
f.close()
f = open(textfile, 'w')
f.write('')
f.close()
f = open(textfile, 'a')
for line in lines:
if int(line[0]) == numdef[1]:
if str(line[2]) == numdef[0]:
k = ans+ line
f.write(k)
else:
f.write(line)
Basically, I am trying to add variable ans to the end of a specific line, the line which appears in my list numdef. So, for example, for
2 H: 4,0 : Where to search for information : google
I want
2 H: 4,0 : Where to search for information : google test
I have also tried using line.insert() but to no avail.
I understand using the 'a' function of the open command is not so relevant and helpful here, but I am out of ideas. Would love tips with this code, or if maybe I should scrap it and rethink the whole thing.
Thank you for your time and advice!
When you use the method
lines = f.readlines()
Python automatically adds "\n" to the end of each line.
Try instead of :
k = line+ans
The following:
k = line.rstrip('\n') + ans
Good luck!
Try this. You don't have an else case if it meets the first requirement but not the other.
ans = 'test'
numdef = ['H',2]
f = open(textfile, 'r')
lines = f.readlines()
f.close()
f = open(textfile, 'w')
f.write('')
f.close()
f = open(textfile, 'a')
for line in lines:
if int(line[0]) == numdef[1] and str(line[2]) == numdef[0]:
k = line.replace('\n','')+ans
f.write(k)
else:
f.write(line)
f.close()
Better way:
#initialize variables
ans = 'test'
numdef = ['H',2]
#open file in read mode, add lines into lines
with open(textfile, 'r') as f:
lines=f.readlines()
#open file in write mode, override everything
with open(textfile, 'w') as f:
#in the list comprehension, loop through each line in lines, if both of the conditions are true, then take the line, remove all newlines, and add ans. Otherwise, remove all the newlines and don't add anything. Then combine the list into a string with newlines as separators ('\n'.join), and write this string to the file.
f.write('\n'.join([line.replace('\n','')+ans if int(line[0]) == numdef[1] and str(line[2]) == numdef[0] else line.replace('\n','') for line in lines]))

Counting to 100,000 and writing that to a file

I haven't used Python for a while but I decided to create a program today to help me with some work I am trying to do. I am trying to create a program that writes the numbers 1-100,000 with the symbol | after each but can't seem to strip the file after I create it so it shows like this: 1|2|3|4.
My Code:
a = 0
b = "|"
while a < 100000:
a += 1 # Same as a = a + 1
new = (a,b)
f = open("export.txt","a") #opens file with name of "export.txt"
f.write(str(new))
f.close()
infile = "export.txt"
outfile = "newfile.txt"
delete_list = ["(","," "'"]
fin = open(infile)
fout = open(outfile, "w+")
for line in fin:
for word in delete_list:
line = line.replace(word, "")
fout.write(line)
fin.close()
fout.close()
export.txt:
newfile.txt:
It looks like you're doing a lot of work unnecessarily.
If all you want is a file that has the numbers 0-99999 with | after each, you could do:
delim = "|"
with open('export.txt', 'w') as f:
for a in xrange(100):
f.write("%d%s" % (a, delim))
I'm not sure what the purpose of the second file is, but, in general, to open one file to read from and a second to write to, you could do:
with open('export.txt', 'r') as fi:
with open('newfile.txt', 'w') as fo:
for line in fi:
for word in line.split('|'):
print(word)
fo.write(word)
Note that there are no newlines in the original file, so for line in fi is actually reading the entire contents of "export.txt" -- this could cause issues.
Try this for writing your file:
numbers = []
for x in range(1,100001):
numbers.append(str(x))
f = open('export.txt', 'w')
f.write('|'.join(numbers))
f.close()

Python Insert text before a specific line

I want to insert a text specifically before a line 'Number'.
I want to insert 'Hello Everyone' befor the line starting with 'Number'
My code:
import re
result = []
with open("text2.txt", "r+") as f:
a = [x.rstrip() for x in f] # stores all lines from f into an array and removes "\n"
# Find the first occurance of "Centre" and store its index
for item in a:
if item.startswith("Number"): # same as your re check
break
ind = a.index(item) #here it produces index no./line no.
result.extend(a[:ind])
f.write('Hello Everyone')
tEXT FILE:
QWEW
RW
...
Number hey
Number ho
Expected output:
QWEW
RW
...
Hello Everyone
Number hey
Number ho
Please help me to fix my code:I dont get anything inserted with my text file!Please help!
Answers will be appreciated!
The problem
When you do open("text2.txt", "r"), you open your file for reading, not for writing. Therefore, nothing appears in your file.
The fix
Using r+ instead of r allows you to also write to the file (this was also pointed out in the comments. However, it overwrites, so be careful (this is an OS limitation, as described e.g. here). The following should do what you desire: It inserts "Hello everyone" into the list of lines and then overwrites the file with the updated lines.
with open("text2.txt", "r+") as f:
a = [x.rstrip() for x in f]
index = 0
for item in a:
if item.startswith("Number"):
a.insert(index, "Hello everyone") # Inserts "Hello everyone" into `a`
break
index += 1
# Go to start of file and clear it
f.seek(0)
f.truncate()
# Write each line back
for line in a:
f.write(line + "\n")
The correct answer to your problem is the hlt one, but consider also using the fileinput module:
import fileinput
found = False
for line in fileinput.input('DATA', inplace=True):
if not found and line.startswith('Number'):
print 'Hello everyone'
found = True
print line,
This is basically the same question as here: they propose to do it in three steps: read everything / insert / rewrite everything
with open("/tmp/text2.txt", "r") as f:
lines = f.readlines()
for index, line in enumerate(lines):
if line.startswith("Number"):
break
lines.insert(index, "Hello everyone !\n")
with open("/tmp/text2.txt", "w") as f:
contents = f.writelines(lines)

Python- How to Remove Columns from a File

I'd like to remove the first column from a file. The file contains 3 columns separated by space and the columns has the following titles:
X', 'Displacement' and 'Force' (Please see the image).
I have came up with the following code, but to my disappointment it doesn't work!
f = open("datafile.txt", 'w')
for line in f:
line = line.split()
del x[0]
f.close()
Any help is much appreciated !
Esan
First of all, you're attempting to read from a file (by iterating through the file contents) that is open for writing. This will give you an IOError.
Second, there is no variable named x in existence (you have not declared/set one in the script). This will generate a NameError.
Thirdly and finally, once you have finished (correctly) reading and editing the columns in your file, you will need to write the data back into the file.
To avoid loading a (potentially large) file into memory all at once, it is probably a good idea to read from one file (line by line) and write to a new file simultaneously.
Something like this might work:
f = open("datafile.txt", "r")
g = open("datafile_fixed.txt", "w")
for line in f:
if line.strip():
g.write("\t".join(line.split()[1:]) + "\n")
f.close()
g.close()
Some reading about python i/o might be helpful, but something like the following should get you on your feet:
with open("datafile.txt", "r") as fin:
with open("outputfile.txt", "w") as fout:
for line in fin:
line = line.split(' ')
if len(line) == 3:
del line[0]
fout.write(line[0] + ' ' + line[1])
else:
fout.write('\n')
EDIT: fixed to work with blank lines
print ''.join([' '.join(l.split()[1:]) for l in file('datafile.txt')])
or, if you want to preserve spaces and you know that the second column always starts at the, say, 10th character:
print ''.join([l[11:] for l in file('datafile.txt')])

Categories