Python compare two files using a list - python

Im trying to compare two files via regex strings and print the output. I seem to have an issue with my loop as only the last line gets printed out. What am I missing ?
import re
delist = [r'"age":.*",',r'"average":.*",',r'"class":.*",']
with open('test1.txt', 'r') as bolo:
boloman = bolo.read()
for dabo in delist:
venga = re.findall(dabo, boloman)
for vaga in venga:
with open ('test.txt', 'r' ) as f:
content = f.read()
venga2 = re.findall(dabo, content)
for vaga2 in venga2:
mboa = content.replace(vaga2,vaga,1)
print (mboa)

At first, a problem I see is that you are always setting mboa with the only result. I think what you really want to do is to create a list and append it to that list.
import re
mboa = []
delist = [r'"age":.*",',r'"average":.*",',r'"class":.*",']
with open('test1.txt', 'r') as bolo:
boloman = bolo.read()
for dabo in delist:
venga = re.findall(dabo, boloman)
for vaga in venga:
with open ('test.txt', 'r' ) as f:
content = f.read()
venga2 = re.findall(dabo, content)
for vaga2 in venga2:
mboa.append(content.replace(vaga2,vaga,1))
print (mboa)
does that solve the issue? if it doesn't add a comment to this question and I'll try to fix it out ;)

Related

Python list iterate not working as expected

I have a file called list.txt:
['d1','d2','d3']
I want to loop through all the items in the list. Here is the code:
deviceList = open("list.txt", "r")
deviceList = deviceList.read()
for i in deviceList:
print(i)
Here the issue is that, when I run the code, it will split all the characters:
% python3 run.py
[
'
d
1
'
,
'
d
2
'
,
'
d
3
'
]
It's like all the items have been considered as 1 string? I think needs to be parsed? Please let me know what am I missing..
Simply because you do not have a list, you are reading a pure text...
I suggest writing the list without the [] so you can use the split() function.
Write the file like this: d1;d2;d3
and use this script to obtain a list
f = open("filename", 'r')
line = f.readlines()
f.close()
list = line.split(";")
if you need the [] in the file, simply add a strip() function like this
f = open("filename", 'r')
line = f.readlines()
f.close()
strip = line.strip("[]")
list = strip.split(";")
should work the same
This isn't the cleanest solution, but it will do if your .txt file is always just in the "[x,y,z]" format.
deviceList = open("list.txt", "r")
deviceList = deviceList[1:-1]
deviceList = deviceList.split(",")
for i in deviceList:
print(i)
This takes your string, strips the "[" and "]", and then separates the entire string between the commas and turns that into a list. As other users have suggested, there are probably better ways to store this list than a text file as it is, but this solution will do exactly what you are asking. Hope this helps!

Python Read strings from file

Unfortunately I have the problem that I can not read strings from files, the part of the code would be:
names = ["Johnatan", "Jackson"]
I tried it with this =
open("./names.txt", "r")
instead of (code above)- list of names, but unfortunately this does not work, if I query it without the file it works without problems.
I would be very happy if someone could help me and tell me exactly where the problem is.
f = open('./names.txt', 'r', encoding='utf-8')
words_string = f.readlines()
words = words_string.split(',')
print(words)
I hope it helps.. As your file contains all the words with comma-separated using readlines we get all the lines in the file as a single string and then simply split the string using .split(','). This results in the list of words that you need.
try read the data in the file like this:
with open(file_path, "r") as f:
data = f.readlines()
data = ['maria, Johnatan, Jackson']
inside data is the list of name. you can parse it using split(",")
you can split in lines if you have in the file, Johnatan,Jackson,Maria
doing:
with open("./names.txt", "r", encoding='utf-8') as fp:
content = fp.read()
names = content.split(",")
you can also do:
names = open("./names.txt", "r", encoding='utf-8').read().split(",")
if you want it to be oneliner,

Python writing apostrophes to file

I'm converting a downloaded Facebook Messenger conversation from JSON to a text file using Python. I've converted the JSON to text and it's all looking fine. I need to strip the unnecessary information and reverse the order of the messages, then save the output to a file, which I've done. However, when I am formatting the messages with Python, when I look at the output file, sometimes instead of an apostrophe, there's â instead.
My Python isn't great as I normally work with Java, so there's probably a lot of things I could improve. If someone could suggest some better tags for this question, I'd also be very appreciative.
Example of apostrophe working: You're not making them are you?
Example of apostrophe not working: Itâs just a button I discovered
What is causing this to happen and why does not happen every time there is an apostrophe?
Here is the script:
#/usr/bin/python3
import datetime
def main():
input_file = open('messages.txt', 'r')
output_file = open('results.txt', 'w')
content_list = []
sender_name_list = []
time_list = []
line = input_file.readline()
while line:
line = input_file.readline()
if "sender_name" in line:
values = line.split("sender_name")
sender_name_list.append(values[1][1:])
if "timestamp_ms" in line:
values = line.split("timestamp_ms")
time_value = values[1]
timestamp = int(time_value[1:])
time = datetime.datetime.fromtimestamp(timestamp / 1000.0)
time_truncated = time.replace(microsecond=0)
time_list.append(time_truncated)
if "content" in line:
values = line.split("content")
content_list.append(values[1][1:])
content_list.reverse()
sender_name_list.reverse()
time_list.reverse()
for x in range(1, len(content_list)):
output_file.write(sender_name_list[x])
output_file.write(str(time_list[x]))
output_file.write("\n")
output_file.write(content_list[x])
output_file.write("\n\n")
input_file.close()
output_file.close()
if __name__ == "__main__":
main()
Edit:
The answer to the question was adding
import codecs
input_file = codecs.open('messages.txt', 'r', 'utf-8')
output_file = codecs.open('results.txt','w', 'utf-8')
Without seeing the incoming data it's hard to be sure, but I suspect that instead of an apostrophe (Unicode U+0027 ' APOSTROPHE), you've got a curly-equivalent (U+2019 ’ RIGHT SINGLE QUOTATION MARK) in there trying to be interpreted as old-fashioned ascii.
Instead of
output_file = open('results.txt', 'w')
try
import codecs
output_file = codecs.open('results.txt','w', 'utf-8')
You may also need the equivalent on your input file.

Adding a new string to the end of a specific line in a text file

I'm new to python hence I am unable to implement the solutions I've found online in order to fix my problem.
I am trying to add a specific string to the end of a specific line to a textfile. As I understand text commands, I must overwrite the file if I don't want to append to the end of it. So, my solution is as follows:
ans = 'test'
numdef = ['H',2]
f = open(textfile, 'r')
lines = f.readlines()
f.close()
f = open(textfile, 'w')
f.write('')
f.close()
f = open(textfile, 'a')
for line in lines:
if int(line[0]) == numdef[1]:
if str(line[2]) == numdef[0]:
k = ans+ line
f.write(k)
else:
f.write(line)
Basically, I am trying to add variable ans to the end of a specific line, the line which appears in my list numdef. So, for example, for
2 H: 4,0 : Where to search for information : google
I want
2 H: 4,0 : Where to search for information : google test
I have also tried using line.insert() but to no avail.
I understand using the 'a' function of the open command is not so relevant and helpful here, but I am out of ideas. Would love tips with this code, or if maybe I should scrap it and rethink the whole thing.
Thank you for your time and advice!
When you use the method
lines = f.readlines()
Python automatically adds "\n" to the end of each line.
Try instead of :
k = line+ans
The following:
k = line.rstrip('\n') + ans
Good luck!
Try this. You don't have an else case if it meets the first requirement but not the other.
ans = 'test'
numdef = ['H',2]
f = open(textfile, 'r')
lines = f.readlines()
f.close()
f = open(textfile, 'w')
f.write('')
f.close()
f = open(textfile, 'a')
for line in lines:
if int(line[0]) == numdef[1] and str(line[2]) == numdef[0]:
k = line.replace('\n','')+ans
f.write(k)
else:
f.write(line)
f.close()
Better way:
#initialize variables
ans = 'test'
numdef = ['H',2]
#open file in read mode, add lines into lines
with open(textfile, 'r') as f:
lines=f.readlines()
#open file in write mode, override everything
with open(textfile, 'w') as f:
#in the list comprehension, loop through each line in lines, if both of the conditions are true, then take the line, remove all newlines, and add ans. Otherwise, remove all the newlines and don't add anything. Then combine the list into a string with newlines as separators ('\n'.join), and write this string to the file.
f.write('\n'.join([line.replace('\n','')+ans if int(line[0]) == numdef[1] and str(line[2]) == numdef[0] else line.replace('\n','') for line in lines]))

Read links from a list from a txt file - Python

I have a file named links.txt which contains the following list
(List name : Set_of_links) :
[https://link1.com, https://link2.com, https://link3.com/hello, https://links4.com/index.php, . . . . ]
I'm executing the program, links_python.py which needs to read each link from that file and store it in a local variable in the python script. I'm using the following program :
i = 0
with open(links.txt, "r") as f:
f.read(set_of_links[i])
i+=1
Seems to be not working.
If you have only 1 line of links, throw away the brackets and spaces and try
links = []
with open('links.txt')) as f:
links = f.read().split(',')
Try the following : Thanks to #Jean for the edit
i = 0
with open(links.txt, "r") as f:
set_of_links[i] = f.readline()
i+=1
If you want to separate each link and append it to set_of_links you can use re to substitute these characters [], then create the list by splitting. Using list comprehensions it should look like:
import re
with open('links.txt', 'r') as f:
set_of_links = [re.sub(r'[(\[\],)]', '', x) for x in f.read().split()]
print set_of_links
output:
['https://link1.com', 'https://link2.com', 'https://link3.com/hello', 'https://links4.com/index.php']

Categories