Python 3.6 - Read encoded text from file and convert to string - python

Hopefully someone can help me out with the following. It is probably not too complicated but I haven't been able to figure it out. My "output.txt" file is created with:
f = open('output.txt', 'w')
print(tweet['text'].encode('utf-8'))
print(tweet['created_at'][0:19].encode('utf-8'))
print(tweet['user']['name'].encode('utf-8'))
f.close()
If I don't encode it for writing to file, it will give me errors. So "output" contains 3 rows of utf-8 encoded output:
b'testtesttest'
b'line2test'
b'\xca\x83\xc9\x94n ke\xc9\xaan'
In "main.py", I am trying to convert this back to a string:
f = open("output.txt", "r", encoding="utf-8")
text = f.read()
print(text)
f.close()
Unfortunately, the b'' - format is still not removed. Do I still need to decode it? If possible, I would like to keep the 3 row structure.
My apologies for the newbie question, this is my first one on SO :)
Thank you so much in advance!

With the help of the people answering my question, I have been able to get it to work. The solution is to change the way how to write to file:
tweet = json.loads(data)
tweet_text = tweet['text'] # content of the tweet
tweet_created_at = tweet['created_at'][0:19] # tweet created at
tweet_user = tweet['user']['name'] # tweet created by
with open('output.txt', 'w', encoding='utf-8') as f:
f.write(tweet_text + '\n')
f.write(tweet_created_at+ '\n')
f.write(tweet_user+ '\n')
Then read it like:
f = open("output.txt", "r", encoding='utf-8')
tweettext = f.read()
print(text)
f.close()

Instead of specifying the encoding when opening the file, use it to decode as you read.
f = open("output.txt", "rb")
text = f.read().decode(encoding="utf-8")
print(text)
f.close()

If b and the quote ' are in your file, that means this in a problem with your file. Someone probably did write(print(line)) instead of write(line). Now to decode it, you can use literal_eval. Otherwise #m_callens answer's should be ok.
import ast
with open("b.txt", "r") as f:
text = [ast.literal_eval(line) for line in f]
for l in text:
print(l.decode('utf-8'))
# testtesttest
# line2test
# ʃɔn keɪn

Related

Python Read strings from file

Unfortunately I have the problem that I can not read strings from files, the part of the code would be:
names = ["Johnatan", "Jackson"]
I tried it with this =
open("./names.txt", "r")
instead of (code above)- list of names, but unfortunately this does not work, if I query it without the file it works without problems.
I would be very happy if someone could help me and tell me exactly where the problem is.
f = open('./names.txt', 'r', encoding='utf-8')
words_string = f.readlines()
words = words_string.split(',')
print(words)
I hope it helps.. As your file contains all the words with comma-separated using readlines we get all the lines in the file as a single string and then simply split the string using .split(','). This results in the list of words that you need.
try read the data in the file like this:
with open(file_path, "r") as f:
data = f.readlines()
data = ['maria, Johnatan, Jackson']
inside data is the list of name. you can parse it using split(",")
you can split in lines if you have in the file, Johnatan,Jackson,Maria
doing:
with open("./names.txt", "r", encoding='utf-8') as fp:
content = fp.read()
names = content.split(",")
you can also do:
names = open("./names.txt", "r", encoding='utf-8').read().split(",")
if you want it to be oneliner,

Replace single line into multi line python

I have file.txt the contents are below
{"action":"validate","completed_at":"2019-12-24T15:24:40+05:30"}{"action":"validate","completed_at":"2019-12-24T15:24:42+05:30"}{"action":"validate","completed_at":"2019-12-24T15:24:45+05:30"}{"action":"validate","completed_at":"2019-12-24T15:24:48+05:30"}
How to convert to like below
{"action":"validate","completed_at":"2019-12-24T15:24:40+05:30"}
{"action":"validate","completed_at":"2019-12-24T15:24:42+05:30"}
{"action":"validate","completed_at":"2019-12-24T15:24:45+05:30"}
{"action":"validate","completed_at":"2019-12-24T15:24:48+05:30"}
I tried
with open('file.txt', w) as f:
f.replace("}{", "}\n{")
Any better way is to replace?
If you file is small enough you could try
with open('file.txt', 'r+') as f:
content = f.read()
f.seek(0)
f.truncate()
f.write(content.replace("}{", "}\n{"))
I would not replace inplace the content, but rather read it, split it, then write it again using a simple regex {.*?}
with open('file.txt', 'r') as f:
value = f.read()
contents = re.findall('{.*?}', value)
with open('file.txt', 'w') as f:
for content in contents:
f.write(content + "\n")

Inserting a comma in between columns in text tile

The problem is I have this text, csv file which is missing commas and I would like to insert it in order to run the file on LaTex and make a table. I have a MWE of a code from another problem which I ran and it did not work. Is it possible someone could guide me on how to change it.
I have used a Python code which provides a blank file, and another one which provides a blank document, and another which removes the spaces.
import fileinput
input_file = 'C:/Users/Light_Wisdom/Documents/Python Notes/test.txt'
output= open('out.txt','w+')
with open('out.txt', 'w+') as output:
for each_line in fileinput.input(input_file):
output.write("\n".join(x.strip() for x in each_line.split(',')))
text file contains more numbers but its like this
0 2.58612
0.00616025 2.20018
0.0123205 1.56186
0.0184807 0.371172
0.024641 0.327379
0.0308012 0.368863
0.0369615 0.322228
0.0431217 0.171899
Outcome
0.049282, -0.0635003
0.0554422, -0.110747
0.0616025, 0.0701394
0.0677627, 0.202381
0.073923, 0.241264
0.0800832, 0.193697
Renewed Attempt:
with open("CSV.txt","r") as file:
new = list(map(lambda x: ''.join(x.split()[0:1]+[","]+x.split()[0:2]),file.readlines()))
with open("New_CSV.txt","w+") as output:
for i in new:
output.writelines(i)
output.writelines("\n")
This can be using .split and .join by splitting the line into a list and then joining the list separated by commas. This enables us to handle several subsequent spaces in the file:
f1 = open(input_file, "r")
with open("out.txt", 'w') as f2:
for line in f1:
f2.write(",".join(line.split()) + "\n")
f1.close()
You can also use csv to handle the writing automatically:
import csv
f1 = open(input_file, "r")
with open("out.txt", 'w') as f2:
writer = csv.writer(f2)
for line in f1:
writer.writerow(line.split())
f1.close()

Python compare two files using a list

Im trying to compare two files via regex strings and print the output. I seem to have an issue with my loop as only the last line gets printed out. What am I missing ?
import re
delist = [r'"age":.*",',r'"average":.*",',r'"class":.*",']
with open('test1.txt', 'r') as bolo:
boloman = bolo.read()
for dabo in delist:
venga = re.findall(dabo, boloman)
for vaga in venga:
with open ('test.txt', 'r' ) as f:
content = f.read()
venga2 = re.findall(dabo, content)
for vaga2 in venga2:
mboa = content.replace(vaga2,vaga,1)
print (mboa)
At first, a problem I see is that you are always setting mboa with the only result. I think what you really want to do is to create a list and append it to that list.
import re
mboa = []
delist = [r'"age":.*",',r'"average":.*",',r'"class":.*",']
with open('test1.txt', 'r') as bolo:
boloman = bolo.read()
for dabo in delist:
venga = re.findall(dabo, boloman)
for vaga in venga:
with open ('test.txt', 'r' ) as f:
content = f.read()
venga2 = re.findall(dabo, content)
for vaga2 in venga2:
mboa.append(content.replace(vaga2,vaga,1))
print (mboa)
does that solve the issue? if it doesn't add a comment to this question and I'll try to fix it out ;)

Reading and Striping a text file

import re
f= ('HelloHowAreYou')
f = re.sub(r"([a-z\d])([A-Z])", r'\1 \2', f)
# Makes the string space separated. You can use split to convert it to list
f = f.split()
print (f)
this works fine to separate all the string of text by capital letters, however when i then change the code to read a text file i have issues. can anyone shed some light why?
to read a file I'm using:
f = open('words.txt','r')
to read a file I'm using:
f = open('words.txt','r')
But that code doesn't read the file, it only opens it. Try:
my_file = open('words.txt','r')
f = file.read()
my_file.close()
Or
with open('words.txt','r') as my_file:
f = my_file.read()

Categories