Write a complete list into a text file in Python - python

I got a list like this that I got reading a temporary text file before.
['Usain','Jamaican','9','2','0']
But now I need to write this list into a new text file that contains a list with lists. The text file should look like this:
[['Usain','Jamaican','9','2','0'], ['Christopher', 'Costarican', '0','1',2']]
I've tried to write the list into a text file, but I just import the elements on the list and write them as newlines.
my code looks like this
def terminar():
with open('temp.txt', 'r') as f:
registroFinal = [line.strip() for line in f]
final = open('final.txt','a')
for item in registroFinal:
final.write("%s\n" % item)
os.remove('temp.txt')

You can use json to dump out the list of lists:
import json
with open('final.txt','a') as final:
json.dump(registroFinal, final)
You would load this in with json.load(). Otherwise you could use repr(registroFinal) to write out the representation into a file. How is this data going to be used? If you plan to read it back into a python object I would favour the json approach.
Python also has the facility to manage temporary files, see tempfile module

Related

Constantly append element to list in json file [duplicate]

In python, I know that I can dump a list of dictionaries in .json file for storage with json.dump() from the json module. However, after dumping a list, is it possible to append more dictionaries to that list in the .json file without explictly read load the full list, append, then dump the list again?
e.g.
In .json I have
[{'a': 1}]
Is it possible to add {'b', 2} to the list in .json such that the file become
[{'a': 1}, {'b', 2}]
The actual list is much longer (on the order of ten million), so I'm wondering if there're more direct ways of doing that without reading the entire list from the file to save memory.
Edit:
PS: I'm also open to other file format as long as it can effectively store a large list of dictionaries and can achieve the function above
It sounds like this can be a simple file manipulation problem under the right circumstances. If you are sure that the root data structure of the dump is indeed a json array, you can delete the last "]" in the file and then append a new dump to the file.
You can append with the dumps function.
from json import dumps, dump
import os
#This represents your current dump call
with open('out.json', 'w') as f:
dump([{'version':1}], f)
# This removes the final ']'
with open('out.json', 'rb+') as f:
f.seek(-1, os.SEEK_END)
f.truncate()
#This appends the new dictionary
with open('out.json', 'a') as f:
f.write(',')
f.write(dumps({'n':1}))
f.write(']')
It seems to also work if you dump with indent because the dump function doesn't end with a newline character in either case.
Handling an empty array
If, the first time you dumped the list, it was empty, resulting in an empty json array in the file "[]", then appending a comma like in my example will result in something like this "[,...] which you probably don't want.
The way I've seen this handled in the wild by protocols like i3bar (wich use an unending json array to send information), is to always start with a header element. In their case they use { "version": 1 }.
So ensure that you have that at the start of yours list when you do the first dump -- that is, unless you're sure you'll always have something in the list.
Other notes
Even though this sort of manual json hack is used by projects like i3bar, I wouldn't personally reccomend doing this in a production environment.
JSON requires that the list be closed with a "]" so its not natively appendable. You could try something tricky with opening the end of the file, removing the "]" and also fiddle with the new JSON you are trying to write. But that's messy.
An interesting thing about JSON is that the encoding doesn't have newlines. You can pretty print a JSON but if you don't, you can write an entire JSON record on a single line. So, instead of a JSON list, just have a bunch of lines, each of which is a JSON encoding of your dict.
def append_dict(filename, d):
with open(filename, 'a', encoding='utf-8') as fp:
fp.write(json.dumps(d))
fp.write("\n")
def read_list(filename):
with open(filename, encoding='utf-8') as fp:
return [json.loads(line) for line in fp]
Since this file is now a bunch of JSON objects, not a single JSON list, any program expecting a single list in this file will fail.

Parse a file of strings in python separated by newline into a json array

I have a file called path_text.txt its contents are the 2 strings separated by newline:
/gp/oi/eu/gatk/inputs/NA12878_24RG_med.hg38.bam
/gp/oi/eu/gatk/inputs/NA12878_24RG_small.hg38.bam
I would like to have a json array object like this:
["/gp/oi/eu/gatk/inputs/NA12878_24RG_med.hg38.bam","/gp/oi/eu/gatk/inputs/NA12878_24RG_small.hg38.bam"]
I have tried something like this:
with open('path_text.txt','w',encoding='utf-8') as myfile:
myfile.write(','.join('\n'))
But it does not work
I don't see where you're actually reading from the file in the first place. You have to actually read your path_text.txt before you can format it correctly right?
with open('path_text.txt','r',encoding='utf-8') as myfile:
content = myfiel.read().splitlines()
Which will give you ['/gp/oi/eu/gatk/inputs/NA12878_24RG_med.hg38.bam', '/gp/oi/eu/gatk/inputs/NA12878_24RG_small.hg38.bam'] in content.
Now if you want to write this data to a file in the format ["/gp/oi/eu/gatk/inputs/NA12878_24RG_med.hg38.bam", "/gp/oi/eu/gatk/inputs/NA12878_24RG_small.hg38.bam"]-
import json
with open('path_json.json', 'w') as f:
json.dump(content, f)
Now the path_json.json file looks like-
["/gp/oi/eu/gatk/inputs/NA12878_24RG_med.hg38.bam", "/gp/oi/eu/gatk/inputs/NA12878_24RG_small.hg38.bam"]
which is valid json in case you want to load a json from a file
see below
with open('path_text.txt') as f:
data = [l.strip() for l in f.readlines()]

How to convert a series of JSON strings into one json file?

I am using python and json to construct a json file. I have a string, 'outputString' which consists of multiple lines of dictionaries turned into jsons, in the following format:
{size:1, title:"Hello", space:0}
{size:21, title:"World", space:10}
{size:3, title:"Goodbye", space:20}
I would like to turn this string of jsons and write a new json file entirely, with each item still being its own line. I would like to turn the string of multiple json objects and turn it into one json file. I have attached the code on how I got outputString and what I have tried to do. Right now, the code I have writes the file, but all on one line. I would like the lines to be separated as the string is.
for value in outputList:
newOutputString = json.dumps(value)
outputString += (newOutputString + "\n")
with open('data.json', 'w') as outfile:
for item in outputString.splitlines():
json.dump(item, outfile)
json.dump("\n",outfile)
PROBLEM: when you json.dump("\n",outfile) it will always be written on the same line as ā€\nā€ is not recognised as a new line in json.
SOLUTION: ensure that you write a new line using python and not a json encoded string:
with open('data.json', 'a') as outfile: # We are appending to the file so that we can add multiple new lines for each of different json strings
for item in outputString.splitlines():
json.dump(item, outfile)
outfile.write("\nā€) # write to the file a new line, as you can see this uses a python string, no need to encode with json
See comments for explanation.
Please ensure that the file you write to is empty if you just want these json objects in them.
Your value rows are not in actual json format if the properties do not come between double quotes.
This would be a proper json data format:
{"size":1, "title":"Hello", "space":0}
Having said that here is a solution to your question with the type of data you provided.
I am assuming your data comes like this:
outputList = ['{size:1, title:"Hello", space:0}',
'{size:21, title:"World", space:10}',
'{size:3, title:"Goodbye", space:20}']
so the only thing you need to do is write each value using the file.write() function
Python 3.6 and above:
with open('data.json', 'w') as outfile:
for value in outputList:
outfile.write(f"{value}\n")
Python 3.5 and below:
with open('data.json', 'w') as outfile:
for value in outputList:
outfile.write(value+"\n")
data.json file will look like this:
{size:1, title:"Hello", space:0}
{size:21, title:"World", space:10}
{size:3, title:"Goodbye", space:20}
Note: As someone already commented, your data.json file will not be a true json format ted file but it serves the purpose of your question. Enjoy! :)

storing and retrieving lists from files

I have a very big list of lists. One of my programs does this:
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
out.write (str(power_time_array))
Now another independent script need to read this list of lists back.
How do I do this?
What I have tried:
with open (file_name,'r') as app_trc_file :
power_trace_of_application.append (app_trc_file.read())
Note: power_trace_application is a list of list of lists.
This stores it as a list with one element as a huge string.
How does one efficiently store and retrieve big lists or list of lists from files in python?
You can serialize your list to json and deserialize it back. This really doesn't change anything in representation, your list is already valid json:
import json
power_time_array = [[1,2,3],[1,2,3]] # In a short form
with open (file_name,'w') as out:
json.dump(power_time_array, out)
and then just read it back:
with open (file_name,'r') as app_trc_file :
power_trace_of_application = json.load(app_trc_file)
For speed, you can use a json library with C backend (like ujson). And this works with custom objects too.
Use Json library to efficiently read and write structured information (in the form of JSON) to a text file.
To write data on the file, use json.dump() , and
To retrieve json data from file, use json.load()
It will be faster:
from ast import literal_eval
power_time_array = [[1,2,3],[1,2,3]]
with open(file_name, 'w') as out:
out.write(repr(power_time_array))
with open(file_name,'r') as app_trc_file:
power_trace_of_application.append(literal_eval(app_trc_file.read()))

Python: Converting Entire Directory of JSON to Python Dictionaries to send to MongoDB

I'm relatively new to Python, and extremely new to MongoDB (as such, I'll only be concerned with taking the text files and converting them). I'm currently trying to take a bunch of .txt files that are in JSON to move them into MongoDB. So, my approach is to open each file in the directory, read each line, convert it from JSON to a dictionary, and then over-write that line that was JSON as a dictionary. Then it'll be in a format to send to MongoDB
(If there's any flaw in my reasoning, please point it out)
At the moment, I've written this:
"""
Kalil's step by step iteration / write.
JSON dumps takes a python object and serializes it to JSON.
Loads takes a JSON string and turns it into a python dictionary.
So we return json.loads so that we can take that JSON string from the tweet and save it as a dictionary for Pymongo
"""
import os
import json
import pymongo
rootdir='~/Tweets'
def convert(line):
line = file.readline()
d = json.loads(lines)
return d
for subdir, dirs, files in os.walk(rootdir):
for file in files:
f=open(file, 'r')
lines = f.readlines()
f.close()
f=open(file, 'w')
for line in lines:
newline = convert(line)
f.write(newline)
f.close()
But it isn't writing.
Which... As a rule of thumb, if you're not getting the effect that you're wanting, you're making a mistake somewhere.
Does anyone have any suggestions?
When you decode a json file you don't need to convert line by line as the parser will iterate over the file for you (that is unless you have one json document per line).
Once you've loaded the json document you'll have a dictionary which is a data structure and cannot be directly written back to file without first serializing it into a certain format such as json, yaml or many others (the format mongodb uses is called bson but your driver will handle the encoding for you).
The overall process to load a json file and dump it into mongo is actually pretty simple and looks something like this:
import json
from glob import glob
from pymongo import Connection
db = Connection().test
for filename in glob('~/Tweets/*.txt'):
with open(filename) as fp:
doc = json.load(fp)
db.tweets.save(doc)
a dictionary in python is an object that lives within the program, you can't save the dictionary directly to a file unless you pickle it (pickling is a way to save objects in files so you can retrieve it latter). Now I think a better approach would be to read the lines from the file, load the json which converts that json to a dictionary and save that info into mongodb right away, no need to save that info into a file.

Categories