I'm trying to read a text file contains dictionaries separated by comma, and convert to a list of dictionaries.
How can I do this with python?
I tried to read as json file or use split method
{
"id": "b1",
"name": "Some Name"
},
{
"id": "b2",
"name": "Another Name"
},
....
result should be:
[ {"id" : "b1", "name" : "Some Name"} , {"id" : "b2", "name" : "Another Name"}, .... ]
If your file is not too big, you can do the following:
import json
with open('filename.txt', 'r') as file:
result = json.loads('[' + file.read() + ']')
You can use json module in python
JSON (JavaScript Object Notation), specified by RFC 7159 (which obsoletes RFC 4627) and by ECMA-404, is a lightweight data interchange format inspired by JavaScript object literal syntax (although it is not a strict subset of JavaScript).
json exposes an API familiar to users of the standard library marshal and pickle modules.
https://docs.python.org/2/library/json.html
In your case if your file is a valid json file
then you can use json.loads() method directly
import json
with open('test.txt', 'r') as file:
result = json.loads(file.read())
Welcome to Stack Overflow..!
If you are using a JSON file to store data, then I highly recommend the Python JSON Library. How to use it you ask..? Read this
If you plan on using a Text file to store data, I recommend that you store the data a bit differently than the JSON format
b1,Some Name
b2,Another Name
.
.
This would make reading the text file way easier. Just use the split() command to separate the lines from one another and then use split(",") on each of the lines to separate the ID from the Name.
Here is the code:
list = open("filename.txt").read().split("\n")
dictionaryList = []
for item in list:
id, name = item.split(",")
dictionaryList.append({id:name})
Hope this works..! SDA
Related
I am strucked in my work where my requirement is combining multiple json files into single json file and need to compress it in s3 folder
Somehow I did but the json contents are merging in dictionary and I know I have used Dictionary to load my json content from files because I tried with loading as List but it throws mw JSONDecodeError "Extra data:line 1 column 432(431)"
my file looks like below:
file1 (no .json extension will be there)
{"abc":"bcd","12354":"31354321"}
file 2
{"abc":"bcd","12354":"31354321":"hqeddeqf":"5765354"}
my code-
import json
import boto3
s3_client=boto3.client('s3')
bucket_name='<my bucket>'
def lambda_handler(event,context):
key='<Bucket key>'
jsonfilesname = ['<name of the json files which stored in list>']
result=[]
json_data={}
for f in (range(len(jsonfilesname))):
s3_client.download_file(bucket_name,key+jsonfilesname[f],'/tmp/'+key+jsonfilesname[f])
infile = open('/tmp/'+jsonfilesname[f]).read()
json_data[infile] = result
with open('/tmp/merged_file','w') as outfile:
json.dump(json_data,outfile)
my output for the outfile by the above code is
{
"{"abc":"bcd","12354":"31354321"}: []",
"{"abc":"bcd","12354":"31354321":"hqeddeqf":"5765354"} :[]"
}
my expectation is:
{"abc":"bcd","12354":"31354321"},{"abc":"bcd","12354":"31354321":"hqeddeqf":"5765354"}
Please someone help and advice what needs to be done to get as like my expected output
First of all:
file 2 is not a valid JSON file, correctly it should be:
{
"abc": "bcd",
"12354": "31354321",
"hqeddeqf": "5765354"
}
Also, the output is not a valid JSON file, what you would expect after merging 2 JSON files is an array of JSON objects:
[
{
"abc": "bcd",
"12354": "31354321"
},
{
"abc": "bcd",
"12354": "31354321",
"hqeddeqf": "5765354"
}
]
Knowing this, we could write a Lamdda to merge JSONS files:
import json
import boto3
s3 = boto3.client('s3')
def lambda_handler(event,context):
bucket = '...'
jsonfilesname = ['file1.json', 'file2.json']
result=[]
for key in jsonfilesname:
data = s3.get_object(Bucket=bucket, Key=key)
content = json.loads(data['Body'].read().decode("utf-8"))
result.append(content)
# Do something with the merged content
print(json.dumps(result))
If you are using AWS, I would recommend using S3DistCp
for json file merging as it provides a fault-tolerant, distributed way that can keep up with large files as well by leveraging MapReduce
. However, it does not seem to support in-place merging.
I am trying to make a script that will delete everything within abc2. But right now, it just deletes all the json code.
The json code is located in a file named "demo".
there are multiple
Python:
with open('demo.json', 'w') as destnationF:
with open('demo.json', 'r') as source_file:
for parameters in source_file:
element = json.loads(parameters.strip())
if 'abc1' in element:
del element['abc1']
dest_file.write(json.dumps(element))
snippet of Json:
{
"parameters": [{
"abc1": {
"type": "string",
"defaultValue": "HELLO1"
},
"abc2": {
"type": "string",
"defaultValue": "HELLO2"
}
}]
}
When openning a file with w it clears it, so do it in 2 steps
read the content, keep what you need, delete what you need
write the new content
to_keep = []
with open('demo.json') as file:
content = json.load(file)
for parameter in content['parameters']:
print(parameter)
if 'abc1' in parameter:
del parameter['abc1']
to_keep.append(parameter)
with open('demo.json', 'w') as file:
json.dump({'parameters': to_keep}, file, indent=4)
Opening the file for writing is truncating the file before you can read it.
You should read the entire file into memory, then you can overwrite the file.
You also need to loop through the parameters list, and delete the abc2 properties in its elements. And when you write the JSON back to the file, you need to separate each of them with newline (but it's generally a bad idea to put multiple JSON strings in a single file -- it would be better to collect them all in a list and load and dump it all at once).
with with open('demo.json', 'r+') as source_file:
lines = source_file.readlines()
source_file.seek(0) # overwrite the file
for parameters in lines:
element = json.loads(parameters.strip())
for param in element['parameters']:
if 'abc2' in element:
del element['abc2']
source_file.write(json.sumps(element) + '\n')
source_file.truncate()
I have a file like this:
machineA=1.2.3.4.5
machineB=2.5.7.8.9
I need to read above file and make a string like this for each line:
{"test":"hello world.","server":"machineA","ip":"1.2.3.4.5","reason":"data"}
{"test":"hello world.","server":"machineB","ip":"2.5.7.8.9","reason":"data"}
As you can see in my above json string only value of server and ip is changing and other remaining two keys, it's value stays same always. How can I generate a string like this for each line?
I am not able to figure out how to make correspoding json for each line and print them out on the console.
f = open('hosts.txt')
line = f.readline()
while line:
print(line)
machine=line.split("=")[0]
ip=line.split("=")[1]
# now how to make a json string for each line
line = f.readline()
f.close()
I'd suggest making a dictionary then using the built in json module to convert it to a JSON string, because I find that cleaner to read.
import json
mydict = {
"test": "hello world.",
"server": machine,
"ip": ip,
"reason": "data"
}
json.dumps(mydict)
See Robᵩ's answer for how to do it without the json module
You don't have to use the JSON library just because the output format is JSON. Try one of these:
output_line = '{"test":"hello world.","server":"%s","ip":"%s","reason":"data"}'%(
machine, ip)
output_line = '{{"test":"hello world.","server":"{}","ip":"{}","reason":"data"}'.format(
machine, ip)
output_line = f'{{"test":"hello world.","server":"{machine}","ip":"{ip}","reason":"data"}'
Note that the final line only works on recent versions of Python3.
I am trying to read a large JSON file (~ 2GB) in python.
The following code works well on small files but doesn't work on large files because of MemoryError on the second line.
in_file = open(sys.argv[1], 'r')
posts = json.load(in_file)
I looked at similar posts and almost everyone suggested to use ijson so I decided to give it a try.
in_file = open(sys.argv[1], 'r')
posts = list(ijson.parse(in_file))
This handled reading the big file size but ijson.parse didn't return a JSON object like json.load does so the rest of my code didn't work
TypeError: tuple indices must be integers or slices, not str
If I print out "posts" when using json.load, the o/p looks like a normal JSON
[{"Id": "23400089", "PostTypeId": "2", "ParentId": "23113726", "CreationDate": ... etc
If I print out "posts" after using ijson.parse, the o/p looks like a hash map
[["", "start_array", null], ["item", "start_map", null],
["item", "map_key", "Id"], ["item.Id", "string ... etc
My question:
I don't want to change the rest of my code so I am wondering if there is anyway to convert the o/p of ijson.parse(in_file) back to a JSON object so that it's exactly the same as if we are using json.load(in_file)?
Maybe this works for you:
in_file = open(sys.argv[1], 'r')
posts = []
data = ijson.items(in_file, 'item')
for post in data:
posts.append(post)
I am trying to create a program in python3 (Mac OS X) and tkinter. It takes an incremental id, the datetime.now and a third string as variables. For example,
a window opens displaying : id / date time / "hello world". The user makes a choice and presses a save button. The inputs are being serialised as json and saved in a file.
mytest = dict([('testId',testId), ('testDate',testDate), ('testStyle',testStyle)])
with open('data/test.txt', mode = 'a', encoding = 'utf-8') as myfile:
json.dump(mytest, myfile, indent = 2)
myfile.close()
the result in the file is
{
"testStyle": "blabla",
"testId": "8",
"testDate": "2013-05-09 13:32"
}{
"testDate": "2013-05-09 13:41",
"testId": "9",
"testStyle": "blabla"
}
As a python newbie, I want to load the file data and make some checks, like "If user made another entry at 2013-05-09, display a message saying that you already entered data for today." What is the proper way to load all these json data ? The list will expand each day and will contain lots of data.
Instead of directly storing the dictionary you can store a list of dictionaries which can be loaded back into an list which can be modified and appended to
import json
mytest1 = dict([('testId','testId1'), ('testDate','testDate1'), ('testStyle','testStyle1')])
json_values = []
json_values.append(mytest1)
s = json.dumps(json_values)
print(s)
json_values = None
mytest2 = dict([('testId','testId2'), ('testDate','testDate2'), ('testStyle','testStyle2')])
json_values = json.loads(s)
json_values.append(mytest2)
s = json.dumps(json_values)
print(s)
You could simply load the file and parse it:
with open(path, mode="r", encoding="utf-8") as myfile:
data = json.loads(myfile.read())
Now you can do with data whatever you want.
If your file is really big, then I suppose you should use a proper database instead.