Python Dictionary Stored as String in List - python

I have a txt file that contains a dictionary in Python and I have opened it in the following manner:
with open('file') as f:
test = list(f)
The result when I look at test is a list of one element. This first element is a string of the dictionary (which also contains other dictionaries), so it looks like:
["ID": 1, date: "2016-01-01", "A": {name: "Steve", id: "534", players:{last: "Smith", first: "Joe", job: "IT"}}
Is there any way to store this as the dictionary without having to find a way to determine the indices of the characters where the different keys and corresponding values begin/end? Or is it possible to read in the file in a way that recognizes the data as a dictionary?

If you are reading a json file then you can use the json module.
import json
with open('data.json') as f:
data = json.load(f)
If you are sure that the file you are reading contains python dictionaries, then you can use the built-in ast.literal_eval to convert those strings to a Python dictionary:
>>> import ast
>>> a = ast.literal_eval("{'a' : '1', 'b' : '2'}")
>>> a
{'a': '1', 'b': '2'}
>>> type(a)
<type 'dict'>
There is an alternative method, eval. But using ast.literal_eval would be better. This answer will explain why.

You can use json module
for Writing
import json
data = ["ID": 1, date: "2016-01-01", "A": {name: "Steve", id: "534", players:{last: "Smith", first: "Joe", job: "IT"}}
with open("out.json", "w") as f:
json.dump(data)
for Reading
import json
with open("out.json", "w") as f:
data = json.load(f)
print data

Just use eval() when you read it from your file.
for example:
>>> f = open('file.txt', 'r').read()
>>> mydict = eval(f)
>>> type(f)
<class 'str'>
>>> type(mydict)
<class 'dict'>

The Python interpreter thinks you are just trying to read an external file as text. It does not know that your file contains formatted content. One way to import easily as a dictionary be to write a second python file that contains the dictionary:
# mydict.py
myImportedDict = {
"ID": 1,
"date": "2016-01-01",
"A": {
"name": "Steve",
"id": "534",
"players": {
"last": "Smith",
"first": "Joe",
"job": "IT"
}
}
}
Then, you can import the dictionary and use it in another file:
#import_test.py
from mydict import myImportedDict
print(type(myImportedDict))
print(myImportedDict)
Python also requires that folders containing imported files also contain a file called
__init__.py
which can be blank. So, create a blank file with that name in addition to the two files above.
If your source file is meant to be in JSON format, you can use the json library instead, which comes packaged with Python: https://docs.python.org/2/library/json.html

Related

convert json to csv and store it in a variable in python

I am using csv module to convert json to csv and store it in a file or print it to stdout.
def write_csv(data:list, header:list, path:str=None):
# data is json format data as list
output_file = open(path, 'w') if path else sys.stdout
out = csv.writer(output_file)
out.writerow(header)
for row in data:
out.writerow([row[attr] for attr in header])
if path: output_file.close()
I want to store the converted csv to a variable instead of sending it to a file or stdout.
say I want to create a function like this:
def json_to_csv(data:list, header:list):
# convert json data into csv string
return string_csv
NOTE: format of data is simple
data is list of dictionaries of string to string maping
[
{
"username":"srbcheema",
"name":"Sarbjit Singh"
},
{
"username":"testing",
"name":"Test, user"
}
]
I want csv output to look like:
username,name
srbcheema,Sarbjit Singh
testing,"Test, user"
Converting JSON to CSV is not a trivial operation. There is also no standardized way to translate between them...
For example
my_json = {
"one": 1,
"two": 2,
"three": {
"nested": "structure"
}
}
Could be represented in a number of ways...
These are all (to my knowledge) valid CSVs that contain all the information from the JSON structure.
data
'{"one": 1, "two": 2, "three": {"nested": "structure"}}'
one,two,three
1,2,'{"nested": "structure"}'
one,two,three__nested
1,2,structure
In essence, you will have to figure out the best translation between the two based on your knowledge of the data. There is no right answer on how to go about this.
I'm relatively knew to Python so there's probably a better way, but this works:
def get_safe_string(string):
return '"'+string+'"' if "," in string else string
def json_to_csv(data):
csv_keys = data[0].keys()
header = ",".join(csv_keys)
res = list(",".join(get_safe_string(row.get(k)) for k in csv_keys) for row in data)
res.insert(0,header)
return "\n".join(r for r in res)

Convert Array of JSON Objects to CSV - Python [duplicate]

This question already has answers here:
How to read a JSON file containing multiple root elements?
(4 answers)
Closed 4 years ago.
I have converted a simple JSON to CSV successfully.
I am facing issue , when the file contains Array of JSON Objects.
I am using csv module not pandas for converting.
Please refer the content below which is getting processed successfully and which is failing :
Sucess (When the file contains single list/array of json object ):
[{"value":0.97,"key_1":"value1","key_2":"value2","key_3":"value3","key_11":"2019-01-01T00:05:00Z"}]
Fail :
[{"value":0.97,"key_1":"value1","key_2":"value2","key_3":"value3","key_11":"2019-01-01T00:05:00Z"}]
[{"value":0.97,"key_1":"value1","key_2":"value2","key_3":"value3","key_11":"2019-01-01T00:05:00Z"}]
[{"value":0.97,"key_1":"value1","key_2":"value2","key_3":"value3","key_11":"2019-01-01T00:05:00Z"}]
The json.loads function is throwing exception as follows :
Extra data ; line 1 column 6789 (char 1234)
How can to process such files ?
EDIT :
This file is flushed using Kinesis Firehorse and pushed to S3.
I am using lambda to download the file and load it and transform.
so it is not a .json file.
Parse each line like so:
with open('input.json') as f:
for line in f:
obj = json.loads(line)
Because your file is not valid JSON. You have to read your file line-by-line and then convert each line individually to object.
Or, you can convert your file structure like this...
[
{
"value": 0.97,
"key_1": "value1",
"key_2": "value2",
"key_3": "value3",
"key_11": "2019-01-01T00:05:00Z"
},
{
"value": 0.97,
"key_1": "value1",
"key_2": "value2",
"key_3": "value3",
"key_11": "2019-01-01T00:05:00Z"
},
{
"value": 0.97,
"key_1": "value1",
"key_2": "value2",
"key_3": "value3",
"key_11": "2019-01-01T00:05:00Z"
}
]
and it will be a valid JSON file.
As tanaydin said, your failing input is not valid json. It should look something like this:
[
{
"value":0.97,
"key_1":"value1",
"key_2":"value2",
"key_3":"value3",
"key_11":"2019-01-01T00:05:00Z"
},
{"value":0.97,"key_1":"value1","key_2":"value2","key_3":"value3","key_11":"2019-01-01T00:05:00Z"},
{"value":0.97,"key_1":"value1","key_2":"value2","key_3":"value3","key_11":"2019-01-01T00:05:00Z"}
]
I assume you're creating the json output by iterating over a list of objects and calling json.dumps on each one. You should create your list of dictionaries, then call json.dumps on the whole list instead.
list_of_dicts_to_jsonify = {}
object_attributes = ['value', 'key_1', 'key_2', 'key_3', 'key_11']
for item in list_of_objects:
# Convert object to dictionary
obj_dict = {}
for k in object_attributes:
obj_dict[k] = getattr(item, k) or None
list_of_dicts_to_jsonify.append(obj_dict)
json_output = json.dumps(list_of_dicts_to_jsonify)

Split JSON File into multiple JSONs according to their ID?

I have a JSON File which looks like this:
{"one":"Some data", "two":"Some data",...}
and so on...
I want to split all the ID's into separate files according to the name of the ID, For example:
one.json
{"one":"Some data"}
two.json
{"two":"Some data"}
and so on.
I got a reference from this. But my problem is slightly different. What can I modify to achieve the separate text files?
I won't teach you how to do file I/O and assume you can do that yourself.
Once you have loaded the original file as a dict with the json module, do
>>> org = {"one":"Some data", "two":"Some data"}
>>> dicts = [{k:v} for k,v in org.items()]
>>> dicts
[{'two': 'Some data'}, {'one': 'Some data'}]
which will give you a list of dictionaries that you can dump to a file (or separate files named after the keys), if you wish.
After loading the JSON file you can treat it as a dictionary in python and then save the contents in file by looping through as you would in normal python dictionary.
Here is an example related to what you want to achieve
Data = {"one":"Some data", "two":"Some data"}
for item in Data:
name = item + '.json'
file = open(name, 'w')
file.write('{"%s":"%s"}' % (item, Data[item]))
file.close()
after getting the json data into a variable,do
a = {"one":"Some data", "two":"Some data"}
for k,v in a.items():
with open(k+".json","w") as f:
f.write('{"%s" : "%s"}' %(k,v))
and output is :
one.json => {"one":"Some data"}
and
two.json => {"two":"Some data"}

How to save a dictionary into a file, keeping nice format?

If I have dictionary like:
{
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
And try to save it to a text file, I get something like this:
{"cats": {"sphinx": 3}, {"british": 2}, "dogs": {}}
How can I save a dictionary in pretty format, so it will be easy to read by human eye?
You can import json and specify an indent level:
import json
d = {
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
j = json.dumps(d, indent=4)
print(j)
{
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
Note that this is a string, however:
>>> j
'{\n "cats": {\n "sphinx": 3, \n "british": 2\n }, \n "dogs": {}\n}'
You can use pprint for that:
import pprint
pprint.pformat(thedict)
If you want to save it in a more standard format, you can also use, for example, a yaml file (and the related python package http://pyyaml.org/wiki/PyYAMLDocumentation), and the code would look like:
import yaml
dictionary = {"cats": {"sphinx": 3}, {"british": 2}, "dogs": {}}
with open('dictionary_file.yml', 'w') as yaml_file:
yaml.dump(dictionary, stream=yaml_file, default_flow_style=False)
dump creates a string in the yaml format to be written to the file. Note that it is possible to specify the stream and write the content immediately to the file. If it is necessary to get the string for some reason before writing to the file, just don't specify it and write it after using write function for the file.
Note also that the parameter default_flow_style allows to have a nicer format; in the example the file looks:
cats:
british: 2
sphinx: 3
dogs: {}
To load again the yaml file in a dictionary:
import yaml
with open('dictionary_file.yml', 'r') as yaml_file:
dictionary = yaml.load(yaml_file)
You can dump it by using the Python Object Notation module (pon: disclaimer I am the author of that module)
from pon import PON, loads
data = {
"cats": {
"sphinx": 3,
"british": 2
},
"dogs": {}
}
pon = PON(obj=data)
pon.dump()
which gives:
dict(
cats=dict(
sphinx=3,
british=2,
),
dogs=dict( ),
)
which again is correct Python, but trading the quoted strings needed for keys by using dict .
You can load this again with:
read_back = loads(open('file_name.pon').read())
print(read_back)
giving:
{'cats': {'sphinx': 3, 'british': 2}, 'dogs': {}}
Please note that loads() does not evaluate the string, it actually parses it safely using python's built-in parser.
PON also allows you to load python dictionaries from files, that have commented entries, and dump them while preserving the comments. This is where it's real usefulness comes into action.
Alternatively, if you would like something, arbitrarily more readable like the YAML format, you can use ruamel.yaml and do:
import ruamel.yaml
ruamel.yaml.round_trip_dump(data, stream=open('file_name.yaml', 'wb'), indent=4)
which gives you a file file_name.yaml with contents:
cats:
sphinx: 3
british: 2
dogs: {}
which uses the indent you seem to prefer (and is more efficient than #alberto's version)

Add lines from file as dictionary to list Python

Im trying to import a file which contains lines like this:
{ "dictitem" : 1, "anotherdictitem" : 2 }
I want to import them in to a list of dictionaries like this:
[{ "dictitem" : "henry", "anotherdictitem" : 2 },{ "dictitem" : "peter", "anotherdictitem" : 4 },{ "dictitem" : "anna", "anotherdictitem" : 6 }]
I tried this: tweetlist = open("sample.out").readlines()
But then they get appended as a string. Does anyone have an idea?
Thanks!
You need to decode each line using json. Example:
import json
list = []
with open("data.txt", 'r') as file:
for line in file:
dict = json.loads(line)
list.append(dict)
print(list)
You can use the AST library's literal_eval function on each line by using a list comprehension like this:
import ast
tweetlist = [ast.literal_eval(x) for x in open("sample.out").readlines()]
ast.literal_eval is a safer version of the eval function that doesn't execute functions.

Categories