How to Create your own JSON parser library in python - python

Question asked to me in an Interview.
To write a Library in python which can parse the JSON data.
It can be a JSONObject or JSON String or anything else.
It should be able to handle all the types of data types
example JSON Data
{
"name": "JaneDoe",
"age": 42,
"smoking": false,
"education": {
"school": "abcSchool",
"University": "xyz"
},
"certificates": ["ccna", "python", "aws"],
"salary": 4200.0,
"profile_img": "https://ii.abc.com/jpeg/profiles/jane.jpg"
}

https://www.json.org/json-ru.html gives all you need to start - there are simple block diagrams, showing how to parse different supported types, like this:
So just rewrite them in Python

Related

How can I read json file to collect the attribute values in different companies?

As the question explained above, I faced the difficulty of reading json file to collect attribute values from the company database in Python, and would like to store the values into numpy.ndarray form. What I wanna do is to read through
all the companies numbering and select its own values.
For example:
"0000059745": {
"Income": 5928375,
"Assets": 958273479,
}
"0000212498": "Empty dictionary.",
"0000310826": {
"Income": 1928474,
"Assets": 2938479,
}
However, the relevant questions I checked on other sources were simply basic instruction of teaching people how to read json file with the same name for a single company, but did not explicitly have the similar problem as I had.
For instance:
"comp1": {
"Income": 5928375,
"Assets": 958273479,
}
"comp1": "Empty dictionary.",
"comp1": {
"Income": 1928474,
"Assets": 2938479,
}
Hence, what I would like to do it is something like below:
with open("input/company_data.json") as f:
for comp in companies["number"]:
for var in company_variables["Assets"]["Income"]:
// Storing both Assets and Income attribute values into dataNdArr as numpy.ndaarray type
dataNdArr = var
I hope someone could help me with further improving this deeper level of reading json file problem.
Thank you.

Count unique values in a JSON

I have a json called thefile.json which looks like this:
{
"domain": "Something",
"domain": "Thingie",
"name": "Another",
"description": "Thing"
}
I am trying to write a python script which would made a set of the values in domain. In this example it would return
{'Something', 'Thingie'}
Here is what I tried:
import json
with open("thefile.json") as my_file:
data = json.load(my_file)
ids = set(item["domain"] for item in data.values())
print(ids)
I get the error message
unique_ids.add(item["domain"])
TypeError: string indices must be integers
Having looked up answers on stack exchange, I'm stumped. Why can't I have a string as an index, seeing as I am using a json whose data type is a dictionary (I think!)? How do I get it so that I can get the values for "domain"?
So, to start, you can read more about JSON formats here: https://www.w3schools.com/python/python_json.asp
Second, dictionaries must have unique keys. Therefore, having two keys named domain is incorrect. You can read more about python dictionaries here: https://www.w3schools.com/python/python_dictionaries.asp
Now, I recommend the following two designs that should do what you need:
Multiple Names, Multiple Domains: In this design, you can access websites and check the domain of each of its values like ids = set(item["domain"] for item in data["websites"])
{
"websites": [
{
"domain": "Something.com",
"name": "Something",
"description": "A thing!"
},
{
"domain": "Thingie.com",
"name": "Thingie",
"description": "A thingie!"
},
]
}
One Name, Multiple Domains: In this design, each website has multiple domains that can be accessed using JVM_Domains = set(data["domains"])
{
"domains": ["Something.com","Thingie.com","Stuff.com"]
"name": "Me Domains",
"description": "A list of domains belonging to Me"
}
I hope this helps. Let me know if I missed any details.
You have a problem in your JSON, duplicate keys. I am not sure if it is forbiden, but I am sure it is bad formatted.
Besides that, of course it is gonna bring you lot of problems.
A dictionary can not have duplicate keys, what would be the return of a duplicate key?.
So, fix your JSON, something like this,
{
"domain": ["Something", "Thingie"],
"name": "Another",
"description": "Thing"
}
Guess what, good format almost solve your problem (you can have duplicates in the list) :)

Write JSON for corresponding XML

I want to write JSON code which can be converted in fix format kind of XML
<function>foo
<return>uint32_t</return>
<param>count
<type>uint32_t</type>
</param>
</function>
I have tried multiple ways to develop a JSON which can be formatted like as in above but failed to get perfection because no separate key is required for foo and count which are orphan values otherwise.
Tried ways:
Way 1:
{
"function" :
{"foo":
{"return":"uint32_t"},
"param":
{"count":
{"type":"uint32_t"}
}
}
}
Way 2:
{
"function" :
["foo",{"return":"uint32_t"}],
"param":
["count",{"type":"uint32_t"}]
}
Way 3: But i do not need name tag :(
{
"function":
{"name": "foo",
"return": "uint32_t",
"param": "count",
"type": "uint32_t"
}
}
For generating output and testing please use:
JSON to XML convertor
Requesting your help.. I later have a script to convert the formatted excel to C header files.
It is very rare for a JSON-to-XML conversion library to give you precise control over the XML that is generated, or conversely, for an XML-to-JSON converter to give you precise control over the JSON that is generated. It's basically not possible because the data models are very different.
Typically you have to accept what the JSON-to-XML converter gives you, and then use XSLT to transform it into the flavour of XML that you actually want.
(Consider using the json-to-xml() conversion function in XSLT 3.0 and then applying template rules to the result.)

How to copy a python script which includes dictionaries to a new python script?

I have a python script which contains dictionaries and is used as input from another python script which performs calculations. I want to use the first script which is used as input, to create more scripts with the exact same structure in the dictionaries but different values for the keys.
Original Script: Car1.py
Owner = {
"Name": "Jim",
"Surname": "Johnson",
}
Car_Type = {
"Make": "Ford",
"Model": "Focus",
"Year": "2008"
}
Car_Info = {
"Fuel": "Gas",
"Consumption": 5,
"Max Speed": 190
}
I want to be able to create more input files with identical format but for different cases, e.g.
New Script: Car2.py
Owner = {
"Name": "Nick",
"Surname": "Perry",
}
Car_Type = {
"Make": "BMW",
"Model": "528",
"Year": "2015"
}
Car_Info = {
"Fuel": "Gas",
"Consumption": 10,
"Max Speed": 280
}
So far, i have only seen answers that print just the keys and the values in a new file but not the actual name of the dictionary as well. Can someone provide some help? Thanks in advance!
If you really want to do it that way (not recommended, because of the reasons statet in the comment by spectras and good alternatives) and import your input Python file:
This question has answers on how to read out the dictionaries names from the imported module. (using the dict() on the module while filtering for variables that do not start with "__")
Then get the new values for the dictionary entries and construct the new dicts.
Finally you need to write a exporter that takes care of storing the data in a python readable form, just like you would construct a normal text file.
I do not see any advantage over just storing it in a storage format.
read the file with something like
text=open('yourfile.py','r').read().split('\n')
and then interpret the list of strings you get... after that you can save it with something like
new_text = open('newfile.py','w')
[new_text.write(line) for line in text]
new_text.close()
as spectras said earlier, not ideal... but if that's what you want to do... go for it

Python: Mutability and dictionaries in config

I want to keep some large, static dictionaries in config to keep my main application code clean. Another reason for doing that is so the dicts can be occasionally edited without having to touch the application.
I thought a good solution was using a json config a la:
http://www.ilovetux.com/Using-JSON-Configs-In-Python/
JSON is a natural, readable format for this type of data. Example:
{
"search_dsl_full": {
"function_score": {
"boost_mode": "avg",
"functions": [
{
"filter": {
"range": {
"sort_priority_inverse": {
"gte": 200
}
}
},
"weight": 2.4
}
],
"query": {
"multi_match": {
"fields": [
"name^10",
"search_words^5",
"description",
"skuid",
"backend_skuid"
],
"operator": "and",
"type": "cross_fields"
}
},
"score_mode": "multiply"
}
}
The big problem is, when I import it into my python app and set a dict equal to it like this:
with open("config.json", "r") as fin:
config = json.load(fin)
...
def create_query()
query_dsl = config['search_dsl_full']
return query_dsl
and then later, only when a certain condition is met, I need to update that dict like this:
if (special condition is met):
query_dsl['function_score']['query']['multi_match']['operator'] = 'or'
Since query_dsl is a reference, it updates the config dictionary too. So when I call the function again, it reflects the updated-for-special-condition version ("or") rather than the the desired config default ("and").
I realize this is a newb issue (yes, I'm a python newb), but I can't seem to figure out a 'pythonic' solution. I'm trying to not be a hack.
Possible options:
When I set query_dsl equal to the config dict, use copy.deepcopy()
Figure out how to make all nested slices of the config dictionary immutable
Maybe find a better way to accomplish what I'm trying to do? I'm totally open to this whole approach being a preposterous newbie mistake.
Any help appreciated. Thanks!

Categories