find string match from json file in python [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
{
"appconfig": {
"username" : "test",
"password" : "testpassowrd"
},
"bot": [
{
"contains": [],
"exact": ["hi","hy","hey","hlw"],
"response": "hey there"
},
{
"contains": [],
"exact": ["gm","good morning","vgm"],
"response": "good morning"
}
],
"blocked": [],
}
i am storing json snippet in son file and opening file during execution :
with open('/data.json', 'r') as f:
data = json.load(f)
i am looking forward to matching string with an exact array in JSON one by one in python. what is the best way possible using a lambda filter?
For example
user_msg = 'hi'
i have to match one by one in each exact array if a value exists send the response.
Thanks in advance.
EDIT : 1

#Tom Robinson pointed out the solution which seems to be working well. But I would suggest to account for the complexity and the size of JSON as well. If the size is huge we need to look for solutions that load JSON as a stream, not as a file like https://github.com/henu/bigjson
Secondly, while comparing in is a simpler operation and easiest to point, but it is known to having taxing performance impact. Based on the size of the list, you may want to convert that into a set and try to find the value.
Lastly, the program seems to be working for dataset, however we need to account for missing keys, I would like to change the above to use dict.get() in place of dict['key'], So the finally extending the solution given by TOM may look like:
def getResponse(user_msg):
with open('/data.json', 'r') as f:
data = json.load(f)
for data_set in data.get("bot",{}):
_extract = data_set.get("exact",None)
if _extract and user_msg in _extract :
return data_set.get("response",None)
getResponse(user_msg)
Using this we can avoid doing key checks.

Through iterating over every possible data set, you can run checks every time like this:
def getResponse(user_msg):
for data_set in data["bot"]:
if user_msg in data_set["exact"]:
return data_set["response"]
getResponse(user_msg)
This assumes that data is already defined and in the global scope, whereas the next function reads the file internally:
def getResponse(user_msg):
with open('/data.json', 'r') as f:
data = json.load(f)
for data_set in data["bot"]:
if user_msg in data_set["exact"]:
return data_set["response"]
getResponse(user_msg)

Related

How to automate saving constants? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last year.
Improve this question
I am in the process of running experiments, and ideally, once all of my code is working, the only parameters that will need to be changed will all be present in one file. My initial idea was to store these parameters in a JSON file:
{
"param1": 1,
"param2": "string parameter"
}
Where, obviously, I have many more than 2 parameters. This turns out to be a nightmare, as my IDE will not guess any of the parameters, which massively slows down my programming as I generally feel obligated to create local variables for every constant that I need in the current function that I'm working in. This results in a lot of unnecessary code (but local to that function, it is significantly more convenient than trying to index the JSON object).
My next thought was then: store the constants in a file like:
PARAM1 = 1
PARAM2 = 'string parameter'
The problem with this is that I'd like to store the parameters with experimental results so that I can look back to see which parameters were specified to produce those results.
Beyond this, my thought is to use a dataclass (probably one with frozen=True), as those can be converted to a dictionary. However, I do not need access to an instance of the class, just the constants within it.
Another thought is to use a class with static variables:
class ExperimentalMetaData:
param1 = 1
param2 = "string parameter"
Which can be converted to a dict with vars(ExperimentalMetaData), except this will contain additional keys that should be popped off before I go about storing the data.
My question is: what is the best way to store constants in python such that they can be saved to a JSON file easily, and also be easily accessed within my code?
If you want to be able to recall different versions of inputs, give them a version
This allows you to create JSON-like input files and keep a collection of parsers which can parse them if you make a breaking change
Here's a very simple example which is more sustainable
class Parser_v1_2(): pass
class Parser_v3_2(): pass
VERSION_PARSER_MAPPING = {
"1.2": Parser_v1_2,
"3.2": Parser_v3_2,
}
def parser_map(input_file):
with open(input_file) as fh:
input_json = json.load(fh)
# get version or optionally provide a default
version = input_json.get("version", "1.0")
# dynamically select parser
return VERSION_PARSER_MAPPING[version](input_json)
Split up your problems.
Storing the data
Serialise it to JSON or YAML (or even csv).
Getting the data
Have a module which reads your json and then sets the right values. Something like:
# constants.py
from json import load
data = load("dump.json")
const1: str = data["const1"]
const2: str = data["const2"]
const3: int = data["const3"]
# some_other_module.py
from constants import const1, const2 # IDE knows what they are
I'd only do this manually with vars in a module for a small (<20) number of vars I needed a lot and didn't want to wrap in some dictionary or the like. Otherwise I'd just use a dict in the module. Pre-populating the dict with keys and None and typehinting it will do the same job of getting autocomplete working.

How to iterate over a JSON object? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
Sorry I am a python beginner coming from a java background. I am doing something wrong and not finding it from google.
I have config_file.json:
{
key1: val1,
key2: val2,
keyArrayOfImportantVal : [ "str1", "str2", "str3"]
}
I am able to read in my json file and create a variable that maps to my json file
config_values = read_config_file('path to file')
#config_values has json as I expect
I need to iterate over the values in keyArrayOfImportantVal. I am just not finding what I need to do this.
I thought this should work, but it doesn't.
for val in config_values.keyArrayOfImportantVal:
print (val)
nor does
importantVals = _config_values.keyArrayOfImportantVal
for val in imporantVals:
prit(val)
you can read how properly read json file from here Reading and Writing JSON to a File in Python or you can use this snippet if it helps
import json
with open('path to file') as json_file:
data = json.load(json_file)
this will iterate all keys
for x in data:
print(x)
this will iterate in values in this key "keyArrayOfImportantVal"
for x in data['keyArrayOfImportantVal']:
print(x)

looping through json python is very slow

Can someone help me understand what I'm doing wrong in the following code:
def matchTrigTohost(gtriggerids,gettriggers):
mylist = []
for eachid in gettriggers:
gtriggerids['params']['triggerids'] = str(eachid)
hgetjsonObject = updateitem(gtriggerids,processor)
hgetjsonObject = json.dumps(hgetjsonObject)
hgetjsonObject = json.loads(hgetjsonObject)
hgetjsonObject = eval(hgetjsonObject)
hostid = hgetjsonObject["result"][0]["hostid"]
hname = hgetjsonObject["result"][0]["name"]
endval = hostid + "--" + hname
mylist.append(endval)
return(hgetjsonObject)
The variable gettriggers contain a lot of ids (~3500):
[ "26821", "26822", "26810", ..... ]
I'm looping through the ids in the variable and assigning them to a json object.
gtriggerids = {
"jsonrpc": "2.0",
"method": "host.get",
"params": {
"output": ["hostid", "name"],
"triggerids": "26821"
},
"auth": mytoken,
"id": 2
}
When I run the code against the above json variable, it is very slow. It is taking several minutes to check each ID. I'm sure I'm doing many things wrong here or at least not in the pythonic way. Can anyone help me speed this up? I'm very new to python.
NOTE:
The dump() , load(), eval() were used to convert the str produced to json.
You asked for help knowing what you're doing wrong. Happy to oblige :-)
At the lowest level—why your function is running slowly—you're running many unnecessary operations. Specifically, you're moving data between formats (python dictionaries and JSON strings) and back again which accomplishes nothing but wasting CPU cycles.
You mentioned this is only way you could get the data in the format you needed. That brings me to the second thing you're doing wrong.
You're throwing code at the wall instead of understanding what's happening.
I'm quite sure (and several of your commenters appear to agree) that your code is not the only way to arrange your data into a usable structure. What you should do instead is:
Understand as much as you can about the data you're being given. I suspect the output of updateitem() should be your first target of learning.
Understand the right/typical way to interact with that data. Your data doesn't have to be a dictionary before you can use it. Maybe it's not the best approach.
Understand what regularities and irregularities the data may have. Part of your problem may not be with types or dictionaries, but with an unpredictable/dirty data source.
Armed with all this new knowledge, manipulate your as simply as you can.
I can pretty much guarantee the result will run faster.
More detail! Some things you wrote suggest misconceptions:
I'm looping through the ids in the variable and assigning them to a json object.
No, you can't assign to a JSON object. In python, JSON data is always a string. You probably mean that you're assigning to a python dictionary, which (sometimes!) can be converted to a JSON object, represented as a string. Make sure you have all those concepts clear before you move forward.
The dump() , load(), eval() were used to convert the str produced to json.
Again, you don't call dumps() on a string. You use that to convert a python object to a string. Run this code in a REPL, go step by step, and inspect or play with each output to understand what it is.

Python Dict Transform

I've been having some strange difficulty trying to transform a dataset that I have.
I currently have a dictionary coming from a form as follows:
data['content']['answers']
I would like to have the ['answers'] appended to the first element of a list like so:
data['content'][0]['answers']
However when I try to create it as so, I get an empty dataset.
data['content'] = [data['content']['answers']]
I can't for the life of me figure out what I am doing wrong.
EDIT: Here is the opening JSON
I have:
{
"content" : {
"answers" : {
"3" : {
But I need it to be:
{
"content" : [
{
"answers" : {
"3" : {
thanks
You can do what you want by using a dictionary comprehension (which is one of the most elegant and powerful features in Python.)
In your case, the following should work:
d = {k:[v] for k,v in d.items()}
You mentioned JSON in your question. Rather than rolling your own parser (which it seems like you might be trying to do), consider using the json module.
If I've understood the question correctly, it sounds like you need data['contents'] to be equal to a list where each element is a dictionary that was previously contained in data['contents']?
I believe this might work (works in Python 2.7 and 3.6):
# assuming that data['content'] is equal to {'answers': {'3':'stuff'}}
data['content'] = [{key:contents} for key,contents in data['content'].items()]
>>> [{'answers': {'3': 'stuff'}}]
The list comprehension will preserve the dictionary content for each dictionary that was in contents originally and will return the dictionaries as a list.
Python 2 doc: https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions
Python 3 doc:
https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
It would be best if you give us a concrete example of 'data' (what the dictionary looks like), what code you try to run, what result you get and what you except. I think I have an idea but can't be sure.
Your question isn't clear and lacks of an explicit example.
Btw, something like this can work for you?
data_list = list()
for content in data.keys():
data_list.append(data[content])

Check if a value in a JSON object from 2 different JSON files are similar and create a list of matching results with python

I'm trying to minimize the data i would send to an Application from an API but some values are null but can be found in another API, so i thought about making a python script run on a server to add those null results to the original JSON file.
I'm appending the list that has that information stored to the list that matches it in the original JSON file, this can be done by using a unique ID that corresponds to a video game title in both those files. Here's my code:
import json
games = open('outputgames.json')
releases = open('outputreleases.json')
games_json = json.load(games)
releases_json = json.load(releases)
# This is where all the results are found in the JSON file
# The results are all stored in a list, so to access the first result
# we would access it like this: games_json['results'][0] or games_data[0]
games_data = games_json['results']
releases_data = releases_json['results]
#This is where, i iterate through the data to see IF the id in the object 'game' which is found in releases_data
#is similar to the one in games_data and then storing both matching results in a Dictionary and a list
#then i just dump the results to a json file.
grouped_data = [dict(data_releases = x, data_games= i ) for x in releases_data for i in games_data if i['id'] == x['game']['id']]
with open('final_results.json', mode = 'w') as f:
json.dump(grouped_data, f)
The initial list in games_data['results'] holds about 480 results while the one in releases_data['results'] holds 470. But for some reason, my code seems to be skipping through some results, I'm supposed to be receiving about 480 results but i'm only getting about 260 results. I'm guessing the iteration i am doing with the "IF" statement here is skipping some ids it already passed, but i'm not sure. If someone can help me make the IF statement not resume from where it left but from the top and actually check if ALL ids match.
If someone can please help me with this issue i am having, or if i am doing something wrong. Any help is nice, Thanks in advance.
Here's a sample of what Grouped_data would return, this is only 1 entry. it returns about 260 when run with the json files, but like i said previously i am supposed to get hundreds more returned :
[{"data_games": {"deck": "Tri Force Heroes is a co-op game set in The Legend of Zelda franchise. Three Links must work together to rid the land of Hyrule of evil once more.", "image": {"tiny_url": "http://static.giantbomb.com/uploads/square_mini/8/82063/2778000-tloztfh.jpg", "medium_url": "http://static.giantbomb.com/uploads/scale_medium/8/82063/2778000-tloztfh.jpg", "thumb_url": "http://static.giantbomb.com/uploads/scale_avatar/8/82063/2778000-tloztfh.jpg", "small_url": "http://static.giantbomb.com/uploads/scale_small/8/82063/2778000-tloztfh.jpg", "screen_url": "http://static.giantbomb.com/uploads/screen_medium/8/82063/2778000-tloztfh.jpg", "icon_url": "http://static.giantbomb.com/uploads/square_avatar/8/82063/2778000-tloztfh.jpg", "super_url": "http://static.giantbomb.com/uploads/scale_large/8/82063/2778000-tloztfh.jpg"}, "id": 49994}, "data_releases": {"deck": null, "image": null, "platform": {"api_detail_url": "http://www.giantbomb.com/api/platform/3045-138/", "id": 138, "name": "Nintendo 3DS eShop"}, "expected_release_day": 23, "expected_release_month": 10, "game": {"api_detail_url": "http://www.giantbomb.com/api/game/3030-49994/", "id": 49994, "name": "The Legend of Zelda: Tri Force Heroes"}, "expected_release_year": 2015, "id": 142927, "region": {"api_detail_url": "http://www.giantbomb.com/api/region/3075-1/", "id": 1, "name": "United States"}, "expected_release_quarter": null, "name": "The Legend of Zelda: Tri Force Heroes"}}]<
Here's an example of 'releases_data' and 'games_data' that wasn't returned in the result but does in fact match IDs :
releases_data:
{"deck":null,"game":{"api_detail_url":"http:\/\/www.giantbomb.com\/api\/game\/3030-50627\/","id":50627,"name":"Orion Trail"},"id":144188,"image":null,"name":"Orion Trail","platform":{"api_detail_url":"http:\/\/www.giantbomb.com\/api\/platform\/3045-94\/","id":94,"name":"PC"}}
games_data:
{"deck":"Orion Trail is a single player choose-your-own-space-adventure.","id":50627,"image":{"icon_url":"http:\/\/static.giantbomb.com\/uploads\/square_avatar\/29\/291401\/2775039-6490638002-heade.jpg","medium_url":"http:\/\/static.giantbomb.com\/uploads\/scale_medium\/29\/291401\/2775039-6490638002-heade.jpg","screen_url":"http:\/\/static.giantbomb.com\/uploads\/screen_medium\/29\/291401\/2775039-6490638002-heade.jpg","small_url":"http:\/\/static.giantbomb.com\/uploads\/scale_small\/29\/291401\/2775039-6490638002-heade.jpg","super_url":"http:\/\/static.giantbomb.com\/uploads\/scale_large\/29\/291401\/2775039-6490638002-heade.jpg","thumb_url":"http:\/\/static.giantbomb.com\/uploads\/scale_avatar\/29\/291401\/2775039-6490638002-heade.jpg","tiny_url":"http:\/\/static.giantbomb.com\/uploads\/square_mini\/29\/291401\/2775039-6490638002-heade.jpg"}}
EDIT: THIS IS INCORRECT. I'm leaving it just as context for the comments.
The problem with the example you posted for releases_data is in the very first field: "deck":null If I try to create a JSON object from this string, I get
builtins.NameError: name 'null' is not defined
There must be some try-catch block somewhere that is ignoring this exception. You could just define
null = None
before processing the files, if this is the only problem. Perhaps you ought to test how many JSON objects you can create from each of the two files, to locate any additional problems, before you go back to merging them.
Just as a debugging tip, this took me perhaps five minutes to analyze, once I got the data to work with from you. All I did was call json.loads on both strings and read the error message. It always (well, almost always) pays to start at the bottom and work up. :-)

Categories