Python match JSON value with regex

Python match JSON value with regex - python

I have a JSON like this in a list agents_json:
[
{
'name': 'ip-10-13-28-114 (5)',
'active': True
},
{
'name': 'ip-10-13-28-127 (6)',
'active': True
},
{
'name': 'ip-10-13-28-127',
'active': True
}
]
I want to delete the objects from the json where the value of the name matches my variable from a list: agents_to_remove it contains strings like the name value of the third object.
So Problem is my list doesn't contain the number between brackets and a lot of objects have names like that.
Can you tell me if its possible to match the json value with a regex in here:
for i in range(len(agents_json)):
for j in agents_to_remove:
regex = re.search(j*)
if agents_json[i]["name"] == j* :
agents_json.pop(i)
break
Obviosly j* isn't working, and after a few hours of searching I still don't have any idea how I could accomplish this.

What you have written looks like JSON - but if this is written in a python file it won't actually be a JSON object, like it might be in javascript, it will be a list of dictionary objects.
It sounds like to want to do some sort of regex or wild card matching to see if an agent in the list appears in the list of agents to be deleted. I don't know exactly what your data looks like but you might try:
remaining_agents = []
for agent in agents_json:
if any(agent["name"].startswith(x) for x in agents_to_remove):
continue
else:
remaining_agents.append(agent)
agents_json = remainig_agents

Here is an alternative to MindOfMetalAndWheels solution, using a regular expression
import re
agents_json = [
{
'name': 'ip-10-13-28-114 (5)',
'active': True
},
{
'name': 'ip-10-13-28-127 (6)',
'active': True
},
{
'name': 'ip-10-13-28-127',
'active': True
}
]
agents_to_remove = ['ip-10-13-28-127']
# Iterate through a copy of the list:
for agent in agents_json[:]:
for regex in agents_to_remove:
if re.search(regex, agent["name"]):
agents_json.remove(agent)
break
print("checking ")
for a in agents_json:
print(a["name"])

Related

Invalid Json using json.dump in python3

My generated json output is showing that it's not a valid Json while checking with jslint. Getting error EOF.
Here am using if len(data) != 0: for not inserting [] in the final output.json file (working but don't know any other way to avoid inserting [] to file)
with open('output.json', 'a') as jsonFile:
print(data)
if len(data) != 0:
json.dump(data, jsonFile, indent=2)
My input data is coming one by one from another function generated from inside for loop.
Sample "data" coming from another function using loop :
print(data)
[{'product': 'food'}, {'price': '$100'}]
[{'product': 'clothing'}, {'price': '$40'}]
...
Can I append these data and make a json file under "Store". What should be the the proper practice. Please suggest.
Sample output generated from output.json file :
[
{
"product": "food"
},
{
"price": "$100"
}
][
{
"product": "clothing"
},
{
"price": "$40"
}
]

Try jsonlines package, you would need to install it using pip install jsonlines.
jsonlines does not contain the comma(,) at the end of line. So you can read and write exact structure the way you have anod you would not need to do any additional merge or formatting.
import jsonlines
with jsonlines.open('output.json') as reader:
for obj in reader:
// Do something with obj
Similarly, you can do the dump but by write method of this module.
with jsonlines.open('output.json', mode='w') as writer:
writer.write(...)
output.jsonl would look like this
[{'product': 'food'}, {'price': '$100'}]
[{'product': 'clothing'}, {'price': '$40'}]

Yes, You can always club them all together and link it to a key named Store which would make sense as they are all the products in the store.
But I think the below format would be much better as each product in the store have a defined product name along with the price of that product
{
"Store":[
{
"product":"food",
"price":"$100"
},
{
"product":"clothing",
"price":"$40"
}
]
}
If you do this way you need not have to insert each and every key,value pair to the json but instead if you can simply insert the entire product name and price to a single object and keep appending it to the store list

Parsing incomplete json array

I have downloaded 5MB of a very large json file. From this, I need to be able to load that 5MB to generate a preview of the json file. However, the file will probably be incomplete. Here's an example of what it may look like:
[{
"first": "bob",
"address": {
"street": 13301,
"zip": 1920
}
}, {
"first": "sarah",
"address": {
"street": 13301,
"zip": 1920
}
}, {"first" : "tom"
From here, I'd like to "rebuild it" so that it can parse the first two objects (and ignore the third).
Is there a json parser that can infer or cut off the end of the string to make it parsable? Or perhaps to 'stream' the parsing of the json array, so that when it fails on the last object, I can exit the loop? If not, how could the above be accomplished?

If your data will always look somewhat similar, you could do something like this:
import json
json_string = """[{
"first": "bob",
"address": {
"street": 13301,
"zip": 1920
}
}, {
"first": "sarah",
"address": {
"street": 13301,
"zip": 1920
}
}, {"first" : "tom"
"""
while True:
if not json_string:
raise ValueError("Couldn't fix JSON")
try:
data = json.loads(json_string + "]")
except json.decoder.JSONDecodeError:
json_string = json_string[:-1]
continue
break
print(data)
This assumes that the data is a list of dicts. Step by step, the last character is removed and a missing ] appended. If the new string can be interpreted as JSON, the infinite loop breaks. Otherwise the next character is removed and so on. If there are no characters left ValueError("Couldn't fix JSON") is raised.
For the above example, it prints:
[{'first': 'bob', 'address': {'zip': 1920, 'street': 13301}}, {'first': 'sarah', 'address': {'zip': 1920, 'street': 13301}}]

For the specific structure in the example we can walk through the string and track occurrences of curly brackets and their closing counterparts. If at the end one or more curly brackets remain unmatched, we know that this indicates an incomplete object. We can then strip any intermediate characters such as commas or whitespace and close the resulting string with a square bracket.
This method ensures that the string is only parsed twice, one time manually and one time by the JSON parser, which might be advantageous for large text files (with incomplete objects consisting of many characters).
brackets = []
for i, c in enumerate(string):
if c == '{':
brackets.append(i)
elif c == '}':
brackets.pop()
if brackets:
string = string[:brackets[0]].rstrip(', \n')
if not string.endswith(']'):
string += ']'

Iterating over JSON list in Python

I'm trying to iterate over a JSON list to print out all of the results of the following:
"examples": [
{
"text": "carry all of the blame"
},
{
"text": "she left all her money to him"
},
{
"text": "we all have different needs"
},
{
"text": "he slept all day"
},
{
"text": "all the people I met"
},
{
"text": "10% of all cars sold"
}
],
I've tried to iterate over it by doing:
iterator = 0
json_example = str(json_data['results'][0]['lexicalEntries'][0]['entries'][0]['senses'][0]['examples'][iterator]['text']).capitalize()
for i in json_example:
print(i)
iterator += 1
But this is only printing each letter of the first example, as oppose to the entire example, followed by other entire examples.
Can I iterate over these as I would like to, or do I need to create separate variables with each example?

Following your code and example, it looks like what you need is :
for example in json_data['results'][0]['lexicalEntries'][0]['entries'][0]['senses'][0]['examples']:
print(example["text"])
In your code, by doing json_data['results'][0]['lexicalEntries'][0]['entries'][0]['senses'][0]['examples'][iterator]['text'] you were only accessing the iteratorth item, so, always the first one (iterator=0), and then iterating on the content of the "text" member.

Only index the json data out to 'examples':
json_example = json_data['results'][0]['lexicalEntries'][0]['entries'][0]['senses'][0]['examples']
then treat each element of 'examples' like a dictionary:
for dictionary in json_example:
for key in dictionary:
print(dictionary[key])
This will print out each value correlated with the key 'text', like you want.

Accessing a dictionary within a list within a dictionary

user = code.chan["#example"]["Someuser"]
# user would look something like:
{
'normal': True,
'voiced': False,
'op': False,
'count': 24,
'messages': [
{
'time': 1448847813,
'message': "This is my mesage"
},
{
'time': 1448847818,
'message': "And this is another"
}
]
}
I am trying to get put the items in 'message' into a list to check whether a string matches any of the items.
Let me know if more information is needed.

I guess you want this:
print [i['message'] for i in user['messages']]
Or,
print map(lambda x:x['message'],user['messages'])
Output:
['This is my mesage', 'And this is another']
To print only the last item, you can use negative indexing. Just like below:
print [i['message'] for i in user['messages']][-1]
Output:
And this is another

You can do that like so:
for message in user['messages']:
if some_string == message['message']:
match = True

If you're looking to search through them you'd want to do something like;
if any(search_string in i['message'] for i in user['messages']):
print 'found your query'

Ok i am a bit confused by the question but if i imagined correctly here is my answer.
You have a list of lists and i suspect it is in json format so you need to access it like this
#fetch data
r = requests.get(wherever_you_fetch_them)
s = r.content.decode()
json_resp = json.loads(s)
succ = json_resp['messages']['message']
and you can create a loop, but i cant help you more because i don't know any information about input data.

How does json determine write/output order

Playing with json in Python's STL and came up with this..
import json as j
cred = j.dumps({'Name': 'John Doe', 'Occupation': 'Programmer'},
sort_keys = True,
indent = 4,
separators = (',', ': '))
_f = open('credentials', 'w')
_f.write(cred)
_f.close()
The output is below and all is fine..
{
"Name": "John Doe",
"Occupation": "Programmer"
}
However, i accidentally typed name in lowercase like this..
cred = j.dumps({'name': 'John Doe', 'Occupation': 'Programmer'},
sort_keys = True,
indent = 4,
separators = (',', ': '))
and the result was this..
{
"Occupation": "Programmer",
"name": "John Doe"
}
How does json determine the write/output order of the values passed to it, what precedence does uppercase have over lowercase or vice versa and is there a way to preserve order?

Python dictionaries, as well as JSON objects, do not have an order. Any order you might see is arbitrary and may change at any time. If you want to store order in JSON, you'll need to use an array instead of an object.
sort_keys seems to guarantee some sort of output order, but that's likely only to make it more readable for humans. Computers reading JSON shouldn't care about field order.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python match JSON value with regex - python

Related

Invalid Json using json.dump in python3

Parsing incomplete json array

Iterating over JSON list in Python

Accessing a dictionary within a list within a dictionary

How does json determine write/output order

Categories

Resources