Python: Append a parsed string but throw out non-compliant values? - python

Warning: I'm a total newbie; apologies if I didn't search for the right thing before submitting this question. I found lots on how to ignore errors, but nothing quite like what I'm trying to do here.
I have a simple script that I'm using to grab data off a database, parse some fields apart, and re-write the parsed values back to the database. Multiple users are submitting to the database according to a delimited template, but there is some degree of non-compliance, meaning sometimes the string won't contain all/any delimiters. My script needs to be able to handle those instances by throwing them out entirely.
I'm having trouble throwing out non-compliant strings, rather than just ignoring the errors they raise. When I've tried try-except-pass, I've ended up getting errors when my script attempts to append parsed values into the array I'm ultimately writing back to the db.
Originally, my script said:
def parse_comments(comments):
parts = comments.split("||")
if len(parts) < 20:
raise ValueError("Comment didn't have enough || delimiters")
return Result._make([parts[i].strip() for i in xrange(2, 21, 3)])
Fully compliant uploads would append Result to an array and write back to db.
I've tried try/except:
def parse_comments(comments):
parts = comments.split("||")
try:
Thing._make([parts[i].strip() for i in xrange(2, 21, 3)])
except:
pass
return Thing
But I end up getting an error when I try and append the parsed values to an array -- specifically TypeError: 'type' object has no attribute 'getitem'
I've also tried:
def parse_comments(comments):
parts = comments.split("||")
if len(parts) >= 20:
Thing._make([parts[i].strip() for i in xrange(2, 21, 3)])
else:
pass
return Thing
but to no avail.
tl;dr: I need to parse stuff and append parsed items. If a string can't be parsed how I want it, I want my code to ignore that string entirely and move on.

But I end up getting an error when I try and append the parsed values to an array -- specifically TypeError: 'type' object has no attribute 'getitem'
Because Thing means the Thing class itself, not an instance of that class.
You need to think more clearly about what you want to return when the data is invalid. It may be the case that you can't return anything directly usable here, so that the calling code has to explicitly check.

I am not sure I understand everything you want to do. But I think you are not catching the error at the right place. You said yourself that it arose when you wanted to append the value to an array. So maybe you should do:
try:
# append the parsed values to an array
except TypeError:
pass
You should give the exception type to catch after except, otherwise it will catch any exception, even a user's CTRL+C which raise a KeyboardInterrupt.

Related

Python string indices must be integers

I'm reading a Dictionary from an API which has a field called 'price'.
I'm reading it fine for a while (so, the code works) until I get to a point I get the error message: string indices must be integers.
That breaks my code.
So, I would like to find a way to skip it (ignore it) when this happens, and continue with the code. And just print something out so I know something happened.
So, far I don't manage to see what number is causing this error.
If I test this by itself, it works fine.
fill = {'price': 0.00002781 }
price = fill['price'] # OUTPUT: string indices must be integers
print(price)
I've tried many things:
from decimal import Decimal
price = decimal(fill['price'])
also:
price = int(fill['price']) # but it's not really an int
and:
price = float(fill['price']) # but sometimes it's a very big float so I need decimal
It seems that what you get from the API is not exactly what you expect:
The variable fill is a string (at least at the time you get the error).
As strings can't have string indices (like dictionaries can) you get the TypeError exception.
To handle the exception and troubleshoot it, you can use try-except, like so:
try:
price = fill['price']
except TypeError as e:
print(f"fill: {fill}, exception: {str(e)}")
This way, when there is an issue, the fill value will be printed as well as the exception.
string indices must be integers tells you that the type of fill during runtime at some point is a str instead of Dict. I suggest that you add type checking or assertion to your program to make sure fill is of the expected type.
If you want to just ignore it you could use try and except blocks.
try:
price = fill['price']
except Exception as e:
print(f"Error reading the price. Error: {e}")

Key 'boot_num' is not recognized when being interpreted from a .JSON file

Currently, I am working on a Boot Sequence in Python for a larger project. For this specific part of the sequence, I need to access a .JSON file (specs.json), establish it as a dictionary in the main program. I then need to take a value from the .JSON file, and add 1 to it, using it's key to find the value. Once that's done, I need to push the changes to the .JSON file. Yet, every time I run the code below, I get the error:
bootNum = spcInfDat['boot_num']
KeyError: 'boot_num'`
Here's the code I currently have:
(Note: I'm using the Python json library, and have imported dumps, dump, and load.)
# Opening of the JSON files
spcInf = open('mki/data/json/specs.json',) # .JSON file that contains the current system's specifications. Not quite needed, but it may make a nice reference?
spcInfDat = load(spcInf)
This code is later followed by this, where I attempt to assign the value to a variable by using it's dictionary key (The for statement was a debug statement, so I could visibly see the Key):
for i in spcInfDat['spec']:
print(CBL + str(i) + CEN)
# Loacting and increasing the value of bootNum.
bootNum = spcInfDat['boot_num']
print(str(bootNum))
bootNum = bootNum + 1
(Another Note: CBL and CEN are just variables I use to colour text I send to the terminal.)
This is the interior of specs.json:
{
"spec": [
{
"os":"name",
"os_type":"getwindowsversion",
"lang":"en",
"cpu_amt":"cpu_count",
"storage_amt":"unk",
"boot_num":1
}
]
}
I'm relatively new with .JSON files, as well as using the Python json library; I only have experience with them through some GeeksforGeeks tutorials I found. There is a rather good chance that I just don't know how .JSON files work in conjunction with the library, but I figure that it would still be worth a shot to check here. The GeeksForGeeks tutorial had no documentation about this, as well as there being minimal I know about how this works, so I'm lost. I've tried searching here, and have found nothing.
Issue Number 2
Now, the prior part works. But, when I attempt to run the code on the following lines:
# Changing the values of specDict.
print(CBL + "Changing values of specDict... 50%" + CEN)
specDict ={
"os":name,
"os_type":ost,
"lang":"en",
"cpu_amt":cr,
"storage_amt":"unk",
"boot_num":bootNum
}
# Writing the product of makeSpec to `specs.json`.
print(CBL + "Writing makeSpec() result to `specs.json`... 75%" + CEN)
jsonobj = dumps(specDict, indent = 4)
with open('mki/data/json/specs.json', "w") as outfile:
dump(jsonobj, outfile)
I get the error:
TypeError: Object of type builtin_function_or_method is not JSON serializable.
Is there a chance that I set up my dictionary incorrectly, or am I using the dump function incorrectly?
You can show the data using:
print(spcInfData)
This shows it to be a dictionary, whose single entry 'spec' has an array, whose zero'th element is a sub-dictionary, whose 'boot_num' entry is an integer.
{'spec': [{'os': 'name', 'os_type': 'getwindowsversion', 'lang': 'en', 'cpu_amt': 'cpu_count', 'storage_amt': 'unk', 'boot_num': 1}]}
So what you are looking for is
boot_num = spcInfData['spec'][0]['boot_num']
and note that the value obtained this way is already an integer. str() is not necessary.
It's also good practice to guard against file format errors so the program handles them gracefully.
try:
boot_num = spcInfData['spec'][0]['boot_num']
except (KeyError, IndexError):
print('Database is corrupt')
Issue Number 2
"Not serializable" means there is something somewhere in your data structure that is not an accepted type and can't be converted to a JSON string.
json.dump() only processes certain types such as strings, dictionaries, and integers. That includes all of the objects that are nested within sub-dictionaries, sub-arrays, etc. See documentation for json.JSONEncoder for a complete list of allowable types.

Python iteration 'Nonetype' using 'in' statement

Hi so I loaded a JSON file into a list using the following code:
import json
postal_mtl = ['H9W','H4W','H9P','H3B','H3A','H2Z','H3H','H3G','H3X','H9J','H1B','H1G','H1H','H4X','H2Y','H9R','H3Z','H3Y']
data = []
with open('business.json',encoding="utf8") as f:
for line in f:
data.append(json.loads(line))
Now I am trying to find the number of restaurants in montreal in this dataset (coming from Yelp). I tried the following code:
compteur3 = 0
for i in range(len(data)):
if data[i]['postal_code'][0:3] in postal_mtl and 'Restaurants' in data[i]['categories']:
compteur3 += 1
print(compteur3)
But I am getting an error saying "argument of type 'NoneType' is not iterable" I guess Python considers the date[i]['categories'] as a Nonetype ? Why is that ? If I enter the following I can see that it's clearly a string:
data[5]['categories']
'Shipping Centers, Couriers & Delivery Services, Local Services, Printing Services'
Now I just want to iterate over all the elements in my data list and find each line where we have the word 'Restaurants' (I got the Montreal stuff fixed)... Any idea ? Thanks !
Based on the code provided, it seems that the error is most likely coming from the if condition. Specifically, it is most likely coming from the statement 'Restaurants' in data[i]['categories']. Under the hood, Python is trying to iterate through data[i]['categories'] to see if 'Restaurants' is in it. If data[i]['categories'] is None, that would cause this error.
This may be caused by the JSON string not being formatted the way you expected. Perhaps, if no categories were listed in the 'Categories' field, a null was put instead of an empty list. To check for this in your code, you can try the following:
compteur3 = 0
for i in range(len(data)):
is_inmontreal = data[i]['postal_code'][0:3] in postal_mtl
is_restaurant = data[i]['categories'] and 'Restaurants' in data[i]['categories']
if is_inmontreal and is_restaurant:
compteur3 += 1
print(compteur3)
Above, I simply split the condition into two parts. Functionally, this would be the same as having the conditions in one line, it just makes it slightly clearer. However, I also added a check in is_restaurant to see if data[i]['categories'] has a positive truth value. In effect, this will check if the value is not None and it is not an empty list. If you really want to be explicit, you can also do
is_restaurant = data[i]['categories'] is not None and 'Restaurants' in data[i]['categories']
Depending on how dirty the data is, you may need to go a little further than this and use exception handling. However, the above is just speculation as I do not know what the data looks like.

iterate threw list and if value doesnt excist hide error and continue

I've got a List like:
results = ['SDV_GAMMA','SDV_BETA,'...','...']
and then comes and for loop like:
for i in range (len(results)):
a = instance.elementSets[results[i]]
The strings defined in the result-list are part of a *.odb result file and if they didn't exist there comes an error.
I would like that my program doesn't stop cause of an error. It should go on and check if values of the others result values exist.
So i do not have to sort every result before i start my program. If it´s not in the list, there is no problem, and if it exists i get my data.
I hope u know what i mean.
You can use try..except block
Ex:
for i in results
try:
a = instance.elementSets[results[i]]
except:
pass
You can simply check the presence of results[i] in instance.elementSets before extracting it.
If instance.elementSets is a dictionary, use the dict.get command.
https://docs.python.org/3/library/stdtypes.html#dict.get

Python 2.7 replace all instances of NULL / NONE in complex JSON object

I have the following code..
.... rest api call >> response
rsp = response.json()
print json2html.convert(rsp)
which results in the following
error: Can't convert NULL!
I therefore started looking into schemes to replace all None / Null's in my JSON response, but I'm having an issue since the JSON returned from the api is complex and nested many levels and I don't know where the NULL will actually appear.
From what I can tell I need to iterate over the dictionary objects recursively and check for any values that are NONE and actually rebuild the object with the values replaced, but I don't really know where to start since dictionary objects are immutable..
If you look at json2html's source it seems like you have a different problem - and the error message is not helping.
Try to use it like this:
print json2html.convert(json=rsp)
btw. because I've already contributed to that project a bit I've opened up the following PR due to this question: https://github.com/softvar/json2html/pull/20

Categories