Ignoring KeyError while writing to file with dask.to_csv - python

I have a dataframe, which is not loaded into memory (and it should stay like that).
At some point in the script I apply a conversion with a dictionary to one of the dataframe columns in the following manner:
df['identifier'] = df.identifier.map(lambda identifier: alias_dict[str(identifier)],
meta=('identifier', str))
KeyError exceptions are not found out on this stage but just when I use to_csv so I try to handle them
try:
dd.to_csv(intersection_df, output, header=None, index=None, single_file=True, sep='\t')
except KeyError as err:
print(f'Unmatched key {err.args[0]}')
In case I encounter a KeyError, the writing to the file is stopped - is there a way to make the writing continue even if I get an exception at that stage?

The best thing to do, if you want to skip or remediate the failing lines but keep writing, is to put your try/except into the mapping function
def alias(identifier):
try:
return alias_dict[str(identifier)]
except KeyError:
return identifier
df['identifier'] = df.identifier.map(alias, meta=('identifier', str))
In this case, failures are passed through unchanged. You could turn them into None and filter them out in a second step, or the two steps could be combined with map_partitions.

Related

Python string indices must be integers

I'm reading a Dictionary from an API which has a field called 'price'.
I'm reading it fine for a while (so, the code works) until I get to a point I get the error message: string indices must be integers.
That breaks my code.
So, I would like to find a way to skip it (ignore it) when this happens, and continue with the code. And just print something out so I know something happened.
So, far I don't manage to see what number is causing this error.
If I test this by itself, it works fine.
fill = {'price': 0.00002781 }
price = fill['price'] # OUTPUT: string indices must be integers
print(price)
I've tried many things:
from decimal import Decimal
price = decimal(fill['price'])
also:
price = int(fill['price']) # but it's not really an int
and:
price = float(fill['price']) # but sometimes it's a very big float so I need decimal
It seems that what you get from the API is not exactly what you expect:
The variable fill is a string (at least at the time you get the error).
As strings can't have string indices (like dictionaries can) you get the TypeError exception.
To handle the exception and troubleshoot it, you can use try-except, like so:
try:
price = fill['price']
except TypeError as e:
print(f"fill: {fill}, exception: {str(e)}")
This way, when there is an issue, the fill value will be printed as well as the exception.
string indices must be integers tells you that the type of fill during runtime at some point is a str instead of Dict. I suggest that you add type checking or assertion to your program to make sure fill is of the expected type.
If you want to just ignore it you could use try and except blocks.
try:
price = fill['price']
except Exception as e:
print(f"Error reading the price. Error: {e}")

Getting exception "openpyxl.utils.exceptions.IllegalCharacterError"

I am extracting data from an Oracle 11g Database using python and writing it to an Excel file. During extraction, I'm using a python list of tuples (each tuple indicates each row in dataset) and the openpyxl module to write the data into Excel. It's working fine for some datasets but for some, it's throwing the exception:
openpyxl.utils.exceptions.IllegalCharacterError
This is the solution I've already tried:
Openpyxl.utils.exceptions.IllegalcharacterError
Here is my Code:
for i in range(0,len(list)):
for j in range(0,len(header)):
worksheet_ntn.cell(row = i+2, column = j+1).value = list[i][j]
Here is the error message:
raise IllegalCharacterError
openpyxl.utils.exceptions.IllegalCharacterError
I did get this error because of some hex charactres in some of my strings.
'Suport\x1f_01'
The encode\decode solutions mess with the accente words too
So...
i resolve this with repr()
value = repr(value)
That give a safe representation, with quotation marks
And then i remove the first and last charactres
value = repr(value)[1:-1]
Now you can safe insert value on your cell
The exception tells you everything you need to know: you must replace the characters that cause the exception. This can be done using re.sub() but, seeing as only you can decide what you want to replace them with — spaces, empty strings, etc. — only you can do this.

Python try-except-except

Im gonna include the description of the task this code is supposed to do in case someone needs it to answer me.
#Write a function called "load_file" that accepts one
#parameter: a filename. The function should open the
#file and return the contents.#
#
# - If the contents of the file can be interpreted as
# an integer, return the contents as an integer.
# - Otherwise, if the contents of the file can be
# interpreted as a float, return the contents as a
# float.
# - Otherwise, return the contents of the file as a
# string.
#
#You may assume that the file has only one line.
#
#Hints:
#
# - Don't forget to close the file when you're done!
# - Remember, anything you read from a file is
# initially interpreted as a string.
#Write your function here!
def load_file(filename):
file=open(filename, "r")
try:
return int(file.readline())
except ValueError:
return float(file.readline())
except:
return str(file.readline())
finally:
file.close()
#Below are some lines of code that will test your function.
#You can change the value of the variable(s) to test your
#function with different inputs.
#
#If your function works correctly, this will originally
#print 123, followed by <class 'int'>.
contents = load_file("LoadFromFileInput.txt")
print(contents)
print(type(contents))
When the code is tested with a file which contains "123", then everything works fine. When the website loads in another file to test this code, following error occurs:
[Executed at: Sat Feb 2 7:02:54 PST 2019]
We found a few things wrong with your code. The first one is shown below, and the rest can be found in full_results.txt in the dropdown in the top left:
We tested your code with filename = "AutomatedTest-uwixoW.txt". We expected load_file to return the float -97.88285. However, it instead encountered the following error:
ValueError: could not convert string to float:
So Im guessing the error occurs inside the first except statement, but i don't understand why. If an error occurs when the value inside a file is being converted to float, shouldnt the code just go to the second except statement ? And in the second except it would be converted to string, which will work anyway ? I'm guessing i misunderstand something about how try-except(specified error)-except(no specified error) works.
Sorry for long post.
shouldnt the code just go to the second except statement ?
Nope: this "flat" try/except statement works only for the first try block. If an exception occurs there, the except branches catch this exception and straight away evaluate the appropriate block. If an exception occurs in this block, it's not caught by anything, because there's no try block there.
So, you'd have to do a whole lot of nested try/except statements:
try:
do_this()
except ValueError:
try:
do_that()
except ValueError:
do_this()
except:
do_that_one()
except:
# a whole bunch of try/except here as well
You may need to add an extra level of nesting.
This is terribly inefficient in terms of the amount of code you'll need to write. A better option might be:
data = file.readline()
for converter in (int, float, str):
try:
return converter(data)
except:
pass
Note that if you do converter(file.readline()), a new line will be read on each iteration (or, in your case, in any new try/except block), which may not be what you need.
No, only one of those except blocks -- the first one matching the exception -- will be executed. The behavior you are describing would correspond to
except ValueError:
try:
return float(file.readline())
except:
return str(file.readline())
def load_file(filename):
file=open(filename, "r")
try:
val = file.readline()
return int(val)
except ValueError:
try:
return float(val)
except:
return str(val)
finally:
file.close()

iterate threw list and if value doesnt excist hide error and continue

I've got a List like:
results = ['SDV_GAMMA','SDV_BETA,'...','...']
and then comes and for loop like:
for i in range (len(results)):
a = instance.elementSets[results[i]]
The strings defined in the result-list are part of a *.odb result file and if they didn't exist there comes an error.
I would like that my program doesn't stop cause of an error. It should go on and check if values of the others result values exist.
So i do not have to sort every result before i start my program. If it´s not in the list, there is no problem, and if it exists i get my data.
I hope u know what i mean.
You can use try..except block
Ex:
for i in results
try:
a = instance.elementSets[results[i]]
except:
pass
You can simply check the presence of results[i] in instance.elementSets before extracting it.
If instance.elementSets is a dictionary, use the dict.get command.
https://docs.python.org/3/library/stdtypes.html#dict.get

Python: Append a parsed string but throw out non-compliant values?

Warning: I'm a total newbie; apologies if I didn't search for the right thing before submitting this question. I found lots on how to ignore errors, but nothing quite like what I'm trying to do here.
I have a simple script that I'm using to grab data off a database, parse some fields apart, and re-write the parsed values back to the database. Multiple users are submitting to the database according to a delimited template, but there is some degree of non-compliance, meaning sometimes the string won't contain all/any delimiters. My script needs to be able to handle those instances by throwing them out entirely.
I'm having trouble throwing out non-compliant strings, rather than just ignoring the errors they raise. When I've tried try-except-pass, I've ended up getting errors when my script attempts to append parsed values into the array I'm ultimately writing back to the db.
Originally, my script said:
def parse_comments(comments):
parts = comments.split("||")
if len(parts) < 20:
raise ValueError("Comment didn't have enough || delimiters")
return Result._make([parts[i].strip() for i in xrange(2, 21, 3)])
Fully compliant uploads would append Result to an array and write back to db.
I've tried try/except:
def parse_comments(comments):
parts = comments.split("||")
try:
Thing._make([parts[i].strip() for i in xrange(2, 21, 3)])
except:
pass
return Thing
But I end up getting an error when I try and append the parsed values to an array -- specifically TypeError: 'type' object has no attribute 'getitem'
I've also tried:
def parse_comments(comments):
parts = comments.split("||")
if len(parts) >= 20:
Thing._make([parts[i].strip() for i in xrange(2, 21, 3)])
else:
pass
return Thing
but to no avail.
tl;dr: I need to parse stuff and append parsed items. If a string can't be parsed how I want it, I want my code to ignore that string entirely and move on.
But I end up getting an error when I try and append the parsed values to an array -- specifically TypeError: 'type' object has no attribute 'getitem'
Because Thing means the Thing class itself, not an instance of that class.
You need to think more clearly about what you want to return when the data is invalid. It may be the case that you can't return anything directly usable here, so that the calling code has to explicitly check.
I am not sure I understand everything you want to do. But I think you are not catching the error at the right place. You said yourself that it arose when you wanted to append the value to an array. So maybe you should do:
try:
# append the parsed values to an array
except TypeError:
pass
You should give the exception type to catch after except, otherwise it will catch any exception, even a user's CTRL+C which raise a KeyboardInterrupt.

Categories