How to open/read a DBF file on python? - python

I am getting an error when I write this code
from dbfread import DBF
for record in DBF("filename.dbf"):
print(record)
and the error that i get is:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 1: ordinal not in range(128)

The error message you're encountering suggests that the 'ascii' codec is unable to decode some characters in the file you're trying to read. This is most likely because the file uses a different encoding that contains characters not representable in the ASCII character set.
To resolve this issue, you can try specifying the encoding used in the file when you create the DBF object. For example:
from dbfread import DBF
for record in DBF("filename.dbf", encoding='utf8'):
print(record)
Note that the actual encoding used in your file might be different, you can check the encoding of your file and use the appropriate one.

Related

Trying to import a csv file, containing non-ascii characters, to a dataframe

When trying to import a csv file into a pandas dataframe I get a UnicodeEncodeError because some of the characters in the csv can't be encoded by ascii. The csv is orignally encoded in utf-8.
My code:
df1 = pd.read_csv(r'‪F:\data\Housing.csv')
UnicodeEncodeError: 'ascii' codec can't encode character '\u202a' in position 0: ordinal not in range(128)
Now I have tried some suggestions posted on stackoverflow to resolve this issue, but alas nothing has worked as of yet.
For instance, I saved the csv file as ascii encoded and tried using the open command hoping I could work my way to a dataframe from there:
open('‪F:\data\Housing.csv', mode='r', encoding='ascii', errors='replace')
However, whether I use 'replace' or 'ignore' the error still remains, I have also tried using the original encoding='utf-8':
UnicodeEncodeError: 'ascii' codec can't encode character '\u202a' in position 0: ordinal not in range(128)
I also tried using codecs.open, but the same result persists.
Perhaps someone here knows how one can solve this issue? Preferably I would replace the characters causing errors with a ? sign.
Thanks in advance!

Python Encoding Error when writing to file

I want write some strings to file which is not in English, they are in Azeri language. Even if I do utf-8 encoding I get following error:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-12: ordinal not in range(128)
my code piece that wants to write to file is following:
t_w = text_list[y].encode('utf-8')
new_file.write(t_w.decode('utf-8'))
new_file.write('\n')
EDIT
Even if I make code as:
t_w = text_list[y].encode('ascii',errors='ignore')
new_file.write(t_w)
new_file.write('\n')
I get following error which is :
TypeError: write() argument must be str, not bytes
From what I can tell t_w.decode(...) attempts to convert your characters to ASCII, which doesn't encode some Azeri characters. There is no need to decode the string because you want to write it to the file as UTF-8, so omit the .decode(...) part:new_file.write(t_w)

Python convert csv to xlsx, however getting UnicodeDecodeError

I am trying to convert a csv file to a .xlsx file using PyExcel.
Here is some example data I have in the CSV file.
1.34805E+12,STANDARD,Jose,Sez,,La Pica, 16 o,Renedo de Piélagos,,39470,Spain,,No,No,1231800,2
I am having issues with the special characters, if there are none it line
merge_all_to_a_book(glob.glob("uploadorders.csv"), "uploadorders.xlsx")
Has no problems, however if it does have special characters such as
Piélagos
or
Lücht
I get this error:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 26: invalid continuation byte
I am unsure what to do about this, I have resorted to downloading the file, and re-saving it in excel.
You get the UnicodeDecodeError because the encoding python uses to read the csv is different from the one used to save the file.
Try to save the file as UTF-8 or use the correct encoding to read it: https://docs.python.org/2/howto/unicode.html

JSON encoding/decoding issues in Python

I'm trying to read in a response from a REST API, parse it as JSON and write the properties to a CSV file.
It appears some of the characters are in an unknown encoding and can't be converted to strings when they're written out to the CSV file:
'ascii' codec can't encode character u'\xf6' in position 15: ordinal not in range(128)
So, what I've tried to do is follow the answer by "agf" on this question:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)
I added a call to unicode(content).encode("utf-8") when my script reads the contents of the response:
obj = json.loads(unicode(content).encode("utf-8"))
Now I see a exceptions.UnicodeDecodeError on this line.
Is Python attempting to decode "content" before encoding it as utf-8? I don't quite understand what's going on. There is no way to determine the encoding of the response since the API I'm calling doesn't set a Content-Type header.
Not sure how to handle this. Please advise.

UnicodeDecodeError in Python with codecs module

I have a text file which comprises unicode strings "aBiyukÙwa", "varcasÙva" etc. When I try to decode them in the python interpreter using the following code, it works fine and decodes to u'aBiyuk\xd9wa':
"aBiyukÙwa".decode("utf-8")
But when I read it from a file in a python program using the codecs module in the following code it throws a UnicodeDecodeError.
file = codecs.open('/home/abehl/TokenOutput.wx', 'r', 'utf-8')
for row in file:
Following is the error message:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd9 in position 8: invalid continuation byte
Any ideas what is causing this strange behavior?
Your file is not encoded in UTF-8. Find out what it is encoded in, and then use that.

Categories