Pandas library unable to read csv file

Pandas library unable to read csv file - python

I have just one line of code which reads a CSV file into a variable df, but this gives the following error: No columns to parse from file.
import pandas as pd
df = pd.read_csv("D:\Folder1\train.csv")
The CSV file is at this location (I've checked it more than once) and the CSV file was being correctly read until I updated the pandas library.
Can someone tell me how to remove this error?

You have to use forward slashes "/" in your path

Related

How to extract a table from a csv file generated by Database

I have a csv file with comments marked by '#'. I want to select only the table part from this and get it into a pandas dataframe. I can just check the '#' marks and the table header and delete them but it will not be dynamic enough. If the csv file is slightly changed it won't work.
Please help me figure out a way to extract only the table part from this csv file.

There is a comment argument if you read in your file, but each line has to start with the appropriate character or your Metadata will not be treated as comment.
import pandas as pd
df = pd.read_csv('path/to/file.csv', sep=';', comment='#')

.csv file can't have comment. Then you must delete comment-line manualy. Try start checking from end file, and stop if # in LINE and ';' not in LINE

Python pandas csv file unicode error and stuffs

I'm trying to read a csv file on python. The code goes like this -
import pandas as pd
df = pd.read_csv("C:\Users\User\Desktop\Inan")
print(df.head())
However it keeps showing the unicode error. Tried putting r,changing the slashes in multiple ways,but it didnt't work,just showed different errors like "file not found". What can I do?

Try this method, It may work
df = pd.read_csv("C:/Users/User/Desktop/Inan.csv", encoding="utf-8")
include your file extension also(.csv .xlxs)

I keep getting UnicodeErrors opening a CSV file although adding utf-8 encoding

I know the questions sounds generic but here is my problem.
I have a csv file that will always cause UnicodeErrors and errors like csv.empty although I am opening the file with utf-8
like this
with open(csv_filename, 'r', encoding='utf-8') as csvfile:
A workaround I found is to open the file I want, copy the lines and save to a new file(with visual code studio) everything works fine.
Someone told me that I have to use pandas. Is it true?
Is there a difference between opening a file with CSV and Pandas?

Pandas will load the contents of the csv file into a dataframe
The csv module has methods like reader and DictReader that will return generators that let you move through the file.
With Pandas:
import pandas as pd
df=pd.read_csv('file.csv')
df.to_csv('new_file.csv',index=False)

Python: How to write data to an Excel file using pd.ExcelWriter?

The question:
I'm trying to write data to an Excel file using Python, specifically using the ExcelWriter function provided py Pandas as described here in the docs. I think I've onto something here, but I'm only able to achieve one of two outcomes:
1. If the Excel file is open, access permission is denied.
2. If the Excel file is closed, the code seems to be running just fine, but the following error message is provided when trying to open the file Excel file after execution:
Excel cannot open the file excelTest.xlsm because the file format or
file extension is not valid. Verify that the file has not been
corrupted and that the file extenstion matches the format of the file
Does anyone know what's going on here? Or is there perhaps a better way to do this than using pd.ExcelWrite?
The details:
I've got three files in the directory C:\pythontest:
1. input.txt
2. excelTest.xlsm
1. pythonTest.py
input.txt is a comma separated text file with this content:
A,B,C
1,4,6
2,5,5
3,5,6
excelTest.xlsm is an Excel file that is completely empty with the exception of of one empty sheet named Sheet1.
pythonTest.py is a script where I'm trying to read the txt file using Python, and then write a pandas dataframe to the Excel file:
import os
import pandas as pd
os.getcwd()
os.chdir('C:/pythonTest')
os.listdir(os.getcwd())
df = pd.read_csv('C:\\pythonTest\\input.txt')
writer = pd.ExcelWriter('excelTest.xlsm')
df.to_excel(writer,'Sheet2')
writer.save()
But as I've mentioned, it fails spectacularly. Any suggestions?
System info:
Windows 7, 64 bit
Excel Version 1803
Python 3.6.6 | Anaconda custom (64-bit) |
Pandas 0.23.4
EDIT 1 - print(df) output as requested in the comments:

Pandas requires that a workbook name ends in .xls or .xlsx. It uses the extension to choose which Excel engine to use.
So the problem you've got is the extension, due to "extension hardening" Excel won't open this file since it knows that it doesn't contain a macro and isn't actually an xlsm file. Writing to excelTest.xlsx should work!

Pandas 0.19.2 read_excel IndexError: List index out of range

I am looking to parse an excel spreadsheet. I decided to use pandas but got caught by an error straight off the bat.
I started with the code below but played around with using a full path and also tried setting the sheetname.
import pandas as pd
table = pd.read_excel('ss_12.xlsx')
if __name__ == '__main__':
pass
The excel spreadsheet is in the same directory as my script file. I taught it would work the same as open() in this sense, just a name required if its in the same directory. I have looked at a few examples online and going by them this should work.
I am trying to strip the first column seen in the image above. The full error (not sure how to format it, sorry)
C:\xx\Playpen\ConfigList_V1_0.xlsx
Traceback (most recent call last):
File "C:\xx\Playpen\getConVars.py", line 12, in <module>
pd.read_excel(excelFile)
File "C:\xx\Programs\Python\Python35\lib\site-packages\pandas\io\excel.py", line 200, in read_excel
**kwds)
File "C:\xx\Programs\Python\Python35\lib\site-packages\pandas\io\excel.py", line 432, in _parse_excel
sheet = self.book.sheet_by_index(asheetname)
File "C:\xx\Programs\Python\Python35\lib\site-packages\xlrd\book.py", line 432, in sheet_by_index
return self._sheet_list[sheetx] or self.get_sheet(sheetx)
IndexError: list index out of range

Make sure you have the right kind of Excel spreadsheet. I had this same error and realized that I had saved it as a Strict XML Open Spreadsheet which still had the .xlsx extension.

If you just want to read the file, it's better to use os.path as follows:
import os
import pandas as pd
dir = 'path_to_excel_file_directory'
excelFile = os.path.join(dir, 'fileName.xlsx')
pd.read_excel(excelFile)
And if the excel file is in the same directory as your script, you can use inspect to automatically detect the directory it's in:
scriptName = inspect.getframeinfo(inspect.currentframe()).filename
dir = os.path.dirname(os.path.abspath(filename))
excelFile = os.path.join(dir, 'fileName.xlsx')
pd.read_excel(excelFile)
One final note: the part
if __name__ == '__main__':
pass
is not related to the question.

Quick solution: create a brand new file (.xls) with the content of the one that gives you the error. It worked for me.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pandas library unable to read csv file - python

You have to use forward slashes "/" in your path

Related

How to extract a table from a csv file generated by Database

Python pandas csv file unicode error and stuffs

I keep getting UnicodeErrors opening a CSV file although adding utf-8 encoding

Python: How to write data to an Excel file using pd.ExcelWriter?

Pandas 0.19.2 read_excel IndexError: List index out of range

Categories

Resources