Python using .csv files in terminal - python

I wrote the following script that runs perfectly when using pyCharm, but when I go to run it in a terminal it gives me these errors:
File "/Users/Chris/PycharmProjects/firstfile/trial.py", line 6, in <module>
r = pf.read_csv('python.csv')
File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 562, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 315, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 645, in __init__
self._make_engine(self.engine)
File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 799, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/local/lib/python2.7/site-packages/pandas/io/parsers.py", line 1213, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 358, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:3427)
File "pandas/parser.pyx", line 628, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:6861)
IOError: File python.csv does not exist
Could someone point in the the right direction? I am guessing that it has to do with the csv file not being in the right path or directory. Right now I have the csv file saved in the same folder as my .py project. I also checked and made sure I have the right packages installed, so I do not think it is that.
import csv
import pandas as pf
r = pf.read_csv('python.csv')
r.head()
print r.describe()
tradeDates = r['Trade Date'].unique()
r.name = 'Trade Date'
for trades in tradeDates:
outfilename = trades
printName = outfilename + ".csv"
print printName
r[r['Trade Date'] == trades].to_csv(printName, index=False)

When you run python /Users/Chris/PycharmProjects/firstfile/trial.py python looks for csv file in your current directory, not in /Users/Chris/PycharmProjects/firstfile.
You either need to change your directory before running the code, or you need to use the full path in trial.py like this:
import csv
import pandas as pf
r = pf.read_csv('/Users/Chris/PycharmProjects/firstfile/python.csv')
r.head()

Related

pandas read_excel from ODS file locked by another user

I'm trying to retrieve csv-formatted data with pandas from a .ods file on a shared folder (mounted using nfs on my machine), and I have trouble getting the data when someone else is working on the file.
In that case, the file is locked, which makes perfect sense to avoid concurrent edition. One can see it when opening the file with LibreOffice for example, or just staring at the folder as a. .~lock file is present.
However, in my case, I'm just trying to open the file to read it with pandas, not edit it. Libre Office offers this possibility for instance. How is it pandas cannot provide that functionality ?
To be more precise, here is the command:
sheet_df = pd.read_excel(filepath, sheet_name= "Sheet2", engine="odf", skiprows=3)
and the output
File "/Users/user_name/job.py", line 148, in read_file
sheet_df = pd.read_excel(filepath, sheet_name= "Sheet2", engine="odf", skiprows=3)
File "/Users/user_name/.pyenv/versions/virtualenv_prod/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/Users/user_name/.pyenv/versions/virtualenv_prod/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 364, in read_excel
io = ExcelFile(io, storage_options=storage_options, engine=engine)
File "/Users/user_name/.pyenv/versions/virtualenv_prod/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 1233, in __init__
self._reader = self._engines[engine](self._io, storage_options=storage_options)
File "/Users/user_name/.pyenv/versions/virtualenv_prod/lib/python3.9/site-packages/pandas/io/excel/_odfreader.py", line 35, in __init__
super().__init__(filepath_or_buffer, storage_options=storage_options)
File "/Users/user_name/.pyenv/versions/virtualenv_prod/lib/python3.9/site-packages/pandas/io/excel/_base.py", line 420, in __init__
self.book = self.load_workbook(self.handles.handle)
File "/Users/user_name/.pyenv/versions/virtualenv_prod/lib/python3.9/site-packages/pandas/io/excel/_odfreader.py", line 46, in load_workbook
return load(filepath_or_buffer)
File "/Users/user_name/.pyenv/versions/virtualenv_prod/lib/python3.9/site-packages/odf/opendocument.py", line 982, in load
z = zipfile.ZipFile(odffile)
File "/Users/user_name/.pyenv/versions/3.9.2/lib/python3.9/zipfile.py", line 1257, in __init__
self._RealGetContents()
File "/Users/user_name/.pyenv/versions/3.9.2/lib/python3.9/zipfile.py", line 1322, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
I'm using python 3.9.2, on a MAC BigSur by the way.
Am I missing something, or pandas.read_excel cannot only read a file ?

How to open a .csv file after user input, using pandas?

I'm very new to Python and this will be an extremely basic question.
I want a user to input the name of a csv file, which I want to open with pandas to easily access its rows and columns.
This is the code that I wrote:
import pandas as pd
DATAFIN = str(raw_input("Name of your data file"))
dataset = pd.read_csv(DATAFIN)
dataset.head()
However, I seem to be doing some kind of mistake because this is the message I get (sorry for the lenght):
Traceback (most recent call last):
File "c:\Users\File.py", line 34, in <module>
dataset = pd.read_csv(DATAFIN)
File "C:\Python27\lib\site-packages\pandas\io\parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Python27\lib\site-packages\pandas\io\parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Python27\lib\site-packages\pandas\io\parsers.py", line 895, in __init__
self._make_engine(self.engine)
File "C:\Python27\lib\site-packages\pandas\io\parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Python27\lib\site-packages\pandas\io\parsers.py", line 1853, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 387, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
does not exist: ' maindata.csv\r'csv
Do you have any idea about which is the problem?
I am sorry for any mistakes in formatting.
It looks like you use a space charakter in your string
' maindata.csv\r'
Try to type your csv name without the space
So it looks like
Name of your data filemaindata.csv
try to read your csv file using pd.read_csv(r'address of the file.csv')
use argparse to get filename from command line.
run your script by python script.py --filename file.csv
and use print to see result
import pandas as pd
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--filename')
args = parser.parse_args()
dataset = pd.read_csv(args.filename)
print(dataset.head())

Python : FileNotFoundError: File b'fleet.csv' does not exist

I am getting a FileNotFoundError when I try to read a particular CSV file in the directory.
If i read another CSV file, I can read it properly without any error.
What I have tried
fleet_data=pd.read_csv('data_fleet.csv', sep=',',index_col=0)
fleet_data=pd.read_csv('Users/Ver/Desktop/Processing/data_fleet.csv',sep=',',index_col=0)
fleet_data=pd.read_csv('Users\Ver\Desktop\Processing\data_fleet.csv',sep=',',index_col=0)
fleet_data=pd.read_csv('data_fleet.csv')
I tried changing the name of the file, but it still doesn't work.
Error
fleet_data=pd.read_csv('data_fleet.csv', sep=',',index_col=0)
Traceback (most recent call last):
File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\IPython\core\interactiveshell.py", line 2869, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-36-9aab06bbbbcc>", line 1, in <module>
fleet_data=pd.read_csv('data_fleet.csv', sep=',',index_col=0)
File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\pandas\io\parsers.py", line 678, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\pandas\io\parsers.py", line 440, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\pandas\io\parsers.py", line 787, in __init__
self._make_engine(self.engine)
File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\pandas\io\parsers.py", line 1014, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Users\VW3ZTWS\PycharmProjects\Data_Collection_and_learnings\venv\lib\site-packages\pandas\io\parsers.py", line 1708, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 384, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 695, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: File b'data_fleet.csv' does not exist
But
If I copy the CSV file from the project to another project folder and open the Python file and open the file (data_fleet.py), I can read it without any issues.
What is the issue in reading the file in a desired folder?
Try to give it absolute path:
'C:\\Users\\Ver\\Desktop\\Processing\\data_fleet.csv'
Could you try listing the content of your directory? For example using the os module and the listdir() function
>>> import os
>>> contents = os.listdir()
>>> contents
This will let you see if there is any odd characters or something preventing you from "finding it"

Python: ID error when importing csv file with pandas

I am trying to Import a csv file saved in a local Folder. When I use Anaconda Python Notebook I have no Problems, while using Zeppelin I do have issues.
The code I am using, that works fine in Anaconda, is:
#import csv data
frequency=pd.read_csv("C:\\Users\\L18938\\Desktop\\Vehicle_to_grid\\analysis\\Frequency_March_2018.csv", nrows=86401)
However, when running it on Zeppelin, I receive:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 646, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 389, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 730, in __init__
self._make_engine(self.engine)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 923, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1390, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:4025)
File "pandas/parser.pyx", line 667, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:8031)
IOError: File C:\Users\L18938\Desktop\Vehicle_to_grid\analysis\Frequency_March_2018.csv does not exist
Obviously, the file exists and there are no Errors in the path spelling.
I have tryied / or double \, but nothing changes. Also
os.chdir("C:/Users/L18938/Desktop/Vehicle_to_grid/analysis")
or
os.listdir("C:/Users/L18938/Desktop/Vehicle_to_grid/analysis")
Any idea? thank you in advance
Your Traceback let show you that the python interpreter is running in Unix file path mode (/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py)
When you are under Anaconda, you are in pure windows and your traceback will be something like (C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py)
Anaconda will reach file with a Windows type file-path, and Zeppelin will reach file in a UNIX type file-path.
Your issue is definitely relative to how you specify your path in Zeppelin, you can't use Windows path, but you you may try something like that:
frequency=pd.read_csv("file:///C:/Users/L18938/Desktop/Vehicle_to_grid/analysis/Frequency_March_2018.csv", nrows=86401)

Scan a directory tree and reading .csv files into a dataframe using Python

I am trying to walk a directory tree and for each csv encountered on the walk I would like to open the file and read columns 0 and 15 into a data-frame (after which I'll process and move onto the next file. I can walk the directory tree using the following:
rootdir = r'C:/Users/stacey/Documents/Alco/auditopt/'
for dirName,sundirList, fileList in os.walk(rootdir):
print('Found directory: %s' % dirName)
for fname in fileList:
print('\t%s' % fname)
df = pd.read_csv(fname, header=1, usecols=[0,15],parse_dates=[0], dayfirst=True,index_col=[0], names=['date', 'total_pnl_per_pos'])
print(df)
but I'm getting the error message:
FileNotFoundError: File b'auditopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv' does not exist.
I am trying to read files which do exist. They are in an MS Excel .csv format so I don't know if that is an issue - if it is, would someone let me know how I read an MS Excel .csv into a data-frame please.
The full stack trace is as follows:
Found directory: C:/Users/stacey/Documents/Alco/auditopt/
Found directory: C:/Users/stacey/Documents/Alco/auditopt/roll_597_oe_2017-03-10
tradeopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv
Traceback (most recent call last):
File "<ipython-input-24-3753e367432d>", line 1, in <module>
runfile('C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py', wdir='C:/Users/stacey/Documents/scripts')
File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\Anaconda\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py", line 49, in <module>
main()
File "C:/Users/stacey/Documents/scripts/Pair_Results_Code_1.0.py", line 36, in main
df = pd.read_csv(fname, header=1, usecols=[0,15],parse_dates=[0], dayfirst=True,index_col=[0], names=['date', 'total_pnl_per_pos'])
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 646, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 389, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 730, in __init__
self._make_engine(self.engine)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 923, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Anaconda\lib\site-packages\pandas\io\parsers.py", line 1390, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas\parser.pyx", line 373, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:4184)
File "pandas\parser.pyx", line 667, in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:8449)
FileNotFoundError: File b'tradeopt.os-pnl.BBG_XASX_ARB_S-BBG_XTKS_7240_S.csv' does not exist
When reading in the file, you need to provide the full path. os.walk by default does not supply the full path. You'll need to supply it yourself.
Use os.path.join to make this easy.
import os
full_path = os.path.join(dirName, file)
df = pd.read_csv(full_path, ...)

Categories