problem with reading a csv file with pandas in executable

problem with reading a csv file with pandas in executable - python

i'm writing a software that reads a csv file at after some steps creates another csv file as output, the software is working fine but when i try to create an executable with pyinstaller i have an error saying that my software can't find the input csv file. Here is how i am reading the csv file as input, i've also tryed to change the pathname with no luck:
import pandas as pd
def lettore():
RawData = pd.read_csv('rawdata.csv', sep=';')
return RawData
how can i solve the problem?

Your code searches for the file it the same folder where the exe is launched.
It is equivalent to
import os
import pandas
filepath = os.path.join(os.getcwd(), 'filename.csv')
df = pd.read_csv(filepath)
Do not use relative paths when you create an exe.
I can give you two other options:
Use an input to get the right file path when running the exe (or eventually use argparse).
filepath = input("insert your csv: ")
df = pd.read_csv(filepath)
Define an absolute path and build it in your code (you cannot change it after building and the program will read the file only from that path).
Edit: after reading your comment, see also
How to reliably open a file in the same directory as a Python script

Related

How to write dataframe to csv to the current working directory python

My code would be something like this:
import os
import pandas as pd
cwd = os.getcwd()
csv_name = '/CONTCAR_SORTED'
df = pd.read_csv(f"{cwd}{csv_name}",
skiprows=2, nrows=100, names=['X','Y','Z' ],
delimiter='\s+',engine='python')
df=df.to_csv("new")
In this way, the output file is written in the directory where the executed python script is located. I have tried different ways to specify a different folder but no file is written in those cases. I don't know how to pass and change to another path to the destination.

cwd = os.getcwd()
path = cwd + "/new"
df.to_csv(path)
Your file will be stored in the working directory with the name 'new'.

Writing dataframe to CSV file is documented here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html
In every python module there is a variable __file__ which has the absolute path for the current python module. You can use it to write CSV to where that python file is located.

You can join the path and pass it to to_csv method.
df.to_csv(os.path.join(cwd, "new"))

How to read a CSV from a folder without file name in Python

I need to read a CSV file from a folder, which is generating from another Module. If that Module fails it won't generate the folder which will have a CSV file.
Example:
path = 'c/files' --- fixed path
When Module successfully runs it will create a folder called output and a file in it.
path =
'c/files/output/somename.csv'
But here is a catch everytime it generates a output folder, CSV file has a different name.
First i need to check if that output folder and a CSV file is there or not.
After that I need to read that CSV.

The following will check for existance of output folder as well as csv file and read the csv file:
import os
import pandas as pd
if 'output' in os.listdir('c/files'):
if len(os.listdir('c/files/output')>0:
x=[i for i in os.listdir('c/files/output') if i[-3:]=='csv][0]
new_file=pd.read_csv(x)

glob.glob can help.
glob.glob('c/files/output/*.csv') returns either an empty list or a list with (hopefully) the path to a single file

You may also try to get the latest file based on creation time, after you have done check on existence of a directory (from above post). Something like below
list_of_files = glob.glob("c/files/*.csv")
latest_file = max(list_of_files, key=os.path.getctime)
latest_file is the latest created file in your directory.

how to get the name of an unknown .XLS file into a variable in Python 3.7

I'm using Python 3.7.
I have to download an excel file (.xls) that has a unique filename every time I download it into a specific downloads folder location.
Then with Python and Pandas, I then have to open the excel file and read/convert it to a dataframe.
I want to automate the process, but I'm having trouble telling Python to get the full name of the XLS file as a variable, which will then be used by pandas:
# add dependencies and set location for downloads folder
import os
import glob
import pandas as pd
download_dir = '/Users/Aaron/Downloads/'
# change working directory to download directory
os.chdir(download_dir)
# get filename of excel file to read into pandas
excel_files = glob.glob('*.xls')
blah = str(excel_files)
blah
So then for example, the output for "blah" is:
"['63676532355861.xls']"
I have also tried just using "blah = print(excel_files)" for the above block, instead of the "str" method, and assigning that to a variable, which still doesn't work.
And then the rest of the process would do the following:
# open excel (XLS) file with unknown filename in pandas as a dataframe
data_df = pd.read_excel('WHATEVER.xls', sheet_name=None)
And then after I convert it to a data frame, I want to DELETE the excel file.
So far, I have spent a lot of time reading about fnames, io, open, os.path, and other libraries.
I still don't know how to get the name of the unknown .XLS file into a variable, and then later deleting that file.
Any suggestions would be greatly appreciated.

This code finds an xls file in your specified path reads the xls file and deletes the file.If your directory contains more than 1 xls file,It reads the last one.You can perform whatever operation you want if you find more than one xls files.
import os
for filename in os.listdir(os.getcwd()):
if filename.endswith(".xls"):
print(filename)
#do your operation
data_df = pd.read_excel(filename, sheet_name=None)
os.remove(filename)

Check this,
lst = os.listdir()
matching = [s for s in lst if '.xls' in s]
matching will have all list of excel files.
As you are having only one excel file, you can save in variable like file_name = matching[0]

Creating a standalone file using Pandas code

I have little to no background in Python or computer science so I’ll try my best to explain what I want to accomplish. I have a Pandas script in Jupyter notebook that edits an Excel .csv file and exports it as an Excel .xlsx file. Basically the reason why we want to do this is because we get these same Excel spreadsheets full of unwanted and disorganized data from the same source. I want other people at my office that don’t have Python to be able to use this script to edit these spreadsheets. From what I understand, this involves creating a standalone file.
Here is my code from Pandas that exports a new spreadsheet:
import pandas as pd
from pandas import ExcelWriter
test = pd.DataFrame.from_csv('J:/SDGE/test.csv', index_col=None)
t = test
for col in ['Bill Date']:
t[col] = t[col].ffill()
T = t[t.Meter.notnull()]
T = T.reset_index(drop=True)
writer = ExcelWriter('PythonExport.xlsx')
T.to_excel(writer,'Sheet5')
writer.save()
How can I make this code into a standalone executable file? I've seen other forums with responses to similar problems, but I still don't understand how to do this.

First, you need to change some parts in your code to make it work for anybody, without the need for them to edit the Python code.
Secondly, you will need to convert your file to an executable (.exe).
There is only one part in your code that needs to be changed to work for everyone: the csv file name and directory
Since your code only works when the file "test.csv" is in the "J:/SDGE/" directory, you can follow one of the following solutions:
Tell everyone who uses the program that the file must be in a precise public directory and named "test.csv" in order to work. (bad)
Change your program to allow for input from the user. This is a little more complex, but is the solution that people probably want:
Add an import for a file selector at the top:
from tkinter.filedialog import askopenfilename
Replace
'J:/SDGE/test.csv'
With
askopenfilename()
This should be the final python script:
import pandas as pd
from pandas import ExcelWriter
from tkinter.filedialog import askopenfilename #added this
test = pd.DataFrame.from_csv(askopenfilename(), index_col=None)
t = test
for col in ['Bill Date']:
t[col] = t[col].ffill()
T = t[t.Meter.notnull()]
T = T.reset_index(drop=True)
writer = ExcelWriter('PythonExport.xlsx')
T.to_excel(writer,'Sheet5')
writer.save()
However, you want this as an executable program, that way others don't have to have python installed and know how to run the script. There are several ways to turn your new .py file into an executable. I would look into this thread.

If you want to run a python script on anyone's system, you will need to have Python installed in that system.
Once you have that, just create a .bat file for the command that you'd be using to execute the python file through CMD.
Step 1: Open Notepad and create a new file
Step 2: Write the command as follows in the file (Just replace the path and filename according to you)
python file.py
Step 3: Save it as script.bat (Select All Types from the list of file types while saving)
Now you can run that batch file as any other program and it will run the code for you. The only thing you need to make while you distribute this batch file and python script is to make sure that both the files are kept in the same location. Or else you will have to add the full path in front of file.py

"CSV file does not exist" for a filename with embedded quotes

I am currently learning Pandas for data analysis and having some issues reading a csv file in Atom editor.
When I am running the following code:
import pandas as pd
df = pd.read_csv("FBI-CRIME11.csv")
print(df.head())
I get an error message, which ends with
OSError: File b'FBI-CRIME11.csv' does not exist
Here is the directory to the file: /Users/alekseinabatov/Documents/Python/"FBI-CRIME11.csv".
When i try to run it this way:
df = pd.read_csv(Users/alekseinabatov/Documents/Python/"FBI-CRIME11.csv")
I get another error:
NameError: name 'Users' is not defined
I have also put this directory into the "Project Home" field in the editor settings, though I am not quite sure if it makes any difference.
I bet there is an easy way to get it to work. I would really appreciate your help!

Have you tried?
df = pd.read_csv("Users/alekseinabatov/Documents/Python/FBI-CRIME11.csv")
or maybe
df = pd.read_csv('Users/alekseinabatov/Documents/Python/"FBI-CRIME11.csv"')
(If the file name has quotes)

Just referring to the filename like
df = pd.read_csv("FBI-CRIME11.csv")
generally only works if the file is in the same directory as the script.
If you are using windows, make sure you specify the path to the file as follows:
PATH = "C:\\Users\\path\\to\\file.csv"

Had an issue with the path, it turns out that you need to specify the first '/' to get it to work!
I am using VSCode/Python on macOS

I also experienced the same problem I solved as follows:
dataset = pd.read_csv('C:\\Users\\path\\to\\file.csv')

Being on jupyter notebook it works for me including the relative path only. For example:
df = pd.read_csv ('file.csv')
But, for example, in vscode I have to put the complete path:
df = pd.read_csv ('/home/code/file.csv')

You are missing '/' before Users. I assume that you are using a MAC guessing from the file path names. You root directory is '/'.

I had the same issue, but it was happening because my file was called "geo_data.csv.csv" - new laptop wasn't showing file extensions, so the name issue was invisible in Windows Explorer.
Very silly, I know, but if this solution doesn't work for you, try that :-)

Just change the CSV file name. Once I changed it for me, it worked fine. Previously I gave data.csv then I changed it to CNC_1.csv.

What worked for me:
import csv
import pandas as pd
import os
base =os.path.normpath(r"path")
with open(base, 'r') as csvfile:
readCSV = csv.reader(csvfile, delimiter='|')
data=[]
for row in readCSV:
data.append(row)
df = pd.DataFrame(data[1:],columns=data[0][0:15])
print(df)
This reads in the file , delimit by |, and appends to list which is converted to a pandas df (taking 15 columns)

Make sure your source file is saved in .csv format. I tried all the steps of adding the full path to the file, including and deleting the header=0, adding skiprows=0 but nothing works as I saved the excel file(data file) in workbook format and not in CSV format. so keep in mind to first check your file extension.

Adnane's answer helped me.
Here's my full code on mac, hope this helps someone. All my csv files are saved in /Users/lionelyu/Documents/Python/Python Projects/
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('ggplot')
path = '/Users/lionelyu/Documents/Python/Python Projects/'
aapl = pd.read_csv(path + 'AAPL_CLOSE.csv',index_col='Date',parse_dates=True)
cisco = pd.read_csv(path + 'CISCO_CLOSE.csv',index_col='Date',parse_dates=True)
ibm = pd.read_csv(path + 'IBM_CLOSE.csv',index_col='Date',parse_dates=True)
amzn = pd.read_csv(path + 'AMZN_CLOSE.csv',index_col='Date',parse_dates=True)

Run "pwd" command first in cli to find out what is your current project's direction and then add the name of the file to your path!

Try this
import os
cd = os.getcwd()
dataset_train = pd.read_csv(cd+"/Google_Stock_Price_Train.csv")

In my case I just removed .csv from the end. I am using ubuntu.
pd.read_csv("/home/mypc/Documents/pcap/s2csv")

Sometimes we ignore a little bit issue which is not a Python or IDE fault
its logical error
We assumed a file .csv which is not a .csv file its a Excell Worksheet file have a look
When you try to open that file using Import compiler will through the error
have a look
To Resolve the issue
open your Target file into Microsoft Excell and save that file in .csv format
it is important to note that Encoding is important because it will help you to open the file when you try to open it with
with open('YourTargetFile.csv','r',encoding='UTF-8') as file:
So you are set to go
now Try to open your file as this
import csv
with open('plain.csv','r',encoding='UTF-8') as file:
load = csv.reader(file)
for line in load:
print(line)
Here is the Output

What works for me is
dataset = pd.read_csv('FBI_CRIME11.csv')
Highlight it and press enter. It also depends on the IDE you are using. I am using Anaconda Spyder or Jupiter.

I am using a Mac. I had the same problem wherein .csv file was in the same folder where the python script was placed, however, Spyder still was unable to locate the file. I changed the file name from capital letters to all small letters and it worked.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

problem with reading a csv file with pandas in executable - python

Related

How to write dataframe to csv to the current working directory python

How to read a CSV from a folder without file name in Python

how to get the name of an unknown .XLS file into a variable in Python 3.7

Creating a standalone file using Pandas code

"CSV file does not exist" for a filename with embedded quotes

Categories

Resources