Creating a standalone file using Pandas code

Creating a standalone file using Pandas code - python

I have little to no background in Python or computer science so I’ll try my best to explain what I want to accomplish. I have a Pandas script in Jupyter notebook that edits an Excel .csv file and exports it as an Excel .xlsx file. Basically the reason why we want to do this is because we get these same Excel spreadsheets full of unwanted and disorganized data from the same source. I want other people at my office that don’t have Python to be able to use this script to edit these spreadsheets. From what I understand, this involves creating a standalone file.
Here is my code from Pandas that exports a new spreadsheet:
import pandas as pd
from pandas import ExcelWriter
test = pd.DataFrame.from_csv('J:/SDGE/test.csv', index_col=None)
t = test
for col in ['Bill Date']:
t[col] = t[col].ffill()
T = t[t.Meter.notnull()]
T = T.reset_index(drop=True)
writer = ExcelWriter('PythonExport.xlsx')
T.to_excel(writer,'Sheet5')
writer.save()
How can I make this code into a standalone executable file? I've seen other forums with responses to similar problems, but I still don't understand how to do this.

First, you need to change some parts in your code to make it work for anybody, without the need for them to edit the Python code.
Secondly, you will need to convert your file to an executable (.exe).
There is only one part in your code that needs to be changed to work for everyone: the csv file name and directory
Since your code only works when the file "test.csv" is in the "J:/SDGE/" directory, you can follow one of the following solutions:
Tell everyone who uses the program that the file must be in a precise public directory and named "test.csv" in order to work. (bad)
Change your program to allow for input from the user. This is a little more complex, but is the solution that people probably want:
Add an import for a file selector at the top:
from tkinter.filedialog import askopenfilename
Replace
'J:/SDGE/test.csv'
With
askopenfilename()
This should be the final python script:
import pandas as pd
from pandas import ExcelWriter
from tkinter.filedialog import askopenfilename #added this
test = pd.DataFrame.from_csv(askopenfilename(), index_col=None)
t = test
for col in ['Bill Date']:
t[col] = t[col].ffill()
T = t[t.Meter.notnull()]
T = T.reset_index(drop=True)
writer = ExcelWriter('PythonExport.xlsx')
T.to_excel(writer,'Sheet5')
writer.save()
However, you want this as an executable program, that way others don't have to have python installed and know how to run the script. There are several ways to turn your new .py file into an executable. I would look into this thread.

If you want to run a python script on anyone's system, you will need to have Python installed in that system.
Once you have that, just create a .bat file for the command that you'd be using to execute the python file through CMD.
Step 1: Open Notepad and create a new file
Step 2: Write the command as follows in the file (Just replace the path and filename according to you)
python file.py
Step 3: Save it as script.bat (Select All Types from the list of file types while saving)
Now you can run that batch file as any other program and it will run the code for you. The only thing you need to make while you distribute this batch file and python script is to make sure that both the files are kept in the same location. Or else you will have to add the full path in front of file.py

Related

Python add path of data directory

I want to add a path to my data directory in python, so that I can read/write files from that directory without including the path to it all the time.
For example I have my working directory at /user/working where I am currently working in the file /user/working/foo.py. I also have all of my data in the directory /user/data where I want to excess the file /user/data/important_data.csv.
In foo.py, I could now just read the csv with pandas using
import pandas as pd
df = pd.read_csv('../data/important_data.csv')
which totally works. I just want to know if there is a way to include /user/data as a main path for the file so I can just read the file with
import pandas as pd
df = pd.read_csv('important_data.csv')
The only idea I had was adding the path via sys.path.append('/user/data'), which didnt work (I guess it only works for importing modules).
Is anyone able to provide any ideas if this is possible?
PS: My real problem is of course more complex, but this minimal example should be enough to handle my problem.

It looks like you can use os.chdir for this purpose.
import os
os.chdir('/user/data')
See https://note.nkmk.me/en/python-os-getcwd-chdir/ for more details.

If you are keeping everything in /user/data, why not use f-strings to make this easy? You could assign the directory to a variable in a config and then use it in the string like so:
In a config somewhere:
data_path = "/user/data"
Reading later...
df = pd.read_csv(f"{data_path}/important_data.csv")

problem with reading a csv file with pandas in executable

i'm writing a software that reads a csv file at after some steps creates another csv file as output, the software is working fine but when i try to create an executable with pyinstaller i have an error saying that my software can't find the input csv file. Here is how i am reading the csv file as input, i've also tryed to change the pathname with no luck:
import pandas as pd
def lettore():
RawData = pd.read_csv('rawdata.csv', sep=';')
return RawData
how can i solve the problem?

Your code searches for the file it the same folder where the exe is launched.
It is equivalent to
import os
import pandas
filepath = os.path.join(os.getcwd(), 'filename.csv')
df = pd.read_csv(filepath)
Do not use relative paths when you create an exe.
I can give you two other options:
Use an input to get the right file path when running the exe (or eventually use argparse).
filepath = input("insert your csv: ")
df = pd.read_csv(filepath)
Define an absolute path and build it in your code (you cannot change it after building and the program will read the file only from that path).
Edit: after reading your comment, see also
How to reliably open a file in the same directory as a Python script

Reading an excel file into python which is present at different network location

I am trying to write a code in jupyter noteboks which reads an excel file and creates a dataframe out of it. The catch in this problem is that the file is not present in same location but on a different network drive. Say, my Python runs in C drive but my excel file is present in M network drive.
I have tried pd.read_excel(r'path') command but it throws an error of-"no such file or directory"

not sure if this is what you are looking for , and the syntax for jupyter notebooks is a little different, but try changing the path before reading the file?
import os
import pandas as pd
path = os.chdir(r"M:\path")
df = pd.read_excel('filename')

Adding a path to pandas to_csv function

I have a small chunk of code using Pandas that reads an incoming CSV, performs some simple computations, adds a column, and then turns the dataframe into a CSV using to_csv.
I was running it all in a Jupyter notebook and it worked great, the output csv file would be there right in the directory when I ran it. I have now changed my code to be run from the command line, and when I run it, I don't see the output CSV files anywhere. The way that I did this was saving the file as a .py, saving it into a folder right on my desktop, and putting the incoming csv in the same folder.
From similar questions on stackoverflow I am gathering that right before I use to_csv at the end of my code I might need to add the path into that line as a variable, such as this.
path = 'C:\Users\ab\Desktop\conversion'
final2.to_csv(path, 'Combined Book.csv', index=False)
However after adding this, I am still not seeing this output CSV file in the directory anywhere after running my pretty simple .py code from the command line.
Does anyone have any guidance? Let me know what other information I could add for clarity. I don't think sample code of the pandas computations is necessary, it is as simple as adding a column with data based on one of my incoming columns.

Join the path and the filename together and pass that to pd.to_csv:
import os
path = 'C:\Users\ab\Desktop\conversion'
output_file = os.path.join(path,'Combined Book.csv')
final2.to_csv(output_file, index=False)

Im pretty sure that you have mixed up the arguments, as shown here. The path should include the filename in it.
path = 'C:\Users\ab\Desktop\conversion\Combined_Book.csv'
final2.to_csv(path, index=False)
Otherwise you are trying to overwrite the whole folder 'conversions' and add a complicated value separator.

I think below is what you are looking for , absolute path
import pandas as pd
.....
final2.to_csv('C:\Users\ab\Desktop\conversion\Combined Book.csv', index=False)
OR for an example:
path_to_file = "C:\Users\ab\Desktop\conversion\Combined Book.csv"
final2.to_csv(path_to_file, encoding="utf-8")

Though late answer but would be useful for someone facing similar issues. It is better to dynamically get the csv folder path instead of hardcoding it. We can do so using os.getcwd(). Later join the csv folder path with csv file name using os.path.join(os.getcwd(),'csvFileName')
Example:
import os
path = os.getcwd()
export_path = os.path.join(path,'Combined Book.csv')
final2.to_csv(export_path, index=False, header=True)

opoening data using pandas in python 3.7

hi i tried to open some data that i downloaded to my documents using pandas with python 3.7
but it doesnt work
this is my code :
import pandas as pd
users=pd.read_csv("ml-100k/u.user",sep="|",names=["User ID","Age","Gender",
"aciation" ,"zipcode"])
user.head()
the eror is :
FileNotFoundError: File b'ml-100k/u.user' does not exist
how can it be that the file doesnt exist if i downloaded it ?
thaks:)

It seems your issue is that your data file is not in the path of your python session. There are a few ways to fix this.
First, your file has .user extension. I believe it should be a .csv extension for pd.read_csv(). Rename the file to make sure the extension and the name of the file are correct. I also advise to make the filename code friendly, substitute whitespaces for _ or - and remove non-alphanumeric characters #*/().
One solution is to provide the full path to the pd.read_csv() function.
pd.read_csv("/home/user/folder/file_name.csv",
sep="|",names=["User ID","Age","Gender","aciation" ,"zipcode"])
If you are using ipython or jupyter notebook you can navigate to the same folder where your file is at with cd path_to_file_folder command and simply pass the file name to the command:
pd.read_csv("file_name.csv",sep="|",
names=["User ID","Age","Gender","aciation" ,"zipcode"])
For more robust solutions check this discussion.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating a standalone file using Pandas code - python

Related

Python add path of data directory

problem with reading a csv file with pandas in executable

Reading an excel file into python which is present at different network location

Adding a path to pandas to_csv function

opoening data using pandas in python 3.7

Categories

Resources