I used pyinstaller to generate an .exe file from a Python script I made for converting a PDF into a dataframe that can be manipulated and outputted as a PDF again.
I originally had an issue with tabula not working with pyinstaller, but managed to edit the spec file to solve the issue. However, now I'm dealing with a FileNotFoundError.
Traceback (most recent call last):
File "room_list.py", line 7, in <module>
File "tabula\io.py", line 314, in read_pdf
FileNotFoundError: [Errno 2] No such file or directory: 'room_list.pdf'
[8608] Failed to execute script 'room_list' due to unhandled exception!
I do not have this issue on the PC that I ran pyinstaller on. When I transferred the exe to another PC and tried to run the program using the same file directory setup (desktop with room_list.pdf file), I got the above error. The room_list.pdf file is present where it should be, so I don't know why it's saying that no such file or directory exists.
I did run PyInstaller on a Windows 11 PC and the PC is Windows 10. Not sure if that could be an issue.
Here's the Python code:
import tabula as tb
import pandas as pd
import os
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
df_list = tb.read_pdf('room_list.pdf', pages='all')
concat_list = []
for df in df_list:
new_df = df.iloc[3:]
new_df.columns = df.iloc[2]
concat_list.append(new_df)
df_all = pd.concat(concat_list)
df_all.dropna(inplace=True)
def stay_over_status(row):
if row['Occ Status'] == 'In-House':
return 'X'
else:
return ' '
def check_out_status(row):
if row['Occ Status'] == 'Due-Out':
return 'X'
else:
return ' '
df_all['S/O'] = df_all.apply(stay_over_status, axis=1)
df_all['C/O'] = df_all.apply(check_out_status, axis=1)
df_all['OUT'] = ''
result = df_all.drop(columns=['Room Type', 'Occ Status', 'Condition', 'Guest Name', 'Arrives', 'Departs'])
fig, ax =plt.subplots(figsize=(12,4))
ax.axis('tight')
ax.axis('off')
plt.tight_layout()
the_table = ax.table(cellText=result.values,colLabels=result.columns,loc='center', colWidths=[0.1 for x in result.columns], cellLoc='center')
pp = PdfPages("maid_list.pdf")
pp.savefig(fig, bbox_inches='tight')
pp.close()
# os.startfile("maid_list.pdf", "print")
Possible solution:
I experimented with a virtual copy of Windows 11, where I faced the same problem I described above. I had more freedom to work around with things, so I tried several different solutions. I put the full path directory specific for the PC. Instead of 'room_list.pdf' I tried C:\Users\User\Desktop\room_list.pdf'. That seemed to solve the FileNotFoundError for room_list.pdf, but I ran into another error. I forgot to note down what it was, but it had to do with tabula and requiring Java. Another post online mentioned installing Java to resolve this issue, which I did. Error was gone, but the maid_list.pdf was not being generated. Again, I went back and put the full directory specific to the virtual machine Desktop directory. This solved the problem, and I was able to get the maid_list.pdf without error.
I will eventually deploy this on the other PC I wanted this work on and will update this, but I'm assuming this will now work. I was hoping that I would not have to install Java or any program in the first place, but if there is no other alternative I will end up just doing that. If there is another way of converting the PDF to a Dataframe without the use of tabula and Java, I would love to hear about it. I'd also like to avoid creating a new exe every time I need to deploy this on another PC that has a different user account folder. If someone has a solution to this, I would greatly appreciate that too. One of the main reasons I went about trying to create an exe file was so that I could easily deploy this on multiple computers without the need of having to install and configure different things.
Related
I'm new to Jupyter Notebook (generally new to programming).
I already tried searching for similar problems but haven't found the solution.
I get this error:
FileNotFoundError: [Errno 2] No such file or directory: 'data/folder/filename.csv'
when trying to run
df = pd.read_csv('data/folder/filename.csv')
The file filename.csv is in the same directory as the notebook I'm using.
Other people (co-learners) who used this notebook were able to run this without any error.
My workaround was by removing the "data/folder/" and just running
df = pd.read_csv('filename.csv')
However, there's now a more complicated one that I have to run:
#set keyword
KEYWORD1='rock'
# read and process the playlist data for keyword
df = pd.read_csv('data/folder/'+KEYWORD1+'filename.csv')\
.merge(pd.read_csv('data/folder/'+KEYWORD1+'filename.csv')\
[['track_id','playlist_id','playlist_name']],\
on='track_id',how='left')
I don't know the workaround for this one. Also, the other people who ran this notebook didn't experience any errors I had. We've installed the same requirements and we've been using jupyter notebook for many days and this is the first time I had an error they (the whole other group) didn't have. Any thoughts on how I can resolve this? Thank you!
The error is most probably due to the directory where the jupyter notebook command is running, but a workaround for your code will be:
#set keyword
KEYWORD1='rock'
# read and process the playlist data for keyword
df = pd.read_csv(KEYWORD1+'filename.csv')\
.merge(pd.read_csv(KEYWORD1+'filename.csv')\
[['track_id','playlist_id','playlist_name']],\
on='track_id',how='left')
I am trying to create various Excel UDFs in python by using xlwings. My UDFs rely on values that are pulled from an HDF file. However, every time I click the "Import Functions" button in Excel, I receive an error. Below is an example.
import pandas as pd
import numpy as np
import xlwings as xw
matrix1 = pd.DataFrame(np.random.random(size = (1000, 1000)))
matrix2 = pd.DataFrame(np.random.random(size = (1000, 100)))
matrix1.to_hdf('matrix.h5', key = 'mat1', mode = 'w')
matrix2.to_hdf('matrix.h5', key = 'mat2', mode = 'a')
arg = pd.read_hdf('matrix.h5', key = 'mat2', mode = 'r')
#xw.func
def dummy(x, y):
return 17
When I click on the "Import Functions" button in the xlwings ribbon in Excel, I receive the following
If I try to run the program with Spyder, I have no issues and can generate the HDF files just fine.
Interestingly, if I remove the lines where I write the HDF file, and just leave the one where I read it, I get an error saying
FileNotFoundError: File matrix.h5 does not exist ...
Even though I have confirmed that the file does exist. If I run the same code in Spyder, I have no issues, it works fine.
Is there some kind of compatibility issue with xlwings and HDF files, or am I missing something?
Can't see xlwings being used for anything in the examples. It is however true that the PyTables is required. Try running pip install tables to install it.
When loading a dataset into Jupyter, I know it requires lines of code to load it in:
from tensorflow.contrib.learn.python.learn.datasets import base
# Data files
IRIS_TRAINING = "iris_training.csv"
IRIS_TEST = "iris_test.csv"
# Load datasets.
training_set = base.load_csv_with_header(filename=IRIS_TRAINING,
features_dtype=np.float32,
target_dtype=np.int)
test_set = base.load_csv_with_header(filename=IRIS_TEST,
features_dtype=np.float32,
target_dtype=np.int)
So why is ther error NotFoundError: iris_training.csv
still thrown? I feel as though there is more to loading data sets on to jupyter, and would be grateful on any help on this topic
I'm following a course through AI adventures, and dont know how to add in the .csv file; the video mentions nothing about how to add it on.
Here is the link: https://www.youtube.com/watch?v=G7oolm0jU8I&list=PLIivdWyY5sqJxnwJhe3etaK7utrBiPBQ2&index=3
The issue is that you either need to use file's absolute path i.e C:\path_to_csv\iris_training.csv for windows and for UNIX/Linux /path_to_csv/iris_training.csv or you will need to place the file in your notebook workspace i.e directory that is being listed in your Jupyter UI which can be found at http://localhost:8888/tree Web UI. If you are having trouble finding the directory then just execute below python code and place the file in the printed location
import os
cwd = os.getcwd()
print(cwd)
Solution A
if you are working with python you can use python lib pandas to import your file .csv using:
import pandas as pd
IRIS_TRAINING = pd.read_csv("../iris_training.csv")
IRIS_TEST = pd.read_csv("../iris_test.csv")
Solution B
import numpy as np
mydata = np.genfromtxt(filename, delimiter=",")
Read More About
python-pandas
Read More About
python-Numpy
I'm trying to import the shapefile "Metropolin_31Jul_0921.shp" to python using the following code:
import shapefile
stat_area_df = shapefile.Reader("Metropolin_31Jul_0921.shp")
but i keep getting this error:
File "C:\Users\maya\Anaconda3\lib\site-packages\shapefile.py", line 291,
in load
raise ShapefileException("Unable to open %s.dbf or %s.shp." %
(shapeName, shapeName) )
shapefile.ShapefileException: Unable to open Metropolin_31Jul_0921.dbf
or Metropolin_31Jul_0921.shp.
Does anyone know what it means?
I tried adding the directory but it didn't help.
Make sure that the directory which the shapefile is located in, includes all of the supporting files such as .dbf, .shx, etc. the .shp will not work without these supporting files.
I am a little new to Python, and I have been using the Jupyter Notebook through Anaconda. I am trying to import a csv file to make a DataFrame, but I am unable to import the file.
Here is an attempt using the local method:
df = pd.read_csv('Workbook1')
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-11-a2deb4e316ab> in <module>()
----> 1 df = pd.read_csv('Workbook1')
After that I tried using the path (I put user for my username)
df = pd.read_csv('Users/user/Desktop/Workbook1.csv')
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-13-3f2bedd6c4de> in <module>()
----> 1 df = pd.read_csv('Users/user/Desktop/Workbook1.csv')
I am using a Mac, which I am also new to, and I am not 100% sure if I am correctly importing the right path. Can anyone offer some insight or solutions that would allow me to open this csv file.
Instead of providing path, you can set a path using the code below:
import os
import pandas as pd
os.chdir("D:/dataset")
data = pd.read_csv("workbook1.csv")
This will surely work.
Are you sure that the file exists in the location you are specifying to the pandas read_csv method? You can check using the os python built in module:
import os
os.path.isfile('/Users/user/Desktop/Workbook1.csv')
Another way of checking if the file of interest is in the current working directory within a Jupyter notebook is by running ls -l within a cell:
ls -l
I think the problem is probably in the location of the file:
df1 = pd.read_csv('C:/Users/owner/Desktop/contacts.csv')
Having done that, now you can play around with the big file if you have, and create useful data with:
df1.head()
The OS module in python provides functions for interacting with the operating system. OS, comes under Python’s standard utility modules.
import os
import pandas as pd
os.chdir("c:\Pandas")
df=pd.read_csv("names.csv")
df
This might help. :)
The file name is case sensitive, so check your case.
I had the same problem on a Mac too and for some reason it only happened to me there. And I tried to use many tricks but nothing works. I recommend you go directly to the file, right click and then press “alt” key after that the option to “copy route” will appear, and just paste it into your jupyter. For some reason that worked to me.
I believe the issue is that you're not using fully qualified paths. Try this:
Move the data into a suitable project directory. You can do this using the %%bash Magic commands.
%%bash
mkdir -p /project/data/
cp data.csv /project/data/data.csv
You can read the file
f = open("/project/data/data.csv","r")
print(f.read())
f.close()
But it might be most useful to load it into a library.
import pandas as pd
data = pd.read_csv("/project/data/data.csv")
I’ve created a runnable Jupyter notebook with more details here: Jupyter Basics: Reading Files.
Try double quotes, instead of single quotes. it worked for me.
you can open csv files in Jupyter notebook by following these easy steps-
Step 1 - create a directory or folder (you can also use old created folder)
Step 2 - Change your Jupyter working directory to that created directory -
import os
os.chdir('D:/datascience/csvfiles')
Step 3 - Now your directory is changed in Jupyter Notebook. Store your file(s) in that directory.
Step 4 - Open your file -
import pandas as pd
df = pd.read_csv("workbook1.csv")
Now your file is read and stored in a Data Frame variable df, you can display this file content by following
df.head() - display first five rows of this file
df - display all rows of this file
Happy Data Science!
There was a similar problem for me while reading a CSV file in Jupyter notebook from the computer.
I solved it by substituting the "" symbol with "/" in the path like this.
This is what I had:
"C:\Users\RAJ\Desktop\HRPrediction\HRprediction.csv"
This is what I changed it for:
"C:/Users/RAJ/Desktop/HRPrediction/HRprediction.csv".
This is what worked for me. I am using Mac OS.
Save your CSV on a separate folder on your desktop.
When opening a Jupyter notebook press on the same folder that your dataset is currently saved in. Press new notebook in the upper right hand corner.
After opening a new notebook. Code as per usual and read your data using import pandas as pd and pd.read_csv calling to your dataset.
No need to use anything extra just use r in front of the location.
df = pd.read_csv(r'C:/Users/owner/Desktop/contacts.csv'