Python: pickle creates module import error - python

I have a python script which uses pickle that I've been running for weeks. I recently installed dropbox so that I could run the script on machine A, pickle the data to dropbox, and then load the data from dropbox onto machine B. So, I used to write to a file in the path of the script, now I write to a separate file synced by dropbox.
When I go to load the data, I get the following error:
data = pickle.load(f)
ModuleNotFoundError: No module named 'pandas._libs'
However, this prior line works fine:
import pandas as pd
In fact, if I run the script that's dumping (rather than loading) data, it also runs successfully.
I've also verified the path is correct using sys.path.
What could be the problem?

As mentioned in comments, this is a pandas version issue. Your pickle file was created by pickling objects containing a newer version or pandas, and the system unpickling that file contains an older version of pandas.
To be more precise, pandas._libs first appeared in:
commit 648ae4f03622d8eafe1ca3b833bd6a99f56bece4
Author: Jeff Reback
Date: Tue Mar 7 18:21:18 2017 -0500
BLD: consolidate remaining extensions
moves extensions to pandas/_libs, which holds the extension code
and also the generated builds (as its importable).
... which first appears in version 0.20. It stands to reason that your pickle file was created with a version of pandas >= 0.20, and the unpickle system has version < 0.20.

Related

Pandas Installed For All Python Versions But Module Can't Be Found

I am trying to modify an AI for a game on the steam store. The AI communicates through the game with the use of a mod called the communication mod. The AI is made using a python project. The package I am trying to modify is https://github.com/ForgottenArbiter/spirecomm and the mod is https://github.com/ForgottenArbiter/CommunicationMod.
I want to add the pandas package and the job lib package as imports so I can use a model I have made for the AI. When I try to run the game + mod after adding the pandas and joblib packages as imports I get this error in the error log.
Traceback (most recent call last):
File "/Users/ross/downloads/spirecomm-master/main.py", line 6, in <module>
from spirecomm.ai.agent import SimpleAgent
File "/Users/ross/Downloads/spirecomm-master/spirecomm/ai/agent.py", line 10, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
This issue only happens when the game is running and the mod tries to run. if I just run the file in terminal it is able to compile/run and send the ready signal
I have checked that I have these modules installed and it is installed. I am on an M1 Mac and I have several different versions of python installed but I have checked them all and it is installed for each of them. I have also opened the python package using pycharm and added pandas and joblib to the python interpreter as a package.
Another thing I have tried is modifying the setup.py file to say that pandas and joblib are required. I then ran it again but I am not sure if this had any effect because I have already run it before.
There is limited help that can be provided without knowing the framework that you are is using but hopefully this will give you some starting points to help.
If you are getting a "No module named 'pandas'" error, it is because you have imported pandas in your code but your python interpreter cannot find it. There are two major reasons this will happen, either it is not installed (which you say has definitely happened) or it is not in the location the interpreter expects (most likely).
The first thing you can do is make sure the pandas install is in the PYTHONPATH. To do this look at Permanently add a directory to PYTHONPATH?.
Secondly, you say you have several versions of python and have installed the module for all versions but you most likely have several instances of at least one version. Many IDEs, such as PyCharm, create a virtual environment when you create a new project and place in it a new instance of python interpreter, or more accurately a link to one version of python. Within this virtual environment, the IDE then loads the modules it has been told to load and thus makes them available for import to the code using that particular environment.
In your case I suspect you may be using a virtual environment that has not had pandas loaded into it. You need to investigate your IDEs documentation if you do not know how to load it. Alternatively you can instruct the IDE to use a different virtual environment (one that does have pandas loaded) - again search documentation for how to do this.
Finally, if all else fails, you can specifically tell your code where to look for the module using the sys.path.append command in your code.
import sys
sys.path.append('/your/pandas/module/path')
import pandas

Why is Python importing an older version of my module when a new version is in the working directory?

I have a module I have written called plu, which I have previously called in some directory with import plu.
However, I have since created a new version of plu in a new directory which is my current working directory (though the old one still exists in its old directory).
For some reason, when I call import plu instead of using the version in the current working directory, python is loading the old module.
Why is this and how can I force python to
Stop remembering module locations and making them global variables?
Load the correct version of the module?
Shadowcoder has provided an answer as a comment, which is deleting pycache from the previous folder and then restarting the kernel.

How to read pickle files created by another python version?

I often use pickle files to store my dataset. Currently I encounter a trouble. In my local desktop, the python version is python 3.8. However, In the remote server, the python version is python 3.7. I need to work in the remote server. So
I want to know how to read the pickle files created by the python 3.8?
import pandas as pd
df=pd.read_pickle('FUND_AREACLASS.pkl')
The report error is here:
File "C:\ProgramData\Anaconda3_new\lib\site-packages\pandas\io\pickle.py", line 181, in read_pickle
return pickle.load(f)
ValueError: unsupported pickle protocol: 5
Can anybody help me figure it out? Thanks in advance.

hdf5 made with h5py py2 corrupted after opening with h5py in py3

Problem
I have a file created with h5py in python 2.7.
These steps lead to a corruption:
I download a fresh copy of it from a collaborator using scp. It is whole and 286MB.
I check that it is readable by opening it with hdfview. This shows all the datasets and groups properly.
I exit hdfview.
Repeat steps 2 and 3 to ensure hdfview is not corrupting the file.
I open ipython 3.6 and,
import h5py
f = h5py.File(filename,'r')
g = f['/sol000']#one group that should be there
I get KeyError: "Unable to open object (Object 'sol000' doesn't exist)"
I f.close() and exit ipython. I again open it with hdfview and the entire structure is gone. The file is now 4KB.
I am able to open the file in python 2 hdf5 and access all the datasets, but must use python 3 for my code.
Systems
File created on Fedora 24 64-bit, python 2.7, hdf5 2.7.0
System trying to read it on Fedora 25 64-bit python 3.6, h5py 2.7.0
Minimal code showing should work
On system 1:
import h5py
import numpy as np
f = h5py.File("file.hdf5","w")
f.create_dataset("/sol000/data",(100,100),dtype=float)
f["/sol000/data"] = np.zeros([100,100],dtype=float)
f.close()
On system 2: Do steps 1-4.
import h5py
f = h5py.File("file.hdf5","r")
f.visit(lambda *x:print(x))
#(sol000/data,)
f.close()
The solution was to enforce libver=earliest. I.e. the following code worked to open the file:
import h5py
f = f.File("file.hdf5","r",libver="earliest")
I've discovered a possible inconsistency in h5py documentation.
It claims that
The “earliest” option means that HDF5 will make a best effort to be
backwards compatible.
The default is “earliest”.
This can't be true if it only works when I explicitly set it.
My collaborator, it turns out, created the corruptable file with an older version of hdf5 C library.

Issue with using protobufs with python ImportError: cannot import name descriptor_pb2

Context
Steps taken:
Environment Setup
I've installed protobufs via Home Brew
I've also followed the steps in the proto-bufs python folder's readme on installing python protobufs - namely running the python setup.py install command
I've using the protobuf-2.4.1 files
Coding
I have a python file (generated from a .proto file I compiled) that contains the statement, among other import statements, but I believe this one is the one causing issues:
from google.protobuf import descriptor_pb2
The above python file, I'm importing in another python file, it's
this python file that I want to write up logic for parsing the
protobufs data files I receive
Error received
I get this error when running that file:
Steps taken to fix
Searched google for that error - didn't find much
Looked at this question/answer Why do I see "cannot import name descriptor_pb2" error when using Google Protocol Buffers?
I don't really understand the above questions selected answer,I tried to run the command in the above answer protoc descriptor.proto --python_out=gen/ by coping and pasting it in the terminal in different places but couldn't get it to work
Question
How do I fix this error?
What is the underlying cause?
How do I check if the rest of my protobuf python compiler/classes are set up correctly?
I've discovered the issue. I had not run the python install instructions the first time I tried to compile this file. I recompiled the file and this issue was fixed.

Categories