I'm writing code in a Jupyter Notebook that involves cleaning and analyzing a large amount of consumer data. I'm trying to use dill to save the dataframes with thousands of rows so I don't have to run the code every time I want to make an adjustment, so dill seems like the perfect package to do so... Except I'm getting this error when attempting to pickle the notebook:
AttributeError: module 'dill' has no attribute 'dump_session'
Let me know if the program code is necessary - I don't think it should make a difference. The imports are:
import numpy as np
import pandas as pd
import dill
import scipy
from matplotlib import pyplot as plt
from __future__ import division
from collections import OrderedDict
from sklearn.cluster import KMeans
pd.options.display.max_columns = None
and when I run this code I get the error from above:
dill.dump_session('recengine.db')
Is there another package that's interfering with dill's use of pickle vs. cpickle?
Related
My attempt to use rpy2 in a Jupyter (iPython) notebook fails at the point where I wish to use tidyr::nest() to make a dataframe that has some elements which are a series. (This cannot be avoided, it is necessary for the next step of the analysis.)
After trying many things (including advice from the package home I've managed to get most of the way to the end, failing to understand rpy2 with tidyr::nest at the last step. (Some of the following example may be superfluous.)
How can I fix this?
% Import packages
from rpy2 import robjects
from rpy2.robjects import Formula, Environment
from rpy2.robjects.vectors import IntVector, FloatVector
from rpy2.robjects.lib import grid
from rpy2.robjects.packages import importr, data
from rpy2.rinterface_lib.embedded import RRuntimeError
import warnings
base = importr('base')
datasets = importr('datasets')
from functools import partial
import rpy2.robjects.lib.tidyr as tidyr
from collections import OrderedDict
from rpy2.robjects.vectors import (StrVector,
IntVector)
from rpy2.robjects.lib.tidyr import DataFrame
from rpy2.robjects import rl
tidyr = importr('tidyr')
dplyr = importr('dplyr')
% Obtain the R dataset "iris"
iris = data(datasets).fetch('iris')['iris']
% Use only two columns necessary for the "tidyr::nest" trial
dataf = (
DataFrame(iris)
.select(base.c(1,5))
)
print(dataf.head())
% Failure occurs below
irisNested = tidyr.nest(dataf, data=rl('Sepal.Length'))
EDIT: This isn't a question anymore, as it turns out my original effort was correct. A fragment of another attempt to solve the problem caused it to fail. However, I asked the question as I didn't find the various sources of help available provided a clear way to address my issue, I felt I made a lucky educated guess.
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
pandas2ri.activate()
Removing this solves the problem.
In the following code I'm getting errors when trying to call librosa.grifflim, telling me the attribute does not exist.
import os
from matplotlib import pyplot as plt
import librosa
import librosa.display
import IPython.display as ipd
import numpy as np
import cv2
S = cv2.imread('spectrograms/CantinaBand60.wav10.jpg')
D = librosa.amplitude_to_db(np.abs(S), ref=np.max)
signal = librosa.griffinlim(D)
sf.write('test.wav', signal, 352000)
I've upgraded librosa, and I still encounter the error. The documentation page for this function no longer seems to exist either. I've also tried import just that module using librosa.griffinlim but it continues to tell me this module doesn't exist. Was this function removed during a recent version? If so, is there another function I can use to apply the griffin lim algorithm?
librosa.griffinlim was introduced in librosa 0.7.0. So you need to have that version or later. You can check this using the following code.
import librosa; print(librosa.__version__)
The following code works-
import sklearn.linear_model
clf= sklearn.linear_model.LogisticRegressionCV()
The following code does not work-
import sklearn
clf= sklearn.linear_model.LogisticRegressionCV()
whereas in case of Numpy, the following also works
import numpy as np
np.random.randint()
Why is that? Please elaborate.
When attempting to pass my RNN call, I call tf.nn.rnn_cell and I receive the following error:
AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'rnn_cell'
Which is odd, because I'm sure I imported everything correctly:
from __future__ import print_function, division
from tensorflow.contrib import rnn
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
But looking at the docs, things have moved around between tensorflow versions.
what would you all recommend to fix this??
Line, I'm getting the error against:
state_per_layer_list = tf.unstack(init_state, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(state_per_layer_list[idx][0], state_per_layer_list[idx][1])
for idx in range(num_layers)]
)
Specifically:
tf.nn.rnn_cell
I'm using anaconda 3 to manage all of this so, the dependancies should all be taken care of. I have already tried working around a damn rank/shape error with Tensor shapes which took ages to resolve.
Cheers in advance.
Replace tf.nn.rnn_cell with tf.contrib.rnn
Since version 1.0, rnn implemented as part of the contrib module.
More information can be found here
https://www.tensorflow.org/api_guides/python/contrib.rnn
I'm trying to import a series of modules into my Python 3.5 code. I use the following code to import:
# import packages for analysis and modeling
import pandas as pd # data frame operations; use pandas 0.18
from pandas.tools.rplot import RPlot, TrellisGrid, GeomPoint, \
ScaleRandomColour # trellis/lattice plotting
import numpy as np # arrays and math functions
from scipy.stats import uniform # for training-and-test split
import statsmodels.api as sm # statistical models (including regression)
import statsmodels.formula.api as smf # R-like model specification
import matplotlib.pyplot as plt # 2D plotting
When i use this code, I receive the following error:
ImportError Traceback (most recent call last)
/var/folders/zy/snhf2bh51v33ny6nf7fyr4wh0000gn/T/tmpdxMQ0Y.py in <module>()
7 # import packages for analysis and modeling
8 import pandas as pd # data frame operations; use pandas 0.18
----> 9 from pandas.tools.rplot import RPlot, TrellisGrid, GeomPoint, \
10 ScaleRandomColour # trellis/lattice plotting
11 import numpy as np # arrays and math functions
ImportError: No module named 'pandas.tools.rplot'
I tried this code with "pd" and with "pandas" written out. I confirmed that pandas was installed by manually typing in import pandas as pd and then confirming its existence by typing in "pd" and receiving the following message: <module 'pandas' from '/Users/me/Library/Enthought/Canopy/edm/envs/User/lib/python3.5/site-packages/pandas/__init__.py'>
What is causing this to happen?
Renaming it during import with as doesn't mean Python will be able to find the original module (pandas) when you use the name pd at a later import statement. Python will look for a module named pd which it will not find.
Since pd does not correspond to some module while pandas does, you'll need to use from pandas import tools in order to get it to work.