I want to apply Hough Transform on stock prices (array of numbers).
I read OpenCV and scikit-image docs and examples ,but got nothing how to apply the transformation to the arrays of numbers instead of images.
I created 2D array from data. First dimension is X(simply index of data) and second dimension is close prices.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import pywt as wt
from skimage.transform import (hough_line, hough_line_peaks,probabilistic_hough_line)
from matplotlib import cm
path = "22-31May-100Tick.csv"
df = pd.read_csv(path)
y = df.Close.values
x = np.arange(0,len(y),1)
data = []
for i in x:
a = [i,y[i]]
data.append(a)
data = np.array(data)
How is it possible to apply the transformation with OpenCV or sickit-image?
Thank you
Related
Help please!!
I've seen some anawers here, but they didn't help me. I need to reconstruct the initial data, having 2 matrixes and using first ten principal components. First matrix (Z) (X_reduced_417)- result of applying sklearn.decomposition.PCA. Second matrix (X_loadings_417) (F) is weight matrix. Answer is Initial data = Z*F+mean_matrix. How to use sklearn to find Z?
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn.datasets, sklearn.decomposition
df_loadings = pd.read_csv('X_loadings_417.csv', header=None)
df_reduced = pd.read_csv('X_reduced_417.csv', header=None) ```
import pandas as pd
import numpy as np
# Load the df_loadings and df_reduced matrices from the CSV files
df_loadings = pd.read_csv("X_loadings_417.csv", header=None)
df_reduced = pd.read_csv("X_reduced_417.csv", header=None)
# Convert the DataFrames to numpy arrays
F = df_loadings.values
Z = df_reduced.values
# The mean of the original data is needed to reconstruct the data
mean_matrix = np.mean(X, axis=0)
# Reconstruct the original data using the first ten principal components
X_reconstructed = Z[:,:10].dot(F[:10,:]) + mean_matrix
import rasterio as rio
from rasterio.plot import show
from sklearn import cluster
import matplotlib.pyplot as plt
import numpy as np
import glob
for filepath in glob.iglob('./dengue3/*.tiff'):
elhas_raster = rio.open(filepath)
elhas_arr = elhas_raster.read() # read the opened image
vmin, vmax = np.nanpercentile(elhas_arr, (5,95)) # 5-95% contrast stretch
# create an empty array with same dimension and data type
imgxyb = np.empty((elhas_raster.height, elhas_raster.width, elhas_raster.count), elhas_raster.meta['dtype'])
# loop through the raster's bands to fill the empty array
for band in range(imgxyb.shape[2]):
imgxyb[:,:,band] = elhas_raster.read(band+1)
#print(imgxyb.shape)
# convert to 1d array
img1d=imgxyb[:,:,:7].reshape((imgxyb.shape[0]*imgxyb.shape[1],imgxyb.shape[2]))
#print(img1d.shape)
Above code I am using to read the tiff images in a folder and get the arrays. However, the output is -
ValueError: cannot reshape array of size 6452775 into shape (921825,12)
Images are 12 band. I tried using 12 in place of 7 in the above code, but the code doesnt execute. How do I resolve this? Thank you for your time.
You have changed the size of the index you're trying to reshape, but not the reshape command parameter:
img1d=imgxyb[:,:,:7].reshape((imgxyb.shape[0]*imgxyb.shape[1],imgxyb.shape[2]))
This should be:
img1d=imgxyb[:,:,:7].reshape((imgxyb.shape[0]*imgxyb.shape[1],7))
My data file is shared in the following link.
We can plot this data using the following script.
import matplotlib as mpl
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
def read_datafile(file_name):
data = np.loadtxt(file_name, delimiter=',')
return data
data = read_datafile('mah_data.csv')
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.set_title("Data")
ax1.set_xlabel('t')
ax1.set_ylabel('s')
ax1.plot(x,y, c='r', label='My data')
leg = ax1.legend()
plt.show()
How can we detect peaks in python? I can't find a suitable peak detection algorithm in Python.
You can use the argrelextrema function in scipy.signal to return the indices of the local maxima or local minima of an array. This works for multi-dimensional arrays as well by specifying the axis.
from scipy.signal import argrelextrema
ind_max = argrelextrema(z, np.greater) # indices of the local maxima
ind_min = argrelextrema(z, np.less) # indices of the local minima
maxvals = z[ind_max]
minvals = z[ind_min]
More specifically, one can use the argrelmax or argrelmin to find the local maximas or local minimas. This also works for multi dimensional arrays using the axis argument.
from scipy.signal import argrelmax, argrelmin
ind_max = argrelmax(z, np.greater) # indices of the local maxima
ind_min = argrelmin(z, np.less) # indices of the local minima
maxvals = z[ind_max]
minvals = z[ind_min]
For more details, one can refer to this link: https://docs.scipy.org/doc/scipy/reference/signal.html#peak-finding
Try using peakutil (http://pythonhosted.org/PeakUtils/). Here is my solution to your question using peakutil.
import pandas as pd
import peakutils
data = pd.read_csv("mah_data.csv", header=None)
ts = data[0:10000][1] # Get the second column in the csv file
print(ts[0:10]) # Print the first 10 rows, for quick testing
# check peakutils for all the parameters.
# indices are the index of the points where peaks appear
indices = peakutils.indexes(ts, thres=0.4, min_dist=1000)
print(indices)
You should also checkout peak finding in scipy (https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks_cwt.html)
Try the findpeaks library.
pip install findpeaks
I can not find the data attached but suppose the data is a vector and stored in data:
import pandas as pd
data = pd.read_csv("mah_data.csv", header=None).values
# Import library
from findpeaks import findpeaks
# If the resolution of your data is low, I would recommend the ``lookahead`` parameter, and if your data is "bumpy", also the ``smooth`` parameter.
fp = findpeaks(lookahead=1, interpolate=10)
# Find peaks
results = fp.fit(data)
# Make plot
fp.plot()
# Results with respect to original input data.
results['df']
# Results based on interpolated smoothed data.
results['df_interp']
I am working with a series of images. I read them first and store in the list then I convert them to dataframe and finally I would like to implement Isomap. When I read images (I have 84 of them) I get 84x2303 dataframe of objects. Now each object by itself also looks like a dataframe. I am wondering how to convert all of it to_numeric so I can use Isomap on it and then plot it.
Here is my code:
import pandas as pd
from scipy import misc
from mpl_toolkits.mplot3d import Axes3D
import matplotlib
import matplotlib.pyplot as plt
import glob
from sklearn import manifold
samples = []
path = 'Datasets/ALOI/32/*.png'
files = glob.glob(path)
for name in files:
img = misc.imread(name)
img = img[::2, ::2]
x = (img/255.0).reshape(-1,3)
samples.append(x)
df = pd.DataFrame.from_records(samples)
print df.dtypes
print df.shape
Thanks!
I have MODIS atmospheric product. I used the code below to read the data.
%matplotlib inline
import numpy as np
from pyhdf import SD
import matplotlib.pyplot as plt
files = ['file1.hdf','file2.hdf','file3.hdf']
for n in files:
hdf=SD.SD(n)
lat = (hdf.select('Latitude'))[:]
lon = (hdf.select('Longitude'))[:]
sds=hdf.select('Deep_Blue_Aerosol_Optical_Depth_550_Land')
data=sds.get()
attributes = sds.attributes()
scale_factor = attributes['scale_factor']
data= data*scale_factor
plt.contourf(lon,lat,data)
The problem is, in some days, there are 3 data sets (as in this case, some days have four datasets) so I can not use hstack or vstack to merge these datasets.
My intention is to get the single array from three different data arrays.
I have also attached datafiles along with this link:https://drive.google.com/open?id=0B2rkXkOkG7ExYW9RNERaZU5lam8
your help will be highly appreciated.