pandas.DataFrame returns Series not a Dataframe

pandas.DataFrame returns Series not a Dataframe - python

I am working with a series of images. I read them first and store in the list then I convert them to dataframe and finally I would like to implement Isomap. When I read images (I have 84 of them) I get 84x2303 dataframe of objects. Now each object by itself also looks like a dataframe. I am wondering how to convert all of it to_numeric so I can use Isomap on it and then plot it.
Here is my code:
import pandas as pd
from scipy import misc
from mpl_toolkits.mplot3d import Axes3D
import matplotlib
import matplotlib.pyplot as plt
import glob
from sklearn import manifold
samples = []
path = 'Datasets/ALOI/32/*.png'
files = glob.glob(path)
for name in files:
img = misc.imread(name)
img = img[::2, ::2]
x = (img/255.0).reshape(-1,3)
samples.append(x)
df = pd.DataFrame.from_records(samples)
print df.dtypes
print df.shape
Thanks!

Related

How to reconstruct data from 10 components in Python using 2 matrixes after PCA?

Help please!!
I've seen some anawers here, but they didn't help me. I need to reconstruct the initial data, having 2 matrixes and using first ten principal components. First matrix (Z) (X_reduced_417)- result of applying sklearn.decomposition.PCA. Second matrix (X_loadings_417) (F) is weight matrix. Answer is Initial data = Z*F+mean_matrix. How to use sklearn to find Z?
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn.datasets, sklearn.decomposition
df_loadings = pd.read_csv('X_loadings_417.csv', header=None)
df_reduced = pd.read_csv('X_reduced_417.csv', header=None) ```

import pandas as pd
import numpy as np
# Load the df_loadings and df_reduced matrices from the CSV files
df_loadings = pd.read_csv("X_loadings_417.csv", header=None)
df_reduced = pd.read_csv("X_reduced_417.csv", header=None)
# Convert the DataFrames to numpy arrays
F = df_loadings.values
Z = df_reduced.values
# The mean of the original data is needed to reconstruct the data
mean_matrix = np.mean(X, axis=0)
# Reconstruct the original data using the first ten principal components
X_reconstructed = Z[:,:10].dot(F[:10,:]) + mean_matrix

Plotly Heatmap - Giving your header/index names

Following the tutorial at - https://plotly.com/python/heatmaps/
If you do the lines
import plotly.express as px
df = px.data.medals_wide(indexed=True)
You can see that the "header" row is named "medals" allowing it to be used as an id later. Similar for nations.
When I create my own pandas dataframe
df = pd.DataFrame(model_data, columns=model_names, index=test_names)
What would I have to add to get the equivalent of "medals" and "nations" from the previous example into my dataframe?

assuming you have a 2D array / list of data
as simple as building data frame in way you note
import plotly.express as px
import pandas as pd
import numpy as np
models = [f"model {n}" for n in range(4)]
tests = [f"test {n}" for n in range(10)]
px.imshow(pd.DataFrame(index=tests, data=np.random.uniform(1,3,(len(tests),len(models))), columns=models))

From numpy array to DICOM

My code reads a DICOM file, takes the pixel information to a numpy array then it modifies the numpy array. It uses lists because im trying to operate with multiple DICOM files at the same time.
I havent found any information on how to take my modified numpy array and make it a DICOM file again so i can use it outside Python.
#IMPORT
import cv2
import numpy as np
from matplotlib import pyplot as plt
import matplotlib.pyplot as plt
import SimpleITK as sitk
from glob import glob
import pydicom as dicom
data_path = "C:\\Users\\oliva\\Desktop\\Py tesis\\dicom\\"
output_path = working_path = "C:\\Users\\oliva\\Desktop\\Py tesis\\dicom1\\"
path = glob(data_path + '/*.dcm')
#Checks if we are in the correct path
print ("Total of %d DICOM images.\nFirst 5 filenames:" % len(path))
print ('\n'.join(path[:14]))
data_set = []
for element in path:
imagen=sitk.ReadImage(element)
#imagen = cv2.imread(element)
array_imagen = sitk.GetArrayViewFromImage(imagen)
array2_imagen=array_imagen[0]
imagen_array_norm = np.uint8(cv2.normalize(array2_imagen, None, 0, 255, cv2.NORM_MINMAX))
data_set.append(imagen_array_norm)
#Check
print(len(data_set))
print(type(data_set[1]))
plt.imshow(data_set[4], cmap=plt.cm.gray)
#Equalization
data_set_eq = equal(data_set)
print(len(data_set_eq))
print(type(data_set_eq[6]))
plt.imshow(data_set_eq[7], cmap=plt.cm.gray)
#Filtering
data_set_m = median(data_set)
print(len(data_set_m))
print(type(data_set_m[6]))
plt.imshow(data_set_m[8], cmap=plt.cm.gray)
#Functions
def equal(data):
data_set_eq = []
for element in data_set:
imagen_array_eq = cv2.equalizeHist(element)
data_set_eq.append(imagen_array_eq)
return data_set_eq
def median(data):
data_set_m = []
for element in data_set:
imagen_array_m =cv2.medianBlur(element,5)
data_set_m.append(imagen_array_m)
return data_set_m
I would like some enlightenment on how to produce a DICOM file from my modified numpy array.

You can convert the numpy array back to a SimpleITK image, and then write it out as Dicom. The code would look something like this:
for x in data_set:
img = sitk.GetImageFromArray(x)
sitk.WriteImage(img, "your_image_name_here.dcm")
From the file name suffix, SimpleITK knows to write Dicom.
Note that the filtering you are doing can be accomplished within SimpleITK. You don't really need to use OpenCV. Check out the following filters in SimpleITK: IntensityWindowingImageFilter, AdaptiveHistogramEqualizationFilter, and MedianImageFilter.
https://itk.org/SimpleITKDoxygen/html/classitk_1_1simple_1_1IntensityWindowingImageFilter.html
https://itk.org/SimpleITKDoxygen/html/classitk_1_1simple_1_1AdaptiveHistogramEqualizationImageFilter.html
https://itk.org/SimpleITKDoxygen/html/classitk_1_1simple_1_1MedianImageFilter.html

Hough Transform on arrays of coordinates(Stock prices)

I want to apply Hough Transform on stock prices (array of numbers).
I read OpenCV and scikit-image docs and examples ,but got nothing how to apply the transformation to the arrays of numbers instead of images.
I created 2D array from data. First dimension is X(simply index of data) and second dimension is close prices.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import pywt as wt
from skimage.transform import (hough_line, hough_line_peaks,probabilistic_hough_line)
from matplotlib import cm
path = "22-31May-100Tick.csv"
df = pd.read_csv(path)
y = df.Close.values
x = np.arange(0,len(y),1)
data = []
for i in x:
a = [i,y[i]]
data.append(a)
data = np.array(data)
How is it possible to apply the transformation with OpenCV or sickit-image?
Thank you

How to convert Image files to CSV with label

Test folder has folders named from 0 to 9. The 0-9 folders include respective handwritten digit images. I want to convert the images to a single test.csv file such that the first column gives the label of the digit (i.e 0-9) and the rest columns give the pixel value if image.
I created the csv but the first column for the label is being shown empty.
from scipy.misc import imread
import numpy as np
import pandas as pd
import os
import imageio
import glob
root = './test'
# go through each directory in the root folder given above
for directory, subdirectories, files in os.walk(root):
# go through each file in that directory
for file in files:
# read the image file and extract its pixels
im = imread(os.path.join(directory,file))
value = im.flatten()
value = np.hstack((directory[8:],value))
df = pd.DataFrame(value).T
df = df.sample(frac=1) # shuffle the dataset
with open('test.csv', 'a') as dataset:
df.to_csv(dataset, header=False, index=False)

from scipy.misc import imread
import numpy as np
import pandas as pd
import os
import imageio
import glob
import pathlib
v = []
for i,files in enumerate(pathlib.Path('./Train').glob('*/**/*.png')):
im = imread(files.as_posix())
value = im.flatten()
value = np.hstack((int(files.parent.name),value))
v.append(value)
df = pd.DataFrame(v)
df = df.sample(frac=1)
df.to_csv('train.csv',header=False,index=False)
This is how I corrected my code.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

pandas.DataFrame returns Series not a Dataframe - python

Related

How to reconstruct data from 10 components in Python using 2 matrixes after PCA?

Plotly Heatmap - Giving your header/index names

From numpy array to DICOM

Hough Transform on arrays of coordinates(Stock prices)

How to convert Image files to CSV with label

Categories

Resources