Use Spreadsheet Names as Variables in Pandas

Use Spreadsheet Names as Variables in Pandas - python

I have different models and initialValues stored in different sheets in an Excel File called RateMatrix1. My models are WIN, WSW, WPA, WFR... and my initialValues are WI, WS, WP, WF... and the sheets on Excel are named exactly as such.
Now, I would like to write a function that uses the name of the model and the initialValues as "sheetnames" below. I was wondering if there is a way to do so in python.
import pandas as pd
import numpy as np
def MLA(model, initialValues)
RMatrix=(pd.read_excel("C:\Anaconda3\RateMatrix1.xlsx", sheetname="model", skiprows=0)).as_matrix() #read the matrix values from excel spreadsheet, and converts the values to a matrix
initialAmount = (pd.read_excel("C:\Anaconda3\RateMatrix1.xlsx", sheetname="initialValues", skiprows=0)).as_matrix() #read the column matrix (initial values) from excel spreadsheet, and converts the values to a matrix
return np.dot(RMatrix,initialAmount)
print(MLA(WIN,WI))

Nevermind...I found a solution.
For anyone else looking to do the same,here's my code:
import pandas as pd
import numpy as np
def MLA(model, initialValues)
RMatrix=(pd.read_excel("C:\Anaconda3\RateMatrix1.xlsx", sheetname=model, skiprows=0)).as_matrix() #read the matrix values from excel spreadsheet, and converts the values to a matrix
initialAmount = (pd.read_excel("C:\Anaconda3\RateMatrix1.xlsx", sheetname=initialValues, skiprows=0)).as_matrix() #read the column matrix (initial values) from excel spreadsheet, and converts the values to a matrix
return np.dot(RMatrix,initialAmount)
print(MLA("WIN","WI"))

Related

I want to convert a .mat file into excel

enter image description herethe type of data is numpy.ndarray. I am trying to convert it to pandas data frame with following code:
import pandas as pd
import numpy as np
from scipy.io import loadmat
data= loadmat(r"EAOW_FLOW_TimeSeries_1hr_LocationA1_final.mat")
ary = np.array(data)
ser = pd.Series(ary)
df=pd.DataFrame(ser)
df.to_csv(r"data.csv", index=False)
this is generating an excel but all the data is in single cell.
I am new to python. please help me resolve this error and convert the mat file to csv

Read a HDF data to a 3d array and save as a dataframe in python

I am currently working on the NASA aerosol optical depth data (MCD19A2), which is a NASA satellite level three product. I have uploaded the data. I want to save the data as a dataframe including all the information of longitude and latitude, and values. I have successfully converted the 0.47um band file into a three-dimensional array. I want to ask how to convert this array into a correct dataframe includes X, Y and the value.
Below are the codes I have tried:
from osgeo import gdal
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
rds = gdal.Open("MCD19A2.A2006001.h26v04.006.2018036214627.hdf")
names=rds.GetSubDatasets()
names[0][0]
*'HDF4_EOS:EOS_GRID:"MCD19A2.A2006001.h26v04.006.2018036214627.hdf":grid1km:Optical_Depth_047'*
aod_047 = gdal.Open(names[0][0])
a47=aod_047.ReadAsArray()
a47[1].shape
(1200,1200)
I would like the result to be like
X (n=1200)
Y (n=1200)
AOD_047
8896067
5559289
0.0123
I know that in R this can be done by
require('gdalUtils')
require('raster')
require('rgdal')
file.name<-"MCD19A2.A2006001.h26v04.006.2018036214627.hdf"
sds <- get_subdatasets(file.name)
gdal_translate(sds[1], dst_dataset = paste0('tmp047', basename(file.name), '.tiff'), b = nband)
r.047 <- raster(paste0('tmp047', basename(file.name), '.tiff'))
df.047 <- raster::as.data.frame(r.047, xy = T)
names(df.047)[3] <- 'AOD_047'
But, R really relies on memory and saving to 'tif' and reading 'tif' is using a lot of memory. So I want to do this task in python. Thanks a lot for your help.

You can use pandas:
import pandas as pd
df=pd.read_hdf('filename.hdf')

How to iterate over rows of .csv file and pass each row to a time-series analysis model?

I want to write a program in python that iterate over each row of a data-matrix in a .csv file and then pass each row as an input to time-series-analysis model and the output(which is going to be a single value) of each row analysed over model will be stored in a form of column.
So far, I have tried iterating over rows, passing it through model and printing each output:
import pandas as pd
import numpy as np
from statsmodels.tsa.ar_model import AR
from random import random
data=pd.read_csv('EXAMPLEMATRIX.csv',header=None)
for i in data.iterrows():
df=np.asarray(i)
model=AR(df)
model_fit=model.fit()
yhat=model_fitd.predict(len(df),len(df))
print(yhat)
but I get an error:
ValueError: maxlag should be < nobs
Please help me solve this problem or finding out where it is going wrong or provide me a reference for solving this problem.
THANKS in advance

Use that instead:
import pandas as pd
import numpy as np
from statsmodels.tsa.ar_model import AR
from random import random
for i in range(data.shape[0]):
row = data.iloc[i]
model=AR(row.values)
model_fit=model.fit()
yhat=model_fit.predict(len(row),len(row))
print(yhat)

plotting a large matrix in python

I have a data file in excel (.xlsx). The data represents a 100 micrometer by 100 micrometer area. Number of steps were set at 50 for x and 50 for y meaning each pixel is 2 micrometer in size. How can I create a 2D image from this data.

getting data from xslx files can be achieved using the openpyxl python module. after installing the module a simple example is (assuming you have an xslx as in the image attached):
from openpyxl import load_workbook
wb = load_workbook("/path/to/matrix.xlsx")
cell_range = wb['Sheet1']['B2:G16']
for row in cell_range:
for cell in row:
print(str(cell.value) + " ", end='')
print("")
this would print all the vaules in the range, you could also read them into a numpy array and plot. xslx example

If you are willing to plot the pixels instead of points using matplotlib then you can convert your dataframe into numpy array and then plot that array using imshow() method of matplotlib, after manipulating the numpy array as per your need.

matlab data file to pandas DataFrame [duplicate]

This question already has answers here:
Read .mat files in Python
(15 answers)
Closed 6 years ago.
Is there a standard way to convert matlab .mat (matlab formated data) files to Panda DataFrame?
I am aware that a workaround is possible by using scipy.io but I am wondering whether there is a straightforward way to do it.

I found 2 way: scipy or mat4py.
mat4py
Load data from MAT-file
The function loadmat loads all variables stored in the MAT-file into a
simple Python data structure, using only Python’s dict and list
objects. Numeric and cell arrays are converted to row-ordered nested
lists. Arrays are squeezed to eliminate arrays with only one element.
The resulting data structure is composed of simple types that are
compatible with the JSON format.
Example: Load a MAT-file into a Python data structure:
data = loadmat('datafile.mat')
From:
https://pypi.python.org/pypi/mat4py/0.1.0
Scipy:
Example:
import numpy as np
from scipy.io import loadmat # this is the SciPy module that loads mat-files
import matplotlib.pyplot as plt
from datetime import datetime, date, time
import pandas as pd
mat = loadmat('measured_data.mat') # load mat-file
mdata = mat['measuredData'] # variable in mat file
mdtype = mdata.dtype # dtypes of structures are "unsized objects"
# * SciPy reads in structures as structured NumPy arrays of dtype object
# * The size of the array is the size of the structure array, not the number
# elements in any particular field. The shape defaults to 2-dimensional.
# * For convenience make a dictionary of the data using the names from dtypes
# * Since the structure has only one element, but is 2-D, index it at [0, 0]
ndata = {n: mdata[n][0, 0] for n in mdtype.names}
# Reconstruct the columns of the data table from just the time series
# Use the number of intervals to test if a field is a column or metadata
columns = [n for n, v in ndata.iteritems() if v.size == ndata['numIntervals']]
# now make a data frame, setting the time stamps as the index
df = pd.DataFrame(np.concatenate([ndata[c] for c in columns], axis=1),
index=[datetime(*ts) for ts in ndata['timestamps']],
columns=columns)
From:
http://poquitopicante.blogspot.fr/2014/05/loading-matlab-mat-file-into-pandas.html
Finally you can use PyHogs but still use scipy:
Reading complex .mat files.
This notebook shows an example of reading a Matlab .mat file,
converting the data into a usable dictionary with loops, a simple plot
of the data.
http://pyhogs.github.io/reading-mat-files.html

Ways to do this:
As you mentioned scipy
import scipy.io as sio
test = sio.loadmat('test.mat')
Using the matlab engine:
import matlab.engine
eng = matlab.engine.start_matlab()
content = eng.load("example.mat",nargout=1)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Use Spreadsheet Names as Variables in Pandas - python

Related

I want to convert a .mat file into excel

Read a HDF data to a 3d array and save as a dataframe in python

How to iterate over rows of .csv file and pass each row to a time-series analysis model?

plotting a large matrix in python

matlab data file to pandas DataFrame [duplicate]

Categories

Resources