reference functions to array slices in python - python

Is the following was possible in python?
(I am pretty new to python, not sure what the appropriate search term would be)
I have a class that stores and manipulates a large numpy array.
Now I would like to access parts of this array via an alias 'reference function'
Here is a dummy example for illustration
import numpy as np
class Trajectory(object):
def __init__(self,M=np.random.random((4,4))):
self.M=M
def get_second_row(self):
return self.M[1,:]
def set_second_row(self,newData):
self.M[1,:]=newData
t=Trajectory()
print t.M
initialData=t.get_second_row()
t.set_second_row(np.random.random(4))
print t.M
What I don't like about this is that I have to write separate set and get functions. is there a simpler way to use just one function to refer to the parts of the array M that would work for both getting and setting values?
so speaking in dummy code, something that would allow me to do this:
values=t.nth_row
t.nth_row=values+1
I would like to use t.nth_row as a reference for both getting and setting the value if that makes sense

is there a simpler way to use just one function to refer to the parts of the array M that would work for both getting and setting values?
Yes, and you've written it. It is your get function:
initialData=t.get_second_row()
t.get_second_row()[:] = np.random.random(4)
t.get_second_row()[0] = 1997

Related

Attempting to use np.insert in a created class which has subscripts yields "object does not support item assignment" debug

I have defined my own class which takes in any matrix and is defined in such a way to convert this matrix into three numpy arrays inside a parenthesis (which I assume means it's a tuple). Furthermore, I have added a getitem method which allows output arrays to be subscript-able just like normal arrays.
My class is called MatrixConverter, and say x is some random matrix, then:
q=MatrixConverter(x)
Where q gives:
q=(array[1,2,3,4],array[5,6,7,8],array[9,10,11,12])
(Note that this is just an example, it does not produce three arrays with consecutive numbers)
Then, for example, by my getitem method, it allows for:
q[0] = array[1,2,3,4]
q[0][1] = 2
Now, I'm attempting to design a method to add en element into one of the arrays using the np.insert function such as the following:
class MatrixConverter
#some code here
def __change__(self,n,x):
self[1]=np.insert(self[1],n,x)
return self
Then, my desired output for the case where n=2 and x=70 is the following:
In:q.__change__(2,70)
Out:(array[1,2,3,4],array[5,6,70,7,8],array[9,10,11,12])
However, this gives me a TypeError: 'MatrixConverter' object does not support item assignment.
Any help/debugs? Should I perhaps use np.concentate instead?
Thank you!
Change your method to:
def __change__(self,n,x):
temp = np.insert(self[1],n,x)
self[1] = temp
return self
This will help you distinguish between a problem with the insert and a problem with the self[1] = ... setting.
I don't think the problem is with the insert call, but you need to write code that doesn't confuse you on such matters. That's a basic part of debugging.
Beyond that you haven't given us enough code to help you. For example what's the "getitem".
Expressions like array[1,2,3,4] tell me that you aren't actually copying from your code. That's not a valid Python expression, or array display.

python / pandas - MultiIndexing - eliminate the use of global variables

I am using pandas to import a dataframe from excel in order to sort, make changes and run some simple addition and division on the data.
My code is working but it has global variables throughout. I think this is poor practice and I want to somehow eliminate these global variables but I am confused on how I can go about doing this.
I'm not sure how I can further modify my dataframe with indexing and slicing without declaring global variables.
mydf = pd.read_excel('data.xlsx')
new_indexes = df.set_index(['apple', 'cherry', 'banana'])
new_indexes['apples and cherries'] = new_indexes['apple'] + new_indexes['cherries']
sliced = multi.loc(axis = 0)[pd.IndexSlice[:, 'fruits']]
total_fruits = sliced.loc[:, 'grapes', 'watermelon', 'orange'].sum(axis=1)
That's a snippet of my code. As you can see I am referring to the global variables in order to further modify my dataframe. I need to eliminate the global variables. I am trying to create functions to help clean up my code.
My main question is how can I refer to my data and changes without assigning global variables to my code?
If I wanted to go about defining a class and reassigning the variables to properties would I be able to do something like this?
class MyDf:
def __init__(self):
pass
def get_df(self):
return pd.read_excel('data.xlsx')
def set_index(self):
self._multi_index = df.set_index(['apple', 'cherry', 'banana'])
def add_totals(self)
self.set_indexes['apples and cherries'] = set_indexes['apple']+ new_indexes['cherries']
Thank you
There are several things you could do, dependent on the overall structure of your code and your goal. Without knowing more about your case and, for example, seeing how the snippet you provided is embedded into the rest of your code, those are only possible solutions.
You could define a function, make it take a dataframe as an argument, perform operations on it and then return the modified dataframe. The function could also simply take a filename as argument, so that the respective df is created within the function to begin with. If you do not need to refer to intermediary variables such as new_indexes or sliced later in the code, using a function to perform the operations might be a good way to go.
You could also define a Class, make the variables into properties of objects of that class and write methods to perform the respective operations you want to do. This would have the advantage that you could still access your variables, if necessary.

Generating random numbers for a probability density function in Python

I'm currently working on a project relating to brownian motion, and trying to simulate some of it using Python (a language I'm admittedly very new at). Currently, my goal is to generate random numbers following a given probability density function. I've been trying to use the scipy library for it.
My current code looks like this:
>>> import scipy.stats as st
>>> class my_pdf(st.rv_continuous):
def _pdf(self,x,y):
return (1/math.sqrt(4*t*D*math.pi))*(math.exp(-((x^2)/(4*D*t))))*(1/math.sqrt(4*t*D*math.pi))*(math.exp(-((y^2)/(4*D*t))))
>>> def get_brown(a,b):
D,t = a,b
return my_pdf()
>>> get_brown(1,1)
<__main__.my_pdf object at 0x000000A66400A320>
All attempts at launching the get_brown function end up giving me these hexadecimals (always starting at 0x000000A66400A with only the last three digits changing, no matter what parameters I give for D and t). I'm not sure how to interpret that. All I want is to get random numbers following the given PDF; what do these hexadecimals mean?
The result you see is the memory address of the object you have created. Now you might ask: which object? Your method get_brown(int, int) calls return my_pdf() which creates an object of the class my_pdf and returns it. If you want to access the _pdf function of your class now and calculate the value of the pdf you can use this code:
get_brown(1,1)._pdf(x, y)
On the object you have just created you can also use all methods of the scipy.stats.rv_continous class, which you can find here.
For your situation you could also discard your current code and just use the normal distribution included in scipy as Brownian motion is mainly a Normal random process.
As noted, this is a memory location. Your function get_brown gets an instance of the my_pdf class, but doesn't evaluate the method inside that class.
What you probably want to do is call the _pdf method on that instance, rather than return the class itself.
def get_brown(a,b):
D,t = a,b # what is D,t for?
return my_pdf()_pdf(a,b)
I expect that the code you've posted is a simplification of what you're really doing, but functions don't need to be inside classes - so the _pdf function could live on it's own. Alternatively, you don't need to use the get_brown function - just instantiate the my_pdf class and call the calculation method.

Substitute numpy functions with Python only

I have a python function that employs the numpy package. It uses numpy.sort and numpy.array functions as shown below:
def function(group):
pre_data = np.sort(np.array(
[c["data"] for c in group[1]],
dtype = np.float64
))
How can I re-write the sort and array functions using only Python in such a way that I no longer need the numpy package?
It really depends on the code after this. pre_data will be a numpy.ndarray which means that it has array methods which will be really hard to replicate without numpy. If those methods are being called later in the code, you're going to have a hard time and I'd advise you to just bite the bullet and install numpy. It's popularity is a testament to it's usefulness...
However, if you really just want to sort a list of floats and put it into a sequence-like container:
def function(group):
pre_data = sorted(float(c['data']) for c in group[1])
should do the trick.
Well, it's not strictly possible because the return type is an ndarray. If you don't mind to use a list instead, try this:
pre_data = sorted(float(c["data"]) for c in group[1])
That's not actually using any useful numpy functions anyway
def function(group):
pre_data = sorted(float(c["data"]) for c in group[1])

Which is more Pythonic way?

I want to write a function to create an empty square matrix have size NxN.
I have 2 ways to write this:
1:
s_matrix = []
create_empty_square_matrix(s_matrix, N)
2:
s_matrix = empty_square_matrix(N)
(Ofcourse, 2 two functions will different a bit. Function create_empty_square_matrix is like a procedure - only manipulate on s_matrix. Function empty_square_matrix create & return a matrix)
Which way is more Pythonic & clearer?
Do you have some suggestions about naming style? I'm not sure about empty_square_matrix & create_empty_square_matrix.
I'd always prefer the second way.
The problem with the first is that you pass the object that you want to write to as the paramenter (s_matrix), and the caller of the function will have to know that it must be passed an empty list. What happens if the caller passes a dict, or a list that is not empty?
By the way, if you want to do matrix calculations, you should take a look at the NumPy library, it offers many things that standard Python does not.

Categories