Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
So I've been kinda new to some concepts, can someone please briefly explain what is the difference between these two codes?
regressor=LinearRegression()
regressor.fit(train_X,train_Y)
.
LinearRegression().fit(train_X,train_Y)
The main difference between the two is that the first creates a variable called regressor which you can later access. The second doesn't do this.
Otherwise the two are doing exactly the same thing.
The purpose of fitting (training) the regressor is to use it in the future for prediction. In you r second example (LinearRegression().fit(train_X,train_Y)) you create an anonymous regressor, train it, and then immediately discard. You cannot use it anymore as it does not have any references.
In the first example, you first create a regressor and assign it to a variable, then train the regressor that was previously created. You can later use it for prediction or any other purpose.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 14 days ago.
Improve this question
def mean(x):
return(sum(x)/len(x))
def variance(x):
x_mean = mean(x)
return sum((x-x_mean)**2)/(len(x)-1)
def standard_deviation(x):
return math.sqrt(variance(x))
The functions above build on each other. They depend on the previous function. What is a good way to implement this in Python? Should I use a class which has these functions? Are there other options?
Because they are widely applicable, keep them as they are
Many parts of a program may need to calculate these statistics, and it will save wordiness to not have to get them out of a class. Moreover, the functions actually don't need any class-stored data: they would simply be static methods of a class. (Which in the old days, we would have simply called "functions"!)
If they needed to store internal information to work correctly, that is a good reason to put them into a class
The advantage in that case is that it is more obvious to the programmer what information is being shared. Moreover, you might want to create two or more instances that had different sets of shared data. That is not the case here.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Every tutorial I have found about machine learning includes testing an algorithm on a dataset that has target values and then it finds how accurate the algorithm is by testing its predictions on the test set.
What if you then receive all of the data except for the target value and you want to make target value predictions to see if they come true in the future?Every tutorial I have seen has been with data that they already know the future target value predictions.
Decision tree is a supervised algorithm. That means you must use some target value(or lable) to build the tree(dividing node's value based on information gain rule).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
My data was modelled with a Cox-regression, using R, however I would like to use this model into a python GUI. As my knowledge of R is very limited. This way non-coders would be able to 'predict' survival rates based on our model.
What is the best way that I could use this model (combination of 3 different regressions) in python?
Do you want to predict values based on your estimates?
In this case you can just copy the R outputs into python and apply to
respective procedures.
Do you want the user to be able to run "your R regression pipeline" from within Python?
There are python libraries that help with that. I find this
source a useful start.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am wondering what I should do for the purpose of my project.
I am gonna operate on about 100 000 rows, every time.
what I wanted to do is to create an object "{}" and then, if I need to search for a value, just call it , for example
data['2018']['09']['Marketing']['AccountName']
the second option is to pull everyting into an array "[]" and in case I need to pull value, I will create a function to go through the array and sum numbers for specific parameters.
But don't know which method is faster.
Will be thankful if you can shed some light on this
Thanks in advance,
If performance (speed) is an issue, Python might not be the ideal choice...
Otherwise:
Might I suggest the use of a proper database, such as SQLLite (which comes shipped with Python).
And maybe SQLAlchemy as an abstraction layer. (https://docs.sqlalchemy.org/en/latest/orm/tutorial.html)
After all, they were made exactly for this kind of tasks.
If that seems overkill: Have a look at Pandas.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am working on pattern recognition program using R/python. What would be the best way to compare two or more figures and identify/recognize the similar or duplicate figures based on pattern recognition?
There are lots of papers on the internet, we can try to get the idea how to extract and process feature in a fingerprint. For instance, http://www.cse.unr.edu/~bebis/CS790Q/PaperPresentations/MinutiaeDetection.pdf
Then you can use whatever classifier you want such as support vector machine.
If you need more idea you can visit http://dermatoglyphics.org/11-basic-patterns-of-fingerprint/ to generalize