Python/Numpy: Division gives me an unexpected deprecation-Warning - python

Im reading data from a csv, then looping it, then I want to divide it by the mean value to normalize it but getting a warning. The code is:
A = genfromtxt("train.txt", delimiter=';', skip_header=1)
lowid = A[:,1].min(axis=0)
highid = A[:,1].max(axis=0)
X = []
Y = []
for i in np.arange(lowid, highid):
I = A[A[:,1] == i][:, [0,2,3]]
meanp = np.mean(I[:,1]);
meanq = np.mean(I[:,2]);
for j in np.arange(I[:,0].min(axis=0)+2, I[:,0].max(axis=0)):
weekday = int(I[j,0]) % 7
# NORMALIZE:
P = I[j,1] / meanp
pP = I[j-1,1] / meanp
ppP = I[j-2,1] / meanp
X.append([weekday, P, pP, ppP])
Y.append(I[j,2])
the train.txt looks like this:
day;itemID;price;quantity
1;1;4.73;6
1;2;7.23;0
1;3;10.23;1
1;4;17.9;0
1;5;1.81;1
1;6;12.39;1
1;7;7.17;1
1;8;7.03;0
1;9;13.61;0
1;10;36.45;1
1;11;24.67;0
1;12;12.04;0
1;13;11.85;0
The warnings:
weekday = int(I[j,0]) % 7
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
P = I[j,1] / meanp
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
pP = I[j-1,1] / meanp
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
ppP = I[j-2,1] / meanp
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
Y.append(I[j,2])
DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
What is the problem?
Thanks
EDIT Okay that was a pretty fast fix myself:
The j has got to be of integer type. I fixed it like this:
for j in range(int(I[:,0].min(axis=0))+2, int(I[:,0].max(axis=0))):
good solution like this? Im new to python...

Okay that was a pretty fast fix myself: The j has got to be of integer type.
I fixed it like this:
for j in range(int(I[:,0].min(axis=0))+2, int(I[:,0].max(axis=0))):
using the python range function OR explicitely defining the data-type for arange like this (thanks #Davidmh):
for j in np.arange(I[:,0].min(axis=0)+2, I[:,0].max(axis=0), dtype=np.int):

Related

Converting list from string to int but there's a catch

I'll start off by saying that I don't know much about programming and I tried searching for answers but I didn't even know what to type in the search engine. So here goes.
class Point:
def __init__ (self, x, y):
self.x = x
self.y = y
def __str__ (self):
return "Members are: %s, %s" % (self.x, self.y)
I have this class which represents a point with its x and y coordinate.
I have a list points = [] and if I manually append a point to that list e.g. points.append(Point(-1.0, 3)) the output returns (-1.0, 3) I'm doing some calculations with these points but I don't think it matters if I put the code for that here.
Things get tricky because I have to input the numbers from a file. I already added them to another list and appended them using a loop. The problem is that the list is in str and if I convert it into int I get an error because of the decimal .0 It says in my assignment that I have to keep the same format as the input.
The thing I don't understand is how does it keep the decimal .0 when I input it like this points.append(Point(-1.0, 3)) and is it possible to get the same output format with numbers from a file.
I tried converting it to float but then all the coordinates get decimal places.
You can use this code to convert the inputs appropriately, with this try-catch mechanism, we first try int, then if we didn't successful, we continue with float.
def float_or_int(inp):
try:
n = int(inp)
except ValueError:
try:
n = float(inp)
except ValueError:
print("it's not int or float")
return n
input_1 = '10.3'
input_2 = '10.0'
input_3 = '10'
res1 = float_or_int(input_1)
res2 = float_or_int(input_2)
res3 = float_or_int(input_3)
print(res1, type(res1)) # 10.3 <class 'float'>
print(res2, type(res2)) # 10.0 <class 'float'>
print(res3, type(res3)) # 10 <class 'int'>
I don't know how your inputs stored in the file/another list you are reading, but you get the idea how to parse a single input.
You could use this:
proper_points = []
for x,y in points:
float_x = float(x)
int_y = int(y)
coords = proper_points.append((x,y))
For your calculations, you could use the proper_points list instead of points
Man, do not reinvent the wheel. If you need to import data from file you can use numpy for example and the function loadtxt. https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html
I do not understand how your file is made and the format of the coordinate. In the bunch of code points.append(Point(-1.0, 3)) the first number is a float the second one is an integer. Which format do you want?
For example in file.dat you have the x,y positions:
1 1
2 3
4 5
where the first column is the x position and the second one represents y. Then you can use this code
import numpy as np
points = np.loadtxt('file.dat', dtype = 'int')
in points you have all the positions inside the file and you can just use slicing to access them.

KeyError with a poisson process using pandas

I am trying to create a function which will simulate a poison process for a changeable dt and total time, and have the following:
def compound_poisson(lamda,mu,sigma,dt,T):
points = pd.Series(0)
out = pd.Series(0)
inds = simple_poisson(lamda,dt,T)
for ind in inds.index:
if inds[ind+dt] > inds[ind]:
points[ind+dt] = np.random.normal(mu,sigma)
else:
points[ind+dt] = 0
out = out.append(np.cumsum(points),ignore_index=True)
out.index = np.linspace(0,T,int(T/dt + 1))
return out
However, I receive a "KeyError: 0.010000000000000002", which should not be in the index at all. Is this a result of being lax with float objects?
In short, yes, it's a floating point error. It's quite hard to know how you got there, but probably something like this:
>>> 0.1 * 0.1
0.010000000000000002
Maybe use round?

How to use the R 'with' operator in rpy2

I am doing an ordinal logistic regression, and following the guide here for the analysis: R Data Analysis Examples: Ordinal Logistic Regression
My dataframe (consult) looks like:
n raingarden es_score consult_case
garden_id
27436 7 0 3 0
27437 1 0 0 1
27439 1 1 1 1
37253 1 0 3 0
37256 3 0 0 0
I am at the part where I need to to create graph to test the proportional odds assumption, with the command in R as follows:
(s <- with(dat, summary(es_score ~ n + raingarden + consult_case, fun=sf)))
(es_score is an ordinal ranked score with values between 0 - 4; n is an integer; raingarden and consult_case, binary values of 0 or 1)
I have the sf function:
sf <- function(y) {
c('Y>=1' = qlogis(mean(y >= 1)),
'Y>=2' = qlogis(mean(y >= 2)),
'Y>=3' = qlogis(mean(y >= 3)))
}
in a utils.r file that I access as follows:
from rpy2.robjects.packages import STAP
with open('/file_path/utils.r', 'r') as f:
string = f.read()
sf = STAP(string, "sf")
And want to do something along the lines of:
R = ro.r
R.with(work_case_control, R.summary(formula, fun=sf))
The major problem is that the R withoperator is seen as a python keyword, so that even if I access it with ro.r.with it is still recognized as a python keyword. (As a side note: I tried using R's apply method instead, but got an error that TypeError: 'SignatureTranslatedAnonymousPackage' object is not callable ... I assume this is referring to my function sf?)
I also tried using the R assignment methods in rpy2 as follows:
R('sf = function(y) { c(\'Y>=1\' = qlogis(mean(y >= 1)), \'Y>=2\' = qlogis(mean(y >= 2)), \'Y>=3\' = qlogis(mean(y >= 3)))}')
R('s <- with({0}, summary(es_score~raingarden + consult_case, fun=sf)'.format(consult))
but ran into issues where the dataframe column names were somehow causing the error: RRuntimeError: Error in (function (file = "", n = NULL, text = NULL, prompt = "?", keep.source = getOption("keep.source"), :
<text>:1:19: unexpected symbol
1: s <- with( n raingarden
I could of course do this all in R, but I have a very involved ETL script in python, and would thus prefer to keep everything in python using rpy2 (I did try this using mord for scipy-learn to run my regreession, but it is pretty primitive).
Any suggestions would be most welcome right now.
EDIT
I tried various combinations #Parfait's suggestions, and qualifying the fun argument is syntactically incorrect, as per PyCharm interpreter (see image with red highlighting at end): ... it doesn't matter what the qualifier is, either, I always get an error
that SyntaxError: keyword can't be an expression.
On the other hand, with no qualifier, there is no syntax error: , but I do get the error TypeError: 'SignatureTranslatedAnonymousPackage' object is not callable when using the function sf as obtained:
from rpy2.robjects.packages import STAP
with open('/Users/gregsilverman/development/python/rest_api/rest_api/scripts/utils.r', 'r') as f:
string = f.read()
sf = STAP(string, "sf")
With that in mind, I created a package in R with the function sf, imported it, and tried various combos with the only one producing no error, being: print(base._with(consult_case_control, R.summary(formula, fun=gms.sf))) (gms is a reference to the package in R I made).
The output though makes no sense:
Length Class Mode
3 formula call
I am expecting a table ala the one on the UCLA site. Interesting. I am going to try recreating my analysis in R, just for the heck of it. I still would like to complete it in python though.
Consider bracketing the with call and be sure to qualify all arguments including fun:
ro.r['with'](work_case_control, ro.r.summary(formula, ro.r.summary.fun=sf))
Alternatively, import R's base package. And to avoid conflict with Python's named method with() translate the R name:
from rpy2.robjects.packages import importr
base = importr('base', robject_translations={'with': '_with'})
base._with(work_case_control, ro.r.summary(formula, ro.r.summary.fun=sf))
And be sure to properly create your formula. Consider using R's stats packages' as.formula to build from string. Notice too another translation is made due to naming conflict:
stats = importr('stats', robject_translations={'format_perc': '_format_perc'})
formula = stats.as_formula('es_score ~ n + raingarden + consult_case')

How to assign a variable a float times an integer

Q1 should be able to contain 1.602*10^-19 and Q2: -1.602*10^-19
Instead it gives me a value error: ValueError: invalid literal for float().
What am I doing wroing. I am a beginner by the way.
import os
Clear = lambda: os.system("cls")
Clear()
Q1 = float(raw_input("What's Q1?\n"))
Q2 = float(raw_input("What's Q2?\n"))
r = float(raw_input("What's radius?\n"))
def calc(Q1, Q2, r):
k = 8.99*10**9
return((k((Q1) * Q2))/r**2)
print(calc(Q1, Q2, r))
You did not say what input you are using or what line you are getting the error, so I am going to assume you are trying to do float("1.602*10^-19").
This is not a valid argument, to need to use a different notation to meet the required format:
float("1.602e-19")
Did you input 1.602*10^-19?
If so, please note the correct format is 1.602e-19

Convert Scientific Notation to Float

Encountered a problem whereby my JSON data gets printed as a scientific notation instead of a float.
import urllib2
import json
import sys
url = 'https://bittrex.com/api/v1.1/public/getmarketsummary?market=btc-quid'
json_obj = urllib2.urlopen(url)
QUID_data = json.load(json_obj)
QUID_MarketName_Trex = QUID_data["result"][0]["MarketName"][4:9]
QUID_Last_Trex = QUID_data["result"][0]["Last"]
QUID_High_Trex = QUID_data["result"][0]["High"]
QUID_Low_Trex = QUID_data["result"][0]["Low"]
QUID_Volume_Trex = QUID_data["result"][0]["Volume"]
QUID_BaseVolume_Trex = QUID_data["result"][0]["BaseVolume"]
QUID_TimeStamp_Trex = QUID_data["result"][0]["TimeStamp"]
QUID_Bid_Trex = QUID_data["result"][0]["Bid"]
QUID_Ask_Trex = QUID_data["result"][0]["Ask"]
QUID_OpenBuyOrders_Trex = QUID_data["result"][0]["OpenBuyOrders"]
QUID_OpenSellOrders_Trex = QUID_data["result"][0]["OpenSellOrders"]
QUID_PrevDay_Trex = QUID_data["result"][0]["PrevDay"]
QUID_Created_Trex = QUID_data["result"][0]["Created"]
QUID_Change_Trex = ((QUID_Last_Trex - QUID_PrevDay_Trex)/ QUID_PrevDay_Trex)*100
QUID_Change_Var = str(QUID_Change_Trex)
QUID_Change_Final = QUID_Change_Var[0:5] + '%'
print QUID_Last_Trex
It prints the following value; 1.357e-05.
I need this to be a float with 8 chars behind the decimal (0.00001370)
As you can see here --> http://i.imgur.com/FCVM1UN.jpg, my GUI displays the first row correct (using the exact same code).
You are looking at the default str() formatting of floating point numbers, where scientific notation is used for sufficiently small or large numbers.
You don't need to convert this, the value itself is a proper float. If you need to display this in a different format, format it explicitly:
>>> print(0.00001357)
1.357e-05
>>> print(format(0.00001357, 'f'))
0.000014
>>> print(format(0.00001357, '.8f'))
0.00001357
Here the f format always uses fixed point notation for the value. The default precision is 6 digits; the .8 instructs the f formatter to show 8 digits instead.
In Python 3, the default string format is essentially the same as format(fpvalue, '.16g'); the g format uses either a scientific or fixed point presentation depending on the exponent of the number. Python 2 used '.12g'.
You can use print formatting:
x = 1.357e-05
print('%f' % x)
Edit:
print('%.08f' % x)
There are some approaches:
#1 float(...) + optionally round() or .format()
x = float(1.357e-05)
round(x, 6)
"{:.8f}".format(x)
#2 with decimal class
import decimal
tmp = decimal.Decimal('1.357e-05')
print('[0]', tmp)
# [0] 0.00001357
tmp = decimal.Decimal(1.357e-05)
print('[1]', tmp)
# [1] 0.0000135700000000000005188384444299032338676624931395053863525390625
decimal.getcontext().prec = 6
tmp = decimal.getcontext().create_decimal(1.357e-05)
print('[2]', tmp)
# [2] 0.0000135700
#3 with .rstrip(...)
x = ("%.17f" % n).rstrip('0').rstrip('.')
Note: there are counterparts to %f:
%f shows standard notation
%e shows scientific notation
%g shows default (scientific if 5 or more zeroes)

Categories