I was going to make a list of 50 random numbers, one to one hundred, and take the square and square root for the numbers, and I would put them into a nice PD.Dataframe (the first list is the random numbers which are the rows and the two other lists are the columns).
My Code looks like this:
import random
import math
import numpy as np
import pandas as pd
y = random.sample(range(1, 100), 50)
y1 = [f**2 for f in y]
y2 = [round(math.sqrt(f2)) for f2 in y]
whole = y + y1 + y2
whole2 = (np.array(whole)).reshape((50,3))
df = pd.DataFrame([whole2[:,1:]], index=whole2[:,0], columns=['Number', 'Square','Square Root'])
I would appreciate if some one could tell me where and how I went wrong. Thanks!
Your logic is a little off here. I think np.stack is a little more intuitive here as well. You can also remove y from the stack if you would only like the random numbers one time.
import random
import math
import numpy as np
import pandas as pd
y = random.sample(range(1, 100), 50)
y1 = [f**2 for f in y]
y2 = [round(math.sqrt(f2)) for f2 in y]
whole = np.stack([y1, y2], axis=-1)
df = pd.DataFrame(whole, index=y, columns=['Square','Square Root'])
Related
I am very interested in using Python to extract 3-4 Dimensions via Canonical Correlation Analyses. I am pasting my very basic code below, and it appears to always default to only extracting two Dimensions even though each of my input arrays are 10,000+ X 3. Even if I have 4 columns for my X & Y matrices it always gives just two Dimensions - was hoping for three and eventually four as I add many more raw Features to my X and Y arrays. Trying to keep simple for now. Could part of my problem also be that some of my Field Names have spaces in them too?
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
data = "G:\Shared drives\Data Intelligence\ZF\Segmentation/Data.csv"
df = pd.read_csv(data)
df.head()
print(df.columns)
X = df[['Altcurr Ext Stk Sales Cc',\
'Altcurr Ext Dss Sales Cc',\
'LBM Sales']]
X.head()
X_mc = (X-X.mean())/(X.std())
X_mc.head()
Y = df[['Primary_Supplier_0_org1',\
'Primary_Supplier_1_org2',\
'Primary_Supplier_2_TV']]
Y.head()
Y_mc = (Y-Y.mean())/(Y.std())
Y_mc.head()
from sklearn.cross_decomposition import CCA
ca = CCA()
ca.fit(X_mc, Y_mc)
X_c, Y_c = ca.transform(X_mc, Y_mc)
By default the CCA() function sets , you can check out the documentation :
Parameters:
n_components int, default=2
Number of components to keep. Should be in [1, min(n_samples, n_features, n_targets)].
For your dataset, X and Y both have 3 columns, so you can go up to n_components = 3 . Using an example dataset :
from sklearn.datasets import make_blobs
from sklearn.cross_decomposition import CCA
X, _ = make_blobs(n_samples=10000, centers=3, n_features=6,random_state=0)
y = X[:,3:]
X = X[:,:3]
ca = CCA(n_components = 3)
ca.fit(X, y)
X_c, Y_c = ca.transform(X, y)
print(X_c.shape)
(10000, 3)
print(Y_c.shape)
(10000, 3)
I need a script which returns a list of random figures from
range(-100;+100) in ratio of positive/negative figures = 2/1. Current
wording returns voluntary ratio
import numpy as np
x=[]
for y in range(10):
y=np.random.randint(-100,100)
x.append(y)
print(x)
import numpy as np
neg = np.random.randint(-100, -1, 10)
poz = np.random.randint(0, 100, 20)
res = np.concatenate((neg, poz), axis=0)
print(res)
np.random.shuffle(res)#If you need to mix
print(res)
As an option.
Im trying to store 60 values to x and the next one to the y and then shift it 1 up and store 60 values to x and the next one to the y. But the for loop only works once for the x values the y values some how do get stored properly. And the size of my dataset is not the problem as storing the y values work. It's just the x values that don't store and when debugging the x sets are just empty array's except the first one. What am i doing wrong?
import math
import numpy as np
import pandas as pd
df = pd.read_csv('dataset.csv')
print(df)
dataset = df.values
data_len = math.ceil(len(dataset) * .1)
data = dataset[0:data_len , :]
print(data)
x = []
y = []
for i in range(60, len(data)):
x.append(data[60-i:i, 0])
y.append(data[i, 0])
I think that you should try to avoid using a for loop, in numpy it is often faster to create masks and work with that. If i understand your question correctly, you want to store 0-59 in x, then 60 in y, then 61-119 in x, then 120 in y and so on.
If that is the correct understanding i would try this instead:
dataset = np.random.randint(1,1000,size=1000) #generating an example
mask = np.zeros(len(dataset),dtype=np.bool)
mask[60::60] = 1 #indices of the ys
y = dataset[mask]
x = dataset[~mask]
I am currently stuck on a problem on which I am required to generate a curve of best fit which I am required to use a more precise x array from 250 to 100 in steps of 10. Here is my code below so far..
import numpy as np
from numpy import polyfit, polyval
import matplotlib.pyplot as plt
x = [250,300,350,400,450,500,550,600,700,750,800,900,1000]
x = np.array(x)
y = [0.791, 0.846, 0.895, 0.939, 0.978, 1.014, 1.046, 1.075, 1.102, 1.148, 1.169, 1.204, 1.234]
y= np.array(y)
r = polyfit(x,y,3)
fit = polyval(r, x)
plt.plot(x, fit, 'b')
plt.plot(x,y, color = 'r', marker = 'x')
plt.show()
If I understand correctly, you are trying to create an array of numbers from a to b by steps of c.
With pure python you can use:
list(range(a, b, c)) #in your case list(range(250, 1000, 10))
Or, since you are using numpy you can directly make the numpy array:
np.arange(a, b, c)
To create an array in steps you can use numpy.arange([start,] stop[, step]):
import numpy as np
x = np.arange(250,1000,10)
To generate values from 250-1000, use range(start, stop, step):
x = range(250,1001,10)
x = np.array(x)
I need to use the "savefig" in Python to save the plot of each iteration of a while loop, and I want that the name i give to the figure contains a literal part and a numerical part. This one comes out from an array or is the number associated to the index of iteration. I make a simple example:
# index.py
from numpy import *
from pylab import *
from matplotlib import *
from matplotlib.pyplot import *
import os
x=arange(0.12,60,0.12).reshape(100,5)
y=sin(x)
i=0
while i<99
figure()
a=x[:,i]
b=y[:,i]
c=a[0]
plot(x,y,label='%s%d'%('x=',c))
savefig(#???#) #I want the name is: x='a[0]'.png
#where 'a[0]' is the value of a[0]
thanks a lot.
Well, it should be simply this:
savefig(str(a[0]))
This is a toy example. Works for me.
import pylab as pl
import numpy as np
# some data
x = np.arange(10)
pl.figure()
pl.plot(x)
pl.savefig('x=' + str(10) + '.png')
I had the same demand recently and figured out the solution. I modify the given code and correct several explicit errors.
from pylab import *
import matplotlib.pyplot as plt
x = arange(0.12, 60, 0.12).reshape(100, 5)
y = sin(x)
i = 0
while i < 99:
figure()
a = x[i, :] # change each row instead of column
b = y[i, :]
i += 1 # make sure to exit the while loop
flag = 'x=%s' % str(a[0]) # use the first element of list a as the name
plot(a, b, label=flag)
plt.savefig("%s.png" % flag)
Hope it helps.
Since python 3.6 you can use f-strings to format strings dynamically:
import matplotlib.pyplot as plt
for i in range(99):
plt.figure()
a = x[:, i]
b = y[:, i]
c = a[0]
plt.plot(a, b, label=f'x={c}')
plt.savefig(f'x={c}.png')