Is there a way to interpolate values while maintaining a ratio?

Is there a way to interpolate values while maintaining a ratio? - python

I have a dataframe of percentages, and I want to interpolate the intermediate values
0 5 10 15 20 25 30 35
A 0.50 0.50 0.50 0.49 0.47 0.41 0.35 0.29 0.22
B 0.31 0.31 0.31 0.29 0.28 0.24 0.22 0.18 0.13
C 0.09 0.09 0.09 0.09 0.08 0.07 0.06 0.05 0.04
D 0.08 0.08 0.08 0.08 0.06 0.06 0.05 0.04 0.03
E 0.01 0.01 0.01 0.01 0.01 0.02 0.02 0.03 0.04
F 0.01 0.01 0.01 0.04 0.10 0.20 0.30 0.41 0.54
So far, I've been using scipy's interp1d and iterating row by row, but it doesn't always maintain the ratios perfectly down the column. Is there a way to do this all together in one function?

reindex then interpolate
r = range(df.columns.min(), df.columns.max() + 1)
df.reindex(columns=r).interpolate(axis=1)
0 1 2 3 4 5 6 7 8 9 ... 31 32 33 34 35 36 37 38 39 40
A 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 0.50 ... 0.338 0.326 0.314 0.302 0.29 0.276 0.262 0.248 0.234 0.22
B 0.31 0.31 0.31 0.31 0.31 0.31 0.31 0.31 0.31 0.31 ... 0.212 0.204 0.196 0.188 0.18 0.170 0.160 0.150 0.140 0.13
C 0.09 0.09 0.09 0.09 0.09 0.09 0.09 0.09 0.09 0.09 ... 0.058 0.056 0.054 0.052 0.05 0.048 0.046 0.044 0.042 0.04
D 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 ... 0.048 0.046 0.044 0.042 0.04 0.038 0.036 0.034 0.032 0.03
E 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ... 0.022 0.024 0.026 0.028 0.03 0.032 0.034 0.036 0.038 0.04
F 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ... 0.322 0.344 0.366 0.388 0.41 0.436 0.462 0.488 0.514 0.54

Related

How to add a sum() value above the df column values?

Supposed I have a df as below, how to add a sum() value in below DataFrame?
df.columns=['value_a','value_b','name','up_or_down','difference']
df
value_a value_b name up_or_down difference
project_name
# sum 27.56 25.04 sum down -1.31
2021-project11 0.43 0.48 2021-project11 up 0.05
2021-project1 0.62 0.56 2021-project1 down -0.06
2021-project2 0.51 0.47 2021-project2 down -0.04
2021-porject3 0.37 0.34 2021-porject3 down -0.03
2021-porject4 0.64 0.61 2021-porject4 down -0.03
2021-project5 0.32 0.25 2021-project5 down -0.07
2021-project6 0.75 0.81 2021-project6 up 0.06
2021-project7 0.60 0.60 2021-project7 down 0.00
2021-project8 0.85 0.74 2021-project8 down -0.11
2021-project10 0.67 0.67 2021-project10 down 0.00
2021-project9 0.73 0.73 2021-project9 down 0.00
2021-project11 0.54 0.54 2021-project11 down 0.00
2021-project12 0.40 0.40 2021-project12 down 0.00
2021-project13 0.76 0.77 2021-project13 up 0.01
2021-project14 1.16 1.28 2021-project14 up 0.12
2021-project15 1.01 0.94 2021-project15 down -0.07
2021-project16 1.23 1.24 2021-project16 up 0.01
2022-project17 0.40 0.36 2022-project17 down -0.04
2022-project_11 0.40 0.40 2022-project_11 down 0.00
2022-project4 1.01 0.80 2022-project4 down -0.21
2022-project1 0.65 0.67 2022-project1 up 0.02
2022-project2 0.75 0.57 2022-project2 down -0.18
2022-porject3 0.32 0.32 2022-porject3 down 0.00
2022-project18 0.91 0.56 2022-project18 down -0.35
2022-project5 0.84 0.89 2022-project5 up 0.05
2022-project19 0.61 0.48 2022-project19 down -0.13
2022-project6 0.77 0.80 2022-project6 up 0.03
2022-project20 0.63 0.54 2022-project20 down -0.09
2022-project8 0.59 0.55 2022-project8 down -0.04
2022-project21 0.58 0.54 2022-project21 down -0.04
2022-project10 0.76 0.76 2022-project10 down 0.00
2022-project9 0.70 0.71 2022-project9 up 0.01
2022-project22 0.62 0.56 2022-project22 down -0.06
2022-project23 2.03 1.74 2022-project23 down -0.29
2022-project12 0.39 0.39 2022-project12 down 0.00
2022-project24 1.35 1.55 2022-project24 up 0.20
project25 0.45 0.42 project25 down -0.03
project26 0.53 NaN project26 down NaN
project27 0.68 NaN project27 down NaN
I tried
df.sum().columns=['value_a_sun','value_b_sum','difference_sum']
And I would like to add below sum value in the above column value,
sum 27.56 25.04 sum down -1.31
But I got AttributeError: 'Series' object has no attribute 'column', how to fix this? Thanks so much for any advice.

Filter columns names in subset by [] before sum and assign for new row in DataFrame.loc:
df.loc['sum'] = df[['value_a','value_b','difference']].sum()
For first line:
df1 = df[['value_a','value_b','difference']].sum().to_frame().T
df = pd.concat([df1, df], ignore_index=True)

How to add rows up to point?

I have pandas series. I would like to add the rows to be equal to value.
For example, I would like to add the first few values to be equal to 0.19798863694528301.
Then add from the beginning so that the second value will be equal to 3.79478220811793, and so forth.
That data is given below:
0.0
0.015
0.03
0.045
0.06
0.075
0.09
0.105
0.12
0.135
0.15
0.165
0.18
0.195
0.21
0.225
0.24
0.255
0.27
0.285
0.3
0.315
0.33
0.345
0.36
0.375
0.39
0.405
0.42
0.435
0.45
0.465
0.48
0.495
0.51
0.525
0.54
0.555
0.57
0.585
0.6
0.615
0.63
0.645
0.66
0.675
0.69
0.705
0.72
0.735
0.75
0.765
0.78
0.795
0.81
0.825

StandardScaler() python error for scaling data

How do I fix this code, do I need to make the features_train and the features_test a DataFrame?
Anyone has an idea of how to fix that code? I really can't understand the problem....
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import Normalizer
from sklearn.metrics import r2_score
admissions_data = pd.read_csv('admissions_data.csv')
labels = admissions_data.iloc[:, -1]
features = admissions_data.iloc[:, 1:8]
features_train, labels_train, features_test, labels_test = train_test_split(features, labels, test_size=0.2, random_state=13)
sc = StandardScaler()
features_train_scaled = sc.fit_transform(features_train)
features_test_scale = sc.transform(features_test)
features_train_scaled = pd.DataFrame(features_train_scaled)
features_test_scale = pd.DataFrame(features_test_scale)
The error is:
Traceback (most recent call last):
File "script.py", line 26, in <module>
features_test_scale = sc.transform(features_test)
File "/usr/local/lib/python3.6/dist-packages/sklearn/preprocessing/_data.py", line 794, in transform
force_all_finite='allow-nan')
File "/usr/local/lib/python3.6/dist-packages/sklearn/base.py", line 420, in _validate_data
X = check_array(X, **check_params)
File "/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py", line 73, in inner_f
return f(**kwargs)
File "/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py", line 624, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[0.57 0.78 0.59 0.64 0.47 0.63 0.65 0.89 0.84 0.73 0.75 0.64 0.46 0.78
0.62 0.53 0.85 0.67 0.84 0.94 0.64 0.53 0.47 0.86 0.62 0.7 0.77 0.61
0.61 0.63 0.86 0.82 0.65 0.58 0.7 0.7 0.84 0.72 0.71 0.77 0.69 0.8
0.52 0.62 0.79 0.71 0.9 0.84 0.6 0.86 0.67 0.61 0.71 0.52 0.62 0.37
0.73 0.64 0.71 0.8 0.88 0.78 0.45 0.62 0.62 0.86 0.74 0.94 0.58 0.7
0.92 0.64 0.65 0.83 0.34 0.66 0.67 0.7 0.71 0.54 0.68 0.61 0.68 0.79
0.57 0.94 0.59 0.79 0.73 0.91 0.86 0.95 0.9 0.92 0.68 0.84 0.69 0.72
0.94 0.53 0.45 0.77 0.77 0.91 0.61 0.78 0.77 0.82 0.9 0.92 0.54 0.92
0.72 0.5 0.68 0.78 0.72 0.53 0.79 0.49 0.68 0.72 0.73 0.93 0.72 0.52
0.54 0.86 0.65 0.93 0.89 0.72 0.34 0.64 0.96 0.79 0.73 0.49 0.73 0.94
0.7 0.95 0.65 0.86 0.78 0.75 0.89 0.94 0.91 0.87 0.93 0.81 0.94 0.89
0.57 0.77 0.39 0.46 0.78 0.64 0.76 0.58 0.56 0.53 0.79 0.9 0.92 0.96
0.67 0.65 0.64 0.58 0.94 0.76 0.78 0.88 0.84 0.68 0.66 0.42 0.56 0.66
0.46 0.65 0.58 0.72 0.48 0.68 0.89 0.95 0.46 0.71 0.79 0.52 0.57 0.76
0.52 0.8 0.77 0.91 0.75 0.49 0.72 0.72 0.61 0.97 0.8 0.85 0.73 0.64
0.87 0.63 0.97 0.72 0.82 0.54 0.71 0.45 0.8 0.49 0.77 0.93 0.89 0.93
0.81 0.62 0.81 0.66 0.78 0.76 0.48 0.61 0.82 0.68 0.7 0.68 0.62 0.81
0.87 0.94 0.38 0.67 0.64 0.84 0.62 0.7 0.62 0.5 0.79 0.78 0.36 0.77
0.57 0.87 0.74 0.71 0.61 0.57 0.64 0.73 0.81 0.74 0.8 0.69 0.66 0.64
0.93 0.64 0.59 0.71 0.82 0.69 0.69 0.89 0.93 0.74 0.64 0.84 0.91 0.97
0.55 0.74 0.72 0.71 0.93 0.96 0.8 0.8 0.81 0.88 0.64 0.38 0.87 0.73
0.78 0.89 0.56 0.61 0.76 0.46 0.78 0.71 0.81 0.59 0.47 0.7 0.42 0.76
0.8 0.67 0.94 0.65 0.51 0.73 0.9 0.8 0.65 0.7 0.96 0.96 0.73 0.79
0.86 0.89 0.85 0.76 0.76 0.71 0.83 0.76 0.42 0.9 0.58 0.66 0.86 0.71
0.8 0.51 0.65 0.58 0.76 0.8 0.7 0.61 0.71 0.69 0.95 0.72 0.79 0.97
0.74 0.96 0.47 0.56 0.73 0.94 0.76 0.79 0.71 0.58 0.94 0.66 0.75 0.76
0.84 0.59 0.68 0.75 0.76 0.72 0.87 0.78 0.67 0.79 0.91 0.57 0.77 0.69
0.73 0.43 0.93 0.68 0.82 0.67 0.74 0.82 0.85 0.62 0.54 0.71 0.92 0.85
0.79 0.63 0.59 0.73 0.66 0.74 0.9 0.81].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

You have made a mistake with splitting the data. That is because you set labels_train which are 1D to features_test by mistake, and since transform function does not expect 1D array, it returns error.
train_test_split() returns features_train, features_test, label_train, labels_test respectively.
So, change your code like this:
#features_train, labels_train, features_test, labels_test = train_test_split(features, labels, test_size=0.2, random_state=13)
features_train, features_test, label_train, labels_test = train_test_split(features, labels, test_size=0.2, random_state=13)

How to concat two pivot tables without losing column name

I am trying to concat two pivot tables but after join the two tables, the columns lost.
Pivot1:
SATISFIED_CHECKOUT 1.0 2.0 3.0 4.0 5.0
SEGMENT
BOTH_TX_SPEND_GROWN 0.01 0.03 0.04 0.14 0.80
BOTH_TX_SPEND_NO_GROWTH 0.01 0.03 0.04 0.14 0.78
ONLY_SHOPPED_2018 NaN 0.03 0.04 0.15 0.78
ONLY_SHOPPED_2019 0.01 0.02 0.05 0.13 0.78
ONLY_SPEND_GROWN 0.01 0.02 0.03 0.12 0.82
ONLY_TX_GROWN 0.01 0.03 0.03 0.14 0.79
SHOPPED_NEITHER NaN 0.04 0.02 0.15 0.79
Pivot2:
SATISFIED_FOOD 1.0 2.0 3.0 4.0 5.0
SEGMENT
BOTH_TX_SPEND_GROWN 0.00 0.01 0.07 0.20 0.71
BOTH_TX_SPEND_NO_GROWTH 0.00 0.01 0.08 0.19 0.71
ONLY_SHOPPED_2018 0.01 0.01 0.07 0.19 0.71
ONLY_SHOPPED_2019 0.00 0.01 0.10 0.19 0.69
ONLY_SPEND_GROWN 0.00 0.01 0.08 0.18 0.72
ONLY_TX_GROWN 0.00 0.02 0.07 0.19 0.72
SHOPPED_NEITHER NaN NaN 0.10 0.20 0.70
The original df looks like below:
SATISFIED_CHECKOUT SATISFIED_FOOD Persona
1 1 BOTH_TX_SPEND_GROWN
2 3 BOTH_TX_SPEND_NO_GROWTH
3 2 ONLY_SHOPPED_2019
.... .... ............
5 3 ONLY_SHOPPED_2019
I am using the code:
a = pd.pivot_table(df,index=["SEGMENT"], columns=["SATISFIED_FOOD"], aggfunc='size').apply(lambda x: x / x.sum(), axis=1).round(2)
b = pd.pivot_table(df,index=["SEGMENT"], columns=["SATISFIED_CHECKOUT"], aggfunc='size').apply(lambda x: x / x.sum(), axis=1).round(2)
pd.concat([a, b],axis=1)
The result like below:
1.0 2.0 3.0 4.0 ... 2.0 3.0 4.0 5.0
SEGMENT ...
BOTH_TX_SPEND_GROWN 0.01 0.03 0.07 0.23 ... 0.03 0.04 0.14 0.80
BOTH_TX_SPEND_NO_GROWTH 0.01 0.03 0.06 0.22 ... 0.03 0.04 0.14 0.78
ONLY_SHOPPED_2018 0.01 0.04 0.08 0.24 ... 0.03 0.04 0.15 0.78
ONLY_SHOPPED_2019 0.01 0.03 0.08 0.25 ... 0.02 0.05 0.13 0.78
ONLY_SPEND_GROWN 0.00 0.03 0.07 0.22 ... 0.02 0.03 0.12 0.82
ONLY_TX_GROWN 0.01 0.02 0.05 0.22 ... 0.03 0.03 0.14 0.79
SHOPPED_NEITHER NaN 0.01 0.07 0.28 ... 0.04 0.02 0.15 0.79
[7 rows x 15 columns]
But what I want to see this the result like below:
SATISFIED_CHECKOUT SATISFIED_FOOD
1.0 2.0 3.0 4.0 ... 2.0 3.0 4.0 5.0
SEGMENT ...
BOTH_TX_SPEND_GROWN 0.01 0.03 0.07 0.23 ... 0.03 0.04 0.14 0.80
BOTH_TX_SPEND_NO_GROWTH 0.01 0.03 0.06 0.22 ... 0.03 0.04 0.14 0.78
ONLY_SHOPPED_2018 0.01 0.04 0.08 0.24 ... 0.03 0.04 0.15 0.78
ONLY_SHOPPED_2019 0.01 0.03 0.08 0.25 ... 0.02 0.05 0.13 0.78
ONLY_SPEND_GROWN 0.00 0.03 0.07 0.22 ... 0.02 0.03 0.12 0.82
ONLY_TX_GROWN 0.01 0.02 0.05 0.22 ... 0.03 0.03 0.14 0.79
SHOPPED_NEITHER NaN 0.01 0.07 0.28 ... 0.04 0.02 0.15 0.79
[7 rows x 15 columns]

Basemap Contour - Correct Indices

I'm trying to make a contour map with Basemap. My lat, lon and eof1 arrays are all 1-D and 79 items long. When I run this code, I get an error saying:
IndexError: too many indices for array
Any suggestions? I'm guessing a meshgrid or something, but all the combinations that I tried did not work.
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import numpy as np
data = np.genfromtxt('/Volumes/NO_NAME/Classwork/Lab3PCAVarimax.txt',usecols=(1,2,3,4,5,6,7),skip_header=1)
eof1 = data[:,6]
locs = np.genfromtxt('/Volumes/NO_NAME/Classwork/OK_vic_grid.txt')
lat = locs[:,1]
lon = locs[:,2]
fig, ax = plt.subplots()
m = Basemap(projection='stere',lon_0=-95,lat_0=35.,lat_ts=40,\
llcrnrlat=33,urcrnrlat=38,\
llcrnrlon=-103.8,urcrnrlon=-94)
X,Y = m(lon,lat)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.drawmapboundary(fill_color='lightblue')
m.drawparallels(np.arange(0.,40.,2.),color='gray',dashes=[1,3],labels=[1,0,0,0])
m.drawmeridians(np.arange(0.,360.,2.),color='gray',dashes=[1,3],labels=[0,0,0,1])
m.fillcontinents(color='beige',lake_color='lightblue',zorder=0)
plt.title('Oklahoma PCA-Derived Soil Moisture Regions (Varimax)')
m.contour(X,Y,eof1)
lat and lon data:
1 33.75 -97.75
2 33.75 -97.25
3 33.75 -96.75
4 33.75 -96.25
5 33.75 -95.75
6 33.75 -95.25
7 33.75 -94.75
8 34.25 -99.75
9 34.25 -99.25
10 34.25 -98.75
11 34.25 -98.25
12 34.25 -97.75
13 34.25 -97.25
14 34.25 -96.75
15 34.25 -96.25
16 34.25 -95.75
17 34.25 -95.25
18 34.25 -94.75
19 34.75 -99.75
20 34.75 -99.25
21 34.75 -98.75
22 34.75 -98.25
23 34.75 -97.75
24 34.75 -97.25
25 34.75 -96.75
26 34.75 -96.25
27 34.75 -95.75
28 34.75 -95.25
29 34.75 -94.75
30 35.25 -99.75
31 35.25 -99.25
32 35.25 -98.75
33 35.25 -98.25
34 35.25 -97.75
35 35.25 -97.25
36 35.25 -96.75
37 35.25 -96.25
38 35.25 -95.75
39 35.25 -95.25
40 35.25 -94.75
41 35.75 -99.75
42 35.75 -99.25
43 35.75 -98.75
44 35.75 -98.25
45 35.75 -97.75
46 35.75 -97.25
47 35.75 -96.75
48 35.75 -96.25
49 35.75 -95.75
50 35.75 -95.25
51 35.75 -94.75
52 36.25 -99.75
53 36.25 -99.25
54 36.25 -98.75
55 36.25 -98.25
56 36.25 -97.75
57 36.25 -97.25
58 36.25 -96.75
59 36.25 -96.25
60 36.25 -95.75
61 36.25 -95.25
62 36.25 -94.75
63 36.75 -102.75
64 36.75 -102.25
65 36.75 -101.75
66 36.75 -101.25
67 36.75 -100.75
68 36.75 -100.25
69 36.75 -99.75
70 36.75 -99.25
71 36.75 -98.75
72 36.75 -98.25
73 36.75 -97.75
74 36.75 -97.25
75 36.75 -96.75
76 36.75 -96.25
77 36.75 -95.75
78 36.75 -95.25
79 36.75 -94.75
eof data
PC5 PC3 PC2 PC6 PC7 PC4 PC1
1 0.21 0.14 0.33 0.39 0.73 0.13 0.03
2 0.19 0.17 0.42 0.24 0.78 0.1 0.04
3 0.17 0.18 0.51 0.18 0.71 0.01 0.1
4 0.18 0.2 0.58 0.19 0.67 0.07 0.11
5 0.15 0.17 0.76 0.2 0.43 0.11 0.13
6 0.12 0.16 0.82 0.17 0.34 0.12 0.15
7 0.1 0.2 0.84 0.14 0.28 0.14 0.13
8 0.16 0.09 0.2 0.73 0.29 0.25 0.1
9 0.18 0.14 0.18 0.68 0.36 0.24 0.14
10 0.23 0.22 0.18 0.63 0.53 0.21 0.14
11 0.19 0.23 0.23 0.52 0.62 0.19 0.14
12 0.2 0.18 0.23 0.43 0.74 0.15 0.11
13 0.21 0.19 0.43 0.24 0.77 0.11 0.11
14 0.15 0.21 0.51 0.15 0.72 0.1 0.15
15 0.14 0.23 0.58 0.19 0.66 0.1 0.12
16 0.13 0.22 0.74 0.19 0.49 0.13 0.12
17 0.08 0.24 0.85 0.19 0.28 0.15 0.1
18 0.1 0.29 0.86 0.15 0.18 0.16 0.07
19 0.26 0.11 0.17 0.77 0.1 0.24 0.06
20 0.36 0.16 0.14 0.74 0.24 0.23 0.12
21 0.32 0.27 0.14 0.65 0.42 0.14 0.14
22 0.39 0.29 0.21 0.58 0.47 0.09 0.21
23 0.3 0.3 0.29 0.47 0.48 0.09 0.33
24 0.25 0.35 0.35 0.42 0.45 0.09 0.45
25 0.25 0.33 0.43 0.29 0.52 0.11 0.46
26 0.24 0.36 0.48 0.26 0.53 0.09 0.4
27 0.18 0.35 0.62 0.24 0.48 0.11 0.28
28 0.13 0.4 0.83 0.12 0.15 0.12 0.06
29 0.13 0.42 0.81 0.1 0.14 0.08 0.05
30 0.45 0.14 0.14 0.7 0.05 0.2 0.04
31 0.52 0.19 0.13 0.68 0.25 0.18 0.06
32 0.53 0.2 0.16 0.66 0.32 0.09 0.08
33 0.48 0.26 0.2 0.56 0.37 0.06 0.21
34 0.41 0.34 0.28 0.44 0.35 0.06 0.43
35 0.37 0.4 0.28 0.37 0.32 0.06 0.54
36 0.24 0.41 0.39 0.27 0.33 0.11 0.56
37 0.29 0.47 0.37 0.28 0.32 0.11 0.54
38 0.3 0.61 0.36 0.25 0.26 0.13 0.47
39 0.21 0.6 0.66 0.13 0.18 0.1 0.12
40 0.13 0.48 0.75 0.1 0.13 0.07 0.06
41 0.55 0.15 0.14 0.63 0.07 0.25 0.1
42 0.55 0.19 0.17 0.65 0.13 0.2 0.11
43 0.6 0.19 0.15 0.62 0.27 0.04 0.06
44 0.63 0.18 0.16 0.53 0.25 0.04 0.16
45 0.69 0.27 0.22 0.36 0.23 -0.01 0.28
46 0.56 0.39 0.25 0.22 0.24 0.06 0.47
47 0.45 0.51 0.28 0.23 0.25 0.11 0.51
48 0.38 0.63 0.3 0.27 0.24 0.14 0.4
49 0.3 0.75 0.34 0.19 0.21 0.13 0.3
50 0.29 0.77 0.44 0.16 0.19 0.12 0.13
51 0.18 0.66 0.63 0.11 0.17 0.1 0.06
52 0.53 0.12 0.08 0.35 0.1 0.52 0.14
53 0.68 0.19 0.14 0.4 0.09 0.36 0.12
54 0.76 0.24 0.14 0.34 0.09 0.29 0.12
55 0.84 0.25 0.12 0.29 0.15 0.1 0.14
56 0.82 0.25 0.11 0.28 0.21 0.03 0.12
57 0.64 0.44 0.22 0.23 0.21 0.06 0.36
58 0.54 0.52 0.27 0.21 0.2 0.09 0.39
59 0.44 0.72 0.26 0.22 0.17 0.17 0.23
60 0.3 0.79 0.28 0.17 0.14 0.11 0.19
61 0.26 0.81 0.35 0.18 0.17 0.12 0.08
62 0.24 0.82 0.37 0.16 0.17 0.1 0.05
63 0.17 0.07 0.22 0.26 0.18 0.75 0.07
64 0.25 0.15 0.24 0.23 0.12 0.82 0.08
65 0.3 0.15 0.16 0.23 0.11 0.82 0.04
66 0.39 0.23 0.15 0.19 0.06 0.77 0.05
67 0.58 0.2 0.09 0.21 0.12 0.55 -0.1
68 0.68 0.17 0.04 0.21 0.11 0.48 -0.07
69 0.59 0.18 0.01 0.14 0.04 0.47 0.07
70 0.75 0.2 0.1 0.29 0.06 0.36 0.11
71 0.75 0.22 0.13 0.26 0.13 0.31 0.07
72 0.82 0.25 0.12 0.2 0.19 0.17 0.06
73 0.79 0.3 0.11 0.15 0.13 0.16 0.03
74 0.76 0.41 0.13 0.16 0.17 0.08 0.13
75 0.65 0.48 0.16 0.14 0.15 0.13 0.15
76 0.52 0.66 0.18 0.16 0.2 0.22 0.05
77 0.45 0.74 0.24 0.16 0.19 0.2 0.06
78 0.38 0.78 0.32 0.17 0.14 0.15 0.02
79 0.28 0.79 0.34 0.15 0.16 0.11 0

AFAICT the essence of your problem is that your x/y grid isn't strictly rectangular. The documentation for matplotlib.pyplot.contour says:
X and Y must both be 2-D with the same shape as Z, or they must both
be 1-D such that len(X) is the number of columns in Z and len(Y) is
the number of rows in Z
see http://matplotlib.org/api/pyplot_api.html
With your unmodified data you can get a quiver plot by e.g:
# create vectors up and slightly right
v=eof1
u=[eof1[i]*0.5 for i in range(len(eof1))]
m.quiver(lon,lat,u,v, latlon=True)
plt.show()
So you will have to map your data to the 1-D,1-D,2-D or 2-D,2-D,2-D format required by contour().
It's fairly easy to make your data cover a smaller latlon rectangular area by deleting rows 1-7 and 63-68 (or I guess you could pad it out with 0 values to cover your original area), but by the time the lon/lat are projected to your stere projection coordinates they aren't rectangular any more, which I think will also be a problem. How about you use a merc projection, just to get things going?
However overall I think you will need more data, particularly to get contours over your Oaklahoma boundary you need data up to the boundary. Use the latlon=True parameter to the contour call so it transforms the lon and lat correctly, even with the merc projection. I also tried adding parameter tri=True but that seems to place different requirements on the xx/y/z data.
Another example, you can get a bubble plot using scatter():
s=[eof1[i]*500 for i in range(len(eof1))]
m.scatter(lon,lat,s=s,latlon=True)
Addition:
Managed to get some contours!
Simplest solution was to hardcode your lat/lon/data for the rectangular region, the meshgrid turns the 1-D lon and lat into a full 2-D grid in xx and yy, and the value points are 2-D. Here's the code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
data = np.genfromtxt('Lab3PCAVarimax.txt',usecols=(1,2,3,4,5,6,7),skip_header=1)
eof1 = data[:,6]
print eof1
eof11= [
[ 0.1 ,0.14 ,0.14 ,0.14 ,0.11 ,0.11 ,0.15 ,0.12 ,0.12 ,0.1 ,0.07]
,[ 0.06 ,0.12 ,0.14 ,0.21 ,0.33 ,0.45 ,0.46 ,0.4 ,0.28 ,0.06 ,0.05]
,[ 0.04 ,0.06 ,0.08 ,0.21 ,0.43 ,0.54 ,0.56 ,0.54 ,0.47 ,0.12 ,0.06]
,[ 0.1 ,0.11 ,0.06 ,0.16 ,0.28 ,0.47 ,0.51 ,0.4 ,0.3 ,0.13 ,0.06]
,[ 0.14 ,0.12 ,0.12 ,0.14 ,0.12 ,0.36 ,0.39 ,0.23 ,0.19 ,0.08 ,0.05]
,[ 0.07 ,0.11 ,0.07 ,0.06 ,0.03 ,0.13 ,0.15 ,0.05 ,0.06 ,0.02 ,0. ]
]
locs = np.genfromtxt('OK_vic_grid.txt')
lat = locs[:,1]
lon = locs[:,2]
lat1 = [34.25 ,34.75,35.25,35.75,36.25,36.75]
lon1 =[-99.75,-99.25, -98.75, -98.25, -97.75, -97.25, -96.75, -96.25, -95.75, -95.25, -94.75]
fig, ax = plt.subplots()
m = Basemap(projection='merc',lon_0=-95,lat_0=35.,lat_ts=40,\
llcrnrlat=33,urcrnrlat=38,\
llcrnrlon=-103.8,urcrnrlon=-94)
#X,Y = m(lon,lat)
m.drawcoastlines()
m.drawstates()
m.drawcountries()
m.drawmapboundary(fill_color='lightblue')
m.drawparallels(np.arange(0.,40.,2.),color='gray',dashes=[1,3],labels=[1,0,0,0])
m.drawmeridians(np.arange(0.,360.,2.),color='gray',dashes=[1,3],labels=[0,0,0,1])
m.fillcontinents(color='beige',lake_color='lightblue',zorder=0)
plt.title('Oklahoma PCA-Derived Soil Moisture Regions (Varimax)')
xx, yy = m(*np.meshgrid(lon1,lat1))
m.contourf(xx,yy,eof11)
plt.show()
Further addition: Actually this still works when the projection is stere :-)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is there a way to interpolate values while maintaining a ratio? - python

Related

How to add a sum() value above the df column values?

How to add rows up to point?

StandardScaler() python error for scaling data

How to concat two pivot tables without losing column name

Basemap Contour - Correct Indices

Categories

Resources