all non-significant or NAN p-values in Logit - python

I'm running a logit with statsmodels that has around 25 regressors, ranging from categorical, ordinal and continuous variables.
My code is the following, with its output:
a = np.asarray(data_nobands[[*all 25 columns*]], dtype=float)
mod_logit = sm.Logit(np.asarray(data_nobands['cured'], dtype=float),a)
logit_res = mod_logit.fit(method="nm", cov_type="cluster", cov_kwds={"groups":data_nobands['AGREEMENT_NUMBER']})
"""
Logit Regression Results
==============================================================================
Dep. Variable: y No. Observations: 17316
Model: Logit Df Residuals: 17292
Method: MLE Df Model: 23
Date: Wed, 05 Aug 2020 Pseudo R-squ.: -0.02503
Time: 19:49:27 Log-Likelihood: -10274.
converged: False LL-Null: -10023.
Covariance Type: cluster LLR p-value: 1.000
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
x1 3.504e-05 0.009 0.004 0.997 -0.017 0.017
x2 1.944e-05 nan nan nan nan nan
x3 3.504e-05 2.173 1.61e-05 1.000 -4.259 4.259
x4 3.504e-05 2.912 1.2e-05 1.000 -5.707 5.707
x5 3.504e-05 0.002 0.016 0.988 -0.004 0.004
x6 3.504e-05 0.079 0.000 1.000 -0.154 0.154
x7 3.504e-05 0.003 0.014 0.989 -0.005 0.005
x8 3.504e-05 0.012 0.003 0.998 -0.023 0.023
x9 3.504e-05 0.020 0.002 0.999 -0.039 0.039
x10 3.504e-05 0.021 0.002 0.999 -0.041 0.041
x11 3.504e-05 0.011 0.003 0.997 -0.021 0.022
x12 8.831e-06 5.74e-06 1.538 0.124 -2.42e-06 2.01e-05
x13 4.82e-06 9.23e-06 0.522 0.602 -1.33e-05 2.29e-05
x14 3.504e-05 0.000 0.248 0.804 -0.000 0.000
x15 3.504e-05 4.02e-05 0.871 0.384 -4.38e-05 0.000
x16 1.815e-05 1.58e-05 1.152 0.249 -1.27e-05 4.9e-05
x17 3.504e-05 0.029 0.001 0.999 -0.057 0.057
x18 3.504e-05 0.000 0.190 0.849 -0.000 0.000
x19 9.494e-06 nan nan nan nan nan
x20 1.848e-05 nan nan nan nan nan
x21 3.504e-05 0.026 0.001 0.999 -0.051 0.051
x22 3.504e-05 0.037 0.001 0.999 -0.072 0.072
x23 -0.0005 0.000 -2.596 0.009 -0.001 -0.000
x24 3.504e-05 0.006 0.006 0.995 -0.011 0.011
x25 3.504e-05 0.011 0.003 0.998 -0.022 0.022
==============================================================================
"""
With any other method such as bfgs, lbfgs, minimize, the output is the following:
"""
Logit Regression Results
==============================================================================
Dep. Variable: y No. Observations: 17316
Model: Logit Df Residuals: 17292
Method: MLE Df Model: 23
Date: Wed, 05 Aug 2020 Pseudo R-squ.: -0.1975
Time: 19:41:22 Log-Likelihood: -12003.
converged: False LL-Null: -10023.
Covariance Type: cluster LLR p-value: 1.000
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
x1 0 0.152 0 1.000 -0.299 0.299
x2 0 724.618 0 1.000 -1420.225 1420.225
x3 0 20.160 0 1.000 -39.514 39.514
x4 0 23.008 0 1.000 -45.094 45.094
x5 0 0.010 0 1.000 -0.020 0.020
x6 0 1.335 0 1.000 -2.617 2.617
x7 0 0.020 0 1.000 -0.039 0.039
x8 0 0.109 0 1.000 -0.214 0.214
x9 0 0.070 0 1.000 -0.137 0.137
x10 0 0.175 0 1.000 -0.343 0.343
x11 0 0.045 0 1.000 -0.088 0.088
x12 0 1.24e-05 0 1.000 -2.42e-05 2.42e-05
x13 0 2.06e-05 0 1.000 -4.04e-05 4.04e-05
x14 0 0.001 0 1.000 -0.002 0.002
x15 0 5.16e-05 0 1.000 -0.000 0.000
x16 0 1.9e-05 0 1.000 -3.73e-05 3.73e-05
x17 0 0.079 0 1.000 -0.155 0.155
x18 0 0.000 0 1.000 -0.001 0.001
x19 0 1145.721 0 1.000 -2245.573 2245.573
x20 0 nan nan nan nan nan
x21 0 0.028 0 1.000 -0.055 0.055
x22 0 0.037 0 1.000 -0.072 0.072
x23 0 0.000 0 1.000 -0.000 0.000
x24 0 0.005 0 1.000 -0.010 0.010
x25 0 0.015 0 1.000 -0.029 0.029
==============================================================================
"""
As you can see, I get either "nan" p-values or highly not significant.
What could the problem be?

Related

Slider X-axis date range for Matplotlib

I did not get slider for date range. Plot a bar graphs on ranges of dates
Datanase details are below
SDATETIME FE014APV FE011APV FE014BPV FE011BPV FE014CPV FE011CPV FT001PV FT002PV
CCC
2022-10-30 00:55:01.000 0 0.018 0 0.063 31.704 64.214 7.844 3.391 23.489 2.142 0.316 0.287 0 0
2022-10-30 07:55:00.000 0 0.012 0 0.042 25.119 47.035 6.336 2.561 18.289 1.41 0.25 0.082 0 0
2022-10-30 15:55:01.000 0 0.023 0 0.053 34.79 60.396 7.855 4.829 23.957 1.887 0.278 10.387 0 0
2022-10-31 00:55:00.000 0 0.024 0 0.062 38.451 67.083 9.175 4.3 18.135 2.145 0.337 0.114 0 0
2022-10-31 07:55:00.000 0 0.015 0 0.041 21.864 44.637 5.636 2.603 13.727 1.411 0.222 0.083 0 0
2022-10-31 15:55:00.000 0 0.021 0 0.058 37.887 66.198 9.805 3.554 20.793 1.898 0.476 10.084 0 0
2022-11-01 00:55:00.000 0 0.024 0 0.063 39.522 72.892 11.107 4.388 24.486 2.138 0.43 0.121 0 0
2022-11-01 07:55:00.000 0 0.014 0 0.042 27.753 49.468 8.069 2.602 20.385 1.412 0.312 0.084 0 0
2022-11-01 15:55:00.000 0 0.021 0 0.053 49.388 73.054 12.142 5.177 19.143 1.9 0.372 10.318 0 0
2022-11-02 00:55:00.000 0 0.024 0 0.062 50.925 84.344 14.387 4.833 30.602 2.139 0.414 0.767 0 0
2022-11-02 07:55:00.000 0 0.016 0 0.043 25.732 49.933 7.521 1.146 23.746 1.413 0.231 0.082 0 0
2022-11-02 15:55:01.000 0 0.016 0 0.061 34.319 59.614 7.979 2.866 18.436 1.9 0.322 11.949 0 0
2022-11-03 00:55:01.000 0 0.017 0 0.073 50.906 80.468 13.649 6.477 29.069 2.148 0.725 0.12 0 0
2022-11-03 07:55:00.000 0 0.015 0 0.042 24.618 52.129 8.839 2.524 20.226 1.415 0.372 0.086 0 0
2022-11-03 15:55:00.000 0 0.023 0 0.055 37.337 59.935 7.956 5.641 19.666 1.896 0.278 11.115 0 0
2022-11-04 00:55:00.000 0 0.031 0 0.067 45.835 88.807 15.842 4.803 30.372 2.147 0.646 0.117 0 0
2022-11-04 07:55:00.000 0 0.018 0 0.043 24.232 45.295 6.252 3.035 18.596 1.415 0.25 0.088 0 0
2022-11-04 15:55:01.000 0 0.029 0 0.057 36.537 57.599 7.607 3.805 17.312 1.826 0.32 9.374 0 0
2022-11-05 00:55:01.000 0 0.03 0 0.065 44.91 70.08 10.222 6.455 19.679 2.082 0.379 2.677 0 0
2022-11-05 07:55:00.000 0 0.018 0 0.042 25.728 45.422 6.617 2.565 18.876 1.367 0.32 0.079 0 0
2022-11-05 15:55:00.000 0.008 0.024 0 0.054 40.846 65.824 9.504 5.214 17.19 1.841 0.326 23.637 0 0
2022-11-06 00:55:01.000 0 0.022 0 0.065 41.552 69.855 9.79 5.334 22.538 2.069 0.296 0.135 0 0
2022-11-06 07:55:01.000 0 0.014 0 0.043 26.126 54.864 9.458 4.002 19.225 1.375 0.452 0.08 0 0
2022-11-06 15:55:01.000 0 0.017 0 0.062 33.386 58.031 7.848 3.679 25.448 1.842 0.331 10.524 0 0
2022-11-07 00:55:00.000 0 0.029 0 0.066 46.823 70.362 10.323 6.417 24.739 2.077 0.317 0.113 0 0
2022-11-07 07:55:00.000 0 0.018 0 0.043 27.748 44.87 6.831 3.879 12.485 1.37 0.199 0.08 0 0
2022-11-07 15:55:00.000 0 0.024 0 0.057 30.679 60.475 8.261 3.831 23.813 1.842 0.262 10.254 0 0
2022-11-08 00:55:00.000 0 0.02 0 0.07 44.283 72.238 12.528 8.434 19.375 2.078 0.555 0.114 0 0
2022-11-08 07:55:01.000 0 0.002 0 0.06 26.667 48.338 7.308 4.564 10.959 1.369 0.191 0.08 0 0
2022-11-08 15:55:00.000 0 0.018 0 0.062 32.208 64.367 9.864 2.346 25.159 1.836 0.441 11.786 0 0
2022-11-09 00:55:01.000 0 0.023 0 0.065 51.771 81.97 15.081 9.926 22.85 2.085 0.766 2.2 0 0
2022-11-09 07:55:01.000 0 0.014 0 0.042 28.192 45.64 7.273 5.396 11.386 1.373 0.219 0.086 0 0
2022-11-09 15:55:01.000 0 0.015 0 0.067 48.331 72.968 11.994 4.817 23.623 1.832 0.274 12.491 0 0
2022-11-10 00:55:02.000 0 0.03 0 0.062 49.688 73.556 11.163 8.909 22.62 2.072 0.267 0.235 0 0
2022-11-10 07:55:02.000 0 0.004 0 0.059 27.595 44.208 6.355 5.009 10.668 1.373 0.165 0.081 0 0
2022-11-10 15:55:02.000 0 0.02 0 0.069 37.923 58.852 8.31 4.97 17.333 1.841 0.222 9.626 0 0
2022-11-11 00:55:02.000 0 0.006 0 0.088 47.117 71.11 10.352 7.718 22.677 2.081 0.261 5.213 0 0
2022-11-11 07:55:02.000 0 0.003 0 0.058 32.351 48.207 7.285 5.51 14.897 1.37 0.178 0.085 0 0
2022-11-11 15:55:00.000 0 0.004 0 0.03 1.772 8.676 0.914 0.512 -0.011 0.5 0.067 -0.004 0 0
2022-11-12 00:55:00.000 0 0 0 0.084 48.271 66.121 14.029 9.502 18.051 1.998 0.882 0.112 0 0
2022-11-12 07:55:01.000 0 0 0 0.059 31.717 45.78 7.625 6.081 13.722 1.374 0.122 0.082 0 0
2022-11-12 15:55:01.000 0 0 0 0.071 61.677 83.471 17.962 11.392 17.152 1.848 0.29 11.262 0 0
2022-11-13 00:55:00.000 0 0.001 0 0.079 52.98 77.556 13.912 6.742 23.841 2.079 0.177 4.706 0 0
2022-11-13 07:55:01.000 0 0.001 0 0.042 31.187 45.321 6.456 7.296 7.965 1.373 0.094 0.08 0 0
2022-11-13 15:55:00.000 0 0.001 0 0.057 53.093 75.026 14.037 7.386 22.119 1.861 0.202 8.901 0 0
2022-11-14 00:55:02.000 0 0.002 0 0.066 49.394 69.717 11.504 9.93 16.727 2.098 0.269 0.123 0 0
2022-11-14 07:55:01.000 0 0.001 0 0.043 27.505 41.813 6.47 5.747 11.561 1.368 0.124 0.082 0 0
2022-11-14 15:55:01.000 0 0.003 0 0.056 43.861 66.129 12.426 5.21 20.432 1.857 0.365 10.862 0 0
2022-11-15 00:55:00.000 0 0.003 0 0.064 47.404 71.234 10.706 7.544 17.227 2.096 0.23 0.114 0 0
2022-11-15 07:55:00.000 0 0.002 0 0.047 30.274 45.558 7.054 5.018 16.077 1.376 0.131 0.09 0 0
2022-11-15 15:55:01.000 0 0.005 0 0.061 46.22 67.617 10.571 8.397 16.724 1.752 0.174 9.08 0 0
2022-11-16 00:55:01.000 0 0.006 1.84 6.091 51.591 76.16 15.078 11.698 22.75 2.016 0.42 6.217 0 0
2022-11-16 07:55:00.000 0 0.002 0 0.041 30.872 46.471 6.652 5.952 12.444 1.366 0.084 0.08 0 0
2022-11-16 15:55:02.000 0 0.004 30.88 32.166 14.41 28.233 10.636 8.152 16.919 1.769 0.196 3.671 0 0
2022-11-17 00:55:02.000 0 0.005 47.63 49.429 0.007 1.372 7.806 7.755 5.613 2.011 0.152 0.126 0 0
2022-11-17 07:55:01.000 0 0.003 32.69 33.582 0.006 0.005 5.213 5.939 5.045 1.321 0.07 0.08 0 0
2022-11-17 15:55:02.000 0.016 0.005 53.65 57.595 0.006 5.877 14.369 8.102 12.324 1.699 0.271 9.452 0 0
2022-11-18 00:55:00.000 0 0.005 66.11 69.54 0.006 0.375 13.636 9.577 11.89 1.762 0.197 1.43 0 0
2022-11-18 07:55:02.000 0 0.002 44.08 46.715 0.004 0.006 8.983 8.774 12.769 1.169 0.125 6.51 0 0
2022-11-18 15:55:01.000 0.003 0.006 57.47 61.427 0.005 0.707 13.819 9.58 13.568 1.561 0.156 10.866 0 0
2022-11-19 00:55:01.000 0.003 0.002 73.69 82.457 0.005 0.337 18.903 12.603 17.728 1.778 0.181 7.083 0 0
2022-11-19 07:55:01.000 0.01 0.004 36.53 63.856 0.006 0.006 14.504 6.993 18.519 1.162 0.535 0.086 0 0
2022-11-19 15:55:02.000 0.011 0.005 28.17 33.162 21.328 27.47 12.38 5.57 18.178 1.735 0.442 13.414 0 0
2022-11-20 00:55:01.000 0 0.01 0 0.064 57.111 74.307 12.316 8.026 24.48 1.881 0.288 1.828 0 0
2022-11-20 07:55:01.000 0 0.006 0 0.043 33.624 43.835 6.248 5.439 11.432 1.238 0.193 0.084 0 0
2022-11-20 15:55:00.000 0 0.008 0 0.056 49.353 61.2 10.707 7.011 17.792 1.674 0.288 15.387 0 0
2022-11-21 00:55:01.000 0 0.015 0 0.125 62.616 75.376 14.667 7.793 19.788 1.891 0.722 0.154 0 0
2022-11-21 07:55:00.000 0 0.009 0 0.042 37.125 45.625 9.399 4.313 13.219 1.246 0.137 0.081 0 0
2022-11-21 15:55:00.000 0 0.008 0 0.062 58.076 71.028 11.602 6.083 19.767 1.689 0.19 15.191 0 0
2022-11-22 00:55:00.000 0 0.003 0 0.084 60.181 89.245 17.017 8.897 25.56 1.869 0.441 7.375 0 0
2022-11-22 07:55:00.000 0 0.006 0 0.042 31.894 43.099 7.043 4.312 12.747 1.249 0.169 0.084 0 0
2022-11-22 15:55:02.000 0 0.008 0 0.056 59.067 79.39 16.03 7.044 25.992 1.684 0.457 9.355 0 0
2022-11-23 00:55:02.000 0 0.005 0 0.063 71.226 93.677 20.037 9.347 26.433 2.078 0.324 2.638 0 0
2022-11-23 07:55:01.000 0 0.006 0 0.042 32.568 42.998 6.799 4.324 17.363 1.367 0.117 1.656 0 0
2022-11-23 15:55:02.000 0 0.005 0 0.062 42.939 62.455 13.313 9.132 13.634 1.835 0.241 8.592 0 0
2022-11-24 00:55:02.000 0 0.018 0 0.061 53.496 71.57 11.27 8.978 25.15 2.091 0.252 3.619 0 0
2022-11-24 07:55:01.000 0 0.008 0 0.041 35.787 46.083 7.314 6.182 14.434 1.359 0.137 6.014 0 0
2022-11-24 15:55:01.000 0 0.013 0 0.052 53.305 67.608 12.84 9.808 19.628 1.834 0.218 4.017 0 0
2022-11-25 00:55:02.000 0 0.013 0 0.061 57.851 91.281 18.689 9.917 29.013 2.08 0.291 2.664 0 0
2022-11-25 07:55:01.000 0 0.011 0 0.041 33.519 42.916 9.603 4.492 12.993 1.365 0.301 0.091 0 0
2022-11-25 15:55:01.000 0 0.018 0 0.052 46.75 59.869 9.527 8.855 14.274 1.838 0.204 11.771 0 0
2022-11-26 00:55:02.000 0 0.018 0 0.061 56.529 71.472 12.531 10.508 19.26 2.079 0.218 1.224 0 0
2022-11-26 07:55:02.000 0 0.014 0 0.043 33.511 43.449 6.56 5.339 14.423 1.363 0.141 0.087 0 0
2022-11-26 15:55:02.000 0 0.02 0 0.056 44.73 57.982 8.594 7.442 14.151 1.839 0.221 10.427 0 0
2022-11-27 00:55:00.000 0 0.014 0 0.071 49.059 64.295 8.403 8.547 21.09 2.072 0.259 0.121 0 0
2022-11-27 07:55:00.000 0 0.008 0 0.044 30.95 40.951 5.789 5.909 13.676 1.374 0.212 0.086 0 0
2022-11-27 15:55:01.000 0 0.006 0 0.059 39.263 63.099 13.329 6.84 18.798 1.805 0.438 9.377 0 0
2022-11-28 00:55:00.000 0 0.009 0 0.065 57.197 94.065 18.395 10.032 29.594 2.088 0.306 0.248 0 0
2022-11-28 07:55:00.000 0 0.006 0 0.039 35.354 59.92 11.049 6.139 16.663 1.367 0.135 0.083 0 0
2022-11-28 15:55:01.000 0 0.01 1.61 3.257 43.694 75.657 16.019 7.061 35.576 1.842 0.749 9.242 0 0
2022-11-29 00:55:00.000 0 0.006 50.28 51.661 5.93 18.87 13.736 9.995 20.507 1.937 0.336 4.385 0 0
2022-11-29 07:55:01.000 0 0.002 33.19 34.42 0.005 0.003 7.187 5.196 13.837 1.199 0.096 0.086 0 0
2022-11-29 15:55:01.000 0 0.003 49.71 51.419 0.006 0.705 8.898 6.902 13.785 1.747 0.142 10.677 0 0
2022-11-30 00:55:00.000 0 0.004 53.57 55.944 0.006 0.891 11.364 8.738 14.981 2.018 0.187 1.616 0 0
2022-11-30 07:55:00.000 0 0.003 37.54 40.799 0.005 0.002 6.801 6.355 12.369 1.323 0.148 2.254 0 0
2022-11-30 15:55:00.000 0 0.006 52.55 67.864 0.005 0.663 12.713 9.065 19.911 1.783 0.206 8.783 0 0
2022-12-01 00:55:00.000 0 0.007 53.3 70.213 0.005 1.389 12.441 7.289 22.754 2.021 0.264 3.189 0 0
2022-12-01 07:55:00.000 0 0.001 31.22 41.052 0.004 0.002 5.979 5.887 12.607 1.324 0.147 5.571 0 0
2022-12-01 15:55:01.000 0 0 41.41 57.296 0.004 0.656 9.649 7.034 20.97 1.785 0.24 8.404 0 0
2022-12-02 00:55:01.000 0 0.002 46.95 63.798 0.004 0.035 9.06 8.873 21.122 2.023 0.228 0.121 0 0
2022-12-02 07:55:01.000 0 0.003 36.67 46.902 0.005 0.004 7.912 8.35 13.829 1.328 0.243 3.472 0 0
2022-12-02 15:55:01.000 0 0.004 42.57 57.335 0.003 0.658 12.986 7.026 18.707 1.782 0.337 7.147 0 0
2022-12-03 00:55:02.000 0 0.003 50.39 66.851 0.005 0.034 13.513 6.718 22.292 2.017 0.349 3.471 0 0
2022-12-03 07:55:01.000 0 0.005 34.45 44.221 0.005 0.003 8.017 7.744 17.182 1.325 0.173 3.24 0 0
2022-12-03 15:55:00.000 0 1.302 44.46 48.749 0.004 0.657 11.326 6.942 15.559 1.67 0.275 1.617 0 0
2022-12-04 00:55:00.000 0 0.005 54.1 56.606 0.006 1.003 10.134 9.463 15.154 1.821 0.188 1.298 0 0
2022-12-04 07:55:00.000 0 0.006 31.65 32.688 0.005 0.004 5.534 4.675 12.947 1.207 0.143 4.31 0 0
2022-12-04 15:55:02.000 0 0.01 50.54 62.561 0.005 0.004 11.465 6.366 20.669 1.608 0.207 10.241 0 0
2022-12-05 00:55:02.000 0 0.011 56.37 69.854 0.006 3.514 12.07 8.455 22.654 1.827 0.27 0.122 0 0
2022-12-05 07:55:02.000 0 0.009 35.44 36.755 0.004 0.003 5.472 5.525 7.212 1.197 0.14 7.182 0 0
2022-12-05 15:55:02.000 0 0.005 46.97 54.578 0.003 0.755 12.283 7.143 21.103 1.613 0.422 4.905 0 0
2022-12-06 00:55:01.000 0 0.006 49.35 58.519 0.005 1.269 9.643 8.041 13.866 1.824 0.206 7.723 0 0
2022-12-06 07:55:00.000 0 0.01 31.71 37.333 0.004 0.004 7.08 4.822 6.315 1.193 0.136 0.082 0 0
2022-12-06 15:55:01.000 0.002 4.843 48.69 59.213 0.004 0.699 15.246 7.262 25.896 1.609 0.429 3.76 0 0
2022-12-07 00:55:00.000 0 0.021 58.6 61.842 0.005 6.846 11.247 8.357 19.981 1.82 0.242 4.737 0 0
2022-12-07 07:55:00.000 0 0.019 37.51 44.291 0.004 0.003 8.446 5.746 13.418 1.197 0.177 1.958 0 0
2022-12-07 15:55:02.000 0.003 0.638 47.06 54.324 0.005 0.004 8.966 6.826 15.307 1.61 0.23 3.997 0 0
2022-12-08 00:55:01.000 0.015 0.033 52.1 54.495 0.007 0.377 7.55 8.041 20.326 1.815 0.216 0.126 0 0
2022-12-08 07:55:01.000 0 0.02 30.73 32.389 0.005 0.003 5.332 6.084 5.549 1.195 0.098 10.254 0 0
2022-12-08 15:55:01.000 0 0.021 48.02 53.836 0.004 0.716 9.885 7.412 12.435 1.609 0.234 5.294 0 0
2022-12-09 00:55:00.000 0 0.007 56.29 89.535 0.003 8.516 20.352 8.103 34.136 1.825 0.646 0.125 0 0
2022-12-09 07:55:01.000 0 0.006 38.06 40.177 0.002 0.334 5.479 7.159 14.988 1.196 0.177 7.942 0 0
2022-12-09 15:55:02.000 0 0.009 44.28 46.154 0.003 0.008 7.751 12.608 12.849 1.606 0.18 7.285 0 0
2022-12-10 00:55:00.000 0 0.008 56.72 58.709 0.002 7.649 9.08 16.702 13.871 1.816 0.227 1.084 0 0
2022-12-10 07:55:01.000 0 0.008 37.02 38.397 0.003 0.393 5.364 9.391 16.812 1.191 0.153 0.086 0 0
2022-12-23 15:55:02.000 0.008 0.006 0 0.053 53.952 64.629 12.176 15.405 -0.717 1.493 0.359 5.793 0 0
2022-12-24 00:55:00.000 0 0.001 0 0.055 59.124 10.539 12.982 17.131 -0.78 1.674 0.319 0.128 0 0
2022-12-24 07:55:02.000 0 0.001 0 0.038 38.513 47.091 10.446 -0.908 -0.52 1.108 0.515 7.178 0 0
2022-12-24 15:55:02.000 0 0.03 0 0.356 10.313 17.462 13.32 -0.708 -0.535 0.842 0.126 0.073 0 0
2022-12-25 00:55:01.000 0 0.025 0 0.05 64.567 3.331 27.403 21.602 -0.851 1.418 0.052 7.327 0 0
2022-12-25 07:55:02.000 0 0.019 0 0.036 41.611 56.151 15.942 5.272 -0.59 0.94 -0.047 0.091 0 0
2022-12-25 15:55:01.000 0 0.031 0 0.046 61.284 73.211 16.096 18.842 -0.798 1.252 1.961 10.967 0 0
2022-12-26 00:55:01.000 0 0.023 0 0.051 63.35 9.48 15.295 16.623 -0.946 1.674 0.176 0.244 0 0
2022-12-26 07:55:00.000 0 0.017 0 0.034 41.022 50.482 11.719 15.404 -0.628 1.109 0.357 4.391 0 0
2022-12-26 15:55:01.000 0 0.024 0 0.047 56.534 78.414 21.556 14.035 -0.856 1.533 0.767 1.067 0 0
2022-12-27 00:55:00.000 0 0.091 0.03 0.087 65.563 0.245 19.674 18.528 -0.946 1.894 0.485 10.931 0 0
2022-12-27 07:55:01.000 0 0.064 0.02 0.618 45.571 54.565 18.331 11.006 -0.524 1.299 0.81 6.934 0 0
2022-12-27 15:55:00.000 0 0.064 0.02 0.071 57.515 70.211 17.059 14.235 -0.771 1.51 0.514 0.105 0 0
2022-12-28 00:55:02.000 0 0.094 0.04 0.087 62.305 0.372 21.511 14.656 -0.872 1.844 0.817 7.452 0 0
2022-12-28 07:55:00.000 0 0.066 0.02 0.053 40.355 49.581 14.841 11.485 -0.624 1.198 0.67 9.739 0 0
2022-12-28 15:55:01.000 0 0.07 0.02 0.065 57.972 73.969 20.895 13.531 -0.797 1.584 0.758 6.097 0 0
2022-12-29 00:55:00.000 0 0.072 0.01 0.074 59.842 0.115 16.199 16.929 -0.856 1.779 0.308 0.121 0 0
2022-12-29 07:55:00.000 0 0.041 0 0.038 38.367 48.239 14.247 12.936 -0.582 1.144 0.429 8.694 0 0
2022-12-29 15:55:01.000 0 0.061 0.01 0.058 39.298 51.174 11.494 11.62 -0.788 1.624 0.301 3.379 0 0
2022-12-30 00:55:01.000 0 0.061 0 0.066 35.638 0.332 8.86 13.917 -0.792 1.735 0.544 0.125 0 0
2022-12-30 07:55:01.000 0 0.042 0 0.037 29.028 37.143 7.629 10.613 -0.573 1.152 0.129 4.441 0 0
2022-12-30 15:55:02.000 0 0.07 0 0.277 42.957 53.598 13.005 16.461 -0.731 1.628 0.337 8.33 0 0
2022-12-31 00:55:00.000 0 0.087 0 0.059 43.298 0.039 10.491 15.678 -0.776 1.769 0.126 0.124 0 0
2022-12-31 07:55:01.000 0 0.061 0 0.038 26.431 34.748 7.926 11.273 -0.526 1.15 0.097 4.05 0 0
2022-12-31 15:55:02.000 0 0.07 1.15 2.032 41.48 53.34 11.752 15.155 -0.717 1.597 0.244 7.564 0 0
2023-01-01 00:55:00.000 0 0.09 79.34 88.672 0.006 0.076 32.249 17.163 -0.702 1.77 1.123 2.947 0 0
2023-01-01 07:55:00.000 0 0.067 49.59 57.26 0.005 0.003 20.069 9.754 -0.463 1.068 1.005 9.849 0 0
2023-01-01 15:55:01.000 0 0.082 59.43 76.773 0.005 0.795 27.802 13.36 -0.615 1.48 1.29 7.65 0 0
2023-01-02 00:55:00.000 0 0.094 59.31 71.289 0.007 0.039 34.918 19.324 -0.718 1.674 -0.283 6.493 0 0
2023-01-02 07:55:01.000 0 0.066 38.66 46.33 0.005 0.003 24.675 12.532 -0.476 1.1 -0.191 5.82 0 0
2023-01-02 15:55:00.000 0 0.086 79.43 93.011 0.004 1.228 41.029 14.371 -0.589 1.46 5.52 9.072 0 0
2023-01-03 00:55:02.000 0 0.103 114.38 126.388 0.006 0.04 40.385 16.676 -0.664 1.619 1.731 6.348 0 0
2023-01-03 07:55:01.000 0 0.075 64.96 71.262 0.005 0.001 21.111 7.754 -0.474 1.103 1.007 5.306 0 0
import os
from tkinter import *
import pandas as pd
from PIL import Image, ImageTk
from statistics import *
import tkinter as tk
from datetime import *
import matplotlib
matplotlib.use("TkAgg")
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg
from numpy import arange, sin, pi
import numpy as np
from matplotlib.figure import Figure
import matplotlib.dates as mdates
from matplotlib.dates import date2num
from dateutil.relativedelta import relativedelta
import pyodbc
#import datetime
from matplotlib.widgets import Slider
from datetime import date, timedelta
from matplotlib.dates import date2num, num2date
#import MySQLdb
# Take the Icon same path of this app
directory = os.path.dirname(__file__)
print(datetime.now().strftime('%A, %d %B %Y\n'))
#*************************************** SQL Connection and Data *******************************************************
conn = pyodbc.connect("Driver={SQL Server Native Client 11.0};" "Server=RUSHABH\RUSHABHPC;" "Database=SCADA;" "Trusted_Connection=yes;")
#WIN-EVP0PK28QU3 bans dairy ql server details
daily_data = pd.read_sql_query('SELECT * FROM [SCADA].[dbo].[TOTALIZER] order by SDATETIME ASC', conn)
monthSQL = pd.read_sql_query('SELECT SDATETIME,max(FE014CPV) as flow,max(FE011CPV) as steam, max(CCC) as coal FROM [SCADA].[dbo].[TOTALIZER] GROUP BY SDATETIME ORDER BY SDATETIME ASC', conn)
print(daily_data)
#************************************Group by date and last data taken by group date*******************************
class Dashboard:
def __init__(self, window):
self.window = window
self.window.title('Dashboard')
self.window.geometry('1366x768')
self.window.state('zoomed')
self.window.config(background='#eff5f6')
#********************** ICON ********************************************
#icon = PhotoImage(file=directory+'/images/dashboard-icon.png')
#self.window.iconphoto(True, icon)
#*************************************************************************
#********************** Window Header ********************************************
#*************************************************************************
self.header = Frame(self.window, bg='#CDCD9B')
self.header.place(x=0, y=1, width=1366, height=40)
self.heading = Label(self.window, text='Steam Generation vs. Coal Consumption', font=("", 12, "bold"), fg='#000000', bg='#CDCD9B')
self.heading.place(x=210, y=6)
#*************************************************************************
#********************** SideBar *******************************************
#*************************************************************************
self.sidebar =Frame(self.window, bg='#FFFFEB')
self.sidebar.place(x=0, y=0, width=150, height=750)
self.coal_text = Button(self.window, text=' COAL ', bg='#32cf8e',font= ("", 13, "bold"), bd=0, fg='white', cursor='hand2', activebackground='#32cf8e')
self.coal_text.place(x=20, y=260)
self.coal_text = Button(self.window, text='WATER', bg='#32cf8e',font= ("", 13, "bold"), bd=0, fg='white', cursor='hand2', activebackground='#32cf8e')
self.coal_text.place(x=20, y=300)
# Logo
#self.logo_Image = Image.open(directory+'/images/SPSolution1.png')
#photo = ImageTk.PhotoImage(self.logo_Image)
#self.logo = Label(self.sidebar,image=photo, bg='#ffffff')
#self.logo.image = photo
#self.logo.place(x=0, y=0)
#self.exit_Image = Image.open(directory+'/images/exit-icon.png')
#photo = ImageTk.PhotoImage(self.exit_Image)
#self.exit = Label(self.sidebar,image=photo, bg='#ffffff')
#self.exit.image = photo
#self.exit.place(x=5, y=652)
self.exit_text = Button(self.window, text='Exit', command = self.window.destroy, bg='#ffffff',font= ("", 13, "bold"), bd=0, cursor='hand2', activebackground='#ffffff')
self.exit_text.place(x=50, y=662)
#******************************* Bar Chart ******************************
#******************************* Header Of Bar Chart ********************************
self.header = Frame(self.window, bg='#808080')
self.header.place(x=155, y=45, width=600, height=25)
# *********************************** Global/Common BarChart Data***********************************************
BarWidth = 0.8
#******************************* Yearly Data Bar Configuration ******************************
yearData = monthSQL.groupby([pd.Grouper(key="SDATETIME", freq="AS")]).sum()
yearData.index = yearData.index.strftime("%Y")
Yearavg = mean(yearData['steam']) #Average taken from a value
Yearavg1 = round(Yearavg, 3) #display 3 decimals values
Yearstr = datetime.now().strftime('%Y')
#************************************* Header Calculation for Year *********************************************
self.heading = Label(self.window, text=Yearstr, font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=160, y=46)
self.heading = Label(self.window, text='Yearly Prod AVG :-', font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=190, y=46)
self.heading = Label(self.window, text=Yearavg1, font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=300, y=46)
# *********************************** BarChart Configurations****************************************************
figure1 = Figure(figsize=(10, 4), dpi=100)
ax1 = figure1.add_subplot(111)
yearXlength = np.arange(len(yearData.index)) #Lengthforx-axisBarsize
ax1.set_title('Yearly Steam Generation (Ton) Vs Coal Consumption (Ton) ')
ax1.bar(yearXlength - BarWidth/4, yearData['steam'], width=BarWidth/2, facecolor='indianred',align='center')
ax1.bar(yearXlength + BarWidth / 4, yearData['coal'], width=BarWidth/2, facecolor='#7eb54e',align='center')
ax1.set_ylabel('Steam Generation(Ton)')
ax1.set_xlabel('Year')
ax1.legend(['Steam Generation', 'Coal Consumption'])
ax1.set_xticks(yearXlength)
ax1.set_xticklabels(yearData.index, rotation=0, fontsize=10)
ax4 = ax1.twinx() # Create another axes that shares the same x-axis as ax.
ax4.set_ylim(*ax1.get_ylim())
ax4.set_ylabel('Coal Consumption(Ton)')
bar1 = FigureCanvasTkAgg(figure1, self.window)
bar1.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)
bar1.get_tk_widget().place(x=155, y=75, width=600, height=655)
#******************************* Monthy Data Bar Chart Calculation *********************************************
monthdata = monthSQL.groupby([pd.Grouper(key="SDATETIME", freq="MS")]).sum()
monthdata.index = monthdata.index.strftime("%b\n%Y")
Monthavg = mean(monthdata['steam']) #Average taken from a value
Monthavg1 = round(Monthavg, 3) #display 3 decimals values
Monthstr = datetime.now().strftime('%b-%Y')
#******************************* Monthly Header Of Bar Chart ********************************
self.header = Frame(self.window, bg='#808080')
self.header.place(x=760, y=46, width=600, height=25)
#************************************* Header Calculation for Year *********************************************
self.heading = Label(self.window, text=Monthstr, font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=760, y=46)
self.heading = Label(self.window, text='Yearly Prod AVG :-', font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=820, y=46)
self.heading = Label(self.window, text=Monthavg1, font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=930, y=46)
# *********************************** BarChart Configurations****************************************************
x = np.arange(len(monthdata.index))
figure2 = Figure(dpi=100)
ax2 = figure2.add_subplot(111)
ax2.set_title('Monthly Steam Generation (Ton) Vs Coal Consumption (Ton)')
ax2.bar(x - BarWidth/4, monthdata['steam'], width=BarWidth/2, facecolor='indianred')
ax2.bar(x + BarWidth/4, monthdata['coal'], width=BarWidth/2, facecolor='#7eb54e')
ax2.set_ylabel('Steam Generation(Ton)')
ax2.legend(['Steam Generation', 'Coal Consumption'])
ax2.set_xticks(x)
ax2.set_xticklabels(monthdata.index, rotation=0, fontsize=8)
ax5 = ax2.twinx() # Create another axes that shares the same x-axis as ax.
ax5.set_ylim(*ax2.get_ylim())
ax5.set_ylabel('Coal Consumption(Ton)')
bar2 = FigureCanvasTkAgg(figure2, self.window)
bar2.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)
bar2.get_tk_widget().place(x=760, y=75, width=600, height=300)
#******************************* Daily Data Bar Chart Calculation ******************************
datadaily = daily_data.groupby([pd.Grouper(key="SDATETIME", freq="D")]).sum()#daily_data.groupby(pd.to_datetime(daily_data['SDATETIME']).dt.strftime('%y-%m-%d'))['FE014CPV', 'CCC'].last().reset_index()
datadaily.index = datadaily.index.strftime("%d-%m")#lastdayfrom = daily_data['SDATETIME'].max()
print(datadaily)
#critDate = datadaily - pd.Timedelta(days=30)
#print(critDate)
#df_selected = datadaily.loc[datadaily ['SDATETIME'] > critDate]
#print(df_selected)
Dailyavg = mean(datadaily['FE014CPV']) #Average taken from a value
Dailyavg1 = round(Dailyavg, 3) #display 3 decimals values
Daystr = datetime.now().strftime('%d-%b-%Y')
#******************************* Monthly Header Of Bar Chart ********************************
self.header = Frame(self.window, bg='#808080')
self.header.place(x=760, y=375, width=600, height=25)
#************************************* Header Calculation for Year *********************************************
self.heading = Label(self.window, text=Daystr, font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=760, y=375)
self.heading = Label(self.window, text='Yearly Prod AVG :-', font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=830, y=375)
self.heading = Label(self.window, text=Dailyavg1, font=("", 8, "bold"), fg='Black', bg='#808080')
self.heading.place(x=930, y=375)
#*********************************** BarChart Configurations****************************************************
figure3 = Figure(dpi=100)
ax3 = figure3.add_subplot(111)
ax3.set_title('Daily Steam Generation (Ton) Vs Coal Consumption (Ton)')
z = np.arange(len(datadaily.index))#date2num(datadaily.index)
ax3.bar(z - BarWidth/4, datadaily['FE014CPV'], width=BarWidth/2, facecolor='indianred', align='center')
ax3.bar(z + BarWidth/4, datadaily['CCC'], width=BarWidth/2, facecolor='#7eb54e', align='center')
ax3.set_ylabel('Steam Generation(Ton)')
ax3.set_xticks(z)
ax6 = ax3.twinx() # Create another axes that shares the same x-axis as ax.
ax6.set_ylim(*ax3.get_ylim())
ax6.set_ylabel('Coal Consumption(Ton)')
ax3.set_xticklabels(datadaily.index, rotation=90, fontsize=6)
ax3.legend(['Steam Generation', 'Coal Consumption'])
#ax3.xaxis_date()
valmin = date2num(date.today() - timedelta(days=366))
valmax = date2num(date.today())
ax3_Slider = Slider(ax3, datadaily.index.any(), valmin, valmax, valstep=1, color='w', initcolor='none', track_color='g')
# adding and formatting of the date ticks
ax3.add_artist(ax3.xaxis)
x_tick_nums = np.linspace(valmin, valmax, 10)
ax3.set_xticks(x_tick_nums, [num2date(s).strftime("%m-%d") for s in x_tick_nums])
# convert slider value to date
def changed_slider(s):
ax3_Slider.valtext.set_text(num2date(s).date())
# ...
# other things that should happen when the slider value changes
# initiate the correct date display
changed_slider(valmin)
ax3_Slider.on_changed(changed_slider)
#ax3.autoscale(tight=True)
bar3 = FigureCanvasTkAgg(figure3, self.window)
bar3.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)
bar3.get_tk_widget().place(x=760, y=400, width=600, height=300)
def win():
window = Tk()
Dashboard(window)
window.mainloop()
if __name__=='__main__':
win()
I have number of dates and i want to show only current five dates plots on bar. I required slider for looking the dates other than current five dates.

Groupby: how to compute a tranformation and division in every value by group

I have a database like this:
participant time1 time2 ... time27
1 0.003 0.001 0.003
1 0.003 0.002 0.001
1 0.006 0.003 0.003
1 0.003 0.001 0.003
2 0.003 0.003 0.001
2 0.003 0.003 0.001
3 0.006 0.003 0.003
3 0.007 0.044 0.006
3 0.000 0.005 0.007
I need to perform a transformation using np.log1p() per participant and divide every value by the maximum of each participant.
(log [X + 1]) / Xmax
How can I do this?
You can use:
df.join(df.groupby('participant')
.transform(lambda s: np.log1p(s)/s.max())
.add_suffix('_trans')
)
Output (as new columns):
participant time1 time2 time27 time1_trans time2_trans time27_trans
0 1 0.003 0.001 0.003 0.499251 0.333167 0.998503
1 1 0.003 0.002 0.001 0.499251 0.666001 0.333167
2 1 0.006 0.003 0.003 0.997012 0.998503 0.998503
3 1 0.003 0.001 0.003 0.499251 0.333167 0.998503
4 2 0.003 0.003 0.001 0.998503 0.998503 0.999500
5 2 0.003 0.003 0.001 0.998503 0.998503 0.999500
6 3 0.006 0.003 0.003 0.854582 0.068080 0.427930
7 3 0.007 0.044 0.006 0.996516 0.978625 0.854582
8 3 0.000 0.005 0.007 0.000000 0.113353 0.996516

Winsorize a dataframe with percentile values

I'd like to replicate this method of winsorizing a dataframe with specified percentile regions in python. I tried using the scipy winsorize function but that didn't get the results I was looking for.
Example expected output for a dataframe winsorized by 0.01% low percentage value and 0.99% high percentage value across each date:
Original df:
A B C D E
2020-06-30 0.033 -0.182 -0.016 0.665 0.025
2020-07-31 0.142 -0.175 -0.016 0.556 0.024
2020-08-31 0.115 -0.187 -0.017 0.627 0.027
2020-09-30 0.032 -0.096 -0.022 0.572 0.024
Winsorized data:
A B C D E
2020-06-30 0.033 -0.175 -0.016 0.64 0.025
2020-07-31 0.142 -0.169 -0.016 0.54 0.024
2020-08-31 0.115 -0.18 -0.017 0.606 0.027
2020-09-30 0.032 -0.093 -0.022 0.55 0.024

Values disappear in dataframe multiindex after set_index()

I have a dataframe that looks like that:
scn cl_name lqd_mp lqd_wp gas_mp gas_wp res_mp res_wp
12 C6 Hexanes 3.398 1.723 2.200 5.835 2.614 2.775
13 NaN Me-Cyclo-pentane 1.193 0.591 0.439 1.146 0.707 0.733
14 NaN Benzene 0.037 0.017 0.013 0.030 0.021 0.020
15 NaN Cyclo-hexane 1.393 0.690 0.697 1.820 0.944 0.979
16 C7 Heptanes 6.359 3.748 1.122 3.477 2.980 3.679
17 NaN Me-Cyclo-hexane 4.355 2.515 0.678 2.068 1.985 2.401
18 NaN Toluene 0.407 0.220 0.061 0.174 0.183 0.208
19 C8 Octanes 10.277 6.901 0.692 2.438 4.092 5.759
20 NaN Ethyl-benzene 0.146 0.091 0.010 0.032 0.058 0.076
21 NaN Meta/Para-xylene 0.885 0.553 0.029 0.095 0.333 0.436
22 NaN Ortho-xylene 0.253 0.158 0.002 0.007 0.091 0.119
23 C9 Nonanes 8.683 6.552 0.280 1.113 3.266 5.160
24 NaN Tri-Me-benzene 0.496 0.351 0.000 0.000 0.176 0.261
25 C10 Decanes 8.216 6.877 0.108 0.451 2.985 5.233
I'd like to replace all the NaN values with the values from the previous row in 'scn' column and then to reindex the dataframe using multiindex on two columns 'scn' and 'cl_name'.
I do it with those two lines of code:
df['scn'] = df['scn'].ffill()
df.set_index(['scn', 'cl_name'], inplace=True)
The first line with ffil() does what I want replacing NaNs with above values. But after doing set_index() these values are disappearing leaving blank cells.
lqd_mp lqd_wp gas_mp gas_wp res_mp res_wp
scn cl_name
C6 Hexanes 3.398 1.723 2.200 5.835 2.614 2.775
Me-Cyclo-pentane 1.193 0.591 0.439 1.146 0.707 0.733
Benzene 0.037 0.017 0.013 0.030 0.021 0.020
Cyclo-hexane 1.393 0.690 0.697 1.820 0.944 0.979
C7 Heptanes 6.359 3.748 1.122 3.477 2.980 3.679
Me-Cyclo-hexane 4.355 2.515 0.678 2.068 1.985 2.401
Toluene 0.407 0.220 0.061 0.174 0.183 0.208
C8 Octanes 10.277 6.901 0.692 2.438 4.092 5.759
Ethyl-benzene 0.146 0.091 0.010 0.032 0.058 0.076
Meta/Para-xylene 0.885 0.553 0.029 0.095 0.333 0.436
Ortho-xylene 0.253 0.158 0.002 0.007 0.091 0.119
C9 Nonanes 8.683 6.552 0.280 1.113 3.266 5.160
Tri-Me-benzene 0.496 0.351 0.000 0.000 0.176 0.261
C10 Decanes 8.216 6.877 0.108 0.451 2.985 5.233
I'd like no blanks in 'scn' part of the index. What am I doing wrong?
Thanks

Why is pandas groupby filter is slower than merge?

I've noticed that Pandas groupby().filter() is slow for large datasets. Much slower than the equivalent merge. Here's my example:
size = 50000000
df = pd.DataFrame( { 'M' : np.random.randint(10,size=size), 'A' : np.random.randn(size), 'B' :np.random.randn(size)})
%%time
gb = df.groupby('M').filter(lambda x : x['A'].count()%2==0)
Wall time: 14 s
%%time
gb_int = df.groupby('M').count()%2==0
gb_int = gb_int[gb_int['A'] == True]
gb = df.merge(gb_int, left_on='M', right_index=True)
Wall time: 8.39 s
Can anyone help me understand why groupby filter is so slow?
Using %%prun, you see that the faster merge relies on inner_join, pandas.hashtable.Int64Factorizer whereas the slower filter uses groupby_indices and sort (showing only calls consuming more than 0.02s):
`merge`: 3361 function calls (3285 primitive calls) in 5.420 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.092 1.092 1.092 1.092 {pandas.algos.inner_join}
4 0.768 0.192 0.768 0.192 {method 'factorize' of 'pandas.hashtable.Int64Factorizer' objects}
1 0.578 0.578 0.578 0.578 {pandas.algos.take_2d_axis1_float64_float64}
4 0.512 0.128 0.512 0.128 {method 'take' of 'numpy.ndarray' objects}
1 0.425 0.425 0.425 0.425 {method 'get_labels' of 'pandas.hashtable.Int64HashTable' objects}
1 0.381 0.381 0.381 0.381 {pandas.algos.take_2d_axis0_float64_float64}
1 0.296 0.296 0.296 0.296 {pandas.algos.take_2d_axis1_int64_int64}
1 0.203 0.203 1.563 1.563 groupby.py:3730(count)
1 0.194 0.194 0.194 0.194 merge.py:746(_get_join_keys)
1 0.130 0.130 5.420 5.420 <string>:2(<module>)
2 0.109 0.054 0.109 0.054 common.py:250(_isnull_ndarraylike)
3 0.099 0.033 0.107 0.036 internals.py:4768(needs_filling)
2 0.099 0.050 0.875 0.438 merge.py:687(_factorize_keys)
2 0.094 0.047 0.200 0.100 groupby.py:3740(<genexpr>)
2 0.083 0.041 0.083 0.041 {pandas.algos.take_2d_axis1_bool_bool}
1 0.081 0.081 0.772 0.772 algorithms.py:156(factorize)
7 0.058 0.008 1.406 0.201 common.py:733(take_nd)
1 0.049 0.049 2.521 2.521 merge.py:322(_get_join_info)
1 0.035 0.035 2.196 2.196 merge.py:516(_get_join_indexers)
1 0.030 0.030 0.030 0.030 {built-in method numpy.core.multiarray.putmask}
1 0.030 0.030 0.033 0.033 merge.py:271(_maybe_add_join_keys)
1 0.028 0.028 3.725 3.725 merge.py:26(merge)
28 0.021 0.001 0.021 0.001 {method 'reduce' of 'numpy.ufunc' objects}
And the slower filter:
3751 function calls (3694 primitive calls) in 9.110 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 2.158 2.158 2.158 2.158 {pandas.algos.groupby_indices}
2 1.214 0.607 1.214 0.607 {pandas.algos.take_2d_axis1_float64_float64}
1 1.017 1.017 1.017 1.017 {method 'sort' of 'numpy.ndarray' objects}
4 0.859 0.215 0.859 0.215 {method 'take' of 'numpy.ndarray' objects}
2 0.586 0.293 0.586 0.293 {pandas.algos.take_2d_axis1_int64_int64}
1 0.534 0.534 0.534 0.534 {pandas.algos.take_1d_int64_int64}
1 0.420 0.420 0.420 0.420 {built-in method pandas.algos.ensure_object}
1 0.395 0.395 0.395 0.395 {method 'get_labels' of 'pandas.hashtable.Int64HashTable' objects}
1 0.349 0.349 0.349 0.349 {pandas.algos.groupsort_indexer}
2 0.324 0.162 0.340 0.170 indexing.py:1794(maybe_convert_indices)
2 0.223 0.112 3.109 1.555 internals.py:3625(take)
1 0.129 0.129 0.129 0.129 {built-in method numpy.core.multiarray.concatenate}
1 0.124 0.124 9.109 9.109 <string>:2(<module>)
1 0.124 0.124 0.124 0.124 {method 'copy' of 'numpy.ndarray' objects}
1 0.086 0.086 0.086 0.086 {pandas.lib.generate_slices}
31 0.083 0.003 0.083 0.003 {method 'reduce' of 'numpy.ufunc' objects}
1 0.076 0.076 0.710 0.710 algorithms.py:156(factorize)
5 0.074 0.015 2.415 0.483 common.py:733(take_nd)
1 0.067 0.067 0.068 0.068 numeric.py:2476(array_equal)
1 0.063 0.063 8.985 8.985 groupby.py:3523(filter)
1 0.062 0.062 2.640 2.640 groupby.py:4300(_groupby_indices)
10 0.059 0.006 0.059 0.006 common.py:250(_isnull_ndarraylike)
1 0.030 0.030 0.030 0.030 {built-in method numpy.core.multiarray.putmask}

Categories