Index Error after for loop has already completed one loop - python

I'm trying to plot the last 30 days of sst data using a for loop. My code will run through the first loop fine but then give this error on the second:
Traceback (most recent call last):
File "sstt.py", line 20, in <module>
Temp = Temp[i,:,:]
IndexError: too many indices for array
It doesn't matter what indice I start on, the second loop always gives this error. If I start on -29, then -28 fails. If I start on -28, -27 fails, etc.
Code:
import numpy as np
import math as m
import urllib2
from pydap.client import open_url
from pydap.proxy import ArrayProxy
data_url_mean = 'http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/noaa.oisst.v2.highres/sst.day.mean.2015.v2.nc'
dataset1 = open_url(data_url_mean)
# Daily Mean
Temp = dataset1['sst']
timestep = [-29,-28,-27,-26,-25,-24,-23,-22,-21,-20,-19,-18,-17,-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1]
for i in timestep:
# Daily Mean
Temp = Temp[i,:,:]
Temp = Temp.array[:]
Temp = Temp * (9./5.) + 32.
Temp = Temp.squeeze()
print i

You're assigning all of your values to the same variable. After the first pass of the loop, Temp is no longer equal to the dataset, and the attempt to perform the operation expecting it to be the dataset fails.
You need to come up with some new names for the variables that you assign values to.

Related

How to call a function into a file

I'm trying to import a function that I have a file in another folder with this structure:
Data_Analytics
|---Src
| ---__init.py__
| ---DEA_functions.py
| ---importing_modules.py
|---Data Exploratory Analysis
| ----File.ipynb
So, from File.ipynb (from now I'm working in notebook) I want to call a function that I have in the file DEA_functions.py. To do that I typed:
import sys
sys.path.insert(1, "../")
from Src.Importing_modules import *
import Src.DEA_functions as DEA
No errors during the importing process but when I want to call the function I got this error:
AttributeError: module 'Src.DEA_functions' has no attribute 'getIndexes'
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-29-534dff78ff93> in <module>
4
5 #There is a negative value that we want to delete
----> 6 DEA.getIndexes(df,df['Y3'].min())
7 df['Y3'].min()
8 df['Y3'].iloc[14289]=0
AttributeError: module 'Src.DEA_functions' has no attribute 'getIndexes'e
And the function is defined in the file, this way:
def getIndexes(dfObj, value):
''' Get index positions of value in dataframe i.e. dfObj.'''
listOfPos = list()
# Get bool dataframe with True at positions where the given value exists
result = dfObj.isin([value])
# Get list of columns that contains the value
seriesObj = result.any()
columnNames = list(seriesObj[seriesObj == True].index)
# Iterate over list of columns and fetch the rows indexes where value exists
for col in columnNames:
rows = list(result[col][result[col] == True].index)
for row in rows:
listOfPos.append((row, col))
# Return a list of tuples indicating the positions of value in the dataframe
return listOfPos
I hope I made myself clear but if not do not hesitate to question whatever you need. I just want to use the functions I have defined in my file DEA_functions.py into my File.ipynb
Thank you!
I found the error, I assigned DEA as the shortname for calling my functions. Looks like I had to use lower case letter. So:
import Src.DEA_funct as dea

Key error when looping through netCDF data dates

I often work with satellite/model data, and a common task I need to perform is creating an array where every element is one of the months of the year. This generally works, but when I run the following code on an ECCO .nc file I get a key error that looks like a string of numbers (for example: KeyError: 727185600000000000)
import numpy as np
import xarray as xr
import pandas as pd
import ecco_v4_py as ecco
import os
grid = ecco.load_ecco_grid_nc(grid_dir,'ECCO-GRID.nc')
ecco1992 = xr.merge((grid,xr.open_dataset('oceFWflx_1992.nc'))).load()
ts = []
zero = 0
for time in ecco1992.oceFWflx.time:
ts.append(ecco1992.oceFWflx.sel(time=ecco1992.variables['time'][zero]))
zero = zero+1
However, when I do this manually, it works fine, for example:
jan = ecco1992.oceFWflx.sel(time = '1992-01-16T12:00:00.000000000')
feb = ecco1992.oceFWflx.sel(time = '1992-02-15T12:00:00.000000000')
(rest of months)
ts.append(jan)
ts.append(feb)
ts.append(rest of months)
Yields the desired array, but isn't practical with large quantities of data.
What could the cause of this key error and how might I avoid it?

Why is `if string[0] == '0':` working but also giving an error?

I'm really new to programing and I'm trying to make my first "project";
the project is pretty taking my bank report excels and summarize them useing openpyxl.
Anyway..I'm trying to work with dates and recive the month in a string of date, and it's kind of working, but it give me some weird error in the output:
**Traceback (most recent call last):
File "D:/Avi...Bank/Python App/app.py", line 22, in <module>
if month[0] == '0':
IndexError: string index out of range**
this is all the code:
from pathlib import PureWindowsPath
import openpyxl as xl
import datetime as dt
wb_inputs = PureWindowsPath('D:/Avi...Inputs.xlsx')
wb_outputs = PureWindowsPath('D:/Aviv...Outputs.xlsx')
wb_inputs = xl.load_workbook(wb_inputs)
wb_outputs = xl.load_workbook(wb_outputs)
input1_sheet = wb_inputs['Input1']
input2_sheet = wb_inputs['Input2']
input3_sheet = wb_inputs['Input3']
output1_sheet = wb_outputs['Output1 - Summary']
output2_sheet = wb_outputs['Output2 - data']
for row in range(23, input3_sheet.max_row + 1):
cell = input3_sheet.cell(row, 1)
price_for_cell = input3_sheet.cell(row, 6)
date_cell = str(cell.value)
month = date_cell[5:7]
if month[0] == '0':
print(month[1])
else:
print(month)
This is runtime error.
The value of variable month is dependent on variable row indirectly.
Therefore, whenever the value of variable month comes out to be an empty string, you cannot access the 0th index of an empty string.
I would suggest you 2 things:
Check the range for for loop to avoid logical errors.
Use of exception handling to maintain a normal flow of application.

Is there a way to loop through a matrix/array/df every 30 rows to return scipy.stats.describe

I want a loop that goes over every 30th row of a (1095, 10000) array, returns a scipy.stats.describe(matrix[30]) and writes these results to a list
I have tried to do it manually and it works, I'm trying to optimise my code
stats150 = scipy.stats.describe(matrix[150])
list_for_stats +=['150:', stats150]
stats180 = scipy.stats.describe(matrix[180])
list_for_stats += ['180:', stats180]
statsOut = open("myOutputStatsFile.txt", "w")
for line in list_for_stats:
# write line to output file
statsOut.write(str(line))
statsOut.write("\n")
statsOut.close()
a for loop that is more intuitive than what I already have
Assuming your matrix is a numpy array this loop goes through every 30th row of a (1095,10000) matrix of ones and stores the scipy.describe results along with the row number as a string in a list:
import numpy as np
import scipy
matrix = np.ones(shape=(1095,10000))
list_for_stats=[]
for i in range(0,matrix.shape[0],30):
list_for_stats +=[str(i)+':', scipy.stats.describe(matrix[i])]

Why am I getting this error about "string indices must be integer"?

I am looking to find the total number of players by counting the unique screen names.
# Dependencies
import pandas as pd
# Save path to data set in a variable
df = "purchase_data.json"
# Use Pandas to read data
data_file_pd = pd.read_json(df)
data_file_pd.head()
# Find total numbers of players
player_count = len(df['SN'].unique())
TypeError Traceback (most recent call last)
<ipython-input-26-94bf0ee04d7b> in <module>()
1 # Find total numbers of players
----> 2 player_count = len(df['SN'].unique())
TypeError: string indices must be integers
Without access to the original data, this is guess work. But I think you might want something like this:
# Save path variable (?)
json_data = "purchase_data.json"
# convert json data to Pandas dataframe
df = pd.read_json(json_data)
df.head()
len(data_file_pd['SN'].unique())
simply if you are getting this error while connecting to schema. then at that time close the web browser and kill the Pg Admin Server and restart it. then it will be work perfectly

Categories