How to store array as variables in a dictionary - python

I'm new to python and stuck with a folder of files that I can't structure.
I have several files (.gra) that I'm importing as numpy arrays using python. These arrays are meteorological variables (GFS) both in 3D and 2D.
The 3D variables are for different altitudes in a polygon (a location). I know that each variable is storage together until the next one start. 3D variables are located first and 2D after.
I would like to create a function that iterates a folder, read each file and, given a step, storage each slice of the array assigned to a key.
My final purpose is to have a dictionary with all data of each variable stored by date.
The output I want is a dataframe of 3 columns (id(date), all meteorological data selected for each date by variable name).
I have tried to create a dictionary with all variables and set up a json file to help me determine the range of elements that contain each variable in the array.
gfs_info = {
"HGTprs": (0,3_042), "CLWMRprs": (3_042, 6_084),"RHprs": (6_084,9_126),
"Velprs": (9_126,12_168),"UGRDprs": (12_168,15_210),"VGRDprs": (15_210,18_252),
"TMPprs": (18_252,21_294),"HGTsfc": (21_294,21_411),"MSLETmsl": (21_411,21_528),
"PWATclm": (21_528,21_645),"RH2m": (21_645,21_762),"Vel100m": (21_762,21_879),
"UGRD100m": (21_879,21_996),"VGRD100m": (21_996,22_113),"Vel80m": (22_113,22_230),
"UGRD80m": (22_230,22_347),"VGRD80m": (22_347,22_464),"Vel10m":(22_464,22_581),
"UGRD10m": (22_581,22_698),"VGRD10m": (22_698,22_815),"GUSTsfc": (22_815,22_932),
"TMPsfc": (22_932,23_049),"TMP2m": (23_049,23_166),"no4LFTXsfc":(23_166,23_283),
"CAPEsfc": (23_283,23_400),"SPFH2m": (23_400,23_517),"SPFH80m": (23_517,23_634),
}
From the 7th key, onwards, the jump is made from 117 to 117 instead of 3042
Thanks in advance!

Related

How to parallelize calculating with odb files in ABAQUS?

I calculate sum of volume at all integration point like
volume = f.steps['Step-1'].frames[-1].fieldOutputs['IVOL'].values
# CALCULATE TOTAL VOLUME AT INTEGRATION POINTS
V = 0
for i in range(len(volume)):
V = V+volume[i].data
the length of volume in my problem is about 27,000,000 so it takes too long to do that.
I try to parallelize this process with multiprocessing module in python.
As far as I know, the data should be splited to several parts for that.
Could you give me some advice about spliting the data in odb files to several parts or parallelizing that code??
You can make use of bulkDataBlocks command available for the field output objects.
Using bulkDataBlocks, you can access all the field data as an array (abaqus array actaully, but numpy array and abaqus array are largely the same) and Hence, manipulation can get very easy.
For example:
# Accessing bulkDataBlocks object
ivolObj= f.steps['Step-1'].frames[-1].fieldOutputs['IVOL'].bulkDataBlocks
# Accessing data as array --> please note it is an Abaqus array type, not numpy array
ivolData = ivolObj[0].data
# now, you can sum it up
ivolSum = sum(ivolData )

(python) using a csv file to store details of a map

i am creating a text based game in python. in this, i will be using a CSV file to store the different tiles on the map. i would like to know what code i would need to essentially request the 'co-ordinates' of the map tile.
for example, if i was to create a tile with the co-ordinates x = 5, y = 6; it would store the information (GRASS1S2s1w, for example) in the 5th column and the sixth row.
i would also like to know how to call the specific cell in which the data is stored.
any alternate ways of doing this (not CSV) will be ignored. this is for a school project and i am too far through to change from CSV (i would have to change a lot of words in my plan.)
note: GRASS1S2I3Sc means 'grass tile' (GRASS), "stone" (1S), "scrap" (2S) and "wood" (1W)
Make a 2d list containing all the information. That way you can access a value of a specific coordinate like
list[x][y]
Then save the list with csv.writer
You can read the existing csv file as a list similarly to access the info.

Create dynamic names for lists [duplicate]

This question already has answers here:
Dictionary use instead of dynamic variable names in Python
(3 answers)
Closed 7 years ago.
From a shapefile I create a number of csv files but I don't know how many of them will be created each time. The csv files are named road_1, road_2 etc.
In these files, I have coordinates. I would like to put the coordinates of every csv files in lists.
So for road_1 I would like 3 lists:
x_1, y_1, z_1
For road_2:
x_2, y_2, z_2 etc.
I tried to name the lists in the loop where I get the coordinates with this : list+'_'+i where i is iterating through the number of files created, but i cannot concatenate a list and a string.
**
EDIT
**
Ok, some marked this topic as duplicated, fair enough. But just saying that I have to use a dictionnary doesn't answer all of my question. I had thought of using a dictionnary but my main issue is creating the name (either the name of the list, either the key of a dictionnary). I have to create this in a loop, not knowing how many I have to create. Thus, the name of the list (or key) should have a variable which must be the number of the road. And it's here that I have a problem.
As I said before, in my loop i tried to use a variable from the iteration loop to name my list but this didn't work, since it's not possible to concanate list with string. I could create an empty dictionnary with many empty key:value pairs, but I would still have to go through the keys name in the loop to add the values from the csv file in the dict.
Since it has been asked many times I wont write the code but only point you in the right direction (and maybe a different approach).
Use the glob module which will return the file names. Something like:
import glob
for csvFileNames in glob.glob("dir/*.csv"):
will return you each filename into the variable csvFileNames.
Then you simply open you csv Files with something like:
with open(csvFileNames, "r") as filesToRead:
for row in filestoRead:
#the structure of you csv File is unknown so cannot say anything here
Then its simple. Find you columns your interested in and create a dicts with the variables you need as keys. Use a counter to increment. All the information is there!

Creating and Storing Multi-Dimensional Array in a netCDF File

This question has potentially two parts but maybe only one if the first part can be encapsulated by the second. I am using python with numpy and netCDF4
First:
I have four lists of different variable values (hereafter referred to elevation values) each of which has a length of 28. These four lists are one set of 5 different latitude values of which are one set of the 24 different time values.
So 24 times...each time with 5 latitudes...each latitude with four lists...each list with 28 values.
I want to create an array with the following dimensions (elevation, latitude, time, variable)
In words, I want to be able to specify which of the four lists I access,which index in the list, and specify a specific time and latitude. So an index into this array would look like this:
array(0,1,2,3) where 0 specifies the first index of the the 4th list specified by the 3. 1 specifies the 2nd latitude, and 2 specifies the 3rd time and the output is the value at that point.
I won't include my code for this part since literally the only things of mention are the lists
list1=[...]
list2=[...]
list3=[...]
list4=[...]
How can I do this, is there an easier structure of the array, or is there anything else I a missing?
Second:
I have created a netCDF file with variables with these four dimensions. I need to set those variables to the array structure made above. I have no idea how to do this and the netCDF4 documentation does a 1-d array in a fairly cryptic way. If the arrays can be made directly into the netCDF file bypassing the need to use numpy first, by all means show me how.
Thanks!
After talking to a few people where I work we came up with this solution:
First we made an array of zeroes using the following argument:
array1=np.zeros((28,5,24,4))
Then appended this array by specifying where in the array we wanted to change:
array1[:,0,0,0]=list1
This inserted the values of the list into the first entry in the array.
Next to write the array to a netCDF file, I created a netCDF in the same program I made the array, made a single variable and gave it values like this:
netcdfvariable[:]=array1
Hope that helps anyone who finds this.

Accessing Data from .mat (version 8.1) structure in Python

I have a Matlab (.mat, version >7.3) file that contains a structure (data) that itself contains many fields. Each field is a single column array. Each field represents an individual sensor and the array is the time series data. I am trying to open this file in Python to do some more analysis. I am using PyTables to read the data in:
import tables
impdat = tables.openFile('data_file.mat')
This reads the file in and I can enter the fileObject and get the names of each field by using:
impdat.root.data.__members__
This prints a list of the fields:
['rdg', 'freqlabels', 'freqbinsctr',... ]
Now, what I would like is a method to take each field in data and make a python variable (perhaps dictionary) with the field name as the key (if it is a dictionary) and the corresponding array as its value. I can see the size of the array by doing, for example:
impdat.root.data.rdg
which returns this:
/data/rdg (EArray(1, 1286920), zlib(3))
atom := Int32Atom(shape=(), dflt=0)
maindim := 0
flavor := 'numpy'
byteorder := 'little'
chunkshape := (1, 16290)
My question is how do I access some of the data stored in that large array (1, 1286920). How can I read that array into another Python variable (list, dictionary, numpy array, etc.)? Any thoughts or guidance would be appreciated.
I have come up with a working solution. It is not very elegant as it requires an eval. So I first create a new variable (alldata) to the data I want to access, and then I create an empty dictionary datastruct, then I loop over all the members of data and assign the arrays to the appropriate key in the dictionary:
alldata = impdat.root.data
datastruct = {}
for names in impdat.rood.data.__members___:
datastruct[names] = eval('alldata.' + names + '[0][:]')
The '[0]' could be superfluous depending on the structure of the data trying to access. In my case the data is stored in an array of an array and I just want the first one. If you come up with a better solution please feel free to share.
I can't seem to replicate your code. I get an error when trying to open the file which I made in 8.0 using tables.
How about if you took the variables within the structure and saved them to a new mat file which only contains a collection of variables. This would make it much easier to deal with and this has already been answered quite eloquently here.
Which states that mat files which are arrays are simply hdf5 files which can be read with:
import numpy as np, h5py
f = h5py.File('somefile.mat','r')
data = f.get('data/variable1')
data = np.array(data) # For converting to numpy array
Not sure the size of the data set you're working with. If it's large I'm sure I could come up with a script to pull the fields out of the structures. I did find this tool which may be helpful. It recursively gets all of the structure field names.

Categories