Filling an array with data from dat files in python

Filling an array with data from dat files in python - python

I have a folder that has dat files, each of which contains data that should be places on a 360 x 181 grid. How can I populate an array of that size with the data? First, the data comes out as a strip, that is, 1 x (360*181). The data needs to be reshaped and then placed into the array.
Try as I might I can not get this to work correctly. I was able to get the data to read into an array, however it seemed that it was being placed into elements psuedo-randomly, as each element did not necessarily match up with the correct placement, as I had previously found in MATLAB. I also have the data in txt format, should that make this easier.
Here is what I have so far, not much luck (very new to python):
#!/usr/bin/python
############################################
#
import csv
import sys
import numpy as np
import scipy as sp
#
#############################################
level = input("Enter a level: ");
LEVEL = str(level);
MODEL = raw_input("Enter a model: ");
NX = 360;
NY = 181;
date = 201409060000;
DATE = str(date);
#############################################
FileList = [];
data = [];
for j in range(24,384,24):
J = str(j);
for i in range(1,51,1):
I = str(i);
fileName = '/Users/alexg/ECMWF_DATA/DAT_FILES/'+MODEL+'_'+LEVEL+'_h_'+I+'_FT0'+J+'_'+DATE+'.dat';
fo = open(FileList(i), "r");
data.append(fo);

Related

How to write csv inside a loop python

i've done got my outputs for the csv file, but i dont know how to write it into csv file because output result is numpy array
def find_mode(np_array) :
vals,counts = np.unique(np_array, return_counts=True)
index = np.argmax(counts)
return(vals[index])
folder = ("C:/Users/ROG FLOW/Desktop/Untuk SIDANG TA/Sudah Aman/testbikincsv/folderdatacitra/*.jpg")
for file in glob.glob(folder):
a = cv2.imread(file)
rows = a.shape[0]
cols = a.shape[1]
middlex = cols/2
middley = rows/2
middle = [middlex,middley]
titikawalx = middlex - 10
titikawaly = middley - 10
titikakhirx = middlex + 10
titikakhiry = middley + 10
crop = a[int(titikawaly):int(titikakhiry), int(titikawalx):int(titikakhirx)]
c = cv2.cvtColor(crop, cv2.COLOR_BGR2HSV)
H,S,V = cv2.split(c)
hsv_split = np.concatenate((H,S,V),axis=1)
Modus_citra = (find_mode(H)) #how to put this in csv
my outputs is modus citra which is array np.uint8, im trying to put it on csv file but im still confused how to write it into csv because the result in loop.
can someone help me how to write it into csv file ? i appreciate every help

Run your loop, and put the data into lists
eg. mydata = [result1,result2,result3]
Then use csv.writerows(mydata) to write your list into csv rows
https://docs.python.org/3/library/csv.html#csv.csvwriter.writerows

You can save your NumPy arrays to CSV files using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format. You must also specify the delimiter; this is the character used to separate each variable in the file, most commonly a comma. For example:
import numpy as np
my_array = np.array([1,2,3,4,5,6,7,8,9,10])
my_file = np.savetxt('randomtext.csv', my_array, delimiter = ',', fmt = '%d')
print(my_file)

Approch to merge a template with header and Items with Data for each entry

I'm trying to learn Python and find a solution for my business.
I'm working on SAP and i need to merge data to fill a template.
Doing the merge based on Excel VBA, it's working but to fill a file with 10 K entries it's take a very long time.
My template is avaiable here
https://docs.google.com/spreadsheets/d/1FXc-4zUYx0fjGRvPf0FgMjeTm9nXVfSt/edit?usp=sharing&ouid=113964169462465283497&rtpof=true&sd=true
And a sample of data is here
https://drive.google.com/file/d/105FP8ti0xKbXCFeA2o5HU7d2l3Qi-JqJ/view?usp=sharing
So I need to merge for each record from my data file into the Excel template where we have an header and 2 lines (it's a FI posting so I need to fill the debit and credit.
In VBA, I have proceed like that:
Fix the cell:
Copy data from the template with function activecell.offset(x,y) ...
From my Data file fill the different record based on technical name.
Now I'm trying the same in Python.
Using Pandas or openpyxyl I can open the file but I can't see how can I continue or proceed to find a way to merge header data (must be copy for eache posting I have to book) and data.
from tkinter import *
import pandas as pd
import datetime
from openpyxl import load_workbook
import numpy as np
def sap_line_item(ligne):
ledger = ligne
print(ligne)
return
# Constante
c_dir = '/Users/sapfinance/PycharmProjects/SAP'
C_FILE_SEP = ';'
root = Tk()
root.withdraw()
# folder_selected = filedialog.askdirectory(initialdir=c_dir)
fiori_selected = filedialog.askopenfile(initialdir=c_dir)
data_selected = filedialog.askopenfile(initialdir=c_dir)
# read data
pd.options.display.float_format = '{:,.2f}'.format
fichier_cible = str(data_selected.name)
target_filename = fichier_cible + '_' + datetime.datetime.now().strftime("%Y%m%d-%H%M%S") + '.xlsx'
# target = pd.ExcelWriter(target_filename, engine='xlsxwriter')
df_full_data = pd.read_csv(data_selected.name, sep=C_FILE_SEP, encoding='unicode_escape', dtype='unicode')
nb_ligne_data = int(len(df_full_data))
print(nb_ligne_data)
#df_fiori = pd.read_excel(fiori_selected.name)
print(fiori_selected.name)
df_fiori = load_workbook(fiori_selected.name)
df_fiori_data = df_fiori.active
Any help to give some tick to approach and find a solution will be appreciate.
Have a great day
Philippe

how to edit part of an hdf5 file

I'm trying to edit precipitation rate values in an existing hdf5 file such that values >= 10 get rewritten as 1 and values < 10 get rewritten as 0. This is what I have so far. The code runs without errors, but after checking the hdf5 files it appears that the changes to the precipitation rate dataset weren't made. I'd appreciate any ideas on how to make it work.
import h5py
import numpy as np
import glob
filenames = []
filenames += glob.glob("/IMERG/Exceedance/2014_E/3B-HHR.MS.MRG.3IMERG.201401*")
for file in filenames:
f = h5py.File(file,'r+')
new_value = np.zeros((3600, 1800))
new_value = new_value.astype(int)
precip = f['Grid/precipitationCal'][0][:][:]
for i in precip:
for j in i:
if j >= 10.0:
new_value[...] = 1
else:
pass
precip[...] = new_value
f.close()

It seems like you are not writing the new values into the file, but only storing them in an array.

It seems like you're only changing the values of the array, not actually updating anything in the file object. Also, I'd get rid of that for loop - it's slow! Try this:
import h5py
import numpy as np
import glob
filenames = []
filenames += glob.glob("/IMERG/Exceedance/2014_E/3B-HHR.MS.MRG.3IMERG.201401*")
for file in filenames:
f = h5py.File(file,'r+')
precip = f['Grid/precipitationCal'][0][:][:]
# Replacing the for loop
precip[precip>10.0] = 1
# Assign values
f['Grid/precipitationCal'][0][:][:] = precip
f.close()

Pandas ValueError: Shape of passed values

In the following code I iterate through a list of images and count the frequencies of a given number, in this case zeros and ones. I then write this out to a csv. This works fine when I write out the list of frequencies only, but when I try to add the filename then I get the error:
ValueError: Shape of passed values is (1, 2), indices imply (2, 2)
When I try to write out one list of frequencies (number of ones) and the filenames it works fine.
My code is as follows:
import os
from osgeo import gdal
import pandas as pd
import numpy as np
# Input directory to the .kea files
InDir = "inDirectory"
# Make a list of the files
files = [file for file in os.listdir(InDir) if file.endswith('.kea')]
# Create empty list to store the counts
ZeroValues = []
OneValues = []
# Iterate through each kea file and open it
for file in files:
print('opening ' + file)
# Open file
ds = gdal.Open(os.path.join(InDir, file))
# Specify the image band
band = ds.GetRasterBand(1)
# Read the pixel values as an array
arr = band.ReadAsArray()
# remove values that are not equal (!=) to 0 (no data)
ZeroPixels = arr[arr==0]
OnePixels = arr[arr==1]
print('Number of 0 pixels = ' + str(len(ZeroPixels)))
print('Number of 1 pixels = ' + str(len(OnePixels)))
# Count the number of values in the array (length) and add to the list
ZeroValues.append(len(ZeroPixels))
OneValues.append(len(OnePixels))
# Close file
ds = Non
# Pandas datagram and out to csv
out = pd.DataFrame(ZeroValues, OneValues, files)
# Write the pandas dataframe to a csv
out.to_csv("out.csv", header=False, index=files)

Pandas thinks you're trying to pass OneValues and files as positional index and columns arguments. See docs.
Try wrapping your fields in a dict:
import pandas as pd
ZeroValues = [2,3,4]
OneValues = [5,6,7]
files = ["A.kea","B.kea","C.kea"]
df = pd.DataFrame(dict(zero_vals=ZeroValues, one_vals=OneValues, fname=files))
Output:
fname one_vals zero_vals
0 A.kea 5 2
1 B.kea 6 3
2 C.kea 7 4

Add scalars to vtk file (vtk 3.0 legacy) in Python

I am trying to add a calculated scalar to an existing VTK file.
A simplified version of my code is the following
import vtk
import os
import numpy as np
reader = vtk.vtkDataSetReader()
reader.SetFileName(vtk_file_name)
reader.ReadAllScalarsOn()
reader.Update()
data = reader.GetOutput() #This contains all data from the VTK
cell_data = data.GetCellData() #This contains just the cells data
scalar_data1 = cell_data.GetArray('scalar1')
scalar_data2 = cell_data.GetArray('scalar2')
scalar1 = np.array([scalar_data1.GetValue(i) for i in range(data.GetNumberOfCells())])
scalar2 = np.array([scalar_data2.GetValue(i) for i in range(data.GetNumberOfCells())])
scalar3 = scalar1 - scalar2
writer = vtk.vtkDataSetWriter()
At this point I assume that I need to add a vtkArray to data by using data.SetCell
The problem is that SetCell asks for a vtkCellArray and I have not managed yet to convert my array scalar3 to a vtkCellArray.
Is this the right approach? Any suggestion?

You actually need to use cell_data.AddArray() to add your array. SetCell() would actually modify the topology of your data set.
Rojj is correct about using vtk.numpy_support to convert back and forth between vtkArrays and numpy arrays. You can use something like the following:
import vtk
from vtk.util import numpy_support
...
scalar3_array = numpy_support.numpy_to_vtk(scalar3)
scalar3_array.SetName('scalar3')
cell_data.AddArray(scalar3)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Filling an array with data from dat files in python - python

Related

How to write csv inside a loop python

Approch to merge a template with header and Items with Data for each entry

how to edit part of an hdf5 file

Pandas ValueError: Shape of passed values

Add scalars to vtk file (vtk 3.0 legacy) in Python

Categories

Resources