How to restore the original array of a two-dimensional array, which is saved as a csv with Pandas and then read into a string
print(data_target_test['texts'][1])
print(type(data_target_test['labels'][1]))
data_target_test.to_csv('test.csv')
data_target_test_2 = pd.read_csv('test.csv')
a = data_target_test_2['texts'][1]
I have tried string processing, but it is too complex and time-consuming. The number of spaces in front of each number is different. I hope to get this array quickly
I'm working on using AI to give me better odds at winning Keno. (don't laugh lol)
My issue is that when I gather my data it comes in the form of 1d arrays of drawings at a time. I have different files that have gathered the data and formatted it as well as performed simple maths on the data set. Now I'm trying to get the data into a certain shape for my Neural Network layers and am having issues.
formatted_list = file.readlines()
#remove newline chars
formatted_list = list(filter(("\n").__ne__, formatted_list))
#iterate through each drawing, format the ends and split into list of ints
for i in formatted_list:
i = i[1:]
i = i[:-2]
i = [int(j) for j in i.split(",")]
#convert to numpy array
temp = np.array(i)
#t1 = np.reshape(temp, (-1, len(temp)))
#print(np.shape(t1))
#append to master list
master_list.append(temp)
print(np.shape(master_list))
This gives output of "(292,)" which is correct there are 292 rows of data however they contain 20 columns as well. If I comment in the "#t1 = np.reshape(temp, (-1, len(temp))) #print(np.shape(t1))" it gives output of "(1,20)(1,20)(1,20)(1,20)(1,20)(1,20)(1,20)(1,20)", etc. I want all of those rows to be added together and keep the columns the same (292,20). How can this be accomplished?
I've tried reshaping the final list and many other things and had no luck. It either populates each number in the row and adds it to the first dimension, IE (5840,) I was expecting to be able to append each new drawing to a master list, convert to numpy array and reshape it to the 292 rows of 20 columns. It just appears that it want's to keep the single dimension. I've tried numpy.concat also and no luck. Thank you.
You can use vstack to concatenate your master_list.
master_list = []
for array in formatted_list:
master_list.append(array)
master_array = np.vstack(master_list)
Alternatively, if you know the length of your formatted_list containing the arrays and array length you can just preallocate the master_array.
import numpy as np
formatted_list = [np.random.rand(20)]*292
master_array = np.zeros((len(formatted_list), len(formatted_list[0])))
for i, array in enumerate(formatted_list):
master_array[i,:] = array
** Edit **
As mentioned by hpaulj in the comments, np.array(), np.stack() and np.vstack() worked with this input and produced a numpy array with shape (7,20).
I am having the following trouble in Python. Assume a numpy.matrix A with entities of dtype to be complex128. I want to export A in CSV format so that the entries are separated by commas and each line at the output file corresponds to a row of A. I also need 18 decimal points of precision for both the real and imaginary parts and no spaces within an entry for example I need this
`6.103515626000000000e+09+1.712134684679831166e+05j`
instead of
`6.103515626000000000e+09 + 1.712134684679831166e+05j`
The following command works but only for 1-by-1 matrix
numpy.savetxt('A.out', A, fmt='%.18e%+.18ej', delimiter=',')
If I use:
numpy.savetxt('A.out', A, delimiter=',')
there are two problems. First, I don't know how many decimal points are preserved by default. Second, each complex entry is put in parentheses like
(6.103515626000000000e+09+1.712134684679831166e+05j)
and I cannot read the file in Matlab.
What do you suggest?
This is probably not the most efficient way of converting data in the large matrix and I am sure there exists a more efficient one-line-of-code solution, but you can try executing the code below and see if it works. Here I will be using pandas to save data to a csv file. The first columns in the generated csv file would be respectively your real and imaginary parts. Here I also assume that the dimension of the input matrix is Nx1.
import pandas as pd
import numpy as np
def to_csv(t, nr_of_decimal = 18):
t_new = np.matrix(np.zeros((t.shape[0], 2)))
t_new[:,:] = np.round(np.array(((str(np.array(t[:])[0][0])[1:-2]).split('+')), dtype=float), decimals=nr_of_decimal)
(pd.DataFrame(t_new)).to_csv('out.csv', index = False, header = False)
#Assume t is your complex matrix
t = np.matrix([[6.103515626000000000e+09+1.712134684679831166e+05j], [6.103515626000000000e+09+1.712134684679831166e+05j]])
to_csv(t)
I am filling an numpy array in python (could change this to a list if neccesary), and i want to fill it with column headings, then enter a loop and fill the table with values, I am struggling with which type to use for the array. I have something like this so far...
info = np.zeros(shape=(no_of_label+1,19),dtype = np.str) #Creates array to store coordinates of particles
info[0,:] = ['Xpos','Ypos','Zpos','NodeNumber','BoundingBoxTopX','BoundingBoxTopY','BoundingBoxTopZ','BoundingBoxBottomX','BoundingBoxBottomY','BoundingBoxBottomZ','BoxVolume','Xdisp','Ydisp','Zdisp','Xrot','Yrot','Zrot','CC','Error']
for i in np.arange(1,no_of_label+1,1):
info[i,:] = [C[0],C[1],C[2],i,int(round(C[0]-b)),int(round(C[1]-b)),int(round(C[2]-b)),int(round(C[0]+b)),int(round(C[1]+b)),int(round(C[2]+b)),volume,0,0,0,0,0,0,0,0] # Fills an array with label.No., size of box, and co-ords
np.savetxt(save_path+Folder+'/Data_'+Folder+'.csv',information,fmt = '%10.5f' ,delimiter=",")
There is other things in the loop, but they are irrelevent, C is an array of float, b is int.
I also need to be able to save it as a csv file as shown in the last line, and open it in excel.
What I have now, returns all the values as integers, when i need C[0], C[1], C[2] to be floating point.
Thanks in advance!
It depends on what you want to do with this array but I think you want to use 'dtype=object' instead of 'np.str'. You can do that explicitly, by changing 'np.str' to 'dtype' or here is how I would write the first part of your code:
import numpy as np
labels = ['Xpos','Ypos','Zpos','NodeNumber','BoundingBoxTopX','BoundingBoxTopY',
'BoundingBoxTopZ','BoundingBoxBottomX','BoundingBoxBottomY','BoundingBoxBottomZ',
'BoxVolume','Xdisp','Ydisp','Zdisp','Xrot','Yrot','Zrot','CC','Error']
no_of_label = len(labels)
#make a list of length ((no_of_label+1)*19) and convert it to an array and reshape it
info = np.array([None]*((no_of_label+1)*19)).reshape(no_of_label+1, 19)
info[0] = labels
Again, there is probably a better way of doing this if you have a specific application in mind, but this should let you store different types of data in the same 2D array.
I have solved it as follows:
info = np.zeros(shape=(no_of_label+1,19),dtype=float)
for i in np.arange(1,no_of_label+1,1):
info[i-1] = [C[0],C[1],C[2],i,int(round(C[0]-b)),int(round(C[1]-b)),int(round(C[2]-b)),int(round(C[0]+b)),int(round(C[1]+b)),int(round(C[2]+b)),volume,0,0,0,0,0,0,0,0]
np.savetxt(save_path+Folder+'/Data_'+Folder+'.csv',information,fmt = '%10.5f' ,delimiter=",",header='Xpos,Ypos,Zpos,NodeNumber,BoundingBoxTopX,BoundingBoxTopY,BoundingBoxTopZ,BoundingBoxBottomX,BoundingBoxBottomY,BoundingBoxBottomZ,BoxVolume,Xdisp,Ydisp,Zdisp,Xrot,Yrot,Zrot,CC,Error',comments='')
Using the header function built in to the numpy save text feature. Thanks everyone!
A bit of context: I am writting a code to save the data I plot to a text file. This data should be stored in such a way it can be loaded back using a script so it can be displayed again (but this time without performing any calculation). The initial idea was to store the data in columns with a format x1,y1,x2,y2,x3,y3...
I am using a code which would be simplified to something like this (incidentally, I am not sure if using a list to group my arrays is the most efficient approach):
import numpy as np
MatrixResults = []
x1 = np.array([1,2,3,4,5,6])
y1 = np.array([7,8,9,10,11,12])
x2 = np.array([0,1,2,3])
y2 = np.array([0,1,4,9])
MatrixResults.append(x1)
MatrixResults.append(y1)
MatrixResults.append(x2)
MatrixResults.append(y2)
MatrixResults = np.array(MatrixResults)
TextFile = open('/Users/UserName/Desktop/Datalog.txt',"w")
np.savetxt(TextFile, np.transpose(MatrixResults))
TextFile.close()
However, this code gives and error when any of the data sets have different lengths. Reading similar questions:
Can numpy.savetxt be used on N-dimensional ndarrays with N>2?
Table, with the different length of columns
However, this requires to break the format (either with flattening or adding some filling strings to the shorter columns to fill the shorter arrays)
My issue summarises as:
1) Is there any method that at the same time we transpose the arrays these are saved individually as consecutive columns?
2) Or maybe is there anyway to append columns to a text file (given a certain number of rows and columns to skip)
3) Should I try this with another library such as pandas?
Thank you very for any advice.
Edit 1:
After looking a bit more it seems that leaving blank spaces is more innefficient than filling the lists.
In the end I wrote my own (not sure if there is numpy function for this) in which I match the arrays length with "nan" values.
To get the data back I use the genfromtxt method and then I use this line:
x = x[~isnan(x)]
To remove the these cells from the arrays
If I find a better solution I will post it :)
To save your array you can use np.savez and read them back with np.load:
# Write to file
np.savez(filename, matrixResults)
# Read back
matrixResults = np.load(filename + '.npz').items[0][1]
As a side note you should follow naming conventions i.e. only class names start with upper case letters.