Extract array from a string - python

I have a pandas series that contains:
[-3.86932793e+02 1.82297039e+01 -5.80108624e+01 3.60803151e+00\n -2.23173279e+01 -1.61694102e+01 -1.91569713e+01 -9.71229354e+00\n 1.04943316e+00 -2.32231360e+00 -1.40624006e+01 -7.31842760e+00\n 9.68115460e+00 2.42948531e+01 5.64715091e+00 2.08459357e+00\n -8.29193170e+00 -5.98514877e+00 -5.60237828e+00 5.11533863e+00\n 4.24665522e+00 2.44113892e+00 -9.27428068e-01 2.42668658e+00\n -1.29403291e+00 -6.17909507e+00 3.12809650e+00 8.99939129e+00\n 8.94010048e+00 8.05541832e+00 5.60370916e+00 -6.52764019e+00\n -9.95711382e+00 -2.02809827e-01 2.57034145e+00 -3.20973926e+00\n -9.36473473e+00 -2.29672003e+00 1.43961641e+00 6.63567513e+00]
How do I turn this into an array I can use for sklearn?

You can call .tolist() on series in pandas to give you a list object.
Since you will use it with sklearn, you most likely want a numpy array, therefore you could also call .to_numpy() which returns a np.array instead.

You can implement this part of code.
import pandas as pd
import numpy as np
str_ = '[-3.86932793e+02 1.82297039e+01 -5.80108624e+01 3.60803151e+00\n
-2.23173279e+01 -1.61694102e+01 -1.91569713e+01 -9.71229354e+00\n
1.04943316e+00 -2.32231360e+00 -1.40624006e+01 -7.31842760e+00\n
9.68115460e+00 2.42948531e+01 5.64715091e+00 2.08459357e+00\n
-8.29193170e+00 -5.98514877e+00 -5.60237828e+00 5.11533863e+00\n
4.24665522e+00 2.44113892e+00 -9.27428068e-01 2.42668658e+00\n
-1.29403291e+00 -6.17909507e+00 3.12809650e+00 8.99939129e+00\n
8.94010048e+00 8.05541832e+00 5.60370916e+00 -6.52764019e+00\n
-9.95711382e+00 -2.02809827e-01 2.57034145e+00 -3.20973926e+00\n
-9.36473473e+00 -2.29672003e+00 1.43961641e+00 6.63567513e+00]'
str_ = str_.replace('\n', '').replace(']', '').replace('[', '')
str_.split(' ')
array = [float(value) for value in str_.split(" ")]

Related

String type to array or list pandas column [duplicate]

This question already has answers here:
How to convert string representation of list to a list
(19 answers)
Closed last month.
I have pandas dataframe as below:
id emb
0 529581720 [-0.06815625727176666, 0.054927315562963486, 0...
1 663817504 [-0.05805483087897301, 0.031277190893888474, 0...
2 507084910 [-0.07410381734371185, -0.03922194242477417, 0...
3 1774950548 [-0.09088297933340073, -0.04383128136396408, -...
4 725573369 [-0.06329705566167831, 0.01242107804864645, 0....
data types of emb column is object. Now I want to convert those into numpy array. So I tried following:
embd = df[embd].values
But as it's in string format I'm getting following output:
embd[0]
out:
array('[-0.06815625727176666, 0.054927315562963486, 0.056555990129709244, -0.04559280723333359, -0.025042753666639328, -0.06674829870462418, -0.027613995596766472,
0.05307046324014664, 0.020159300416707993, 0.012015435844659805, 0.07048438489437103,
-0.020022081211209297, -0.03899797052145004, -0.03358669579029083, -0.06369364261627197,
-0.045727960765361786, -0.05619484931230545, -0.07043793052434921, -0.07021039724349976,
2.8020248282700777E-4, -0.04271571710705757, -0.04004468396306038, 0.01802503503859043, -0.0553901381790638, 0.0068290019407868385, -0.021117383614182472, -0.06583991646766663]',
dtype='<U11190')
Can someone tell me how can I convert this successfully into array with float32 values.
You can use the numpy function numpy.array() to convert an array of strings to an array with float32 values. Here is an example:
import numpy as np
string_array = ["1.0", "2.5", "3.14"]
float_array = np.array(string_array, dtype=np.float32)
Alternatively, you can use the pandas function pandas.to_numeric() to convert the values of a column of a dataframe from string to float32. Here is an example:
import pandas as pd
df = pd.DataFrame({"A": ["1.0", "2.5", "3.14"]})
df["A"] = pd.to_numeric(df["A"], downcast='float')
You can also use the pd.to_numeric() method and catch the errors that might arise when trying to convert the string to float, using the errors='coerce' argument. This will replace the invalid string values with NaN.
df['A'] = pd.to_numeric(df['A'], errors='coerce')
Use ast.literal_eval:
import ast
df['emb'] = df['emb'].apply(ast.literal_eval)
Output:
>>> df['emb'].values
array([list([-0.06815625727176666, 0.054927315562963486]),
list([-0.05805483087897301, 0.031277190893888474]),
list([-0.07410381734371185, -0.03922194242477417]),
list([-0.09088297933340073, -0.04383128136396408]),
list([-0.06329705566167831, 0.01242107804864645])], dtype=object)
>>> np.stack(df['emb'].values)
array([[-0.06815626, 0.05492732],
[-0.05805483, 0.03127719],
[-0.07410382, -0.03922194],
[-0.09088298, -0.04383128],
[-0.06329706, 0.01242108]])
Alternative to store list as numpy array:
df['emb'] = df['emb'].apply(lambda x: np.array(ast.literal_eval(x)))

Trying to create Numpy matrix/array but my output adds a \t and \n in python after splitting the comma separated array of numbers

So I am trying to make a Numpy Array based on this set of numbers I have that is without a header and split based on space (my objective is to remove the arrays with ALL zeroes). this is the code I have:
with open('/Users/name/Desktop/PDB/test_d3psm_misc/d3ps.profile','r') as f:
for line in f:
r = line.split(" ")
print(r)
my output:
['0.1\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.36\t0.54\t0.0\t0.0\t0.0\t\n']
['0.0\t0.06\t0.0\t0.0\t0.0\t0.0\t0.03\t0.0\t0.0\t0.0\t0.0\t0.0\t0.91\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.0\t0.0\t0.0\t0.0\t0.0\t0.02\t0.02\t0.0\t0.0\t0.51\t0.16\t0.06\t0.07\t0.0\t0.0\t0.0\t0.03\t0.0\t0.03\t0.1\t\n']
['0.02\t0.0\t0.05\t0.74\t0.0\t0.0\t0.12\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.03\t0.03\t0.0\t0.0\t0.0\t0.0\t\n']
['0.18\t0.1\t0.05\t0.13\t0.01\t0.0\t0.02\t0.04\t0.05\t0.0\t0.02\t0.13\t0.0\t0.09\t0.1\t0.06\t0.01\t0.0\t0.0\t0.0\t\n']
['0.04\t0.01\t0.07\t0.27\t0.04\t0.0\t0.12\t0.0\t0.0\t0.0\t0.26\t0.08\t0.0\t0.0\t0.01\t0.01\t0.03\t0.0\t0.01\t0.06\t\n']
['0.0\t0.04\t0.01\t0.02\t0.0\t0.01\t0.02\t0.46\t0.0\t0.0\t0.0\t0.05\t0.0\t0.0\t0.21\t0.13\t0.02\t0.0\t0.03\t0.0\t\n']
['0.02\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.11\t0.02\t0.04\t0.03\t0.0\t0.15\t0.0\t0.0\t0.0\t0.0\t0.64\t0.0\t\n']
['0.0\t0.79\t0.0\t0.0\t0.0\t0.03\t0.0\t0.0\t0.01\t0.0\t0.02\t0.12\t0.0\t0.0\t0.0\t0.0\t0.02\t0.0\t0.0\t0.0\t\n']
['0.05\t0.02\t0.01\t0.0\t0.02\t0.07\t0.01\t0.0\t0.0\t0.05\t0.04\t0.09\t0.01\t0.0\t0.46\t0.02\t0.09\t0.0\t0.0\t0.05\t\n']
['0.03\t0.11\t0.31\t0.0\t0.24\t0.0\t0.0\t0.21\t0.02\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.03\t0.02\t0.0\t0.0\t0.04\t\n']
['0.08\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.09\t0.0\t0.0\t0.0\t0.0\t0.05\t0.02\t0.0\t0.0\t0.01\t0.74\t\n']
['0.2\t0.0\t0.02\t0.0\t0.01\t0.0\t0.0\t0.59\t0.02\t0.02\t0.0\t0.0\t0.0\t0.01\t0.0\t0.06\t0.03\t0.0\t0.0\t0.01\t\n']
['0.17\t0.0\t0.02\t0.0\t0.04\t0.0\t0.0\t0.05\t0.0\t0.45\t0.04\t0.0\t0.06\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.16\t\n']
['0.07\t0.0\t0.0\t0.0\t0.09\t0.0\t0.0\t0.01\t0.0\t0.33\t0.06\t0.0\t0.08\t0.04\t0.0\t0.0\t0.0\t0.0\t0.0\t0.31\t\n']
['0.07\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.0\t0.0\t0.2\t0.33\t0.0\t0.01\t0.03\t0.02\t0.0\t0.01\t0.0\t0.02\t0.29\t\n']
['0.07\t0.06\t0.0\t0.01\t0.1\t0.03\t0.02\t0.01\t0.0\t0.14\t0.18\t0.0\t0.07\t0.11\t0.0\t0.01\t0.04\t0.0\t0.02\t0.12\t\n']
['0.0\t0.09\t0.48\t0.13\t0.01\t0.01\t0.04\t0.0\t0.01\t0.02\t0.02\t0.02\t0.0\t0.0\t0.01\t0.07\t0.05\t0.0\t0.0\t0.04\t\n']
['0.07\t0.22\t0.03\t0.08\t0.0\t0.06\t0.12\t0.05\t0.05\t0.01\t0.01\t0.07\t0.02\t0.0\t0.04\t0.1\t0.03\t0.0\t0.02\t0.01\t\n']
['0.03\t0.1\t0.09\t0.16\t0.01\t0.12\t0.16\t0.04\t0.02\t0.0\t0.0\t0.12\t0.0\t0.01\t0.04\t0.03\t0.03\t0.0\t0.01\t0.03\t\n']
['0.01\t0.05\t0.13\t0.09\t0.01\t0.04\t0.02\t0.44\t0.0\t0.0\t0.0\t0.14\t0.01\t0.0\t0.02\t0.02\t0.0\t0.0\t0.0\t0.0\t\n']
['0.0\t0.1\t0.02\t0.03\t0.01\t0.23\t0.23\t0.01\t0.03\t0.01\t0.06\t0.19\t0.0\t0.0\t0.01\t0.01\t0.02\t0.0\t0.0\t0.02\t\n']
['0.01\t0.0\t0.0\t0.0\t0.01\t0.0\t0.05\t0.0\t0.0\t0.23\t0.15\t0.0\t0.01\t0.01\t0.0\t0.0\t0.0\t0.0\t0.0\t0.53\t\n']
['0.01\t0.0\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.05\t0.6\t0.01\t0.03\t0.21\t0.0\t0.0\t0.01\t0.05\t0.01\t0.01\t\n']
['0.03\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.13\t0.42\t0.0\t0.06\t0.0\t0.0\t0.0\t0.02\t0.15\t0.02\t0.17\t\n']
['0.28\t0.0\t0.0\t0.0\t0.01\t0.04\t0.01\t0.26\t0.0\t0.03\t0.06\t0.01\t0.01\t0.0\t0.01\t0.02\t0.11\t0.0\t0.0\t0.16\t\n']
['0.02\t0.4\t0.02\t0.0\t0.02\t0.19\t0.01\t0.02\t0.03\t0.0\t0.0\t0.21\t0.0\t0.0\t0.0\t0.07\t0.0\t0.0\t0.0\t0.0\t\n']
['0.02\t0.69\t0.02\t0.0\t0.0\t0.01\t0.05\t0.05\t0.01\t0.0\t0.01\t0.05\t0.0\t0.01\t0.0\t0.05\t0.0\t0.01\t0.01\t0.02\t\n']
['0.07\t0.02\t0.01\t0.01\t0.01\t0.05\t0.01\t0.0\t0.05\t0.11\t0.16\t0.08\t0.03\t0.09\t0.03\t0.08\t0.04\t0.01\t0.08\t0.05\t\n']
['0.07\t0.16\t0.05\t0.11\t0.02\t0.0\t0.02\t0.24\t0.06\t0.01\t0.01\t0.1\t0.01\t0.0\t0.05\t0.04\t0.0\t0.0\t0.01\t0.04\t\n']
['0.04\t0.05\t0.09\t0.08\t0.0\t0.15\t0.07\t0.04\t0.02\t0.02\t0.01\t0.12\t0.01\t0.0\t0.07\t0.1\t0.09\t0.0\t0.02\t0.01\t\n']
['0.01\t0.01\t0.11\t0.13\t0.0\t0.02\t0.04\t0.27\t0.19\t0.0\t0.01\t0.08\t0.0\t0.0\t0.04\t0.04\t0.01\t0.0\t0.04\t0.0\t\n']
['0.12\t0.06\t0.01\t0.01\t0.03\t0.04\t0.0\t0.03\t0.02\t0.04\t0.11\t0.02\t0.05\t0.03\t0.0\t0.19\t0.11\t0.01\t0.05\t0.08\t\n']
['0.0\t0.03\t0.0\t0.0\t0.0\t0.0\t0.0\t0.03\t0.01\t0.01\t0.03\t0.0\t0.01\t0.03\t0.01\t0.0\t0.0\t0.78\t0.06\t0.01\t\n']
['0.05\t0.0\t0.02\t0.03\t0.0\t0.33\t0.08\t0.07\t0.03\t0.03\t0.01\t0.06\t0.0\t0.03\t0.0\t0.11\t0.06\t0.0\t0.02\t0.06\t\n']
['0.03\t0.0\t0.0\t0.0\t0.02\t0.01\t0.0\t0.03\t0.0\t0.05\t0.2\t0.0\t0.13\t0.31\t0.05\t0.02\t0.02\t0.0\t0.02\t0.11\t\n']
['0.01\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.01\t0.06\t0.0\t0.0\t0.01\t0.82\t0.04\t0.0\t0.0\t0.0\t0.03\t\n']
['0.13\t0.02\t0.01\t0.0\t0.0\t0.28\t0.0\t0.34\t0.0\t0.0\t0.0\t0.15\t0.0\t0.0\t0.0\t0.02\t0.05\t0.0\t0.0\t0.0\t\n']
['0.0\t0.0\t0.03\t0.0\t0.0\t0.0\t0.0\t0.97\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.05\t0.08\t0.0\t0.0\t0.0\t0.0\t0.0\t0.39\t0.14\t0.0\t0.03\t0.12\t0.0\t0.12\t0.0\t0.02\t0.01\t0.0\t0.01\t0.02\t\n']
['0.03\t0.02\t0.0\t0.0\t0.05\t0.0\t0.0\t0.0\t0.0\t0.29\t0.1\t0.01\t0.08\t0.0\t0.04\t0.01\t0.01\t0.02\t0.0\t0.35\t\n']
['0.02\t0.01\t0.07\t0.23\t0.03\t0.02\t0.41\t0.03\t0.01\t0.01\t0.02\t0.09\t0.03\t0.0\t0.0\t0.02\t0.0\t0.0\t0.0\t0.01\t\n']
['0.07\t0.02\t0.01\t0.05\t0.0\t0.02\t0.12\t0.01\t0.01\t0.08\t0.03\t0.12\t0.0\t0.03\t0.27\t0.08\t0.01\t0.0\t0.02\t0.03\t\n']
['0.01\t0.01\t0.05\t0.17\t0.01\t0.0\t0.03\t0.63\t0.02\t0.0\t0.0\t0.01\t0.01\t0.0\t0.0\t0.05\t0.01\t0.0\t0.0\t0.01\t\n']
['0.01\t0.0\t0.0\t0.02\t0.0\t0.0\t0.9\t0.0\t0.01\t0.0\t0.05\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.01\t0.02\t0.04\t0.11\t0.01\t0.01\t0.11\t0.0\t0.0\t0.0\t0.0\t0.02\t0.0\t0.0\t0.01\t0.31\t0.37\t0.0\t0.0\t0.0\t\n']
['0.1\t0.0\t0.01\t0.04\t0.0\t0.01\t0.01\t0.0\t0.0\t0.11\t0.15\t0.0\t0.01\t0.07\t0.37\t0.01\t0.02\t0.01\t0.03\t0.04\t\n']
['0.06\t0.05\t0.0\t0.01\t0.0\t0.04\t0.44\t0.02\t0.03\t0.04\t0.11\t0.03\t0.02\t0.01\t0.02\t0.02\t0.02\t0.04\t0.01\t0.03\t\n']
['0.11\t0.0\t0.01\t0.16\t0.0\t0.25\t0.3\t0.01\t0.01\t0.01\t0.01\t0.02\t0.0\t0.0\t0.01\t0.01\t0.05\t0.0\t0.01\t0.04\t\n']
['0.63\t0.0\t0.05\t0.0\t0.13\t0.0\t0.0\t0.07\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.02\t0.1\t0.0\t0.0\t0.0\t\n']
['0.44\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.03\t0.0\t0.04\t0.11\t0.0\t0.12\t0.04\t0.0\t0.01\t0.0\t0.0\t0.0\t0.19\t\n']
['0.13\t0.06\t0.01\t0.0\t0.01\t0.03\t0.05\t0.0\t0.02\t0.08\t0.2\t0.05\t0.03\t0.05\t0.0\t0.01\t0.0\t0.01\t0.12\t0.14\t\n']
['0.0\t0.95\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.05\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.0\t0.05\t0.0\t0.0\t0.0\t0.0\t0.91\t0.0\t0.0\t0.0\t0.03\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.06\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.01\t0.0\t0.1\t0.4\t0.0\t0.03\t0.02\t0.0\t0.02\t0.09\t0.0\t0.01\t0.25\t\n']
['0.02\t0.1\t0.02\t0.01\t0.01\t0.12\t0.08\t0.01\t0.04\t0.01\t0.07\t0.17\t0.03\t0.08\t0.03\t0.02\t0.0\t0.08\t0.1\t0.01\t\n']
['0.03\t0.0\t0.0\t0.0\t0.0\t0.0\t0.91\t0.0\t0.03\t0.0\t0.0\t0.0\t0.0\t0.03\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t1.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.09\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.01\t0.0\t0.11\t0.12\t0.0\t0.0\t0.0\t0.0\t0.09\t0.36\t0.0\t0.0\t0.22\t\n']
['0.03\t0.02\t0.03\t0.01\t0.0\t0.01\t0.0\t0.86\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.03\t0.01\t0.0\t0.0\t0.0\t\n']
['0.02\t0.0\t0.0\t0.0\t0.02\t0.0\t0.01\t0.0\t0.01\t0.25\t0.32\t0.0\t0.02\t0.04\t0.0\t0.02\t0.03\t0.0\t0.07\t0.18\t\n']
['0.03\t0.08\t0.06\t0.17\t0.01\t0.04\t0.07\t0.02\t0.03\t0.02\t0.04\t0.13\t0.01\t0.01\t0.04\t0.09\t0.09\t0.0\t0.01\t0.03\t\n']
['0.15\t0.03\t0.01\t0.01\t0.03\t0.03\t0.02\t0.07\t0.0\t0.07\t0.06\t0.08\t0.0\t0.01\t0.17\t0.03\t0.07\t0.0\t0.01\t0.16\t\n']
['0.04\t0.07\t0.04\t0.08\t0.01\t0.06\t0.27\t0.05\t0.03\t0.01\t0.03\t0.17\t0.0\t0.0\t0.02\t0.05\t0.04\t0.01\t0.0\t0.01\t\n']
['0.01\t0.03\t0.11\t0.16\t0.0\t0.05\t0.04\t0.01\t0.1\t0.02\t0.07\t0.08\t0.01\t0.05\t0.07\t0.12\t0.02\t0.0\t0.02\t0.04\t\n']
['0.04\t0.02\t0.0\t0.02\t0.01\t0.01\t0.02\t0.04\t0.0\t0.14\t0.13\t0.03\t0.01\t0.02\t0.02\t0.06\t0.03\t0.01\t0.02\t0.38\t\n']
['0.04\t0.21\t0.02\t0.02\t0.01\t0.08\t0.26\t0.01\t0.02\t0.02\t0.04\t0.07\t0.0\t0.0\t0.0\t0.07\t0.07\t0.0\t0.0\t0.04\t\n']
['0.01\t0.01\t0.03\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.27\t0.22\t0.07\t0.01\t0.07\t0.01\t0.03\t0.0\t0.01\t0.1\t0.15\t\n']
['0.02\t0.01\t0.01\t0.01\t0.0\t0.0\t0.0\t0.01\t0.01\t0.2\t0.46\t0.0\t0.03\t0.07\t0.0\t0.02\t0.01\t0.01\t0.01\t0.13\t\n']
['0.28\t0.02\t0.02\t0.01\t0.0\t0.01\t0.04\t0.31\t0.01\t0.0\t0.02\t0.03\t0.0\t0.0\t0.02\t0.09\t0.06\t0.02\t0.02\t0.01\t\n']
['0.05\t0.1\t0.01\t0.02\t0.08\t0.06\t0.17\t0.04\t0.03\t0.03\t0.01\t0.08\t0.0\t0.04\t0.01\t0.14\t0.03\t0.0\t0.0\t0.11\t\n']
['0.05\t0.04\t0.01\t0.01\t0.02\t0.06\t0.02\t0.02\t0.01\t0.11\t0.06\t0.01\t0.02\t0.06\t0.02\t0.13\t0.23\t0.0\t0.04\t0.08\t\n']
['0.03\t0.22\t0.03\t0.03\t0.0\t0.08\t0.06\t0.01\t0.02\t0.01\t0.08\t0.14\t0.0\t0.02\t0.18\t0.03\t0.03\t0.0\t0.01\t0.02\t\n']
['0.03\t0.04\t0.06\t0.17\t0.04\t0.03\t0.1\t0.12\t0.02\t0.01\t0.03\t0.05\t0.0\t0.01\t0.14\t0.07\t0.02\t0.0\t0.05\t0.01\t\n']
['0.02\t0.02\t0.01\t0.03\t0.0\t0.01\t0.05\t0.01\t0.06\t0.01\t0.04\t0.03\t0.01\t0.04\t0.02\t0.04\t0.03\t0.52\t0.02\t0.02\t\n']
['0.02\t0.03\t0.01\t0.01\t0.01\t0.02\t0.02\t0.02\t0.02\t0.1\t0.34\t0.01\t0.02\t0.14\t0.01\t0.02\t0.01\t0.02\t0.05\t0.08\t\n']
['0.05\t0.24\t0.05\t0.02\t0.01\t0.02\t0.07\t0.03\t0.02\t0.03\t0.04\t0.06\t0.0\t0.02\t0.02\t0.13\t0.13\t0.0\t0.03\t0.03\t\n']
['0.01\t0.01\t0.01\t0.01\t0.0\t0.01\t0.03\t0.01\t0.01\t0.02\t0.01\t0.02\t0.0\t0.1\t0.04\t0.03\t0.01\t0.05\t0.61\t0.01\t\n']
['0.09\t0.15\t0.02\t0.26\t0.0\t0.02\t0.11\t0.02\t0.04\t0.02\t0.02\t0.1\t0.0\t0.0\t0.06\t0.04\t0.01\t0.0\t0.0\t0.03\t\n']
['0.02\t0.03\t0.0\t0.05\t0.0\t0.0\t0.01\t0.01\t0.0\t0.11\t0.31\t0.06\t0.01\t0.18\t0.0\t0.03\t0.01\t0.0\t0.07\t0.08\t\n']
['0.02\t0.04\t0.0\t0.06\t0.0\t0.0\t0.04\t0.01\t0.0\t0.0\t0.04\t0.01\t0.01\t0.01\t0.71\t0.01\t0.01\t0.0\t0.0\t0.02\t\n']
['0.03\t0.07\t0.04\t0.07\t0.01\t0.08\t0.1\t0.04\t0.06\t0.03\t0.01\t0.17\t0.0\t0.01\t0.1\t0.08\t0.07\t0.0\t0.0\t0.02\t\n']
['0.01\t0.36\t0.02\t0.06\t0.0\t0.08\t0.06\t0.01\t0.08\t0.03\t0.01\t0.08\t0.0\t0.03\t0.01\t0.11\t0.04\t0.0\t0.01\t0.01\t\n']
['0.07\t0.02\t0.0\t0.02\t0.0\t0.04\t0.03\t0.02\t0.06\t0.04\t0.34\t0.02\t0.07\t0.04\t0.01\t0.04\t0.03\t0.03\t0.06\t0.07\t\n']
['0.04\t0.05\t0.01\t0.0\t0.0\t0.0\t0.03\t0.01\t0.01\t0.25\t0.04\t0.01\t0.1\t0.01\t0.02\t0.03\t0.06\t0.1\t0.0\t0.2\t\n']
['0.05\t0.24\t0.04\t0.08\t0.01\t0.01\t0.05\t0.09\t0.07\t0.02\t0.04\t0.18\t0.01\t0.01\t0.01\t0.02\t0.02\t0.02\t0.0\t0.03\t\n']
['0.04\t0.07\t0.03\t0.01\t0.01\t0.02\t0.03\t0.3\t0.01\t0.01\t0.01\t0.08\t0.01\t0.01\t0.21\t0.05\t0.07\t0.0\t0.0\t0.02\t\n']
['0.04\t0.13\t0.06\t0.01\t0.0\t0.01\t0.03\t0.04\t0.03\t0.04\t0.12\t0.09\t0.03\t0.01\t0.01\t0.03\t0.08\t0.0\t0.06\t0.17\t\n']
['0.08\t0.02\t0.02\t0.01\t0.16\t0.01\t0.01\t0.08\t0.02\t0.03\t0.03\t0.03\t0.02\t0.09\t0.05\t0.03\t0.03\t0.01\t0.24\t0.04\t\n']
['0.01\t0.19\t0.01\t0.11\t0.01\t0.02\t0.02\t0.03\t0.01\t0.24\t0.04\t0.1\t0.0\t0.07\t0.02\t0.02\t0.01\t0.0\t0.03\t0.06\t\n']
['0.02\t0.01\t0.02\t0.01\t0.0\t0.01\t0.03\t0.66\t0.01\t0.01\t0.02\t0.02\t0.0\t0.02\t0.02\t0.02\t0.03\t0.0\t0.04\t0.04\t\n']
['0.02\t0.06\t0.02\t0.02\t0.0\t0.42\t0.04\t0.02\t0.06\t0.01\t0.05\t0.12\t0.01\t0.02\t0.01\t0.02\t0.04\t0.0\t0.02\t0.04\t\n']
['0.05\t0.08\t0.06\t0.03\t0.0\t0.03\t0.04\t0.02\t0.03\t0.02\t0.02\t0.36\t0.03\t0.01\t0.01\t0.07\t0.1\t0.03\t0.01\t0.01\t\n']
['0.02\t0.05\t0.01\t0.09\t0.0\t0.44\t0.11\t0.02\t0.03\t0.04\t0.02\t0.03\t0.01\t0.01\t0.01\t0.02\t0.01\t0.0\t0.03\t0.05\t\n']
['0.02\t0.09\t0.01\t0.01\t0.0\t0.02\t0.06\t0.02\t0.08\t0.13\t0.07\t0.27\t0.03\t0.01\t0.01\t0.03\t0.03\t0.0\t0.02\t0.1\t\n']
['0.01\t0.01\t0.01\t0.01\t0.0\t0.01\t0.01\t0.01\t0.01\t0.08\t0.06\t0.01\t0.0\t0.08\t0.01\t0.01\t0.01\t0.48\t0.11\t0.04\t\n']
['0.01\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.02\t0.02\t0.0\t0.0\t0.6\t0.0\t0.0\t0.0\t0.04\t0.24\t0.02\t\n']
['0.13\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.03\t0.02\t0.06\t0.49\t0.03\t0.09\t0.03\t0.01\t0.02\t0.02\t0.0\t0.0\t0.06\t\n']
['0.19\t0.0\t0.0\t0.0\t0.08\t0.0\t0.0\t0.01\t0.0\t0.13\t0.28\t0.0\t0.06\t0.1\t0.0\t0.02\t0.0\t0.0\t0.03\t0.08\t\n']
['0.04\t0.35\t0.01\t0.05\t0.0\t0.12\t0.08\t0.02\t0.02\t0.01\t0.04\t0.12\t0.03\t0.01\t0.01\t0.02\t0.01\t0.0\t0.05\t0.04\t\n']
['0.04\t0.01\t0.0\t0.01\t0.0\t0.0\t0.01\t0.02\t0.0\t0.04\t0.42\t0.01\t0.02\t0.21\t0.01\t0.04\t0.01\t0.0\t0.08\t0.07\t\n']
['0.05\t0.05\t0.02\t0.03\t0.02\t0.02\t0.05\t0.01\t0.01\t0.11\t0.12\t0.12\t0.02\t0.0\t0.03\t0.04\t0.17\t0.0\t0.01\t0.12\t\n']
['0.06\t0.0\t0.02\t0.03\t0.03\t0.0\t0.01\t0.41\t0.01\t0.02\t0.02\t0.02\t0.0\t0.05\t0.1\t0.11\t0.04\t0.0\t0.01\t0.04\t\n']
['0.05\t0.08\t0.07\t0.15\t0.01\t0.07\t0.08\t0.11\t0.04\t0.01\t0.04\t0.08\t0.0\t0.0\t0.04\t0.05\t0.06\t0.0\t0.03\t0.03\t\n']
['0.01\t0.0\t0.05\t0.34\t0.04\t0.03\t0.3\t0.01\t0.0\t0.01\t0.04\t0.0\t0.0\t0.0\t0.0\t0.12\t0.03\t0.0\t0.01\t0.0\t\n']
['0.09\t0.02\t0.03\t0.05\t0.03\t0.02\t0.09\t0.05\t0.0\t0.03\t0.07\t0.05\t0.0\t0.08\t0.02\t0.21\t0.05\t0.01\t0.02\t0.09\t\n']
['0.07\t0.04\t0.04\t0.24\t0.01\t0.08\t0.25\t0.01\t0.03\t0.0\t0.01\t0.04\t0.01\t0.0\t0.0\t0.09\t0.04\t0.0\t0.01\t0.04\t\n']
['0.06\t0.0\t0.0\t0.0\t0.0\t0.04\t0.01\t0.02\t0.01\t0.52\t0.06\t0.0\t0.01\t0.03\t0.07\t0.0\t0.03\t0.0\t0.0\t0.12\t\n']
['0.03\t0.15\t0.33\t0.11\t0.04\t0.1\t0.02\t0.02\t0.0\t0.03\t0.01\t0.06\t0.0\t0.0\t0.0\t0.04\t0.01\t0.01\t0.0\t0.03\t\n']
['0.01\t0.05\t0.02\t0.02\t0.01\t0.0\t0.0\t0.0\t0.0\t0.14\t0.34\t0.02\t0.06\t0.07\t0.08\t0.02\t0.02\t0.0\t0.0\t0.12\t\n']
['0.02\t0.05\t0.14\t0.18\t0.0\t0.16\t0.11\t0.03\t0.03\t0.02\t0.07\t0.07\t0.01\t0.01\t0.02\t0.02\t0.03\t0.0\t0.0\t0.02\t\n']
['0.16\t0.03\t0.06\t0.08\t0.03\t0.04\t0.11\t0.05\t0.04\t0.02\t0.08\t0.01\t0.02\t0.01\t0.03\t0.04\t0.09\t0.0\t0.02\t0.07\t\n']
['0.07\t0.05\t0.02\t0.03\t0.09\t0.02\t0.02\t0.12\t0.04\t0.01\t0.04\t0.04\t0.02\t0.01\t0.06\t0.14\t0.2\t0.0\t0.0\t0.02\t\n']
['0.06\t0.02\t0.09\t0.12\t0.0\t0.07\t0.12\t0.06\t0.1\t0.07\t0.06\t0.06\t0.02\t0.02\t0.02\t0.04\t0.03\t0.0\t0.01\t0.03\t\n']
['0.03\t0.06\t0.06\t0.04\t0.0\t0.06\t0.05\t0.03\t0.15\t0.02\t0.02\t0.19\t0.0\t0.01\t0.08\t0.07\t0.07\t0.0\t0.02\t0.03\t\n']
['0.16\t0.02\t0.04\t0.03\t0.03\t0.02\t0.11\t0.02\t0.0\t0.02\t0.02\t0.02\t0.0\t0.01\t0.38\t0.04\t0.02\t0.0\t0.03\t0.02\t\n']
['0.0\t0.0\t0.0\t0.02\t0.0\t0.02\t0.89\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.01\t0.0\t0.0\t0.02\t0.0\t0.0\t0.0\t\n']
['0.01\t0.01\t0.0\t0.01\t0.01\t0.0\t0.01\t0.01\t0.06\t0.15\t0.07\t0.02\t0.0\t0.52\t0.01\t0.01\t0.02\t0.0\t0.0\t0.08\t\n']
['0.06\t0.03\t0.02\t0.37\t0.01\t0.07\t0.11\t0.04\t0.02\t0.03\t0.05\t0.05\t0.0\t0.0\t0.0\t0.11\t0.02\t0.0\t0.0\t0.01\t\n']
['0.22\t0.04\t0.03\t0.22\t0.02\t0.06\t0.11\t0.12\t0.02\t0.0\t0.01\t0.05\t0.0\t0.0\t0.0\t0.07\t0.01\t0.0\t0.01\t0.01\t\n']
['0.17\t0.01\t0.0\t0.0\t0.02\t0.0\t0.0\t0.0\t0.0\t0.06\t0.04\t0.0\t0.0\t0.03\t0.0\t0.0\t0.0\t0.52\t0.11\t0.0\t\n']
['0.08\t0.4\t0.01\t0.03\t0.02\t0.09\t0.07\t0.03\t0.0\t0.0\t0.01\t0.16\t0.02\t0.0\t0.0\t0.03\t0.01\t0.0\t0.0\t0.03\t\n']
['0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.04\t0.0\t0.0\t0.0\t0.95\t0.0\t0.0\t\n']
['0.01\t0.01\t0.03\t0.01\t0.01\t0.03\t0.03\t0.01\t0.03\t0.06\t0.1\t0.01\t0.09\t0.11\t0.0\t0.02\t0.01\t0.0\t0.04\t0.41\t\n']
['0.01\t0.04\t0.05\t0.22\t0.0\t0.02\t0.06\t0.05\t0.03\t0.0\t0.0\t0.05\t0.0\t0.0\t0.15\t0.27\t0.07\t0.0\t0.0\t0.0\t\n']
['0.06\t0.01\t0.0\t0.0\t0.01\t0.0\t0.0\t0.0\t0.0\t0.05\t0.19\t0.01\t0.06\t0.08\t0.15\t0.01\t0.01\t0.05\t0.24\t0.07\t\n']
['0.06\t0.01\t0.02\t0.22\t0.0\t0.05\t0.25\t0.02\t0.02\t0.0\t0.0\t0.05\t0.01\t0.0\t0.01\t0.02\t0.01\t0.24\t0.0\t0.0\t\n']
['0.04\t0.04\t0.08\t0.15\t0.0\t0.08\t0.28\t0.0\t0.01\t0.0\t0.02\t0.03\t0.01\t0.0\t0.0\t0.03\t0.02\t0.0\t0.15\t0.05\t\n']
['0.12\t0.0\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.0\t0.04\t0.46\t0.0\t0.01\t0.01\t0.25\t0.01\t0.01\t0.0\t0.03\t0.05\t\n']
['0.1\t0.0\t0.0\t0.0\t0.03\t0.0\t0.01\t0.0\t0.0\t0.08\t0.22\t0.0\t0.04\t0.02\t0.24\t0.02\t0.01\t0.0\t0.02\t0.21\t\n']
['0.09\t0.16\t0.06\t0.16\t0.0\t0.07\t0.16\t0.06\t0.01\t0.0\t0.02\t0.08\t0.0\t0.0\t0.02\t0.09\t0.01\t0.0\t0.0\t0.0\t\n']
['0.12\t0.06\t0.03\t0.01\t0.0\t0.3\t0.03\t0.01\t0.03\t0.05\t0.19\t0.05\t0.01\t0.0\t0.0\t0.06\t0.0\t0.0\t0.01\t0.03\t\n']
['0.07\t0.0\t0.01\t0.0\t0.02\t0.0\t0.0\t0.0\t0.0\t0.22\t0.16\t0.0\t0.0\t0.0\t0.15\t0.0\t0.0\t0.0\t0.0\t0.34\t\n']
['0.1\t0.03\t0.03\t0.01\t0.0\t0.0\t0.01\t0.01\t0.0\t0.16\t0.05\t0.01\t0.03\t0.02\t0.01\t0.01\t0.04\t0.03\t0.0\t0.47\t\n']
['0.05\t0.0\t0.04\t0.06\t0.0\t0.06\t0.07\t0.02\t0.02\t0.0\t0.01\t0.02\t0.03\t0.05\t0.29\t0.18\t0.05\t0.0\t0.05\t0.0\t\n']
['0.01\t0.0\t0.0\t0.0\t0.0\t0.02\t0.0\t0.02\t0.02\t0.0\t0.01\t0.02\t0.0\t0.81\t0.0\t0.0\t0.0\t0.0\t0.06\t0.0\t\n']
['0.0\t0.0\t0.0\t0.0\t0.0\t0.02\t0.05\t0.0\t0.0\t0.0\t0.0\t0.86\t0.0\t0.0\t0.05\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.0\t0.58\t0.1\t0.03\t0.0\t0.03\t0.11\t0.0\t0.0\t0.0\t0.01\t0.13\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t\n']
['0.03\t0.05\t0.05\t0.29\t0.0\t0.1\t0.15\t0.04\t0.05\t0.0\t0.01\t0.06\t0.0\t0.0\t0.12\t0.04\t0.0\t0.0\t0.0\t0.0\t\n']
['0.03\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.09\t0.15\t0.0\t0.03\t0.0\t0.0\t0.01\t0.02\t0.03\t0.0\t0.61\t\n']
['0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.11\t0.11\t0.0\t0.0\t0.0\t0.0\t0.75\t0.0\t\n']
['0.03\t0.4\t0.0\t0.0\t0.0\t0.05\t0.23\t0.0\t0.01\t0.01\t0.04\t0.17\t0.02\t0.0\t0.0\t0.0\t0.02\t0.01\t0.0\t0.02\t\n']
['0.11\t0.24\t0.0\t0.07\t0.01\t0.15\t0.11\t0.01\t0.01\t0.0\t0.02\t0.17\t0.03\t0.04\t0.0\t0.0\t0.02\t0.0\t0.0\t0.0\t\n']
['0.33\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.14\t0.04\t0.0\t0.01\t0.0\t0.0\t0.0\t0.03\t0.0\t0.0\t0.45\t\n']
['0.02\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.15\t0.48\t0.01\t0.14\t0.01\t0.0\t0.0\t0.0\t0.0\t0.0\t0.19\t\n']
['0.08\t0.1\t0.08\t0.03\t0.01\t0.15\t0.12\t0.0\t0.01\t0.0\t0.0\t0.28\t0.01\t0.0\t0.01\t0.05\t0.07\t0.0\t0.0\t0.01\t\n']
['0.13\t0.04\t0.0\t0.02\t0.0\t0.02\t0.6\t0.0\t0.06\t0.0\t0.0\t0.01\t0.01\t0.0\t0.0\t0.0\t0.02\t0.0\t0.09\t0.0\t\n']
['0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.07\t0.0\t0.0\t0.28\t0.0\t0.0\t0.64\t0.0\t0.0\t0.0\t0.0\t0.0\t0.01\t\n']
['0.3\t0.1\t0.01\t0.0\t0.01\t0.03\t0.06\t0.03\t0.1\t0.05\t0.0\t0.04\t0.0\t0.01\t0.0\t0.17\t0.0\t0.0\t0.0\t0.09\t\n']
['0.07\t0.15\t0.06\t0.03\t0.0\t0.11\t0.07\t0.05\t0.04\t0.02\t0.0\t0.01\t0.01\t0.0\t0.17\t0.19\t0.02\t0.0\t0.0\t0.0\t\n']
['0.01\t0.02\t0.0\t0.0\t0.0\t0.02\t0.0\t0.0\t0.05\t0.07\t0.31\t0.0\t0.0\t0.27\t0.0\t0.03\t0.03\t0.01\t0.08\t0.08\t\n']
['0.19\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.01\t0.0\t0.15\t0.48\t0.05\t0.0\t0.03\t0.0\t0.01\t0.02\t0.0\t0.0\t0.07\t\n']
['0.01\t0.11\t0.01\t0.01\t0.19\t0.16\t0.0\t0.01\t0.0\t0.05\t0.1\t0.03\t0.1\t0.15\t0.0\t0.01\t0.0\t0.0\t0.0\t0.05\t\n']
['0.23\t0.04\t0.05\t0.09\t0.0\t0.02\t0.0\t0.04\t0.02\t0.0\t0.0\t0.0\t0.0\t0.0\t0.24\t0.29\t0.0\t0.0\t0.0\t0.0\t\n']
['0.13\t0.0\t0.0\t0.0\t0.0\t0.06\t0.05\t0.02\t0.02\t0.04\t0.24\t0.0\t0.05\t0.1\t0.04\t0.0\t0.11\t0.0\t0.0\t0.15\t\n']
['0.44\t0.02\t0.0\t0.0\t0.0\t0.18\t0.09\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.19\t0.02\t0.02\t0.0\t0.0\t0.03\t\n']
['0.51\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.3\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.0\t0.19\t0.0\t0.0\t0.0\t0.0\t\n']
I dont understand why this is happening. If there is an easier why to parse a bunch of files with numbers in it finding those that are ALL zeroes that would be better :)
import pandas as pd
df = pd.read_csv("/Users/name/Desktop/PDB/test_d3psm_misc/d3ps.profile", sep="\t", header=None).astype(float)
df = df.loc[df.sum(axis=1)>0]
Then, if you want to convert in numpy array:
df = df.to_numpy()

How to solve the FunctionError and MapError

Python 3.6 pycharm
import prettytable as pt
import numpy as np
import pandas as pd
a=np.random.randn(30,2)
b=a.round(2)
df=pd.DataFrame(b)
df.columns=['data1','data2']
tb = pt.PrettyTable()
def func1(columns):
def func2(column):
return tb.add_column(column,df[column])
return map(func2,columns)
column1=['data1','data2']
print(column1)
print(func1(column1))
I want to get the results are:
tb.add_column('data1',df['data1'])
tb.add_column('data2',df['data2'])
As a matter of fact,the results are:
<map object at 0x000001E527357828>
I am trying find the answer in Stack Overflow for a long time, some answer tell me can use list(func1(column1)), but the result is [None, None].
Based on the tutorial at https://ptable.readthedocs.io/en/latest/tutorial.html, PrettyTable.add_column modifies the PrettyTable in-place. Such functions generally return None, not the modified object.
You're also overcomplicating the problem by trying to use map and a fancy wrapper function. The below code is much simpler, but produces the desired result.
import prettytable as pt
import numpy as np
import pandas as pd
column_names = ['data1', 'data2']
a = np.random.randn(30, 2)
b = a.round(2)
df = pd.DataFrame(b)
df.columns = column_names
tb = pt.PrettyTable()
for col in column_names:
tb.add_column(col, df[col])
print(tb)
If you're still interesting in learning about the thing that map returns, I suggest reading about iterables and iterators. map returns an iterator over the results of calling the function, and does not actually do any work until you iterate over it.

convert pandas df to multi-dimensional numpy array

I have very sparse data in a pandas dataframe with 25million+ records. This has to be converted into a multi dimensional numpy array. I have written this the straightforward way using a for loop, and was wondering if there is a more efficient way.
import numpy as np
import pandas as pd
facts_pd = pd.DataFrame.from_records(columns=['name','offset','code'],
data=[('John', -928, 'dx_434'), ('Steve',-757,'dx_5859'), ('Jack',-800,'dx_250'),
('John',-919,'dx_401'),('John',-956,'dx_5859')])
name_lu = pd.DataFrame(sorted(facts_pd['name'].unique()), columns=['name'])
name_lu["nameid"] = name_lu.index
offset_lu = pd.DataFrame(sorted(facts_pd['offset'].unique(), reverse=True), columns=['offset'])
offset_lu["offsetid"] = offset_lu.index
code_lu = pd.DataFrame(sorted(facts_pd['code'].unique()), columns=['code'])
code_lu["codeid"] = code_lu.index
facts_pd = pd.merge(pd.merge(pd.merge(facts_pd, name_lu, how="left", on="name")
, offset_lu, how="left", on="offset"), code_lu, how="left", on="code")
facts_pd.drop(["name","offset","code"], inplace=True, axis=1)
facts_np = np.zeros((len(name_lu),len(offset_lu),len(code_lu)))
for row in facts_pd.iterrows():
i,j,k = row[1]
facts_np[i][j][k] = 1
The command you are probably looking for is dataframe.as_matrix() this will return a numpy array and not a matrix despite what the command says here is the man pages for it.
Here is another stack overflow topic on the use of it as well
Refurbished code:
import numpy as np
import pandas as pd
facts_pd = pd.DataFrame.from_records(columns=['name','offset','code'],
data=[('John', -928, 'dx_434'), ('Steve',-757,'dx_5859'), ('Jack',-800,'dx_250'),
('John',-919,'dx_401'),('John',-956,'dx_5859')])
facts_np = facts_pd.as_matrix()
print facts_np # Displays the data frame in numpy array format.

How to set precision of numpy float array when converting from string array?

I am using the following link to convert a array of string to array of float
Convert String to float array
The data that I am getting is in a weird format
535. 535. 535. 534.68 534.68 534.68
Although numpy is able to convert the string array to float but some other is failing when data is in the format 535.
Is there a way to convert all 535. to 535.00 in one go.
I am using the following code for conversions
import numpy as np
strarray = ["535.","535.","534.68"]
floatarray = np.array(filter(None,strarray),dtype='|S10').astype(np.float)
print floatarray
Convert the the strings to float128.
Try this:
import numpy as np
strarray = ["535.","535.","534.68"]
floatarray = np.array(filter(None,strarray),dtype='|S10').astype(np.float128)
print floatarray
Output:
[ 535.0 535.0 534.68]
Or use the recommended longdouble:
import numpy as np
strarray = ["535.","535.","534.68"]
floatarray = np.array(filter(None,strarray),dtype='|S10').astype(np.longdouble)
print floatarray
Output:
[ 535.0 535.0 534.68]

Categories