How do I change the CSV data to a float?

How do I change the CSV data to a float? - python

I have a piece of code that is meant to read a CSV file that has data in it. I get this error message when I run the program, "ValueError: could not convert string to float:" How can I make my strings into floats?
,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r = np.loadtxt('car.txt', delimiter = '\s', unpack = True)
plt.plot(o,r, label='Loaded from file!')
plt.xlabel('o')
plt.ylabel('r')
plt.title('Interesting Graph\nCheck it out')
plt.legend()
plt.show()
ValueError: could not convert string to float:

Python use float() function to convert a string into a float. The ValueError happens when Python doesn't know how to do so
for example
float("1.234") # return 1.234
float("1.45 ") # return 1.45
but if you do
float("abc")
This will cause a ValueErrror since Python doesn't know how to convert it.
Back to your question, maybe try to print your variable and check if it is valid to be converted to float.

if all your data are float, maybe:
1,the first rou are names not float
2,maybe there are 'backspace' in your data, they will be recognize as '/t', find a spack in your data,press ctrl+r,change backspace to space

Related

How to convert a string array to float array?

I basically first converted a multidimensional array to a string array in order to set the values as my dictionary key, and now I need to convert the string array back to a regular float array. For example, what I have is:
str_array = ['[0.25 0.2916666666666667]', '[0.5833333333333334 0.2916666666666667]',
'[0.5555555555555555 0.3333333333333332]']
And I literally just need it back as a regular array
array = [[0.25 0.2916666666666667], [0.5833333333333334 0.2916666666666667],
[0.5555555555555555 0.3333333333333332]]
I have tried all the following : (*independently)
for i in str_arr:
i.strip("'")
np.array(i)
float(i)
Yet none of them work. They either cannot convert str --> float or they still keep the type as a str. Please help.

Use ast.literal_eval to convert str to another data type
import ast
str_array = ['[0.25 0.2916666666666667]', '[0.5833333333333334 0.2916666666666667]',
'[0.5555555555555555 0.3333333333333332]']
result = [ast.literal_eval(i.replace(" ", ",")) for i in str_array]
print(result) # [[0.25, 0.2916666666666667], [0.5833333333333334, 0.2916666666666667], [0.5555555555555555, 0.3333333333333332]]

You can also use the basic function eval.
[eval(x.replace(" ",",")) for x in str_array]

pd.to_numeric could not convert string to float

def openfiles():
file1 = tkinter.filedialog.askopenfilename(filetypes=(("Text Files",".csv"),("All files","*")))
read_text=pd.read_csv(file1)
displayed_file.insert(tk.END,read_text)
read_text['OPCODE'] = pd.to_numeric(read_text['OPCODE'],errors = 'coerce').fillna(0.0)
read_text['ADDRESS'] = pd.to_numeric(read_text['ADDRESS'],errors = 'coerce').fillna(0.0)
classtype1=np.argmax(model.predict(read_text), axis=-1)
tab2_display_text.insert(tk.END,read_text)
When running this code it shows "could not convert string to float".
Link of the csv file that is used to as datafram: https://github.com/Yasir1515/Learning/blob/main/Book2%20-%20Copy.csv
Complete code link (probmatic code is at line 118-119): https://github.com/Yasir1515/Learning/blob/main/PythonApplication1.py

In your data ADDRESS is a hexadecimal number and OPCODE is a list of hexadecimal numbers. I don't know why would you want to convert hex numbers to float. You should convert them to integers.
The method to_numeric is not suitable to convert hex string to integer, or handle a list of hex numbers. You need to write help function:
def hex2int(x):
try:
return int(x, 16)
except:
return 0
def hex_list2int_list(zz):
return [hex2int(el) for el in zz.split()]
Now replace relevant lines:
read_text['OPCODE'] = read_text['OPCODE'].apply(hex_list2int_list)
read_text['ADDRESS'] = read_text['ADDRESS'].apply(hex2int)

I look at your CSV file. The column OPCODE contains one row with a long string of some numbers separated by space(' '). therefor you cannot cast that type of value to numeric type (the string '88 99 77 66' != numeric type). I can suggest some solution to split those many values in the column OPCODE to many rows and then perform the to_numeric method after afterwards you can make manipulation and return it to the previous form.
what I suggest is:
read_text=pd.read_csv(file1)
new_df = pd.concat([pd.Series(row['ADDRESS'], row['OPCODE'].split(' '))
for _, row in a.iterrows()]).reset_index()
new_df['OPCODE'] = pd.to_numeric(new_df['OPCODE'],errors = 'coerce').fillna(0.0)

How to properly use binascii.crc32 on a byte array

I have what should be an array of Hex but always seen as a 'str' by binascii.crc32().
As an exemple : data = ['aa', 'bb', 'cc'].
This is for frame building in order to put it in a txt file openable by Wireshark under specific format (not the problem here, this works fine).
As shown in documentation :
print(binascii.crc32(b"hello world")) works.
I tried to convert data into binary with bin() which gave me
data = ['10101010', '10111011', '11001100']
However it is never seen as binary.
I tried to convert it using bytes() method but only managed to convert it into ASCII again.
def toBin(data):
data2=[]
for iBcl in range (1,len(data)):
if iBcl%2!=0:
binary=bin(int(data[iBcl-1]+data[iBcl],16))[2:]
data2.append(binary)
print(data2)
return data2
data="aabbcc"
data2=toBin(data)
print(binascii.crc32(data2[0]+data2[1]+data2[2]))
According to online CRC32 calculator the result should be
0xBE4DF84C but i've got the following error :
TypeError: a bytes-like object is required, not 'str'
I don't get the error using bytes() method but CRC32 is then calculated on ASCII character and this give me incorrect crc.

You have a list of hexadecimal data. You can convert each byte with binascii.unhexlify and then join all:
b = b''.join((binascii.unhexlify(i) for i in data))
print(b)
gives as expected
b'\xaa\xbb\xcc'
You can control the crc32:
print(hex(binascii.crc32(b)))
which gives:
0xbe4df84c

how to convert string to float in for loop in python?

I want to calculate the average flow, but I have hard time converting string to float in python.
here is my code in notepad++:
import cookielib,urllib2,urllib,string
import cx_Oracle
import datetime
import string
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
total_flow = 0
count = 0
page = opener.open("http://cdec.water.ca.gov/cgi-progs/queryCSV?station_id=vns&sensor_num=20&dur_code=E&start_date=2016-04-14&end_date=2016-04-15&data_wish=View+CSV+Data")
for line in page:
a=line.split(',')
b = float(a)
count+=1
total_flow= total_flow+b[-1]
# here a=[date,time,flow]; so b[-1] is flow data
ave_flow = total_flow/count
print ave_flow
when I ran this script, I got the error message:
b=float(a)
type error: float() argument must be a string or a number
however, when I directly converted string to float in python, it worked. I don't know what was going on.

Here a is a list not a string. You should have another for loop to handle each element in the list.

a = line.split(',')
This returns a list. That's why you are getting an error.

a=line.split(',')
a is a list here, try something like:
b=float(a[0])

ok, I figured it out. there are some invisible symbol along with string in the list so i can't simply convert string to number.
i changed to
a=line.strip().split(",")
b=float(a[-1])
and it works.

xlrd read number as string

I am trying to read from an xls file a long number (6425871003976) but python keeps trunking it before it reads it as a number not a string (6.42587100398e+12). Is there any method to read it directly as a string even thou in the xls file it is a number?
values = sheet.row_values(rownum)
in values it appears correctly (6425871003976.0) but when I try values[0] it is already switched to the incorrect value.
Solution:
This was my solution using repr():
if type(values[1]) is float:
code_str = repr(values[1]).split(".")[0]
else:
code_str = values[1]
product_code = code_str.strip(' \t\n\r')

It's the same value. All that's different is how the value is being printed to screen. The scientific notation you get is because to print the number str is called on it. When you print the list of values the internal __str__ method of the list calls repr on each of its elements. Try print(repr(values[0])) instead.

use print "%d" %(values[0])
%d is used for printing integer values

This is an example, which bring the value of a cell (in your case it's an int), you need to convert it to a string using the str function
from openpyxl import load_workbook
wb = load_workbook(filename='xls.xlsx', read_only=True)
ws = wb['Sheet1']
for row in ws.rows:
for cell in row:
cell_str=str(cell.value)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I change the CSV data to a float? - python

if all your data are float, maybe: 1,the first rou are names not float 2,maybe there are 'backspace' in your data, they will be recognize as '/t', find a spack in your data,press ctrl+r,change backspace to space

Related

How to convert a string array to float array?

pd.to_numeric could not convert string to float

How to properly use binascii.crc32 on a byte array

how to convert string to float in for loop in python?

xlrd read number as string

Categories

Resources