Convert str to float python - python

Why does this throw an error : ValueError: could not convert string to float:
frequencies.append(float(l[start+1:stop1].strip()))
losses.append(float(l[stop1+5:stop2].strip()))
Doesn't the float() command parse values into the float type? Where am I wrong here? Both frequencies and losses are lists
This is the code:
def Capture():
impedance = 0
losses = []
frequencies = []
Xtalk = []
start = 0
stop1 = 0
stop2 =0
for filename in glob.glob(os.path.join(user_input, '*.txt')):
with open(filename, 'r') as f:
for l in f.readlines():
if l.startswith(' Impedance'):
v = l[12:-7]
impedance = float(v)
if l.startswith(' Xtalk'):
Xtalk.append(l[7:].strip())
if l.startswith(' Loss per inch'):
start = l.find('#')
stop1 = l.find('GHz', start)
stop2 = l.find('dB', start)
frequencies.append(float(l[start+1:stop1].strip()))
losses.append(float(l[stop1+5:stop2].strip()))
print(impedance, frequencies, losses, Xtalk)
It basically takes values from a text file and prints them onto the console
And the text files look like this:
Impedance = 71.28 ohms
Begin Post processing
Frequency multiplier = 1Hz
number of ports = 12
Start Frequency = 0
End Frequency = 40000000000
Number of Frequency points = 4001
Touchstone Output file = C:\Users\Aravind_Sampathkumar\Desktop\IMLC\BO\Output_TW_3.5-TS_3-core_h_2.xml_5000mil.s12p
Output format = Real - Imaginary
Loss per inch # 2.500000e+00 GHz = -0.569 dB
Loss per inch # 5 GHz = -0.997 dB
Xtalk #1 (Conductor 1 2):
Step response Next= -0.56 mV
Step response Fext peak # 5 inches= 0.11 mV
Xtalk #2 (Conductor 5 6):
Step response Next= -0.56 mV
Step response Fext peak # 5 inches= 0.11 mV
Finished post processing

First make sure which is the format of variable.
string with comma can't be converted to float with "float()" parser
a = "1,2345"
float(a)
Traceback (most recent call last):
File "<input>", line 1, in <module>
ValueError: could not convert string to float: '1,2345'
a = "1.2345"
float(a)
1.2345

Related

How can i solve "killed"?

I'm trying to plot clusters for my data which is stored in .data file using the density peak clustering algorithm using this code but got killed as the file size is 8 Giga and my Ram is 32. how can I solve this problem, please?
the core problem in loading the whole file by this method
def density_and_distance(self, distance_file, dc = None):
print("Begin")
distance, num, max_dis, min_dis = load_data(distance_file)
print("end")
if dc == None:
dc = auto_select_dc(distance, num, max_dis, min_dis)
rho = local_density(distance, num, dc)
delta, nearest_neighbor = min_distance(distance, num, max_dis, rho)
self.distance = distance
self.rho = rho
self.delta = delta
self.nearest_neighbor = nearest_neighbor
self.num = num
self.dc = dc
return rho, delta
I got Begin word printed then got killed after some minutes
the file contains like
1 2 19.86
1 3 36.66
1 4 87.94
1 5 11.07
1 6 36.94
1 7 52.04
1 8 173.68
1 9 28.10
1 10 74.00
1 11 85.36
1 12 40.04
1 13 95.24
1 14 67.29
....
the method of reading the file is
def load_data(distance_file):
distance = {}
min_dis, max_dis = sys.float_info.max, 0.0
num = 0
with open(distance_file, 'r', encoding = 'utf-8') as infile:
for line in infile:
content = line.strip().split(' ')
assert(len(content) == 3)
idx1, idx2, dis = int(content[0]), int(content[1]), float(content[2])
num = max(num, idx1, idx2)
min_dis = min(min_dis, dis)
max_dis = max(max_dis, dis)
distance[(idx1, idx2)] = dis
distance[(idx2, idx1)] = dis
for i in range(1, num + 1):
distance[(i, i)] = 0.0
infile.close()
return distance, num, max_dis, min_dis
to be
import dask.dataframe as dd
def load_data(distance_file):
distance = {}
min_dis, max_dis = sys.float_info.max, 0.0
num = 0
#with open(distance_file, 'r', encoding = 'utf-8') as infile:
df_dd = dd.read_csv("ex3.csv")
print("df_dd",df_dd.head())
#for line in df_dd:
#content = df_dd.strip().split(' ')
#print(content)
idx1, idx2, dis = df_dd.partitions[0], df_dd.partitions[1], df_dd.partitions[2]
print("df_dd.partitions[0]",df_dd.partitions[0])
num = max(num, idx1, idx2)
min_dis = min(min_dis, dis)
max_dis = max(max_dis, dis)
distance[(idx1, idx2)] = dis
distance[(idx2, idx1)] = dis
for i in range(1, num + 1):
distance[(i, i)] = 0.0
return distance, num, max_dis, min_dis
You are using Python native integers and floats: these alone take tens of bytes for each actual number in your data (28 bytes for an integer).
If you simply use Numpy or Pandas for that, your memory consumption might be slashed by a factor of 4 or more, without further adjustments.
Your lines average 10 bytes this early - at an 8GB file you should have less than 800 million registers - if you use 16bit integer numbers and 32 bit float that would mean that your data might fit in 10GB of memory. It is still a tight call, as the default pandas behavior is to copy everything on changes to a column. There are other options:
Since your code depends on indexing the rows as you've done there, you could just offload your data to an SQLite DB, and use in-sqlite indices instead of the dict you are using, as well as its min and max operators: this would offset memory usage, and sqlite would make its job with minimal fuss.
Another option would be to use "dask" instead of Pandas: it will take care of offloading data that would not fit in memory to disk.
TL;DR: the way your problem is arranged, maybe going to sqlite might be the way that would require less changes in what you have thought.

Write data stream to .txt, average values after n seconds, then clear text and repeat

I am trying to write a program that writes my hyp_ft variable (continuously streamed once every second) to a text file, waits until 60 lines (1 minute) have been written, averages those 60 values, and then clears the text file and repeats.
I found some code from another post here and tried to incorporate this into mine, with an added if statement.
Calculating average of the numbers in a .txt file using Python
Can someone look at this for me and tell me if this is correct? Thank you!
hyp_ft = (hyp_m*3.2800839) #hyp_ft is the value that is constantly publishing
f = open( 'hyp_ft.txt', 'a' )
f.write(str(hyp_ft) + '\n' )
f.close()
while 1:
with open('hyp_ft.txt') as fh:
sum = 0 # initialize here, outside the loop
count = 0 # and a line counter
for line in fh:
count += 1 # increment the counter
if count == 59
sum += float(line) #convert the lines to floats
average = sum / count
print average
fh.truncate()
fh.close()
You are missing time.sleep() function that might be the issue to your problem, try as:
import time
f = open( 'hyp_ft.txt', 'a' )
count = 0
while True:
#hyp_ft is the value that is constantly publishing
hyp_ft = (hyp_m*3.2800839)
# first time creating the file if it doesnt exists
f.write(str(hyp_ft) + '\n' )
count += 1
if count == 60:
f.close()
with open('hyp_ft.txt') as fh:
total = 0 # initialize here, outside the loop
for line in fh:
total += float(line) #convert the lines to floats
average = total / (count) # as your count is still 59 not 60 what you actually want
print average
fh.truncate()
fh.close()
count = 0
f = open( 'hyp_ft.txt', 'a' )
Here is the updated solution where you are continuously updating the value to the file updating the count value and as soon as you reach 60 if calculates the average and prints the value and continue doing the same.
And assuming you don't need the usage of this file outside of this code there is no need to create one, you can simply do something like:
def calc_distance(gps1, gps2):
count = 0
total = 0
while True:
gps1_lat = round(gps1.latitude, 6)
gps1_lon = round(gps1.longitude, 6)
gps2_lat = round(gps2.latitude, 6)
gps2_lon = round(gps2.longitude, 6)
delta_lat = (gps1_lat*(108000) - gps2_lat*(108000))
delta_lon = (gps1_lon*(108000) - gps2_lon*(108000))
hyp_m = (delta_lat**2 + delta_lon**2)**0.5
hyp_ft = (hyp_m*3.2800839)
total += hyp_ft
pub = rospy.Publisher("gps_dist", Float32, queue_size=10)
rate = rospy.Rate(1)
rospy.loginfo("Distance is %s in ft.", hyp_ft)
# rospy.loginfo(hyp_ft)
pub.publish(hyp_ft)
rate.sleep()
count += 1
if(count == 60):
print (total/count)
total = 0
count = 0
this is much efficient, check if this helps
#warl0ck here is the entire function. What's happening here is that I'm trying to get GPS latitude and longitude values to calculate the distance between two GPS coordinates. It makes one calculation once every second and gives me the hyp_ft value. I'm trying to write 60 hyp_ft values, take the average of those 60 values, then clear the file and continue doing that for the next 60 values, repeatedly. As for the rospy lines, those are from the Robot Operating System (ROS) framework that this code uses to help it run.
When I run the code, it only saves the hyp_ft value, then after 60 seconds it gives me this error.
The error that I'm getting is:
fh.truncate()
"I/O operation on closed file"
def calc_distance(gps1, gps2):
while 1:
gps1_lat = round(gps1.latitude, 6)
gps1_lon = round(gps1.longitude, 6)
gps2_lat = round(gps2.latitude, 6)
gps2_lon = round(gps2.longitude, 6)
delta_lat = (gps1_lat*(108000) - gps2_lat*(108000))
delta_lon = (gps1_lon*(108000) - gps2_lon*(108000))
hyp_m = (delta_lat**2 + delta_lon**2)**0.5
hyp_ft = (hyp_m*3.2800839)
f = open( 'hyp_ft.txt', 'a' )
try:
f.write(str(hyp_ft) + '\n' )
finally:
f.close()
pub = rospy.Publisher("gps_dist", Float32, queue_size=10)
rate = rospy.Rate(1)
rospy.loginfo("Distance is %s in ft.", hyp_ft)
# rospy.loginfo(hyp_ft)
pub.publish(hyp_ft)
rate.sleep()
while True:
time.sleep(60)
with open('hyp_ft.txt') as fh:
sum = 0 # initialize here, outside the loop
count = 0 # and a line counter
for line in fh:
count += 1 # increment the counter
if count == 60:
sum += float(line) #convert the lines to floats
break # to come out of the for loop
average = sum / count
print average
fh.truncate()
fh.close()``

Extracting frequencies from a wav file python

I am familiar with python but new to numpy, so please pardon me if I am wrong.
I am trying to read a .wav file having multiple frequencies (separated by silence). So far I've been able to read the values and find the various parts in the file where there is a sound. Then, I am trying to find the Discrete Cosine Transform and calculate the frequencies from it (ref: how to extract frequency associated with fft values in python)
However, I'm getting an error:
index 46392 is out of bounds for axis 0 with size 25
Here's my code:
import wave
import struct
import numpy as np
def isSilence(windowPosition):
sumVal = sum( [ x*x for x in sound[windowPosition:windowPosition+windowSize+1] ] )
avg = sumVal/(windowSize)
if avg <= 0.0001:
return True
else:
return False
#read from wav file
sound_file = wave.open('test.wav', 'r')
file_length = sound_file.getnframes()
data = sound_file.readframes(file_length)
sound_file.close()
#data = struct.unpack("<h", data)
data = struct.unpack('{n}h'.format(n=file_length), data)
sound = np.array(data)
#sound is now a list of values
#detect silence and notes
i=0
windowSize = 2205
windowPosition = 0
listOfLists = []
listOfLists.append([])
maxVal = len(sound) - windowSize
while True:
if windowPosition >= maxVal:
break
if not isSilence(windowPosition):
while not isSilence(windowPosition):
listOfLists[i].append(sound[windowPosition:windowPosition+ windowSize+1])
windowPosition += windowSize
listOfLists.append([]) #empty list
i += 1
windowPosition += windowSize
frequencies = []
#Calculating the frequency of each detected note by using DFT
for signal in listOfLists:
if not signal:
break
w = np.fft.fft(signal)
freqs = np.fft.fftfreq(len(w))
l = len(signal)
#imax = index of first peak in w
imax = np.argmax(np.abs(w))
fs = freqs[imax]
freq = imax*fs/l
frequencies.append(freq)
print frequencies
Edit: Here is the traceback:
Traceback (most recent call last):
File "final.py", line 61, in <module>
fs = freqs[imax]
IndexError: index 46392 is out of bounds for axis 0 with size 21
The problem was that I assumed listOfLists was actually a list of lists, but actually it was a list of list of lists. The line:
listOfLists[i].append(sound[windowPosition:windowPosition+ windowSize+1])
was appending a list everytime, but I assumed it was appending the elements to existing list.
For instance, if listOfLists was:
[ [1,2,3] ]
Then, listOfLists[0].append([4,5,6]) would give:
[ [ [1,2,3],[4,5,6] ] ]
But I was expecting:
[ [1,2,3,4,5,6] ]
Replacing the problematic line with the code below worked for me:
for v in sound[windowPosition:windowPosition+windowSize+1]:
listOfLists[i].append(v)

unpack requires a string argument of length 2

I unable to figure out whats the problem with my code
import wave,struct,math
waveFile = wave.open("F:/mayaSound/ChillingMusic.wav", "rb")
volume = []
frames = waveFile.getnframes()
rate = waveFile.getframerate()
length = frames / int(rate)
for i in range (0,length):
waveData = waveFile.readframes(1)
data = struct.unpack("<h",waveData)
temp = (int(data[0])-128) / 128
volume.append(20*math.log10(abs(temp) + 1))
print volume
waveFile.close()
it gives me
Error: error: file <maya console> line 14: unpack requires a string argument of length 2 #

sum( array, 1) giving 'nan' in Python

First of all i know nan stands for "not a number" but I am not sure how i am getting an invalid number in my code. What i am doing is using a python script that reads a file for a list of vectors (x,y,z) and then converts it to a long array of values, but if i don't use the file and i make a for loop that generates random numbers i don't get any 'nan's.
After this i am using Newtons law of gravity to calculate the pos of stars, F= GMm/r^2 to calculate positions and then that data gets sent through a socket server to my c# visualizing software that i developed for watching simulations. Unfortuanately my python script that does the calculating has only but been troublesome to get working.
poslist = []
plist = []
mlist = []
lineList = []
coords = []
with open("Hyades Vectors.txt", "r") as text_file:
content = text_file.readlines()
#remove /n
for i in range(len(content)):
for char in "\n":
line = content[i].replace(char,"")
lineList.append(line)
lines = array(lineList)
#split " " within each line
for i in range(len(lines)):
coords.append(lines[i].split(" "))
coords = array(coords)
#convert coords string to integer
for i in range(len(coords)):
x = np.float(coords[i,0])
y = np.float(coords[i,1])
z = np.float(coords[i,2])
poslist.append((x,y,z))
pos = array(poslist)
quite often it is sending nan's after the second time going through this loop
vcm = sum(p)/sum(m) #velocity of centre mass
p = p-m*vcm #make total initial momentum equal zero
Myr = 8.4
dt = 1
pos = pos-(p/m)*(dt/2.) #initial half-step
finished = False
while not finished: # or NBodyVis.Oppenned() == False
r = pos-pos[:,newaxis] #all pairs of star-to-star vectors
for n in range(Nstars):
r[n,n] = 1e6 #otherwise the self-forces are infinite
rmag = sqrt(sum(square(r),-1)) #star-to star scalar distances
F = G*m*m[:,newaxis]*r/rmag[:,:,newaxis]**3 # all force pairs
for n in range(Nstars):
F[n,n] = 5 # no self-forces
p = p+sum(F,1)*dt #sum(F,1) is where i get a nan!!!!!!!!!!!!!!!!
pos -= (p/m)*dt
if Time <= 0:
finished = True
else:
Time -= 1
What am i doing wrong?????? I don't fully understand nans but i can't have them if my visualizing software is to read a nan, as for then nothing will apear for visuals. I know that the error is sum(F,1) I went and printed everything through until i got a nan and that is where, but how is it getting a nan from summing. Here is what part of the text file looks like that i am reading:
51.48855 4.74229 -85.24499
121.87149 11.44572 -140.79644
59.81673 68.8417 18.76767
31.95567 37.23007 6.59515
29.81066 34.76371 6.18374
41.35333 49.52844 14.12314
32.10481 38.46982 7.96628
48.13239 60.4019 37.45474
26.37793 34.53385 15.9054
76.02468 103.98826 25.96607
51.52072 71.17618 32.09829
please help

Categories