Python: Create coordinate list (convert string to int) - python

I want to import several coordinates (could add up to 20.000) from an text file.
These coordinates need to be added into a list, looking like the follwing:
coords = [[0,0],[1,0],[2,0],[0,1],[1,1],[2,1],[0,2],[1,2],[2,2]]
However when i want to import the coordinates i got the follwing error:
invalid literal for int() with base 10
I can't figure out how to import the coordinates correctly.
Does anyone has any suggestions why this does not work?
I think there's some problem with creating the integers.
I use the following script:
Bronbestand = open("D:\\Documents\\SkyDrive\\afstuderen\\99 EEM - Abaqus 6.11.2\\scripting\\testuitlezen4.txt", "r")
headerLine = Bronbestand.readline()
valueList = headerLine.split(",")
xValueIndex = valueList.index("x")
#xValueIndex = int(xValueIndex)
yValueIndex = valueList.index("y")
#yValueIndex = int(yValueIndex)
coordList = []
for line in Bronbestand.readlines():
segmentedLine = line.split(",")
coordList.extend([segmentedLine[xValueIndex], segmentedLine[yValueIndex]])
coordList = [x.strip(' ') for x in coordList]
coordList = [x.strip('\n') for x in coordList]
coordList2 = []
#CoordList3 = [map(int, x) for x in coordList]
for i in coordList:
coordList2 = [coordList[int(i)], coordList[int(i)]]
print "coordList = ", coordList
print "coordList2 = ", coordList2
#print "coordList3 = ", coordList3
The coordinates needed to be imported are looking like (this is "Bronbestand" in the script):
id,x,y,
1, -1.24344945, 4.84291601
2, -2.40876842, 4.38153362
3, -3.42273545, 3.6448431
4, -4.22163963, 2.67913389
5, -4.7552824, 1.54508495
6, -4.99013376, -0.313952595
7, -4.7552824, -1.54508495
8, -4.22163963, -2.67913389
9, -3.42273545, -3.6448431
Thus the script should result in:
[[-1.24344945, 4.84291601],[-2.40876842, 4.38153362],[-3.42273545, 3.6448431],[-4.22163963, 2.67913389],[-4.7552824, 1.54508495],[-4.99013376,-0.313952595],[-4.7552824, -1.54508495],[-4.22163963, -2.67913389],[-3.42273545, -3.6448431]]
I also tried importing the coordinates with the native python csv parser but this didn't work either.
Thank you all in advance for the help!

Your numbers are not integers so the conversion to int fails.
Try using float(i) instead of int(i) to convert into floating point numbers instead.
>>> int('1.5')
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
int('1.5')
ValueError: invalid literal for int() with base 10: '1.5'
>>> float('1.5')
1.5

Other answers have said why your script fails, however, there is another issue here - you are massively reinventing the wheel.
This whole thing can be done in a couple of lines using the csv module and a list comprehension:
import csv
with open("test.csv") as file:
data = csv.reader(file)
next(data)
print([[float(x) for x in line[1:]] for line in data])
Gives us:
[[-1.24344945, 4.84291601], [-2.40876842, 4.38153362], [-3.42273545, 3.6448431], [-4.22163963, 2.67913389], [-4.7552824, 1.54508495], [-4.99013376, -0.313952595], [-4.7552824, -1.54508495], [-4.22163963, -2.67913389], [-3.42273545, -3.6448431]]
We open the file, make a csv.reader() to parse the csv file, skip the header row, then make a list of the numbers parsed as floats, ignoring the first column.
As pointed out in the comments, as you are dealing with a lot of data, you may wish to iterate over the data lazily. While making a list is good to test the output, in general, you probably want a generator rather than a list. E.g:
([float(x) for x in line[1:]] for line in data)
Note that the file will need to remain open while you utilize this generator (remain inside the with block).

Related

Elliptic curve addition - how to add point coordinates from a file in Python

i'm just a tech newbie learning how ec cryptography works and stumbled into a problem with my python code
i'm testing basic elliptic curve operations like adding points, multiple by G etc, let's say i have
Ax = 0xbc46aa75e5948daa08123b36f2080d234aac274bf62fca8f9eb0aadf829c744a
Ay = 0xe5f28c3a044b1cac54a9b4bf719f02dfae93a0bae73832301e786104f43255a5
A = (Ax,Ay)
f = open('B-coordinates.txt', 'r')
data = f.read()
f.close()
print (data)
B = 'data'
there B-coordinates.txt contains lines like (0xe7e6bd3424a1e92abb45846c82d570f0596850661d1c952f9fe3564567d9b9e8,0x59c9e0bba945e45f40c0aa58379a3cb6a5a2283993e90c58654af4920e37f5)
then i perform basic point addition A+B add(A,B)
so because of B = 'data' i obviously have this error
TypeError: unsupported operand type(s) for -: 'int' and 'str'
and if i add int(data) >
Error invalid literal for int() with base 10: because letters in input (i.e. in points coordinates).
so my question is, please can someone knowledgeable in python and elliptic curve calculations tell me how to add the coordinates of a point so as to bypass these int problems when extracting lines from a file into .py? I will be very grateful for the answer! I've been trying to figure out how to do it right for many hours now, and maybe just goofing off, but please I'll appreciate any hints
You can load B from B-coordinates.txt by evaluating its content as Python code:
B = eval(data)
As the code above leads to arbitrary code execution if you don't trust B-coordinates.txt content. If so, parse the hexadecimal tuple manually:
B = tuple([int(z, 16) for z in data[1:-1].split(',')])
Then to sum A and B in a pairwise manner using native Python 3 and keep a tuple, you can proceed as follows by summing unpacked coordinates (zip) for both tuples:
print(tuple([a + b for (a, b) in zip(A, B)]))
UPDATE:
Assume B-coordinates.txt looks like the following as described by OP author comment:
(0x1257e93a78a5b7d8fe0cf28ff1d8822350c778ac8a30e57d2acfc4d5fb8c192,0x1124ec11c77d356e042dad154e1116eda7cc69244f295166b54e3d341904a1a7)
(0x754e3239f325570cdbbf4a87deee8a66b7f2b33479d468fbc1a50743bf56cc18,0x673fb86e5bda30fb3cd0ed304ea49a023ee33d0197a695d0c5d98093c536683)
...
You can load the Bs from this file by doing:
f = open('B-coordinates.txt', 'r')
lines = f.read().splitlines()
f.close()
Bs = [eval(line) for line in lines]
As described above to avoid arbitrary code execution, use the following:
Bs = [tuple([int(z, 16) for z in line[1:-1].split(',')]) for line in lines]
That way you can use for instance the first B pair, by using Bs[0], defined by the first line of B-coordinates.txt that is:
(0x1257e93a78a5b7d8fe0cf28ff1d8822350c778ac8a30e57d2acfc4d5fb8c192,0x1124ec11c77d356e042dad154e1116eda7cc69244f295166b54e3d341904a1a7)
You probably dont want to set B equal to 'data' (as a string) but instead to data (as the variable)
replace B = 'data' with B = data in the last row
Your data seems to be a tuple of hex-strings.
Use int(hex_string, 16) to convert them (since hex is base 16 not 10)
EDIT based on comment:
Assuming your file looks like this:
with open("B-coordinates.txt", "r") as file:
raw = file.read()
data = [tuple(int(hex_str, 16) for hex_str in item[1:-1].split(",")) for item in raw.split("\n")]
You can then get the first Bx, By like this:
Bx, By = data[0]

Python - live update graphs; to plot Time on x-axis

i have a python script that collects data from a server in the form of
<hh-mm-ss>,<ddd>
here, the first field is Date and the second field is an integer digit. this data is being written into a file.
i have another thread running which is plotting a live graph from the file which i mentioned in the above paragraph.
so this file has data like,
<hh-mm-ss>,<ddd>
<hh-mm-ss>,<ddd>
<hh-mm-ss>,<ddd>
<hh-mm-ss>,<ddd>
Now i want to plot a time series Matplotlib graph with the above shown data.
but when i try , it throws an error saying,
ValueError: invalid literal for int() with base 10: '15:53:09'
when i have normal data like shown below, things are fine
<ddd>,<ddd>
<ddd>,<ddd>
<ddd>,<ddd>
<ddd>,<ddd>
UPDATE
my code that generates graph from the file i have described above is shown below,
def animate(i):
pullData = open("sampleText.txt","r").read()
dataArray = pullData.split('\n')
xar = []
yar = []
for eachLine in dataArray:
if len(eachLine)>1:
x,y = eachLine.split(',')
xar.append(int(x))
yar.append(int(y))
ax1.clear()
ax1.plot(xar,yar)
UPDATED CODE
def animate(i):
print("inside animate")
pullData = open("sampleText.txt","r").read()
dataArray = pullData.split('\n')
xar = []
yar = []
for eachLine in dataArray:
if len(eachLine)>1:
x,y = eachLine.split(',')
timeX=datetime.strptime(x, "%H:%M:%S")
xar.append(timeX.strftime("%H:%M:%S"))
yar.append(float(y))
ax1.clear()
ax1.plot(xar,yar)
Now i am getting the error at this line (ax1.plot(xar,yar))
how will i get over this?
You are trying to parse an integer from a string representing a timestamp. Of course it fails.
In order to be able to use the timestamps in a plot, you need to parse them to the proper type, e.g., datetime.time or datetime.datetime. You can use datetime.datetime.strptime(), dateutil.parser.parse() or maybe also time.strptime() for this.
Plotting the data is straight-forward, then. Have a look at the interactive plotting mode: matplotlib.pyplot.ion().
For reference/further reading:
https://pypi.python.org/pypi/python-dateutil
http://dateutil.readthedocs.org/en/latest/parser.html#dateutil.parser.parse
https://docs.python.org/2/library/datetime.html#datetime.datetime.strptime
https://docs.python.org/2/library/time.html#time.strptime
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.ion
Plotting time in Python with Matplotlib
How to iterate over the file in python
Based on your code I have created an example. I have inlined some notes as to why I think it's better to do it this way.
# use with-statement to make sure the file is eventually closed
with open("sampleText.txt") as f:
data = []
# iterate the file using the file object's iterator interface
for line in f:
try:
t, f = line.split(",")
# parse timestamp and number and append it to data list
data.append((datetime.strptime(t, "%H:%M:%S"), float(f)))
except ValueError:
# something went wrong: inspect later and continue for now
print "failed to parse line:", line
# split columns to separate variables
x,y = zip(*data)
# plot
plt.plot(x,y)
plt.show()
plt.close()
For further reading:
https://docs.python.org/2/reference/datamodel.html#context-managers
https://docs.python.org/2/library/stdtypes.html#file-objects
The error tells you the cause of the problem: You're trying to convert a string, such as '15:53:09', into an integer. This string is not a valid number.
Instead, you should either look into using a datetime object from the datetime module to work with date/time things or at least splitting the string into fields using ':' as the delimiter and the using each field separately.
Consider this brief demo:
>>> time = '15:53:09'
>>> time.split(':')
['15', '53', '09']
>>> [int(v) for v in time.split(':')]
[15, 53, 9]
>>> int(time) # expect exception
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '15:53:09'
>>>

Python, convert all entries of list from string to float

I am brand new to Python and looking up examples for what I want to do. I am not sure what is wrong with this loop, what I would like to do is read a csv file line by line and for each line:
Split by comma
Remove the first entry (which is a name) and store it as name
Convert all other entries to floats
Store name and the float entries in my Community class
This is what I am trying at the moment:
class Community:
num = 0
def __init__(self, inName, inVertices):
self.name = inName
self.vertices = inVertices
Community.num += 1
allCommunities = []
f = open("communityAreas.csv")
for i, line in enumerate(f):
entries = line.split(',')
name = entries.pop(0)
for j, vertex in entries: entries[j] = float(vertex)
print name+", "+entries[0]+", "+str(type(entries[0]))
allCommunities.append(Community(name, entries))
f.close()
The error I am getting is:
>>>>> PYTHON ERROR!!! Traceback (most recent call last):
File "alexChicago.py", line 86, in <module>
for j, vertex in entries: entries[j] = float(vertex)
ValueError: too many values to unpack
It may be worth pointing out that this is running in omegalib, a library for a visual cluster that runs in C and interprets Python.
I think you forgot the enumerate() function on line 86; should be
for j, vertex in enumerate(entries): entries[j] = float(vertex)
If there's always a name and then a variable number of float values, it sounds like you need to split twice: the first time with a maxsplit of 1, and the other as many times as possible. Example:
name, float_values = line.split(',',1)
float_values = [float(x) for x in float_values.split(',')]
I may not be absolutely certain about what you want to achieve here, but converting all the element in entries to float, should not this be sufficient?: Line 86:
entries=map(float, entries)

Reading in multiple hdf5 files and appending them to a new dictionary

I have a list of hdf5 files which I would like to open and read in the appropriate values into a new dictionary and eventually write to a text file. I don't necessarily know the values, so the user defines them in an array as an input into the code. The number of files needed is defined by the number of days worth of data the user wants to look at.
new_data_dic = {}
for j in range(len(values)):
new_data_dic[values[j]] = rbsp_ephm[values[j]]
for i in (np.arange(len(filenames_a)-1)+1):
rbsp_ephm = h5py.File(filenames_a[i])
for j in range(len(values)):
new_data_dic[values[j]].append(rbsp_ephm[values[j]])
This works fine if I only have one file, but if I have two or more it seems to close the key? I'm not sure if this is exactly what is happening, but when I ask what new_data_dic is, for values it gives {'Bfs_geo_a': <Closed HDF5 dataset>,... which will not write to a text file. I've tried closing the hdf5 file before opening the next (rbsp_ephm.close()) but I get the same error.
Thanks for any and all help!
I don't really understand your problem... you are trying to create a list of hdf5 dataset?
Or did you just forget the [()] to acces the values in the dataset itself?
Here is a simple standalone example that works just fine :
import h5py
# File creation
filenames_a = []
values = ['values/toto', 'values/tata', 'values/tutu']
nb_file = 5
tmp = 0
for i in range(nb_file):
fname = 'file%s.h5' % i
filenames_a.append(fname)
file = h5py.File(fname, 'w')
grp = file.create_group('values')
for value in values:
file[value] = tmp
tmp += 1
file.close()
# the thing you want
new_data_dict = {value: [] for value in values}
for fname in filenames_a:
rbsp_ephm = h5py.File(fname, 'r')
for value in values:
new_data_dict[value].append(rbsp_ephm[value][()])
print new_data_dict
It returns :
{'values/tutu': [2, 5, 8, 11, 14], 'values/toto': [0, 3, 6, 9, 12], 'values/tata': [1, 4, 7, 10, 13]}
Does it answer your question?
Maybe not directly the good solution, but you could try to extract data as numpy arrays which are a more flexible format rather than the h5py dataset one. See below how to do it:
>>> print type(file['Average/u'])
<class 'h5py.highlevel.Dataset'>
>>> print type(file['Average/u'][:])
<type 'numpy.ndarray'>
And just in case, you should try to use a more "pythonic" way for your loop, that is:
for j in values:
new_data_dic[j] = rbsp_ephm[j]
instead of:
for j in range(len(values)):
new_data_dic[values[j]] = rbsp_ephm[values[j]]

Writing a random amount of random numbers to a file and returning their squares

So, I'm trying to write a random amount of random whole numbers (in the range of 0 to 1000), square these numbers, and return these squares as a list. Initially, I started off writing to a specific txt file that I had already created, but it didn't work properly. I looked for some methods I could use that might make things a little easier, and I found the tempfile.NamedTemporaryFile method that I thought might be useful. Here's my current code, with comments provided:
# This program calculates the squares of numbers read from a file, using several functions
# reads file- or writes a random number of whole numbers to a file -looping through numbers
# and returns a calculation from (x * x) or (x**2);
# the results are stored in a list and returned.
# Update 1: after errors and logic problems, found Python method tempfile.NamedTemporaryFile:
# This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system, and creates a temprary file that can be written on and accessed
# (say, for generating a file with a list of integers that is random every time).
import random, tempfile
# Writes to a temporary file for a length of random (file_len is >= 1 but <= 100), with random numbers in the range of 0 - 1000.
def modfile(file_len):
with tempfile.NamedTemporaryFile(delete = False) as newFile:
for x in range(file_len):
newFile.write(str(random.randint(0, 1000)))
print(newFile)
return newFile
# Squares random numbers in the file and returns them as a list.
def squared_num(newFile):
output_box = list()
for l in newFile:
exp = newFile(l) ** 2
output_box[l] = exp
print(output_box)
return output_box
print("This program reads a file with numbers in it - i.e. prints numbers into a blank file - and returns their conservative squares.")
file_len = random.randint(1, 100)
newFile = modfile(file_len)
output = squared_num(file_name)
print("The squared numbers are:")
print(output)
Unfortunately, now I'm getting this error in line 15, in my modfile function: TypeError: 'str' does not support the buffer interface. As someone who's relatively new to Python, can someone explain why I'm having this, and how I can fix it to achieve the desired result? Thanks!
EDIT: now fixed code (many thanks to unutbu and Pedro)! Now: how would I be able to print the original file numbers alongside their squares? Additionally, is there any minimal way I could remove decimals from the outputted float?
By default tempfile.NamedTemporaryFile creates a binary file (mode='w+b'). To open the file in text mode and be able to write text strings (instead of byte strings), you need to change the temporary file creation call to not use the b in the mode parameter (mode='w+'):
tempfile.NamedTemporaryFile(mode='w+', delete=False)
You need to put newlines after each int, lest they all run together creating a huge integer:
newFile.write(str(random.randint(0, 1000))+'\n')
(Also set the mode, as explained in PedroRomano's answer):
with tempfile.NamedTemporaryFile(mode = 'w+', delete = False) as newFile:
modfile returns a closed filehandle. You can still get a filename out of it, but you can't read from it. So in modfile, just return the filename:
return newFile.name
And in the main part of your program, pass the filename on to the squared_num function:
filename = modfile(file_len)
output = squared_num(filename)
Now inside squared_num you need to open the file for reading.
with open(filename, 'r') as f:
for l in f:
exp = float(l)**2 # `l` is a string. Convert to float before squaring
output_box.append(exp) # build output_box with append
Putting it all together:
import random, tempfile
def modfile(file_len):
with tempfile.NamedTemporaryFile(mode = 'w+', delete = False) as newFile:
for x in range(file_len):
newFile.write(str(random.randint(0, 1000))+'\n')
print(newFile)
return newFile.name
# Squares random numbers in the file and returns them as a list.
def squared_num(filename):
output_box = list()
with open(filename, 'r') as f:
for l in f:
exp = float(l)**2
output_box.append(exp)
print(output_box)
return output_box
print("This program reads a file with numbers in it - i.e. prints numbers into a blank file - and returns their conservative squares.")
file_len = random.randint(1, 100)
filename = modfile(file_len)
output = squared_num(filename)
print("The squared numbers are:")
print(output)
PS. Don't write lots of code without running it. Write little functions, and test that each works as expected. For example, testing modfile would have revealed that all your random numbers were being concatenated. And printing the argument sent to squared_num would have shown it was a closed filehandle.
Testing the pieces gives you firm ground to stand on and lets you develop in an organized way.

Categories