I'm trying to read the data from stdin, actually I'm using Ctrl+C, Ctrl+V to pass the values into cmd, but it stops the process at some point. It's always the same point. The input file is .in type, formating is that the first row is one number and next 3 rows contains the set of numbers separated with space. I'm using Python 3.9.9. Also this problem occurs with longer files (number of elements in sets > 10000), with short input everything is fine. It seems like the memory just run out. I had following aproach:
def readData():
# Read input
for line in range(5):
x = list(map(int, input().rsplit()))
if(line == 0):
nodes_num = x[0]
if(line == 1):
masses_list = x
if(line == 2):
init_seq_list = x
if(line == 3):
fin_seq_list = x
return nodes_num, masses_list, init_seq_list, fin_seq_list
and the data which works:
6
2400 2000 1200 2400 1600 4000
1 4 5 3 6 2
5 3 2 4 6 1
and the long input file:
https://pastebin.com/atAcygkk
it stops at the sequence: ... 2421 1139 322], so it's like a part of 4th row.
To read input from "standard input", you just need to use the stdin stream. Since your data is all on lines you can just read until the EOL delimiter, not having to track lines yourself with some index number. This code will work when run as python3.9 sowholeinput.py < atAcygkk.txt, or cat atAcygkk.txt| python3.9 sowholeinput.py.
def read_data():
stream = sys.stdin
num = int(stream.readline())
masses = [int(t) for t in stream.readline().split()]
init_seq = [int(t) for t in stream.readline().split()]
fin_seq = [int(t) for t in stream.readline().split()]
return num, masses, init_seq, fin_seq
Interestingly, it does not work, as you describe, when pasting the text using the terminal cut-and-paste. This implies a limitation with that method, not Python itself.
I am not Python programmer, but I need to use some method from SciPy library. I just want to repeat inner loop a couple of times, but with changed index of table. Here is my code for now:
from scipy.stats import pearsonr
fileName = open('ILPDataset.txt', 'r')
attributeValue, classValue = [], []
for index in range(0, 10, 1):
for line in fileName.readlines():
data = line.split(',')
attributeValue.append(float(data[index]))
classValue.append(float(data[10]))
print(index)
print(pearsonr(attributeValue, classValue))
And I am getting the following output:
0
(-0.13735062681256097, 0.0008840631556260505)
1
(-0.13735062681256097, 0.0008840631556260505)
2
(-0.13735062681256097, 0.0008840631556260505)
3
(-0.13735062681256097, 0.0008840631556260505)
4
(-0.13735062681256097, 0.0008840631556260505)
5
(-0.13735062681256097, 0.0008840631556260505)
6
(-0.13735062681256097, 0.0008840631556260505)
7
(-0.13735062681256097, 0.0008840631556260505)
8
(-0.13735062681256097, 0.0008840631556260505)
9
(-0.13735062681256097, 0.0008840631556260505)
As you can see index is changing, but the result of that function is always like the index would be 0.
When I am running script couple of times but with changing index value like this:
attributeValue.append(float(data[0]))
attributeValue.append(float(data[1]))
...
attributeValue.append(float(data[9]))
everything is ok, and I am getting correct results, but I can't do it in one loop statement. What am I doing wrong?
EDIT:
Test file:
62,1,6.8,3,542,116,66,6.4,3.1,0.9,1
40,1,1.9,1,231,16,55,4.3,1.6,0.6,1
63,1,0.9,0.2,194,52,45,6,3.9,1.85,2
34,1,4.1,2,289,875,731,5,2.7,1.1,1
34,1,4.1,2,289,875,731,5,2.7,1.1,1
34,1,6.2,3,240,1680,850,7.2,4,1.2,1
20,1,1.1,0.5,128,20,30,3.9,1.9,0.95,2
84,0,0.7,0.2,188,13,21,6,3.2,1.1,2
57,1,4,1.9,190,45,111,5.2,1.5,0.4,1
52,1,0.9,0.2,156,35,44,4.9,2.9,1.4,1
57,1,1,0.3,187,19,23,5.2,2.9,1.2,2
38,0,2.6,1.2,410,59,57,5.6,3,0.8,2
38,0,2.6,1.2,410,59,57,5.6,3,0.8,2
30,1,1.3,0.4,482,102,80,6.9,3.3,0.9,1
17,0,0.7,0.2,145,18,36,7.2,3.9,1.18,2
46,0,14.2,7.8,374,38,77,4.3,2,0.8,1
Expected results of pearsonr for 9 script runs:
data[0] (0.06050513030608389, 0.8238536636813034)
data[1] (-0.49265895172303803, 0.052525691067199995)
data[2] (-0.5073312383613632, 0.0448647312201305)
data[3] (-0.4852842899321005, 0.056723468068371544)
data[4] (-0.2919584357031029, 0.27254138535817224)
data[5] (-0.41640591455640696, 0.10863082761524119)
data[6] (-0.46954072465442487, 0.0665061785375443)
data[7] (0.08874739193909209, 0.7437895010751641)
data[8] (0.3104260624799073, 0.24193152445774302)
data[9] (0.2943030868699842, 0.26853066217221616)
Turn each line of the file into a list of floats
data = []
with open'ILPDataset.txt') as fileName:
for line in fileName:
line = line.strip()
line = line.split(',')
line = [float(item) for item in line[:11]]
data.append(line)
Transpose the data so that each list in data has the column values from the original file. data --> [[column 0 items], [column 1 items],[column 2 items],...]
data = zip(*data) # for Python 2.7x
#data = list(zip(*data)) # for python 3.x
Correlate:
for n in [0,1,2,3,4,5,6,7,8,9]:
corr = pearsonr(data[n], data[10])
print('data[{}], {}'.format(n, corr))
#wwii 's answer is very good
Only one suggestion. list(zip(*data)) seems a bit overkill to me. zip is really for lists with variable types and potentially variable lengths to be composed into tuples. Only then be transformed back into lists in this case with list()).
So why not just use the simple transpose operation which is what this is?
import numpy;
//...
data = numpy.transpose(data);
which does the same job, probably faster (not measure) and more deterministically.
I hope you are having pleasant holidays so far!
I am trying to read a .txt file in which values are stored and separated from each other by a line skip and then calculate with the values.
I am trying to figure out how to do this using a Python script.
Let's say this is the content of my text file:
0.1 #line(0)
1.0
2.0
0.2 #line(3)
1.1
2.1
0.3 #line(6)
1.2
2.2
...
Basically I would to implement an operation that calculates:
line(0)*line(1)*line(2) in the first step, writes it into another .txt file and then continues with line(3)*line(4)*line(5) and so on:
with open('/filename.txt') as file_:
for line in file_:
for i in range(0,999,1):
file = open('/anotherfile.txt')
file.write(str(line(i)*line(i+1)*line(i+2) + '\n')
i += 3
Does anyone have an idea how to get this working?
Any tips would be appreciated!
Thanks,
Steve
This would multiply three numbers at a time and write the product of the three into another file:
with open('numbers_in.txt') as fobj_in, open('numbers_out.txt', 'w') as fobj_out:
while True:
try:
numbers = [float(next(fobj_in)) for _ in range(3)]
product = numbers[0] * numbers[1] * numbers[2]
fobj_out.write('{}\n'.format(product))
except StopIteration:
break
Here next(fobj_in) always tries to read the next line.
If there is no more line a StopIteration exception is raised.
The except StopIteration: catches this exception and terminates the loop.
The list comprehension [float(next(fobj_in)) for _ in range(3)]
converts three numbers read from three lines into floating point numbers.
Now, multiplying the thee numbers is matter of indexing into the list numbers.
You can do this:
file = open('/anotherfile.txt','w')
i=0
temp=1
with open('/filename.txt') as file_:
for line in file_:
temp = temp*int(line)
if(i>1 && i%3==0):
file.write(str(temp)+'\n')
temp=1
i += 1
I have three list:
alist=[1,2,3,4,5]
blist=['a','b','c','d','e']
clist=['#','#','$','&','*']
I want my output in this format:
1 2 3 4 5
a b c d e
# # $ & *
I am able to print in correct format but when i am having list with many elements it's actually printing like this:
1 2 3 4 5 6 ..........................................................................
................................................................................
a b c d e ............................................................................
......................................................................................
# # $ & * .............................................................................
.......................................................................................
but I want my output like this:
12345....................................................................
abcde...................................................................
##$&*...................................................................
............................................................... {this line is from alist}
................................................................ {this line is from blist}
................................................................ {this line is from clist}
Try the following:
term_width = 80
all_lists = (alist, blist, clist)
length = max(map(len, all_lists))
for offset in xrange(0, length, term_width):
print '\n'.join(''.join(map(str, l[offset:offset+term_width])) for l in all_lists)
This assumes terminal width is 80 characters, which is the default. You might want to detect it's actual width with curses library or something based on it.
Either way, to adapt to any output width you only need to change term_width value and the code will use it.
It also assumes all elements are 1-character long. If it's not the case, please clarify.
If you need to detect terminal width, you may find some solutions here: How to get Linux console window width in Python
Firstly, here is the relevant part of the code:
stokes_list = np.zeros(shape=(numrows,1024)) # 'numrows' defined earlier
for i in range(numrows):
epoch_name = y['filename'][i] # 'y' is an array from earlier
os.system('pdv -t {0} > temp.txt '.format(epoch_name)) # 'pdv' is a command from another piece of software - here I copy the output into a temporary file
stokes_line = np.genfromtxt('temp.txt', usecols=3, dtype=[('stokesI','float')], skip_header=1)
stokes_list = np.vstack((stokes_line,stokes_line))
So, basically, every time the code loops around, stokes_line pulls one of the columns (4th one) from the file temp.txt, and I want it to add a line to stokes_list each time.
For example, if the first stokes_line is
1.1 2.2 3.3
and the second is
4.4 5.5 6.6
then stokes_list will be
1.1 2.2 3.3
4.4 5.5 6.6
and will keep growing...
It's not working at the moment, because I think that the line:
stokes_list = np.vstack((stokes_line,stokes_line))
is not correct. It's only stacking 2 lists - which makes sense as I only have 2 arguments. I basically would like to know how I keep stacking again and again.
Any help would be very gratefully received!
If it is needed, here is an example of the format of the temp.txt file:
File: t091110_065921.SFTC Src: J1903+0925 Nsub: 1 Nch: 1 Npol: 4 Nbin: 1024 RMS: 0.00118753
0 0 0 0.00148099 -0.00143755 0.000931365 -0.00296775
0 0 1 0.000647476 -0.000896698 0.000171287 0.00218597
0 0 2 0.000704697 -0.00052846 -0.000603842 -0.000868739
0 0 3 0.000773361 -0.00234724 -0.0004112 0.00358033
0 0 4 0.00101559 -0.000691062 0.000196023 -0.000163109
0 0 5 -0.000220367 -0.000944024 0.000181002 -0.00268215
0 0 6 0.000311783 0.00191545 -0.00143816 -0.00213856
vstacking again and again is not good, because it copies the whole arrays.
Create a normal Python list, .append to it and then pass it whole to np.vstack to create a new array once.
stokes_list = []
for i in xrange(numrows):
...
stokes_line = ...
stokes_list.append(stokes_line)
big_stokes = np.vstack(stokes_list)
You already know the final size of the stokes_list array since you know numrows. So it seems you don't need to grow an array (which is very inefficient). You can simply assign the correct row at each iteration.
Simply replace your last line by :
stokes_list[i] = stokes_line
By the way, about your non-working line I think you meant :
stokes_list = np.vstack((stokes_list, stokes_line))
where you're replacing stokes_list by its new value.