Why is .readlines() making a list of individual characters? - python

I have a text file of this format:
EFF 3500. GRAVITY 0.00000 SDSC GRID [+0.0] VTURB 2.0 KM/S L/H 1.25
wl(nm) Inu(ergs/cm**2/s/hz/ster) for 17 mu in 1221 frequency intervals
1.000 .900 .800 .700 .600 .500 .400 .300 .250 .200 .150 .125 .100 .075 .050 .025 .010
9.09 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.35 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.61 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.77 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.96 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10.20 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10.38 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
...more numbers
I'm trying to make it so File[0][0] will print the word "EFF" and so on.
import sys
import numpy as np
from math import *
import matplotlib.pyplot as plt
print 'Number of arguments:', len(sys.argv), 'arguments.'
print 'Argument List:', str(sys.argv)
z = np.array(sys.argv) #store all of the file names into array
i = len(sys.argv) #the length of the filenames array
File = open(str(z[1])).readlines() #load spectrum file
for n in range(0, len(File)):
File[n].split()
for n in range(0, len(File[1])):
print File[1][n]
However,it keeps outputting individual characters as if each list index is a single character. This includes whitespace too. I have split() in a loop because if I put readlines().split() it gives an error.
Output:
E
F
F
3
5
0
0
.
G
R
A
V
I
...ect
What am i doing wrong?

>>> text = """some
... multiline
... text
... """
>>> lines = text.splitlines()
>>> for i in range(len(lines)):
... lines[i].split() # split *returns* the list of tokens
... # it does *not* modify the string inplace
...
['some']
['multiline']
['text']
>>> lines #strings unchanged
['some', 'multiline', 'text']
>>> for i in range(len(lines)):
... lines[i] = lines[i].split() # you have to modify the list
...
>>> lines
[['some'], ['multiline'], ['text']]
If you want a one-liner do:
>>> words = [line.split() for line in text.splitlines()]
>>> words
[['some'], ['multiline'], ['text']]
Using a file object it should be:
with open(z[1]) as f:
File = [line.split() for line in f]
By the way, you are using an anti-idiom when looping. If you want to loop over an iterable simply do:
for element in iterable:
#...
If you need also the index of the element use enumerate:
for index, element in enumerate(iterable):
#...
In your case:
for i, line in enumerate(File):
File[i] = line.split()
for word in File[1]:
print word

You want something like this:
for line in File:
fields = line.split()
#fields[0] is "EFF", fields[1] is "3500.", etc.
The split() method returns a list of strings, it does not modify the object that is is called on.

Related

Mimicing 'n' Nested For Loops Using Recursion

I would like to create some number of for-loops equal to the length of a list, and iterate through the values in that list. For example, if I had the list:
[1,2,3,4]
I would like the code to function like:
for i in range(1):
for j in range(2):
for k in range(3):
for l in range(4):
myfunc(inputs)
I understand I would need to do this recursively, but I'm not quite sure how. Ideally, I would even be able to iterate through these list values by a variable step; perhaps I want to count by two's for one loop, by .8's for another, etc. In that case, I would probably deliver the information in a format like this:
[[value,step],[value,step] ... [value,step],[value,step]]
So, how could I do this?
Not quite sure what you want at the very end, but here's a way to recursively set-up your loops:
test = [1,2,3,4]
def recursive_loop(test):
if len(test) == 1:
for i in range(test[0]):
print('hi') # Do whatever you want here
elif len(test) > 1:
for i in range(test[0]):
recursive_loop(test[1:])
recursive_loop(test)
You can certainly do it with recursion, but there's already a library function for that:
itertools.product
from itertools import product
def nested_loops(myfunc, l):
for t in product(*(range(n) for n in l)):
myfunc(*t)
## OR EQUIVALENTLY
# def nested_loops(myfunc, l):
# for t in product(*map(range, l)):
# myfunc(l)
nested_loops(print, [1, 2, 3, 4])
# 0 0 0 0
# 0 0 0 1
# 0 0 0 2
# 0 0 0 3
# 0 0 1 0
# 0 0 1 1
# ...
# 0 1 2 1
# 0 1 2 2
# 0 1 2 3
You can of course include steps too. Library function zip can be useful.
def nested_loops_with_steps_v1(myfunc, upperbounds, steps):
for t in product(*(range(0, n, s) for n,s in zip(upperbounds, steps))):
myfunc(*t)
nested_loops_with_steps_v1(print, [1,2,8,10], [1,1,4,5])
# 0 0 0 0
# 0 0 0 5
# 0 0 4 0
# 0 0 4 5
# 0 1 0 0
# 0 1 0 5
# 0 1 4 0
# 0 1 4 5
Or if your steps and upperbounds are already zipped together:
def nested_loops_with_steps_v2(myfunc, l):
for t in product(*(range(0, n, s) for n,s in l)):
myfunc(*t)
nested_loops_with_steps_v2(print, [(1,1),(2,1),(8,4),(10,5)])
# 0 0 0 0
# 0 0 0 5
# 0 0 4 0
# 0 0 4 5
# 0 1 0 0
# 0 1 0 5
# 0 1 4 0
# 0 1 4 5

Using itertools to generate an exponential binary space

I am interested in generating all binary combination of N variables without having to implement a manual loop of iterating N times over N and each time looping over N/2 and so on.
Do we have such functionality in python?
E.g:
I have N binary variables:
pool=['A','B','C',...,'I','J']
len(pool)=10
I would like to generate 2^10=1024 space out of these such as:
[A B C ... I J]
iter0 = 0 0 0 ... 0 0
iter1 = 0 0 0 ... 0 1
iter2 = 0 0 0 ... 1 1
...
iter1022 = 1 1 1 ... 1 0
iter1023 = 1 1 1 ... 1 1
You see that I don't have repetitions here, each variable is enabled once per each of these iter's sequences. How can I do that using Python's itertools?
itertools.product with the repeat parameter is the simplest answer:
for A, B, C, D, E, F, G, H, I, J in itertools.product((0, 1), repeat=10):
The values of each variable will cycle fastest on the right, and slowest on the left, so you'll get:
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 1 0 0
etc. This may be recognizable to you: It's just the binary representation of an incrementing 10 bit number. Depending on your needs, you may actually want to just do:
for i in range(1 << 10):
then mask i with 1 << 9 to get the value of A, 1 << 8 for B, and so on down to 1 << 0 (that is, 1) for J. If the goal is just to print them, you can even get more clever, by binary stringifying and then using join to insert the separator:
for i in range(1 << 10):
print(' '.join('{:010b}'.format(i)))
# Or letting print insert the separator:
print(*'{:010b}'.format(i)) # If separator isn't space, pass sep='sepstring'

Sum all values in CSV

I have a CSV file with 0s and 1s and need to determine the sum total of the entire file. The file looks like this when opened in ExCel:
0 1 1 1 0 0 0 1 0 1
1 0 1 0 0 1 1 0 0 0
0 0 1 0 0 0 0 1 0 1
0 1 1 1 1 1 1 0 1 1
0 0 1 0 1 0 1 1 0 1
0 0 0 0 0 0 0 0 1 0
0 0 1 0 0 1 1 0 1 1
0 0 1 1 0 0 1 1 0 1
1 0 1 0 1 0 1 1 1 0
0 1 0 0 1 0 0 0 1 1
Using this script I can sum the values of each row and they print out in a single column:
import csv
import numpy as np
path = r'E:\myPy\one_zero.csv'
infile = open(path, 'r')
with infile as file_in:
fin = csv.reader(file_in, delimiter = ',')
for line in fin:
print line.count('1')
I need to be able to sum up the resulting column, but my experience with this is mild. Looking for suggestions. Thanks.
If you have more than just 1's and 0's map to int and sum all rows:
with open( r'E:\myPy\one_zero.csv') as f:
r = csv.reader(f, delimiter = ',')
count = sum(sum(map(int,row)) for row in r)
Or just count the 1's:
with open( r'E:\myPy\one_zero.csv' ) as f:
r = csv.reader(f, delimiter = ',')
count = sum(row.count("1") for row in r)
Just use with open(r'E:\myPy\one_zero.csv'), you don't need to and should not open and then pass the file handle to with.
path = r'E:\myPy\one_zero.csv'
infile = open(path, 'r')
answer = 0
with infile as file_in:
fin = csv.reader(file_in, delimiter = ',')
for line in fin:
a = line.count(1)
answer += a
print answer
Example:
answer = 0
lines = [[1, 0, 0, 1],[1,1,1,1],[0,0,0,1]]
for line in lines:
a = line.count(1)
answer += a
print answer
7
One possible error is you used:
line.count('1')
vs
line.count(1)
looking for a string instead of a numeric
Why use the CSV module at all? You have a file full of 0s, 1s, commas and newlines. Just open the file, read() it and count the 1s:
>>> with open(filename, 'r') as fin: print fin.read().count('1')
That should get you what you want, no?

How do I load both Strings and floats into a numpy array?

I need to somehow make numpy load in both text and numbers.
I am getting this error:
Traceback (most recent call last):
File "ip00ktest.py", line 13, in <module>
File = np.loadtxt(str(z[1])) #load spectrum file
File "/usr/lib64/python2.6/site-packages/numpy/lib/npyio.py", line 805, in loadtxt
items = [conv(val) for (conv, val) in zip(converters, vals)]
ValueError: invalid literal for float(): EFF
because my file I'm loading in has text in it. I need each word to be stored in an array index as well as the data below it. How do I do that?
Edit: Sorry for not giving an example. Here is what my file looks like.
FF 3500. GRAVITY 0.00000 SDSC GRID [+0.0] VTURB 2.0 KM/S L/H 1.25
wl(nm) Inu(ergs/cm**2/s/hz/ster) for 17 mu in 1221 frequency intervals
1.000 .900 .800 .700 .600 .500 .400 .300 .250 .200 .150 .125 .100 .075 .050 .025 .010
9.09 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.35 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.61 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.77 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9.96 0.000E+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
There are thousands of numbers below the ones shown here. Also, there are different datasets within the file such that the header you see on top repeats, followed by a new set of new numbers.
Code that Fails:
import sys
import numpy as np
from math import *
print 'Number of arguments:', len(sys.argv), 'arguments.'
print 'Argument List:', str(sys.argv)
z = np.array(sys.argv) #store all of the file names into array
i = len(sys.argv) #the length of the filenames array
File = np.loadtxt(str(z[1])) #load spectrum file
If the line that messes it up always begins with EFF, then you can ignore that line quite easily:
np.loadtxt(str(z[1]), comments='EFF')
Which will treat any line beginning with 'EFF' as a comment and it will be ignored.
To read the numbers, use the skiprows parameter of numpy.loadtxt to skip the header. Write custom code to read the header, because it seems to have an irregular format.
NumPy is most useful with homogenous numerical data -- don't try to put the strings in there.

print comma function not working

I put in this code:
def game():
down, right = 0, 0
while 1:
for i in range(down):
print "00000000000000000000000000000000000000000000000000000000000000000000000000000000"
for j in range(right):
print "0",
print "S",
for j in range(80 - (right + 1)):
print "0",
for i in range(38 - (down + 1)):
print "00000000000000000000000000000000000000000000000000000000000000000000000000000000"
direction = raw_input("Direction?")
if(direction.upper() == "W"):
down -= 1
elif(direction.upper() == "S"):
down += 1
elif(direction.upper() == "A"):
right -= 1
elif(direction.upper() == "D"):
right += 1
game()
The 'print "S",' is supposed to have a comma at the end, to stop it printing on the next line the next time I call the print function.
The actual thing came up with something like this: (I've chopped some lines containing "000000")
S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000000000
Why do the first zeroes have spaces in-between? There are no spaces in my print function.
print "0",
Means print "0", and print a space.
The comma basically causes print to output a space instead of a newline. If you want to circumvent this behavior, use sys.stdout.write() instead.
You can set softspace to False after every ,
print "S",
sys.stdout.softspace = False
but then you might as well use sys.stdout.write()

Categories