TypeError in for loop - python

I'm having trouble with some code, where I have a text file with 633,986 tuples, each with 3 values (example: the first line is -0.70,0.34,1.05). I want to create an array where I take the magnitude of the 3 values in the tuple, so for elements a,b,c, I want magnitude = sqrt(a^2 + b^2 + c^2).
However, I'm getting an error in my code. Any advice?
import math
fname = '\\pathname\\GerrysTenHz.txt'
open(fname, 'r')
Magn1 = [];
for i in range(0, 633986):
Magn1[i] = math.sqrt((fname[i,0])^2 + (fname[i,1])^2 + (fname[i,2])^2)
TypeError: string indices must be integers, not tuple

You need to open the file properly (use the open file object and the csv module to parse the comma-separated values), read each row and convert the strings into float numbers, then apply the correct formula:
import math, csv
fname = '\\pathname\\GerrysTenHz.txt'
magn1 = []
with open(fname, 'rb') as inputfile:
reader = csv.reader(inputfile)
for row in reader:
magn1.append(math.sqrt(sum(float(c) ** 2 for c in row)))
which can be simplified with a list comprehension to:
import math, csv
fname = '\\pathname\\GerrysTenHz.txt'
with open(fname, 'rb') as inputfile:
reader = csv.reader(inputfile)
magn1 = [math.sqrt(sum(float(c) ** 2 for c in row)) for row in reader]
The with statement assigns the open file object to inputfile and makes sure it is closed again when the code block is done.
We add up the squares of the column values with sum(), which is fed a generator expression that converts each column to float() before squaring it.

You need to use the lines of the file and the csv module (as Martijn Pieters points out) to examine each value. This can be done with a list comprehension and with:
with open(fname) as f:
reader = csv.reader(f)
magn1 = [math.sqrt(sum(float(i)**2 for i in row)) for row in reader]
just make sure you import csv as well
To explain the issues your having (there are quite a few) I'll walk through a more drawn out way to do this.
you need to use what openreturns. open takes a string and returns a file object.
f = open(fname)
I'm assuming the range in your for loop is suppose to be the number of lines in the file. You can instead iterate over each line of the file one by one
for line in f:
Then to get the numbers on each line, use the str.split method of to split the line on the commas
x, y, z = line.split(',')
convert all three to floats so you can do math with them
x, y, z = float(x), float(y), float(z)
Then use the ** operator to raise to a power, and take the sqrt of the sum of the three numbers.
n = math.sqrt(x**2 + y**2 + z**2)
Finally use the append method to add to the back of the list
Magn1.append(n)

Let's look at fname. That's a string. So if you try to subscript it (i.e., fname[i, 0]), you should use an integer, and you'll get back the character at index i. Since you're using [i, 0] as the string indices, you're passing a tuple. That's no integer!
Really, you should be reading a line from the file, then doing things with that. So,
with(open(fname, 'r')) as f: # You're also opening the file and doing nothing with it
for line in f:
print('doing something with %s' % line)

Related

Calculating with data from a txt file in python

Detailed:
I have a set of 200 or so values in a txt file and I want to select the first value b[0] and then go through the list from [1] to [199] and add them together.
So, [0]+[1]
if that's not equal to a certain number, then it would go to the next term i.e. [0]+[2] etc etc until it's gone through every term. Once it's done that it will increase b[0] to b[1] and then goes through all the values again
Step by step:
Select first number in list.
Add that number to the next number
Check if that equals a number
If it doesn't, go to next term and add to first term
Iterate through these until you've gone through all terms/ found
a value which adds to target value
If gone through all values, then go to the next term for the
starting add value and continue
I couldn't get it to work, if anyone can maybe provide a solution or give some advice? Much appreciated. I've tried looking at videos and other stack overflow problems but I still didn't get anywhere. Maybe I missed something, let me know! Thank you! :)
I've attempted it but gotten stuck. This is my code so far:
b = open("data.txt", "r")
data_file = open("data.txt", "r")
for i, line in enumerate(data_file):
if (i+b)>2020 or (i+b)<2020:
b=b+1
else:
print(i+b)
print(i*b)
Error:
Traceback (most recent call last):
File "c:\Users\███\Desktop\ch1.py", line 11, in <module>
if (i+b)>2020 or (i+b)<2020:
TypeError: unsupported operand type(s) for +: 'int' and '_io.TextIOWrapper'
PS C:\Users\███\Desktop>
I would read the file into an array and then convert it into ints before
actually dealing with the problem. files are messy and the less we have to deal with them the better
with open("data.txt", "r") as data_file:
lines = data_file.readlines() # reads the file into an array
data_file.close
j = 0 # you could use a better much more terse solution but this is easy to understand
for i in lines:
lines[j] = int(i.strip().replace("\n", ""))
j += 1
i, j = 0
for i in lines: # for every value of i we go through every value of j
# so it would do x = [0] + [0] , [0] + [1] ... [1] + [0] .....
for j in lines:
x = j + i
if x == 2020:
print(i * j)
Here are some things that you can fix.
You can't add the file object b to the integer i. You have to convert the lines to int by using something like:
integer_in_line = int(line.strip())
Also you have opened the same file twice in read mode with:
b = open("data.txt", "r")
data_file = open("data.txt", "r")
Opening it once is enough.
Make sure that you close the file after you used it:
data_file.close()
To compare each number in the list with each other number in the list you'll need to use a double for loop. Maybe this works for you:
certain_number = 2020
data_file = open("data.txt", "r")
ints = [int(line.strip()) for line in data_file] # make a list of all integers in the file
for i, number_at_i in enumerate(ints): # loop over every integer in the list
for j, number_at_j in enumerate(ints): # loop over every integer in the list
if number_at_i + number_at_j == certain_number: # compare the integers to your certain number
print(f"{number_at_i} + {number_at_j} = {certain_number}")
data_file.close()
Your problem is the following: The variables b and data_file are not actually the text that you are hoping they are. You should read something about reading text files in python, there are many tutorials on that.
When you call open("path.txt", "r"), the open function returns a file object, not your text. If you want the text from the file, you should either call read or readlines. Also it is important to close your file after reading the content.
data_file = open("data.txt", "r") # b is a file object
text = data_file.read() # text is the actual text in the file in a single string
data_file.close()
Alternatively, you could also read the text into a list of strings, where each string represents one line in the file: lines = data_file.readlines().
I assume that your "data.txt" file contains one number per line, is that correct? In that case, your lines variable will be a list of the numbers, but they will be strings, not integers or floats. Therefore, you can't simply use them to perform calculation. You would need to call int() on them.
Here is an example how to do it. I assumed that your textfile looks like this (with arbitary numbers):
1
2
3
4
...
file = open("data.txt", "r")
lines = file.readlines()
file.close()
# This creates a new list where the numbers are actual numbers and not strings
numbers = []
for line in lines:
numbers.append(int(line))
target_number = 2020
starting_index = 0
found = False
for i in range(starting_index, len(numbers)):
temp = numbers[i]
for j in range(i + 1, len(numbers)):
temp += numbers[j]
if temp == target_number:
print(f'Target number reached by adding nubmers from {i} to {j}')
found = True
break #This stops the inner loop.
if found:
break #This stops the outer loop

Reading a numbers off a list from a txt file, but only upto a comma

This is data from a lab experiment (around 717 lines of data). Rather than trying to excell it, I want to import and graph it on either python or matlab. I'm new here btw... and am a student!
""
"Test Methdo","exp-l Tensile with Extensometer.msm"
"Sample I.D.","Sample108.mss"
"Speciment Number","1"
"Load (lbf)","Time (s)","Crosshead (in)","Extensometer (in)"
62.638,0.900,0.000,0.00008
122.998,1.700,0.001,0.00012
more numbers : see Screenshot of more data from my file
I just can't figure out how to read the line up until a comma. Specifically, I need the Load numbers for one of my arrays/list, so for example on the first line I only need 62.638 (which would be the first number on my first index on my list/array).
How can I get an array/list of this, something that iterates/reads the list and ignores strings?
Thanks!
NOTE: I use Anaconda + Jupyter Notebooks for Python & Matlab (school provided software).
EDIT: Okay, so I came home today and worked on it again. I hadn't dealt with CSV files before, but after some searching I was able to learn how to read my file, somewhat.
import csv
from itertools import islice
with open('Blue_bar_GroupD.txt','r') as BB:
BB_csv = csv.reader(BB)
x = 0
BB_lb = []
while x < 7: #to skip the string data
next(BB_csv)
x+=1
for row in islice(BB_csv,0,758):
print(row[0]) #testing if I can read row data
Okay, here is where I am stuck. I want to make an arraw/list that has the 0th index value of each row. Sorry if I'm a freaking noob!
Thanks again!
You can skip all lines till the first data row and then parse the data into a list for later use - 700+ lines can be easily processd in memory.
Therefor you need to:
read the file line by line
remember the last non-empty line before number/comma/dot ( == header )
see if the line is only number/comma/dot, else increase a skip-counter (== data )
seek to 0
skip enough lines to get to header or data
read the rest into a data structure
Create test file:
text = """
""
"Test Methdo","exp-l Tensile with Extensometer.msm"
"Sample I.D.","Sample108.mss"
"Speciment Number","1"
"Load (lbf)","Time (s)","Crosshead (in)","Extensometer (in)"
62.638,0.900,0.000,0.00008
122.998,1.700,0.001,0.00012
"""
with open ("t.txt","w") as w:
w.write(text)
Some helpers and the skipping/reading logic:
import re
import csv
def convert_row(row):
"""Convert one row of data into a list of mixed ints and others.
Int is the preferred data type, else string is used - no other tried."""
d = []
for v in row:
try:
# convert to int && add
d.append(float(v))
except:
# not an int, append as is
d.append(v)
return d
def count_to_first_data(fh):
"""Count lines in fh not consisting of numbers, dots and commas.
Sideeffect: will reset position in fh to 0."""
skiplines = 0
header_line = 0
fh.seek(0)
for line in fh:
if re.match(r"^[\d.,]+$",line):
fh.seek(0)
return skiplines, header_line
else:
if line.strip():
header_line = skiplines
skiplines += 1
raise ValueError("File does not contain pure number rows!")
Usage of helpers / data conversion:
data = []
skiplines = 0
with open("t.txt","r") as csvfile:
skip_to_data, skip_to_header = count_to_first_data(csvfile)
for _ in range(skip_to_header): # skip_to_data if you do not want the headers
next(csvfile)
reader = csv.reader(csvfile, delimiter=',',quotechar='"')
for row in reader:
row_data = convert_row(row)
if row_data:
data.append(row_data)
print(data)
Output (reformatted):
[['Load (lbf)', 'Time (s)', 'Crosshead (in)', 'Extensometer (in)'],
[62.638, 0.9, 0.0, 8e-05],
[122.998, 1.7, 0.001, 0.00012]]
Doku:
re.match
csv.reader
Method of file objekts (i.e.: seek())
With this you now have "clean" data that you can use for further processing - including your headers.
For visualization you can have a look at matplotlib
I would recommend reading your file with python
data = []
with open('my_txt.txt', 'r') as fd:
# Suppress header lines
for i in range(6):
fd.readline()
# Read data lines up to the first column
for line in fd:
index = line.find(',')
if index >= 0:
data.append(float(line[0:index]))
leads to a list containing your data of the first column
>>> data
[62.638, 122.998]
The MATLAB solution is less nice, since you have to know the number of data lines in your file (which you do not need to know in the python solution)
n_header = 6
n_lines = 2 % Insert here 717 (as you mentioned)
M = csvread('my_txt.txt', n_header, 0, [n_header 0 n_header+n_lines-1 0])
leads to:
>> M
M =
62.6380
122.9980
For the sake of clarity: You can also use MATLABs textscan function to achieve what you want without knowing the number of lines, but still, the python code would be the better choice in my opinion.
Based on your format, you will need to do 3 steps. One, read all lines, two, determine which line to use, last, get the floats and assign them to a list.
Assuming you file name is name.txt, try:
f = open("name.txt", "r")
all_lines = f.readlines()
grid = []
for line in all_lines:
if ('"' not in line) and (line != '\n'):
grid.append(list(map(float, line.strip('\n').split(','))))
f.close()
The grid will then contain a series of lists containing your group of floats.
Explanation for fun:
In the "for" loop, i searched for the double quote to eliminate any string as all strings are concocted between quotes. The other one is for skipping empty lines.
Based on your needs, you can use the list grid as you please. For example, to fetch the first line's first number, do
grid[0][0]
as python's list counts from 0 to n-1 for n elements.
This is super simple in Matlab, just 2 lines:
data = dlmread('data.csv', ',', 6,0);
column1 = data(:,1);
Where 6 and 0 should be replaced by the row and column offset you want. So in this case, the data starts at row 7 and you want all the columns, then just copy over the data in column 1 into another vector.
As another note, try typing doc dlmread in matlab - it brings up the help page for dlmread. This is really useful when you're looking for matlab functions, as it has other suggestions for similar functions down the bottom.

Python Error When Attempting to Iterate Over a List of File Names

I have a list of file names, all of which have the .csv ending. I am trying to use the linecache.getline function to get 2 parts of each csv - the second row, 5th item and the 46th row, 5th item and comparing the two values (they're stock returns).
import csv
import linecache
d = open('successful_scrapes.csv')
csv = csv.reader(d)
k = []
for row in csv:
k.append(row)
x =linecache.getline('^N225.csv',2)
y = float(x.split(",")[4])
for c in k:
g = linecache.getline(c,2)
t = float(g.split(",")[4])
Everything works until the for loop over the k list. It keeps returning the error "Unhashable type: list." I've tried including quotation marks before and after each .csv file name in the list. Additionally, all the files are included in the same directory. Any thoughts?
Thanks!
You're misusing linecache, which is for working with files. There is no point in using it at all if you are going to pull the whole file into memory first.
In this case, since you have the whole CSV copied into k, just do the comparison:
yourComparisonFunction(k[1][4],k[45][4])
Alternately, you could use linecache instead of csv, and do it like so:
import linecache
file_list = ['file1','file2','file3','etc']
for f in file_list:
line2 = linecache.getline(f,2)
line2val = float(line2.split(",")[4])
line46 = linecache.getline(f,46)
line46val = float(line46.split(",")[4])
And the, I assume, add some comparison logic.
You can read the file and then just append the values to a list depending on the row number.
import csv
with open("C/a.csv", "rb") as f:
reader = csv.reader(f)
lst = [x[4] for i, x in enumerate(reader) if i == 1 or i == 45]
Then you can do the comparison with the lst's items

Convert first 2 elements of tuple to integer

I have a csv file with the following structure:
1234,5678,"text1"
983453,2141235,"text2"
I need to convert each line to a tuple and create a list. Here is what I did
with open('myfile.csv') as f1:
mytuples = [tuple(line.strip().split(',')) for line in f1.readlines()]
However, I want the first 2 columns to be integers, not strings. I was not able to figure out how to continue with this, except by reading the file line by line once again and parsing it. Can I add something to the code above so that I transform str to int as I convert the file to list of tuples?
This is a csv file. Treat it as such.
import csv
with open("test.csv") as csvfile:
reader = csv.reader(csvfile)
result = [(int(a), int(b), c) for a,b,c in reader]
If there's a chance your input may not be what you think it is:
import csv
with open('test.csv') as csvfile:
reader = csv.reader(csvfile)
result = []
for line in reader:
this_line = []
for col in line:
try:
col = int(col)
except ValueError:
pass
this_line.append(col)
result.append(tuple(this_line))
Instead of trying to cram all of the logic in a single line, just spread it out so that it is readable.
with open('myfile.csv') as f1:
mytuples = []
for line in f1:
tokens = line.strip().split(',')
mytuples.append( (int(tokens[0]), int(tokens[1]), tokens[2]) )
Real python programmers aren't afraid of using multiple lines.
You can use isdigit() to check if all letters within element in row is digit so convert it to int , so replace the following :
tuple(line.strip().split(','))
with :
tuple(int(i) if i.isdigit() else i for i in (line.strip().split(','))
You can cram this all into one line if you really want, but god help me I don't know why you'd want to. Try giving yourself room to breathe:
def get_tuple(token_list):
return (int(token_list[0]), int(token_list[1]), token_list[2])
mytuples = []
with open('myfile.csv') as f1:
for line in f1.readlines():
token_list = line.strip().split(',')
mytuples.append(get_tuple(token_list))
Isn't that way easier to read? I like list comprehension as much as the next guy, but I also like knowing what a block of code does when I sit down three weeks later and start reading it!

calculations with floats in a txt file

I have a txt file that is composed of three columns, the first column is integers and the second and third column are floats. I want to do a calculation with each float and separate by line. My pseudocode is below:
def first_function(file):
pogt = 0.
f=open(file, 'r')
for line in f:
pogt += otherFunction(first float, second float)
f.close
Also, would the "for line in f" guarantee that my pogt will be the sum of my otherFunction calculation of all the lines in the txt file?
Assuming that you get the values for first float and second float correctly, your code is close to correct, you'll need to dedent (the inverse of indent) the f.close line, or even better, use with, it will handle the close for you (btw, you should do f.close() instead of f.close)
And do not use file as variable name, it's reserved word in Python.
Also use better names for your variables.
Assuming your file is separated by spaces, you can define get_numbers as follows:
def get_numbers(line):
[the_integer, first_float, second_float] = line.strip().split()
return (first_float, second_float)
def first_function(filename):
the_result = 0
with open(filename, 'r') as f:
for line in f:
(first_float, second_float) = get_numbers(line)
the_result += other_function(first_float, second_float)

Categories