I have a program that saves a .txt log with multiple values and text. Those are time values can differ each time the program is ran, but their place in this .txt file is always the same. I need to grab those values and check if they are smaller than 6 seconds. I was trying to convert my .txt file to string, and then use:
x = string(filename[50:55])
float(x)
But it doesn't seem to work. How can I extract values form fixed place in .txt file and then convert them to float number?
//EDIT:
Photo of my log below, I want to check those values marked with blue line:
//EDIT2:
Photo of another log, how would I extract those percentages, those must be below 70 to pass.
//EDIT3:Photo of the error I got and fragment of the code:
with open(r'C:\Users\Time_Log.txt') as f:
print(f)
lines = f.readlines()
print(r'Lines: ')
print(lines)
print(r'Type of lines:', type(lines))
# start at 1 to avoid the header
for line in range(1, len(lines)):
print(r'Type of line:', type(line))
splits = line.split("|")
print(r'Type of splits:', type(splits))
t = splits[3].split(".")
if t[0] < 6:
print(r'Test completed. Startup time is shorter than 6 seconds')
I mean with an example it would be much simpler, but assuming that your values have a fixed distance (idk space or tab) you'd split the string with that and look at the elements that you want to compare, so idk if the time is your 6th item you'd string.split(" ") and pick splitted_str[5]. You can do that further if you're time format follows a regular pattern idk hours:minutes:seconds and then do the math or you could even use packages like date or time to convert them to some time object which could potentially do some more useful comparison.
So the question is basically how well formatted your values are.
Edit:
So given the example you could:
with open("filename.txt") as f:
lines = f.readlines()
# start at 1 to avoid the header
for i in range(3, len(lines)):
splits = lines[i].split("|")
t = str(splits[3]).split(".")
if int(t[0)] < 6:
[do something with that line]
Related
This is data from a lab experiment (around 717 lines of data). Rather than trying to excell it, I want to import and graph it on either python or matlab. I'm new here btw... and am a student!
""
"Test Methdo","exp-l Tensile with Extensometer.msm"
"Sample I.D.","Sample108.mss"
"Speciment Number","1"
"Load (lbf)","Time (s)","Crosshead (in)","Extensometer (in)"
62.638,0.900,0.000,0.00008
122.998,1.700,0.001,0.00012
more numbers : see Screenshot of more data from my file
I just can't figure out how to read the line up until a comma. Specifically, I need the Load numbers for one of my arrays/list, so for example on the first line I only need 62.638 (which would be the first number on my first index on my list/array).
How can I get an array/list of this, something that iterates/reads the list and ignores strings?
Thanks!
NOTE: I use Anaconda + Jupyter Notebooks for Python & Matlab (school provided software).
EDIT: Okay, so I came home today and worked on it again. I hadn't dealt with CSV files before, but after some searching I was able to learn how to read my file, somewhat.
import csv
from itertools import islice
with open('Blue_bar_GroupD.txt','r') as BB:
BB_csv = csv.reader(BB)
x = 0
BB_lb = []
while x < 7: #to skip the string data
next(BB_csv)
x+=1
for row in islice(BB_csv,0,758):
print(row[0]) #testing if I can read row data
Okay, here is where I am stuck. I want to make an arraw/list that has the 0th index value of each row. Sorry if I'm a freaking noob!
Thanks again!
You can skip all lines till the first data row and then parse the data into a list for later use - 700+ lines can be easily processd in memory.
Therefor you need to:
read the file line by line
remember the last non-empty line before number/comma/dot ( == header )
see if the line is only number/comma/dot, else increase a skip-counter (== data )
seek to 0
skip enough lines to get to header or data
read the rest into a data structure
Create test file:
text = """
""
"Test Methdo","exp-l Tensile with Extensometer.msm"
"Sample I.D.","Sample108.mss"
"Speciment Number","1"
"Load (lbf)","Time (s)","Crosshead (in)","Extensometer (in)"
62.638,0.900,0.000,0.00008
122.998,1.700,0.001,0.00012
"""
with open ("t.txt","w") as w:
w.write(text)
Some helpers and the skipping/reading logic:
import re
import csv
def convert_row(row):
"""Convert one row of data into a list of mixed ints and others.
Int is the preferred data type, else string is used - no other tried."""
d = []
for v in row:
try:
# convert to int && add
d.append(float(v))
except:
# not an int, append as is
d.append(v)
return d
def count_to_first_data(fh):
"""Count lines in fh not consisting of numbers, dots and commas.
Sideeffect: will reset position in fh to 0."""
skiplines = 0
header_line = 0
fh.seek(0)
for line in fh:
if re.match(r"^[\d.,]+$",line):
fh.seek(0)
return skiplines, header_line
else:
if line.strip():
header_line = skiplines
skiplines += 1
raise ValueError("File does not contain pure number rows!")
Usage of helpers / data conversion:
data = []
skiplines = 0
with open("t.txt","r") as csvfile:
skip_to_data, skip_to_header = count_to_first_data(csvfile)
for _ in range(skip_to_header): # skip_to_data if you do not want the headers
next(csvfile)
reader = csv.reader(csvfile, delimiter=',',quotechar='"')
for row in reader:
row_data = convert_row(row)
if row_data:
data.append(row_data)
print(data)
Output (reformatted):
[['Load (lbf)', 'Time (s)', 'Crosshead (in)', 'Extensometer (in)'],
[62.638, 0.9, 0.0, 8e-05],
[122.998, 1.7, 0.001, 0.00012]]
Doku:
re.match
csv.reader
Method of file objekts (i.e.: seek())
With this you now have "clean" data that you can use for further processing - including your headers.
For visualization you can have a look at matplotlib
I would recommend reading your file with python
data = []
with open('my_txt.txt', 'r') as fd:
# Suppress header lines
for i in range(6):
fd.readline()
# Read data lines up to the first column
for line in fd:
index = line.find(',')
if index >= 0:
data.append(float(line[0:index]))
leads to a list containing your data of the first column
>>> data
[62.638, 122.998]
The MATLAB solution is less nice, since you have to know the number of data lines in your file (which you do not need to know in the python solution)
n_header = 6
n_lines = 2 % Insert here 717 (as you mentioned)
M = csvread('my_txt.txt', n_header, 0, [n_header 0 n_header+n_lines-1 0])
leads to:
>> M
M =
62.6380
122.9980
For the sake of clarity: You can also use MATLABs textscan function to achieve what you want without knowing the number of lines, but still, the python code would be the better choice in my opinion.
Based on your format, you will need to do 3 steps. One, read all lines, two, determine which line to use, last, get the floats and assign them to a list.
Assuming you file name is name.txt, try:
f = open("name.txt", "r")
all_lines = f.readlines()
grid = []
for line in all_lines:
if ('"' not in line) and (line != '\n'):
grid.append(list(map(float, line.strip('\n').split(','))))
f.close()
The grid will then contain a series of lists containing your group of floats.
Explanation for fun:
In the "for" loop, i searched for the double quote to eliminate any string as all strings are concocted between quotes. The other one is for skipping empty lines.
Based on your needs, you can use the list grid as you please. For example, to fetch the first line's first number, do
grid[0][0]
as python's list counts from 0 to n-1 for n elements.
This is super simple in Matlab, just 2 lines:
data = dlmread('data.csv', ',', 6,0);
column1 = data(:,1);
Where 6 and 0 should be replaced by the row and column offset you want. So in this case, the data starts at row 7 and you want all the columns, then just copy over the data in column 1 into another vector.
As another note, try typing doc dlmread in matlab - it brings up the help page for dlmread. This is really useful when you're looking for matlab functions, as it has other suggestions for similar functions down the bottom.
I have two text files with numbers that I want to do some very easy calculations on (for now). I though I would go with Python. I have two file readers for the two text files:
with open('one.txt', 'r') as one:
one_txt = one.readline()
print(one_txt)
with open('two.txt', 'r') as two:
two_txt = two.readline()
print(two_txt)
Now to the fun (and for me hard) part. I would like to loop trough all the numbers in the second text file and then subtract it with the second number in the first text file.
I have done this (extended the coded above):
with open('two.txt') as two_txt:
for line in two_txt:
print line;
I don't know how to proceed now, because I think that the second text file would need to be converted to string in order do make some parsing so I get the numbers I want. The text file (two.txt) looks like this:
Start,End
2432009028,2432009184,
2432065385,2432066027,
2432115011,2432115211,
2432165329,2432165433,
2432216134,2432216289,
2432266528,2432266667,
I want to loop trough this, ignore the Start,End (first line) and then once it loops only pick the first values before each comma, the result would be:
2432009028
2432065385
2432115011
2432165329
2432216134
2432266528
Which I would then subtract with the second value in one.txt (contains numbers only and no Strings what so ever) and print the result.
There are many ways to do string operations and I feel lost, for instance I don't know if the methods to read everything to memory are good or not.
Any examples on how to solve this problem would be very appreciated (I am open to different solutions)!
Edit: Forgot to point out, one.txt has values without any comma, like this:
102582
205335
350365
133565
Something like this
with open('one.txt', 'r') as one, open('two.txt', 'r') as two:
next(two) # skip first line in two.txt
for line_one, line_two in zip(one, two):
one_a = int(split(line_one, ",")[0])
two_b = int(split(line_two, " ")[1])
print(one_a - two_b)
Try this:
onearray = []
file = open("one.txt", "r")
for line in file:
onearray.append(int(line.replace("\n", "")))
file.close()
twoarray = []
file = open("two.txt", "r")
for line in file:
if line != "Start,End\n":
twoarray.append(int(line.split(",")[0]))
file.close()
for i in range(0, len(onearray)):
print(twoarray[i] - onearray[i])
It should do the job!
I have a csv that I am not able to read using read_csv
Opening the csv with sublime text shows something like:
col1,col2,col3
text,2,3
more text,3,4
HELLO
THIS IS FUN
,3,4
As you can see, the text HELLO THIS IS FUN takes three lines, and pd.read_csv is confused as it thinks these are three new observations. How can I parse that correctly in Pandas?
Thanks!
It looks like you'll have to preprocess the data manually:
with open('data.csv','r') as f:
lines = f.read().splitlines()
processed = []
cum_c = 0
buffer = ''
for line in lines:
buffer += line # Append the current line to a buffer
c = buffer.count(',')
if cum_c == 2:
processed.append(line)
buffer = ''
elif cum_c > 2:
raise # This should never happen
This assumes that your data only contains unwanted newlines, e.g. if you had data with say, 3 elements in one row, 2 elements in the next, then the next row should either be blank or contain only 1 element. If it has 2 or more, i.e. it's missing a necessary newline, then an error is thrown. You can accommodate this case if necessary with a minor modification.
Actually, it might be more efficient to remove newlines instead, but it shouldn't matter unless you have a lot of data.
I want to take a number, in my case 0, and add 1, then replace it back into the file. This is what I have so far:
def caseNumber():
caseNumber = open('caseNumber.txt', "r")
lastCase = caseNumber.read().splitlines()[0]
Case = []
Case.append(lastCase)
newCase = [ int(x)+1 for x in Case ]
with open('caseNumber.txt', mode = 'a',
encoding = 'utf-8') as my_file:
my_file.write('{}'.format(newCase))
print('Thankyou, your case number is {}, Write it down!'.format(newCase))
After this is run, i get:
this is what is added to the file: 0000[1] (the number in the file was 0000
to start off with, but it added [1] aswell)
Basically, the part I am stuck on is adding 1 to the number without the brackets.
newCase is a list, which gets printed with its values enclosed in brackets. If you just want the value in the list to get written to the file, you'll need to say that.
You don't need to create a list comprehension since you only need 1 item.
Since you're converting list to string you get the list representation: with brackets.
Note that it's not the only problem: you're appending to your text file (a mode), you don't replace the number. You have to write the file from scratch. But for that, you have to save full file contents when reading the first time. My proposal:
with open("file.txt") as f:
number,_,rest = f.read().partition(" ") # split left value only
number = str(int(number)+1) # increment and convert back to string
with open('file.txt',"w") as f:
f.write(" ".join([number,rest])) # write back file fully
So if the file contains:
0 this is a file
hello
each time you run this code above, the leading number is incremented, but the trailint text is kept
1 this is a file
hello
and so on...
I am trying to scan a textfile for a specific keyword. When this keyword is found there's a numeric value on the line that I need to compare to see if it's less than a set value. If it is, the following lines in the file need to be printed/saved until the next keyword is reached.
I hope this makes sense.
Example of the textfile:
"saleAmount","500",
text text text
etc etc etc
text text text
"saleAmount","1200",
text text text
etc etc etc
text text text
My python file is as follows:
import re
info = open("results.txt", "r")
for line in info:
if re.match("(.*)saleAmount(.*)", line):
for s in line:
result=re.findall('\d+', s)
if (result < 1000)
print (result)
In this case the line with the amount of 500 should be compared to 1000 and printed, as should the following 2 lines.
The line with the amount of 1200 and it's following lines will be ignored.
I can get this to print out the values in a weird one digit a line result but when I add in the comparison I can't get that.
I'm sure this is simple but I'm new to python.
Thanks
Here's one possible solution:
import re
ls=[]
with open('results.txt') as f:
for line in f:
if "saleAmount" in line:
ls.append(line.strip('\n'))
for num in range(len(ls)):
for amnt in re.findall('\d+',ls[num]):
if int(amnt) < 1000:
print(amnt)
What I did is added the file that contain saleAmount to a list ls, then extracted the numbers and from that list and compared to see if they are less than 1000.
In your case, result is obtaining the values whether it contains a number (splits up that number into single digit) or if it contains a string (becomes empty list).
In your original code, try print(result) right after you define it without the if statement and what's below it. Then you'll get a clearer understanding as to why you can't compare it to 1000
Edit: Include "saleAmount" and following lines
import re
ls=[]
with open('data.csv') as f:
for line in f:
ls.append(line)
for w in ls:
if "saleAmount" in w and int(re.findall('\d+',w)[0]) < 1000:
print(w)
for i in range(1,4):
print(ls[ls.index(w)+i])