I just started to learn python. I have a csv file contains three row: date, time and temperature. Now I want to screen all temperature data > 80 and then put the screened lines into a list and print them.
import csv
date_array = []
time_array = []
temp_array = []
output = []
with open("/Users/Fay/Documents/GitHub/warning_system/temp.csv") as csvfile:
csv_reader = csv.reader(csvfile, delimiter=",")
next(csv_reader, None)
for row in csv_reader:
date_array.append(row[0])
time_array.append(row[1])
temp_array.append(row[2])
#why to disassemble the data vertically instead of horizontally, line by line.
#print(data_array[1])
#print(time_array[1])
#print(temp_array[1])
for row in csv_reader:
output= ['data_array','time_array','temp_array']
if temp_array > '80':
print(output)
Could you help me to fix it? Thanks.
Make an array of dictionaries, not 3 separate arrays.
The second loop should iterate over the array that you filled in, not csv_reader. There's nothing left to process in csv_reader, because the previous loop reached the end of it.
You should also convert the temperature to a number.
import csv
data_array = []
output = []
with open("/Users/Fay/Documents/GitHub/warning_system/temp.csv") as csvfile:
csv_reader = csv.reader(csvfile, delimiter=",")
next(csv_reader, None)
for row in csv_reader:
data_array.append({"date": row[0], "time": row[1], "temp": float(row[2])})
for item in data_array:
if item['temp'] > 80:
output.append(item)
print(output)
Related
I have python code for appending data to the same csv, but when I append the data, it skips rows, and starts from row 15, instead from row 4
import csv
with open('csvtask.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
ls = []
for line in csv_reader:
if len(line['Values'])!= 0:
ls.append(int(line['Values']))
new_ls = ['','','']
for i in range(len(ls)-1):
new_ls.append(ls[i+1]-ls[i])
print(new_ls)
with open('csvtask.csv','a',newline='') as new_file:
csv_writer = csv.writer(new_file)
for i in new_ls:
csv_writer.writerow(('','','','',i))
new_file.close()
Here is the image
It's not really feasible to update a file at the same time you're reading it, so a common workaround it to create a new file. The following does that while preserving the fieldnames in the origin file. The new column will be named Diff.
Since there's no previous value to use to calculate a difference for the first row, the rows of the files are processed using the built-in enumerate() function which provides a value each time it's called which provides the index of the item in the sequence as well as the item itself as the object is iterated. You can use the index to know whether the current row is the first one or not and handle in a special way.
import csv
# Read csv file and calculate values of new column.
with open('csvtask.csv', 'r', newline='') as file:
reader = csv.DictReader(file)
fieldnames = reader.fieldnames # Save for later.
diffs = []
prev_value = 0
for i, row in enumerate(reader):
row['Values'] = int(row['Values']) if row['Values'] else 0
diff = row['Values'] - prev_value if i > 0 else ''
prev_value = row['Values']
diffs.append(diff)
# Read file again and write an updated file with the column added to it.
fieldnames.append('Diff') # Name of new field.
with open('csvtask.csv', 'r', newline='') as inp:
reader = csv.DictReader(inp)
with open('csvtask_updated.csv', 'w', newline='') as outp:
writer = csv.DictWriter(outp, fieldnames)
writer.writeheader()
for i, row in enumerate(reader):
row.update({'Diff': diffs[i]}) # Add new column.
writer.writerow(row)
print('Done')
You can use the DictWriter function like this:-
header = ["data", "values"]
writer = csv.DictWriter(file, fieldnames = header)
data = [[1, 2], [4, 6]]
writer.writerows(data)
I have this following code
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data)
print(row_count)
for row in data:
print(row)
It prints:
<_csv.reader object at 0x00000295CB6933C8>
505
So it prints data as an object and prints the row_count as 505. Just does not seem to print row in the for-loop. I am not sure why there is nothing being passed to the variable row?
This is particularly frustrating because if i get rid of row_count it works! Why?
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data)
You just read the entire file; you've exhausted the input. There is nothing left for your second for to find. You have to reset the reader. The most obvious way is to close the file and reopen. Less obvious ... and less flexible ... is to reset the file pointer to the beginning, with
csvfile.seek(0)
This doesn't work for all file subtypes, but does work for CSV.
Even better, simply count the lines as you print them:
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
row_count = 0
for row in data:
print(row)
row_count += 1
print(row_count)
You consumed the rows from data already with your set comprehension:
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data) # This consumes all of the rows
print(row_count)
for row in data: # no more rows at this point.
print(row) # doesn't run because there are no more rows left
You'll have to save all of the rows in memory or create a second CSV reader object if you want to print the count before printing each row.
It could be the best solution for you to make that data into a list and itterate through it to save yourself all the troubles if you have some memory to spend
with open('your_csv.csv','r') as f_:
reader=csv.reader(f_)
new_list=list(reader)
for row in new_list:
print(row) #or whatever else you want to do afterwards
Can anyone help me out, I am new to python and I have some problems assigning variables and skipping some lines
My code looks like this:
import csv
with open('sample.csv', "r") as csvfile:
# Set up CSV reader and process the header
reader = csv.reader(csvfile, delimiter=' ')
skip_lines = csvfile.readlines()[4:] #skip the first four lines
capacity = []
voltage = []
temperature = []
impedance = []
# Loop through the lines in the file and get each coordinate
for row in reader:
capacity.append(row[0])
voltage.append(row[1])
temperature.append(row[2])
impedance.append(row[3])
print(capacity, voltage, temperature, impedance)
Your problem is here:
skip_lines = csvfile.readlines()[4:]
This reads the entire file into memory, and puts everything but the first 4 lines in skip_lines. By the time you get to:
for row in reader:
The file-handler is exhausted.
To skip a single line in the file-handler, use:
next(csvfile)
Since you want to skip the first four:
for _ in range(4):
next(csvfile)
Right now, data and filenameVariable are printing the final row when I need all rows. I tried .append but that didn't work. What else could I use?
Here is the data I'm working with:
someCSVfile.csv|cust_no,0|streetaddr,1|city,2|state,3|zip,4|phone_home,5|firstname,6|lastname,7|status,9|
someCSVfile1.csv|cust_no,0|streetaddr,1|city,2|state,3|zip,4|phone_home,5|firstname,6|lastname,7|status,9|
And here's the code so far:
import csv
reader = csv.reader(open('match_log.txt','rb'), dialect='excel', delimiter='|')
data = {}
for row in reader:
filenameVariable = row[0]
data = dict(item.split(',') for item in row[1:])
print data
print filenameVariable
#right now its taking the final row. I need all rows
The problem is you are overwriting data each line in the CSV. Instead, all you need to do is have the row[0] as a key in the data dict:
import csv
reader = csv.reader(open('match_log.txt','rb'), dialect='excel', delimiter='|')
filenameVariable = []
data = {}
for row in reader:
filenameVariable.append(row[0])
data[row[0]] = dict(item.split(',') for item in row[1:])
print data
print filenameVariable
Why the unique[1] is never accessed in the second for???
unique is an array of strings.
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
for row in reader:
print unique[i] # always prints the first item unique[0]
if row[1]==unique[i]:
print row[1], row[0] # prints only the unique[0] stuff
Thank you
I think it would be useful to go through the program flow.
First, it will assign i=0, then it will read the entire CSV file, printing unique[0] for each line in the CSV file, then after it finishes reading the CSV file, it will go to the second iteration, assigning i=1, and then since the program has finished reading the file, it won't enter for row in reader:, hence it exits the loop.
Further Clarification
The csv.reader(f) won't actually read the file until you do for row in reader, and after that it has nothing more to read. If you want to read the file multiple times, then read it into a list first beforehand, like this:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
rows = [row for row in reader]
for i in range(len(unique)):
for row in rows:
print unique[i]
if row[1]==unique[i]:
print row[1], row[0]
I think you might have better luck if you change your nested structure to:
import csv
res = {}
for x in unique:
res[x] = []
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
if row[1]==unique[i]:
res[unique[i]].append([row[1],row[0]])
#print row[1], row[0] # prints only the unique[0] stuff
for x in unique:
print res[x]