Data not being passed to variable in for-loop - python

I have this following code
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data)
print(row_count)
for row in data:
print(row)
It prints:
<_csv.reader object at 0x00000295CB6933C8>
505
So it prints data as an object and prints the row_count as 505. Just does not seem to print row in the for-loop. I am not sure why there is nothing being passed to the variable row?
This is particularly frustrating because if i get rid of row_count it works! Why?

data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data)
You just read the entire file; you've exhausted the input. There is nothing left for your second for to find. You have to reset the reader. The most obvious way is to close the file and reopen. Less obvious ... and less flexible ... is to reset the file pointer to the beginning, with
csvfile.seek(0)
This doesn't work for all file subtypes, but does work for CSV.
Even better, simply count the lines as you print them:
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
row_count = 0
for row in data:
print(row)
row_count += 1
print(row_count)

You consumed the rows from data already with your set comprehension:
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data) # This consumes all of the rows
print(row_count)
for row in data: # no more rows at this point.
print(row) # doesn't run because there are no more rows left
You'll have to save all of the rows in memory or create a second CSV reader object if you want to print the count before printing each row.

It could be the best solution for you to make that data into a list and itterate through it to save yourself all the troubles if you have some memory to spend
with open('your_csv.csv','r') as f_:
reader=csv.reader(f_)
new_list=list(reader)
for row in new_list:
print(row) #or whatever else you want to do afterwards

Related

If condition in Python and create list

I just started to learn python. I have a csv file contains three row: date, time and temperature. Now I want to screen all temperature data > 80 and then put the screened lines into a list and print them.
import csv
date_array = []
time_array = []
temp_array = []
output = []
with open("/Users/Fay/Documents/GitHub/warning_system/temp.csv") as csvfile:
csv_reader = csv.reader(csvfile, delimiter=",")
next(csv_reader, None)
for row in csv_reader:
date_array.append(row[0])
time_array.append(row[1])
temp_array.append(row[2])
#why to disassemble the data vertically instead of horizontally, line by line.
#print(data_array[1])
#print(time_array[1])
#print(temp_array[1])
for row in csv_reader:
output= ['data_array','time_array','temp_array']
if temp_array > '80':
print(output)
Could you help me to fix it? Thanks.
Make an array of dictionaries, not 3 separate arrays.
The second loop should iterate over the array that you filled in, not csv_reader. There's nothing left to process in csv_reader, because the previous loop reached the end of it.
You should also convert the temperature to a number.
import csv
data_array = []
output = []
with open("/Users/Fay/Documents/GitHub/warning_system/temp.csv") as csvfile:
csv_reader = csv.reader(csvfile, delimiter=",")
next(csv_reader, None)
for row in csv_reader:
data_array.append({"date": row[0], "time": row[1], "temp": float(row[2])})
for item in data_array:
if item['temp'] > 80:
output.append(item)
print(output)

Loop over rows of csv.DictReader more than once

I open a file and read it with csv.DictReader. I iterate over it twice, but the second time nothing is printed. Why is this, and how can I make it work?
with open('MySpreadsheet.csv', 'rU') as wb:
reader = csv.DictReader(wb, dialect=csv.excel)
for row in reader:
print row
for row in reader:
print 'XXXXX'
# XXXXX is not printed
You read the entire file the first time you iterated, so there is nothing left to read the second time. Since you don't appear to be using the csv data the second time, it would be simpler to count the number of rows and just iterate over that range the second time.
import csv
from itertools import count
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
row_count = count(1)
for row in reader:
next(count)
print(row)
for i in range(row_count):
print('Stack Overflow')
If you need to iterate over the raw csv data again, it's simple to open the file again. Most likely, you should be iterating over some data you stored the first time, rather than reading the file again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print('Stack Overflow')
If you don't want to open the file again, you can seek to the beginning, skip the header, and iterate again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
f.seek(0)
next(reader)
for row in reader:
print('Stack Overflow')
You can create a list of dictionaries, each dictionary representing a row in your file, and then count the length of the list, or use list indexing to print each dictionary item.
Something like:
with open('YourCsv.csv') as csvfile:
reader = csv.DictReader(csvfile)
rowslist = list(reader)
for i in range(len(rowslist))
print(rowslist[i])
add a wb.seek(0) (goes back to the start of the file) and next(reader) (skips the header row) before your second loop.
You can try store the dict in list and output
input_csv = []
with open('YourCsv.csv', 'r', encoding='UTF-8') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
input_csv.append(row)
for row in input_csv:
print(row)
for row in input_csv:
print(row)

Python not entering in for

Why the unique[1] is never accessed in the second for???
unique is an array of strings.
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
for row in reader:
print unique[i] # always prints the first item unique[0]
if row[1]==unique[i]:
print row[1], row[0] # prints only the unique[0] stuff
Thank you
I think it would be useful to go through the program flow.
First, it will assign i=0, then it will read the entire CSV file, printing unique[0] for each line in the CSV file, then after it finishes reading the CSV file, it will go to the second iteration, assigning i=1, and then since the program has finished reading the file, it won't enter for row in reader:, hence it exits the loop.
Further Clarification
The csv.reader(f) won't actually read the file until you do for row in reader, and after that it has nothing more to read. If you want to read the file multiple times, then read it into a list first beforehand, like this:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
rows = [row for row in reader]
for i in range(len(unique)):
for row in rows:
print unique[i]
if row[1]==unique[i]:
print row[1], row[0]
I think you might have better luck if you change your nested structure to:
import csv
res = {}
for x in unique:
res[x] = []
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
if row[1]==unique[i]:
res[unique[i]].append([row[1],row[0]])
#print row[1], row[0] # prints only the unique[0] stuff
for x in unique:
print res[x]

Counting using DictReader

Still new to Python, this is how far I've managed to get:
import csv
import sys
import os.path
#VARIABLES
reader = None
col_header = None
total_rows = None
rows = None
#METHODS
def read_csv(csv_file):
#Read and display CSV file w/ HEADERS
global reader, col_header, total_rows, rows
#Open assign dictionaries to reader
with open(csv_file, newline='') as csv_file:
#restval = blank columns = - /// restkey = extra columns +
reader = csv.DictReader(csv_file, fieldnames=None, restkey='+', restval='-', delimiter=',',
quotechar='"')
try:
col_header = reader.fieldnames
print('The headers: ' + str(reader.fieldnames))
for row in reader:
print(row)
#Calculate number of rows
rows = list(reader)
total_rows = len(rows)
except csv.Error as e:
sys.exit('file {}, line {}: {}'.format(csv_file, reader.line_num, e))
def calc_total_rows():
print('\nTotal number of rows: ' + str(total_rows))
My issue is that, when I attempt to count the number of rows, it comes up as 0 (impossible because csv_file contains 4 rows and they print on screen.
I've placed the '#Calculate number of rows' code above my print row loop and it works, however the rows then don't print. It's as if each task is stealing the dictionary from one another? How do I solve this?
The problem is that the reader object behaves like a file as its iterating through the CSV. Firstly you iterate through in the for loop, and print each row. Then you try to create a list from whats left - which is now empty as you've iterated through the whole file. The length of this empty list is 0.
Try this instead:
rows = list(reader)
for row in rows:
print(row)
total_rows = len(rows)

How to read one single line of csv data in Python?

There is a lot of examples of reading csv data using python, like this one:
import csv
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
print(row)
I only want to read one line of data and enter it into various variables. How do I do that? I've looked everywhere for a working example.
My code only retrieves the value for i, and none of the other values
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
i = int(row[0])
a1 = int(row[1])
b1 = int(row[2])
c1 = int(row[2])
x1 = int(row[2])
y1 = int(row[2])
z1 = int(row[2])
To read only the first row of the csv file use next() on the reader object.
with open('some.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
# now do something here
# if first row is the header, then you can do one more next() to get the next row:
# row2 = next(f)
or :
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
# do something here with `row`
break
you could get just the first row like:
with open('some.csv', newline='') as f:
csv_reader = csv.reader(f)
csv_headings = next(csv_reader)
first_line = next(csv_reader)
You can use Pandas library to read the first few lines from the huge dataset.
import pandas as pd
data = pd.read_csv("names.csv", nrows=1)
You can mention the number of lines to be read in the nrows parameter.
Just for reference, a for loop can be used after getting the first row to get the rest of the file:
with open('file.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
for row in reader:
print(row) # prints rows 2 and onward
From the Python documentation:
And while the module doesn’t directly support parsing strings, it can easily be done:
import csv
for row in csv.reader(['one,two,three']):
print row
Just drop your string data into a singleton list.
The simple way to get any row in csv file
import csv
csvfile = open('some.csv','rb')
csvFileArray = []
for row in csv.reader(csvfile, delimiter = '.'):
csvFileArray.append(row)
print(csvFileArray[0])
To print a range of line, in this case from line 4 to 7
import csv
with open('california_housing_test.csv') as csv_file:
data = csv.reader(csv_file)
for row in list(data)[4:7]:
print(row)
I think the simplest way is the best way, and in this case (and in most others) is one without using external libraries (pandas) or modules (csv). So, here is the simple answer.
""" no need to give any mode, keep it simple """
with open('some.csv') as f:
""" store in a variable to be used later """
my_line = f.nextline()
""" do what you like with 'my_line' now """

Categories