Right now, data and filenameVariable are printing the final row when I need all rows. I tried .append but that didn't work. What else could I use?
Here is the data I'm working with:
someCSVfile.csv|cust_no,0|streetaddr,1|city,2|state,3|zip,4|phone_home,5|firstname,6|lastname,7|status,9|
someCSVfile1.csv|cust_no,0|streetaddr,1|city,2|state,3|zip,4|phone_home,5|firstname,6|lastname,7|status,9|
And here's the code so far:
import csv
reader = csv.reader(open('match_log.txt','rb'), dialect='excel', delimiter='|')
data = {}
for row in reader:
filenameVariable = row[0]
data = dict(item.split(',') for item in row[1:])
print data
print filenameVariable
#right now its taking the final row. I need all rows
The problem is you are overwriting data each line in the CSV. Instead, all you need to do is have the row[0] as a key in the data dict:
import csv
reader = csv.reader(open('match_log.txt','rb'), dialect='excel', delimiter='|')
filenameVariable = []
data = {}
for row in reader:
filenameVariable.append(row[0])
data[row[0]] = dict(item.split(',') for item in row[1:])
print data
print filenameVariable
Related
I have this following code
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data)
print(row_count)
for row in data:
print(row)
It prints:
<_csv.reader object at 0x00000295CB6933C8>
505
So it prints data as an object and prints the row_count as 505. Just does not seem to print row in the for-loop. I am not sure why there is nothing being passed to the variable row?
This is particularly frustrating because if i get rid of row_count it works! Why?
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data)
You just read the entire file; you've exhausted the input. There is nothing left for your second for to find. You have to reset the reader. The most obvious way is to close the file and reopen. Less obvious ... and less flexible ... is to reset the file pointer to the beginning, with
csvfile.seek(0)
This doesn't work for all file subtypes, but does work for CSV.
Even better, simply count the lines as you print them:
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
row_count = 0
for row in data:
print(row)
row_count += 1
print(row_count)
You consumed the rows from data already with your set comprehension:
with open('data.csv') as csvfile:
data = csv.reader(csvfile, delimiter=' ')
print(data)
row_count = row_count = sum(1 for lines in data) # This consumes all of the rows
print(row_count)
for row in data: # no more rows at this point.
print(row) # doesn't run because there are no more rows left
You'll have to save all of the rows in memory or create a second CSV reader object if you want to print the count before printing each row.
It could be the best solution for you to make that data into a list and itterate through it to save yourself all the troubles if you have some memory to spend
with open('your_csv.csv','r') as f_:
reader=csv.reader(f_)
new_list=list(reader)
for row in new_list:
print(row) #or whatever else you want to do afterwards
I just started to learn python. I have a csv file contains three row: date, time and temperature. Now I want to screen all temperature data > 80 and then put the screened lines into a list and print them.
import csv
date_array = []
time_array = []
temp_array = []
output = []
with open("/Users/Fay/Documents/GitHub/warning_system/temp.csv") as csvfile:
csv_reader = csv.reader(csvfile, delimiter=",")
next(csv_reader, None)
for row in csv_reader:
date_array.append(row[0])
time_array.append(row[1])
temp_array.append(row[2])
#why to disassemble the data vertically instead of horizontally, line by line.
#print(data_array[1])
#print(time_array[1])
#print(temp_array[1])
for row in csv_reader:
output= ['data_array','time_array','temp_array']
if temp_array > '80':
print(output)
Could you help me to fix it? Thanks.
Make an array of dictionaries, not 3 separate arrays.
The second loop should iterate over the array that you filled in, not csv_reader. There's nothing left to process in csv_reader, because the previous loop reached the end of it.
You should also convert the temperature to a number.
import csv
data_array = []
output = []
with open("/Users/Fay/Documents/GitHub/warning_system/temp.csv") as csvfile:
csv_reader = csv.reader(csvfile, delimiter=",")
next(csv_reader, None)
for row in csv_reader:
data_array.append({"date": row[0], "time": row[1], "temp": float(row[2])})
for item in data_array:
if item['temp'] > 80:
output.append(item)
print(output)
I am scraping data, however I want the csv to write at column 2 to 12 or B-L rather than 1-4. Thus far I have simply been scraping langs_text to the column though this is slow. Is there a better method that does not take such a long time so I can start at column 2?
I have tried to include the below however it simply does not write any values to csv and continues job.
E.g
langs11 = ("potato")
langs11_text = []
langs11 = []
langs11_text = []
time.sleep(0)
FILE LOCATION = 'C:\\Users\\Bain3\\Aperture.csv'
with open((FILE LOCATION), 'a', newline='', encoding="utf-8") as outfile:
writer = csv.writer(outfile)
for row in zip(langs11_text, langs_text, langs11_text, langs11_text, langs11_text, langs11_text, langs1_text, langs2_text, elem_href, langs11_text):
print(row)
writer.writerow(row)
What you need is something like below
for row in zip(langs_text, langs2_text, langs3_text):
data = ["","","","","","","","","","","",""]
data[1] = row[0]
data[4] = row[1]
data[6] = href
data[7] = row[2]
writer.writerow(data)
Why the unique[1] is never accessed in the second for???
unique is an array of strings.
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
for row in reader:
print unique[i] # always prints the first item unique[0]
if row[1]==unique[i]:
print row[1], row[0] # prints only the unique[0] stuff
Thank you
I think it would be useful to go through the program flow.
First, it will assign i=0, then it will read the entire CSV file, printing unique[0] for each line in the CSV file, then after it finishes reading the CSV file, it will go to the second iteration, assigning i=1, and then since the program has finished reading the file, it won't enter for row in reader:, hence it exits the loop.
Further Clarification
The csv.reader(f) won't actually read the file until you do for row in reader, and after that it has nothing more to read. If you want to read the file multiple times, then read it into a list first beforehand, like this:
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
rows = [row for row in reader]
for i in range(len(unique)):
for row in rows:
print unique[i]
if row[1]==unique[i]:
print row[1], row[0]
I think you might have better luck if you change your nested structure to:
import csv
res = {}
for x in unique:
res[x] = []
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
for i in range(len(unique)):
# print unique[i] #prints all the items in the array
if row[1]==unique[i]:
res[unique[i]].append([row[1],row[0]])
#print row[1], row[0] # prints only the unique[0] stuff
for x in unique:
print res[x]
Still new to Python, this is how far I've managed to get:
import csv
import sys
import os.path
#VARIABLES
reader = None
col_header = None
total_rows = None
rows = None
#METHODS
def read_csv(csv_file):
#Read and display CSV file w/ HEADERS
global reader, col_header, total_rows, rows
#Open assign dictionaries to reader
with open(csv_file, newline='') as csv_file:
#restval = blank columns = - /// restkey = extra columns +
reader = csv.DictReader(csv_file, fieldnames=None, restkey='+', restval='-', delimiter=',',
quotechar='"')
try:
col_header = reader.fieldnames
print('The headers: ' + str(reader.fieldnames))
for row in reader:
print(row)
#Calculate number of rows
rows = list(reader)
total_rows = len(rows)
except csv.Error as e:
sys.exit('file {}, line {}: {}'.format(csv_file, reader.line_num, e))
def calc_total_rows():
print('\nTotal number of rows: ' + str(total_rows))
My issue is that, when I attempt to count the number of rows, it comes up as 0 (impossible because csv_file contains 4 rows and they print on screen.
I've placed the '#Calculate number of rows' code above my print row loop and it works, however the rows then don't print. It's as if each task is stealing the dictionary from one another? How do I solve this?
The problem is that the reader object behaves like a file as its iterating through the CSV. Firstly you iterate through in the for loop, and print each row. Then you try to create a list from whats left - which is now empty as you've iterated through the whole file. The length of this empty list is 0.
Try this instead:
rows = list(reader)
for row in rows:
print(row)
total_rows = len(rows)