I am using python to parse CSV file but I face an issue how to extract "Davies" element from second row.
CSV looks like this
"_submissionusersID","_submissionresponseID","username","firstname","lastname","userid","phone","emailaddress","load_date"
"b838b35d-ca18-4c7c-874a-828298ae3345","e9cde2ff-33a7-477e-b3b9-12ceb0d214e0","DAVIESJO","John","Davies","16293","","john_davies#test2.com","2019-08-30 15:37:03"
"00ec3205-6fcb-4d6d-b806-25579b49911a","e9cde2ff-11a7-477e-b3b9-12ceb0d934e0","MORANJO","John","Moran","16972","+1 (425) 7404555","brian_moran2#test2.com","2019-08-30 15:37:03"
"cc44e6bb-af76-4165-8839-433ed8cf6036","e9cde2ff-33a7-477e-b3b9-12ceb0d934e0","TESTNAN","Nancy","Test","75791","+1 (412) 7402344","nancy_test#test2.com","2019-08-30 15:37:03"
"a8ecd4db-6c8d-453c-a2a7-032553e2f0e6","e9cde2ff-33a7-477e-b3b9-12ceb0d234e0","SMITHJO","John","Smith","197448","+1 (415) 5940445","john_smith#test2.com","2019-08-30 15:37:03"
I'm stuck here:
with open('Docs/CSV/submis/submis.csv') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
You are absolutely correct with the code and each and every row is returned as a Dict so you need to parse the Dict and obtain the required results you want to,
as shown below.
import csv
with open('/home/liferay172/Documents/Sundeep/stackoverflow/text.csv') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
print(row)
print("Username :: "+row['username'])
print("Firstname :: "+row['firstname'])
print("Lastname :: "+row['lastname'])
For a specific row
import csv
rowNumber = 1
with open('/home/liferay172/Documents/Sundeep/stackoverflow/text.csv') as csv_file:
csv_reader = csv.DictReader(csv_file)
print(list(csv_reader)[rowNumber-1]['lastname']) # -1 as the index starts from 0
Returns > Davies
Here's how to put, for example, "Davies" record in result variable and also print its data if found.
import csv
with open('/home/liferay172/Documents/Sundeep/stackoverflow/text.csv') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
if (row['username'] == "Davies"):
match = row
print("Username:\t" + row['username'])
print("Firstname:\t" + row['firstname'])
print("Lastname:\t" + row['lastname'])
break
print(match)
You can convert the CSV reader object to a list and then it can be accessed by index.
import csv
with open('Docs/CSV/submis/submis.csv') as csv_file:
csv_reader = list(csv.reader(csv_file))
# 2nd row
print(csv_reader[1])
# 2nd row 3rd column
print(csv_reader[1][2])
Related
How can I remove a row from csv file. This is the code and I want to delete row 1 of my csv file. I added del row[1] but it does not do anything. The program runs without error but does not delete row 1.
import csv
with open('grades.csv', 'r') as file:
grades_reader = csv.reader(file, delimiter=',')
row_num = 1
for row in grades_reader:
print('Row #{}:'.format(row_num), row)
row_num += 1
del row[1]
One approach is to write the content to a temp file and then rename it
Ex:
import csv
import os
with open('grades.csv', 'r') as file, open('grades_out.csv', 'w', newline='') as outfile:
grades_reader = csv.reader(file, delimiter=',')
grades_reader_out = csv.writer(outfile, delimiter=',')
header = next(grades_reader) # Header
next(grades_reader) # Skip first row
grades_reader_out.writerow(header) #writer Header
for row in grades_reader:
grades_reader_out.writerow(row)
# Rename
os.rename(..., ...)
I have python code for appending data to the same csv, but when I append the data, it skips rows, and starts from row 15, instead from row 4
import csv
with open('csvtask.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
ls = []
for line in csv_reader:
if len(line['Values'])!= 0:
ls.append(int(line['Values']))
new_ls = ['','','']
for i in range(len(ls)-1):
new_ls.append(ls[i+1]-ls[i])
print(new_ls)
with open('csvtask.csv','a',newline='') as new_file:
csv_writer = csv.writer(new_file)
for i in new_ls:
csv_writer.writerow(('','','','',i))
new_file.close()
Here is the image
It's not really feasible to update a file at the same time you're reading it, so a common workaround it to create a new file. The following does that while preserving the fieldnames in the origin file. The new column will be named Diff.
Since there's no previous value to use to calculate a difference for the first row, the rows of the files are processed using the built-in enumerate() function which provides a value each time it's called which provides the index of the item in the sequence as well as the item itself as the object is iterated. You can use the index to know whether the current row is the first one or not and handle in a special way.
import csv
# Read csv file and calculate values of new column.
with open('csvtask.csv', 'r', newline='') as file:
reader = csv.DictReader(file)
fieldnames = reader.fieldnames # Save for later.
diffs = []
prev_value = 0
for i, row in enumerate(reader):
row['Values'] = int(row['Values']) if row['Values'] else 0
diff = row['Values'] - prev_value if i > 0 else ''
prev_value = row['Values']
diffs.append(diff)
# Read file again and write an updated file with the column added to it.
fieldnames.append('Diff') # Name of new field.
with open('csvtask.csv', 'r', newline='') as inp:
reader = csv.DictReader(inp)
with open('csvtask_updated.csv', 'w', newline='') as outp:
writer = csv.DictWriter(outp, fieldnames)
writer.writeheader()
for i, row in enumerate(reader):
row.update({'Diff': diffs[i]}) # Add new column.
writer.writerow(row)
print('Done')
You can use the DictWriter function like this:-
header = ["data", "values"]
writer = csv.DictWriter(file, fieldnames = header)
data = [[1, 2], [4, 6]]
writer.writerows(data)
I have a csv with two fields, 'positive' and 'negative'. I am trying to add the positive words to a list from the csv using the DictReader() module. Here is the following code.
import csv
with open('pos_neg_cleaned.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
positive_list = []
for n in csv_reader:
if n == 'positive' and csv_reader[n] != None :
positive_list.append(csv_reader[n])
However the program returns an empty list. Any idea how to get around this issue? Or what am I doing wrong?
That's because you can only read once from the csv_reader generator. In this case your do this with the print statement.
With a little re-arranging it should work fine:
import csv
with open('pos_neg_cleaned.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
positive_list = []
for n in csv_reader:
# put your print statement inside of the generator loop.
# otherwise the generator will be empty by the time your run the logic.
print(n)
# as n is a dict, you want to grab the right value from that dict.
# if it contains a value, then do something with it.
if n['positive']:
# Here you want to call the value from your dict.
# Don't try to call the csv_reader - but use the given data.
positive_list.append(n['positive'])
Every row in DictReader is a dictionary, so you can retrieve "columns values" using column name as "key" like this:
positive_column_values = []
for row in csv_dict_reader:
positive_column_value = row["positive"]
positive_column_values.append(positive_column_value)
After execution of this code, "positive_column_values" will have all values from "positive" column.
You can replace this code with your code to get desired result:
import csv
with open('pos_neg_cleaned.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
positive_list = []
for row in csv_reader:
positive_list.append(row["positive"])
print(positive_list)
Here's a short way with a list comprehension. It assumes there is a header called header that holds (either) positive or negative values.
import csv
with open('pos_neg_cleaned.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
positive_list = [line for line in csv_reader if line.get('header') == 'positive']
print(positive_list)
alternatively if your csv's header is positive:
positive_list = [line for line in csv_reader if line.get('positive')]
I open a file and read it with csv.DictReader. I iterate over it twice, but the second time nothing is printed. Why is this, and how can I make it work?
with open('MySpreadsheet.csv', 'rU') as wb:
reader = csv.DictReader(wb, dialect=csv.excel)
for row in reader:
print row
for row in reader:
print 'XXXXX'
# XXXXX is not printed
You read the entire file the first time you iterated, so there is nothing left to read the second time. Since you don't appear to be using the csv data the second time, it would be simpler to count the number of rows and just iterate over that range the second time.
import csv
from itertools import count
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
row_count = count(1)
for row in reader:
next(count)
print(row)
for i in range(row_count):
print('Stack Overflow')
If you need to iterate over the raw csv data again, it's simple to open the file again. Most likely, you should be iterating over some data you stored the first time, rather than reading the file again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print('Stack Overflow')
If you don't want to open the file again, you can seek to the beginning, skip the header, and iterate again.
with open('MySpreadsheet.csv', 'rU') as f:
reader = csv.DictReader(f, dialect=csv.excel)
for row in reader:
print(row)
f.seek(0)
next(reader)
for row in reader:
print('Stack Overflow')
You can create a list of dictionaries, each dictionary representing a row in your file, and then count the length of the list, or use list indexing to print each dictionary item.
Something like:
with open('YourCsv.csv') as csvfile:
reader = csv.DictReader(csvfile)
rowslist = list(reader)
for i in range(len(rowslist))
print(rowslist[i])
add a wb.seek(0) (goes back to the start of the file) and next(reader) (skips the header row) before your second loop.
You can try store the dict in list and output
input_csv = []
with open('YourCsv.csv', 'r', encoding='UTF-8') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
input_csv.append(row)
for row in input_csv:
print(row)
for row in input_csv:
print(row)
There is a lot of examples of reading csv data using python, like this one:
import csv
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
print(row)
I only want to read one line of data and enter it into various variables. How do I do that? I've looked everywhere for a working example.
My code only retrieves the value for i, and none of the other values
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
i = int(row[0])
a1 = int(row[1])
b1 = int(row[2])
c1 = int(row[2])
x1 = int(row[2])
y1 = int(row[2])
z1 = int(row[2])
To read only the first row of the csv file use next() on the reader object.
with open('some.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
# now do something here
# if first row is the header, then you can do one more next() to get the next row:
# row2 = next(f)
or :
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
# do something here with `row`
break
you could get just the first row like:
with open('some.csv', newline='') as f:
csv_reader = csv.reader(f)
csv_headings = next(csv_reader)
first_line = next(csv_reader)
You can use Pandas library to read the first few lines from the huge dataset.
import pandas as pd
data = pd.read_csv("names.csv", nrows=1)
You can mention the number of lines to be read in the nrows parameter.
Just for reference, a for loop can be used after getting the first row to get the rest of the file:
with open('file.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
for row in reader:
print(row) # prints rows 2 and onward
From the Python documentation:
And while the module doesn’t directly support parsing strings, it can easily be done:
import csv
for row in csv.reader(['one,two,three']):
print row
Just drop your string data into a singleton list.
The simple way to get any row in csv file
import csv
csvfile = open('some.csv','rb')
csvFileArray = []
for row in csv.reader(csvfile, delimiter = '.'):
csvFileArray.append(row)
print(csvFileArray[0])
To print a range of line, in this case from line 4 to 7
import csv
with open('california_housing_test.csv') as csv_file:
data = csv.reader(csv_file)
for row in list(data)[4:7]:
print(row)
I think the simplest way is the best way, and in this case (and in most others) is one without using external libraries (pandas) or modules (csv). So, here is the simple answer.
""" no need to give any mode, keep it simple """
with open('some.csv') as f:
""" store in a variable to be used later """
my_line = f.nextline()
""" do what you like with 'my_line' now """