I want to write in a CSV file some data. I don't have a problem to do this. The only issue I get is that I want to write the "title" just once, but it's writing it every two lines.
Here is my code:
rows = [['IVE_PATH','FPS moyen','FPS max','FPS min','MEDIAN'],[str(listFps[k]),statistics.mean(numberList), max(numberList), min(numberList), statistics.median(numberList)]]
with open("C:\ProgramData\OutilTestObjets3D\MaquetteCB-2019\DataSet\doc.csv", 'a', newline='') as csvfile:
csv_writer = csv.writer(csvfile, delimiter=';')
for row in rows:
csv_writer.writerow(row)
k += 1
I want to have this:
['IVE_PATH','FPS moyen','FPS max','FPS min','MEDIAN']
written only once at the top of the file, and not every two lines.
Solution is adding Not keywor in loop
with open("C:\ProgramData\OutilTestObjets3D\MaquetteCB-2019\DataSet\doc.csv", 'a', newline='') as csvfile:
csv_writer = csv.writer(csvfile, delimiter=';')
for row Not in rows:
csv_writer.writerow(row)
k += 1
It's because you opened the file in append mode ('a') and you are iterating over all the rows each time you write to the file. This means every time you write, you will add both the header and the data to the existing file.
The solution is to separate the writing of the header and the data rows.
One way is to check first if you are writing to an empty file with tell(), and if you are, that's the only time to write the header. Then proceed with iterating over all the rows except for the header.
import csv
rows = [
['IVE_PATH','FPS moyen','FPS max','FPS min','MEDIAN'], # header
[1,2,3,4,5], # sample data
[6,7,8,9,0] # sample data
]
with open("doc.csv", 'a', newline='') as csvfile:
csv_writer = csv.writer(csvfile, delimiter=';')
# Check if we are at the top of an empty file.
# If yes, then write the header.
# If no, then assume that the header was already written earlier.
if csvfile.tell() == 0:
csv_writer.writerow(rows[0])
# Iterate over only the data, skip rows[0]
for row in rows[1:]:
csv_writer.writerow(row)
Another way is to check first if the output CSV file exists. If it does not exist yet, create it and write the header row. Then succeeding runs of your code should only append the data rows.
import csv
import os
rows = [
['IVE_PATH','FPS moyen','FPS max','FPS min','MEDIAN'], # header
[1,2,3,4,5], # sample data
[6,7,8,9,0] # sample data
]
csvpath = "doc.csv"
# If the output file does not exist yet, create it.
# Then write the header row.
if not os.path.exists(csvpath):
with open(csvpath, "w") as csvfile:
csv_writer = csv.writer(csvfile, delimiter=';')
csv_writer.writerow(rows[0])
with open(csvpath, 'a', newline='') as csvfile:
csv_writer = csv.writer(csvfile, delimiter=';')
# Iterate over only the data, skip rows[0]
for row in rows[1:]:
csv_writer.writerow(row)
Related
How can I remove a row from csv file. This is the code and I want to delete row 1 of my csv file. I added del row[1] but it does not do anything. The program runs without error but does not delete row 1.
import csv
with open('grades.csv', 'r') as file:
grades_reader = csv.reader(file, delimiter=',')
row_num = 1
for row in grades_reader:
print('Row #{}:'.format(row_num), row)
row_num += 1
del row[1]
One approach is to write the content to a temp file and then rename it
Ex:
import csv
import os
with open('grades.csv', 'r') as file, open('grades_out.csv', 'w', newline='') as outfile:
grades_reader = csv.reader(file, delimiter=',')
grades_reader_out = csv.writer(outfile, delimiter=',')
header = next(grades_reader) # Header
next(grades_reader) # Skip first row
grades_reader_out.writerow(header) #writer Header
for row in grades_reader:
grades_reader_out.writerow(row)
# Rename
os.rename(..., ...)
I am trying to add 2 new columns to an existing file in the same program. The csv is generated by the previous function.
After looking at many answers here, I tried this, but it doesn't work because I couldn't find any answers using the csv dict writer in them, they were all about csv writer. This just creates a new file with these 2 columns in them. Can I get some help with this?
for me, sp in zip(meds, specs):
print(me.text, sp.text)
dict2 = {"Medicines": me.text, "Specialities": sp.text}
with open(f'Infusion_t{zip_add}.csv', 'r') as read, \
open(f'(Infusion_final{zip_add}.csv', 'a+', encoding='utf-8-sig', newline='') as f:
reader = csv.reader(read)
w = csv.DictWriter(f, dict2.keys())
for row in reader:
if not header_added:
w.writeheader()
header_added = True
row.append(w.writerow(dict2))
You need to append the new columns to row, then write row to the output file. You don't need the dictionary or DictWriter.
You can also open the output file just once before the loop, and write the header there, rather than each time through the main loop.
with open(f'(Infusion_final{zip_add}.csv', 'w', encoding='utf-8-sig', newline='') as f:
w = csv.writer(f)
w.writerow(['col1', 'col2', 'col3', ..., 'Medicines', 'Specalities']) # replace colX with the names of the original columns
for me, sp in zip(meds, specs):
print(me.text, sp.text)
with open(f'Infusion_t{zip_add}.csv', 'r') as read:
reader = csv.reader(read)
for row in reader:
row.append(me.text)
row.append(sp.text)
w.writerow(row)
I have python code for appending data to the same csv, but when I append the data, it skips rows, and starts from row 15, instead from row 4
import csv
with open('csvtask.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
ls = []
for line in csv_reader:
if len(line['Values'])!= 0:
ls.append(int(line['Values']))
new_ls = ['','','']
for i in range(len(ls)-1):
new_ls.append(ls[i+1]-ls[i])
print(new_ls)
with open('csvtask.csv','a',newline='') as new_file:
csv_writer = csv.writer(new_file)
for i in new_ls:
csv_writer.writerow(('','','','',i))
new_file.close()
Here is the image
It's not really feasible to update a file at the same time you're reading it, so a common workaround it to create a new file. The following does that while preserving the fieldnames in the origin file. The new column will be named Diff.
Since there's no previous value to use to calculate a difference for the first row, the rows of the files are processed using the built-in enumerate() function which provides a value each time it's called which provides the index of the item in the sequence as well as the item itself as the object is iterated. You can use the index to know whether the current row is the first one or not and handle in a special way.
import csv
# Read csv file and calculate values of new column.
with open('csvtask.csv', 'r', newline='') as file:
reader = csv.DictReader(file)
fieldnames = reader.fieldnames # Save for later.
diffs = []
prev_value = 0
for i, row in enumerate(reader):
row['Values'] = int(row['Values']) if row['Values'] else 0
diff = row['Values'] - prev_value if i > 0 else ''
prev_value = row['Values']
diffs.append(diff)
# Read file again and write an updated file with the column added to it.
fieldnames.append('Diff') # Name of new field.
with open('csvtask.csv', 'r', newline='') as inp:
reader = csv.DictReader(inp)
with open('csvtask_updated.csv', 'w', newline='') as outp:
writer = csv.DictWriter(outp, fieldnames)
writer.writeheader()
for i, row in enumerate(reader):
row.update({'Diff': diffs[i]}) # Add new column.
writer.writerow(row)
print('Done')
You can use the DictWriter function like this:-
header = ["data", "values"]
writer = csv.DictWriter(file, fieldnames = header)
data = [[1, 2], [4, 6]]
writer.writerows(data)
i am new to python and coding. i have large data like below and want to save it in csv file with fields as the header. All fields are ',' separated and each parameter have value on right side
for example for LAIGCINAME="LocalLA" , LAIGCINAME is the field and "LocalLA" is the value. my problem is all lines have some missing fields. Can anyone help me how to handle this in python as the data us not sync
ZXWN:GCI="12345",LAIGCINAME="LocalLA",PROXYLAI=NO,MSCN="11223344",VLRN="11223344",MSAREANAME="0"
ZWGA:GCI="13DADC12",PROXYLAI=NO,MSCVLRTYPE=MSCVLRNUM,MSCN="33223344",VLRN="22334455",MSAREANAME="0",NONBCLAI=NO;
As your data has lots of possible columns names, you will need to first parse the whole file to determine a suitable list of names. Once this is done, the header for the output file can be written followed by all of the data.
By making use of a csv.DictWriter() object, missing entries will be written as empty cells. A restval parameter could be added if another value is needed for missing values e.g. "N/A"
import csv
header = set()
input_filename = 'input.csv'
output_filename = 'output.csv'
with open(input_filename, newline='') as f_input:
csv_input = csv.reader(f_input)
# First determine all possible column names
for row in csv_input:
header.update({entry.split('=')[0] for entry in row})
with open(input_filename, newline='') as f_input, open(output_filename, 'w', newline='') as f_output:
csv_input = csv.reader(f_input)
csv_output = csv.DictWriter(f_output, fieldnames=sorted(header))
csv_output.writeheader()
for row in csv_input:
output_row = {}
for entry in row:
key, value = entry.split('=')
output_row[key] = value.strip('"')
csv_output.writerow(output_row)
For the two lines you have given, this would give you an output file as:
LAIGCINAME,MSAREANAME,MSCN,MSCVLRTYPE,NONBCLAI,PROXYLAI,VLRN,ZWGA:GCI,ZXWN:GCI
LocalLA,0,11223344,,,NO,11223344,,12345
,0,33223344,MSCVLRNUM,NO;,NO,22334455,13DADC12,
The csv.dictwriter works by writing a row from a dictionary, the csv.writer works by taking a list of items.
The code creates a single dictionary for each row called output_row and then writes it to the output file. By working one row at a time, the script will be able to handle files of any size without running into memory problems.
An alternative approach would be to read the whole file into memory and create a list of dictionaries, one for each row. The header values could be calculated at the same time. This list of dictionaries could then be written in one go.
For example:
import csv
input_filename = 'input.csv'
output_filename = 'output.csv'
header = set() # Use a set to create unique header values from all rows
output_rows = [] # list of dictionary rows
with open(input_filename, newline='') as f_input:
csv_input = csv.reader(f_input)
for row in csv_input:
output_row = {}
for entry in row:
key, value = entry.split('=')
output_row[key] = value.strip('"')
header.add(key)
output_rows.append(output_row)
with open(output_filename, 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=sorted(header))
csv_output.writeheader()
csv_output.writerows(output_rows)
Note, this approach would fail if the file is too big (your question mentions that you have large data).
There is a lot of examples of reading csv data using python, like this one:
import csv
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
print(row)
I only want to read one line of data and enter it into various variables. How do I do that? I've looked everywhere for a working example.
My code only retrieves the value for i, and none of the other values
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
for row in reader:
i = int(row[0])
a1 = int(row[1])
b1 = int(row[2])
c1 = int(row[2])
x1 = int(row[2])
y1 = int(row[2])
z1 = int(row[2])
To read only the first row of the csv file use next() on the reader object.
with open('some.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
# now do something here
# if first row is the header, then you can do one more next() to get the next row:
# row2 = next(f)
or :
with open('some.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
# do something here with `row`
break
you could get just the first row like:
with open('some.csv', newline='') as f:
csv_reader = csv.reader(f)
csv_headings = next(csv_reader)
first_line = next(csv_reader)
You can use Pandas library to read the first few lines from the huge dataset.
import pandas as pd
data = pd.read_csv("names.csv", nrows=1)
You can mention the number of lines to be read in the nrows parameter.
Just for reference, a for loop can be used after getting the first row to get the rest of the file:
with open('file.csv', newline='') as f:
reader = csv.reader(f)
row1 = next(reader) # gets the first line
for row in reader:
print(row) # prints rows 2 and onward
From the Python documentation:
And while the module doesn’t directly support parsing strings, it can easily be done:
import csv
for row in csv.reader(['one,two,three']):
print row
Just drop your string data into a singleton list.
The simple way to get any row in csv file
import csv
csvfile = open('some.csv','rb')
csvFileArray = []
for row in csv.reader(csvfile, delimiter = '.'):
csvFileArray.append(row)
print(csvFileArray[0])
To print a range of line, in this case from line 4 to 7
import csv
with open('california_housing_test.csv') as csv_file:
data = csv.reader(csv_file)
for row in list(data)[4:7]:
print(row)
I think the simplest way is the best way, and in this case (and in most others) is one without using external libraries (pandas) or modules (csv). So, here is the simple answer.
""" no need to give any mode, keep it simple """
with open('some.csv') as f:
""" store in a variable to be used later """
my_line = f.nextline()
""" do what you like with 'my_line' now """