add row to a csv file with python - python

I have two csv files that each have 2 columns, one of them being the date. I want to add the second column of the second file to the first file, resulting a file with 3 columns.
I did it by creating a new file and appending the data to it this way:
import csv
coinsfile = open('total-bitcoins.csv', newline='')
pricesfile = open('market-price.csv', newline='')
coins = csv.reader(coinsfile, delimiter=',')
prices = csv.reader(pricesfile, delimiter=',')
with open('result.csv', 'w') as res:
for coin_row, price_row in zip(coins, prices):
line = str(coin_row[0]) + ',' + str(coin_row[1]) + ',' + str(price_row[1])
res.append(line)
The code runs without any errors but the result is a csv file which is completely empty.
Where am I making the mistake, or is there a better way to do this job?

res is a file handle, so the append method doesn't apply to it. So there's an attribute error while the output file is opened, which results in an empty output file (or, yes, one of the input files is empty, ending zip immediately, but this answer explains how to fix the next issues)
A quickfix would be:
res.write(line+"\n")
but the best way would be to flatten the result of zip and feed it to a csv.writer object (using a comprehension to generate each row by addition of both input csv rows)
import csv
with open('result.csv', 'w', newline="") as res, open('total-bitcoins.csv', newline='') as coinsfile, open('market-price.csv', newline='') as pricesfile:
coins = csv.reader(coinsfile)
prices = csv.reader(pricesfile)
cw = csv.writer(res)
cw.writerows(coin_rows+price_row for coin_row, price_row in zip(coins, prices))
note that newline="" is required when writing your files (Python 3) to avoid the infamous blank line "bug" when running windows
I have added the input files in the with statement to ensure that the inputs are closed when exiting it. And also removed the delimiter parameter as comma is the default.

The easiest way to satisfy this need would be using a library like pandas. Using pandas, adding a column to an existing file would be as easy as loading the file into a dataframe, and adding the required column to it in just one line.
Adding can be done by mere assignment, or through join/merge methods.

Related

Replace comma with semicolon when creating Csv Dataframe

I have a code that creates a csv file, when I first open it I everything is in one column so I have to do the usual
Go to Data and do the following. The data is then spplited into columns.
I work with Office 365, and recently I was told that if I change the commas with semicolons then when I open the newly created file Csv file, Excel will automatically open the file already separated into columns.
I’m asking for some advice here, since having to do this process for every created Csv file is really time consuming.
Looking for a way to alter my code so it does this automatically maybe instead of splitting columns with commas, do it with semicolons in this case. Just to try if this works out.
with open('created.csv', 'w', newline='') as f:
writer = csv.writer(f)
[1]: https://i.stack.imgur.com/OtxO4.png
If you already want to transform an existing file you can do it like that:
with open('created.csv', 'r', encoding='utf-8') as f_in, open("outfile.csv", 'w') as f_out:
for line in f_in:
line = line.split(",")
line = ";".join(line)
f_out.write(line)
In case you have already a dataframe you can do it like #jezrael said in the comment with:
df.to_csv('created.csv', sep=';')
As mention in the comment you are already using the csv module to write your file. You have to change this line in your code:
writer = csv.writer(f)
to
writer = csv.writer(f, delimiter=';')
As for me if I open a csv splitted with "," I have to that thing you described in your question. But if I open a csv splitted with ";" it's already in the right columns.
This is (for Windows user at least) dependent on your region settings. This can be different for everyone dependent on your language settings.
You can check them here and also change it if you want:
https://www.itsupportguides.com/knowledge-base/office-2013/excel-20132016-how-to-change-csv-delimiter-character/

CSV.writer each set entry on a new line

Environment: Python 3.7 on Windows
Goal: Write out a set to a .csv file, with each set entry on a new line.
Problem: Each set entry is not on a new line... when I open the CSV file in Excel, every set entry is in a separate column, rather than a separate row.
Question: What do I need to do to get each set entry written on a new line?
import csv
test_set = {'http://www.apple.com', 'http://www.amazon.com', 'http://www.microsoft.com', 'https://www.ibm.com'}
with open('output.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows([test_set])
f.close()
You passed writer.writerows() a list with a single element, and so it wrote a single row.
You need to convert your set to a series of rows; each row a list with the row contents. You could use a generator expression to produce the rows:
writer.writerows([value] for value in test_set)
However, you are not really producing a CSV here. With a single column, you may as well just write the set contents directly to a file with newlines in between. The print() function can be co-opted for this task:
with open('output.csv', 'w') as f:
print(*test_set, sep='\n', file=f)

Combining multiple csv files into one csv file

I am trying to combine multiple csv files into one, and have tried a number of methods but I am struggling.
I import the data from multiple csv files, and when I compile them together into one csv file, it seems that the first few rows get filled out nicely, but then it starts randomly inputting spaces of variable number in between the rows, and it never finishes filling out the combined csv file, it just seems to continuously get information added to it, which does not make sense to me because I am trying to compile a finite amount of data.
I have already tried writing close statements for the file, and I still get the same result, my designated combined csv file never stops getting data, and it will randomly space the data throughout the file - I just want a normally compiled csv.
Is there an error in my code? Is there any explanation as to why my csv file is behaving this way?
csv_file_list = glob.glob(Dir + '/*.csv') #returns the file list
print (csv_file_list)
with open(Avg_Dir + '.csv','w') as f:
wf = csv.writer(f, delimiter = ',')
print (f)
for files in csv_file_list:
rd = csv.reader(open(files,'r'),delimiter = ',')
for row in rd:
print (row)
wf.writerow(row)
Your code works for me.
Alternatively, you can merge files as follows:
csv_file_list = glob.glob(Dir + '/*.csv')
with open(Avg_Dir + '.csv','w') as wf:
for file in csv_file_list:
with open(file) as rf:
for line in rf:
if line.strip(): # if line is not empty
if not line.endswith("\n"):
line+="\n"
wf.write(line)
Or, if the files are not too large, you can read each file at once. But in this case all empty lines an headers will be copied:
csv_file_list = glob.glob(Dir + '/*.csv')
with open(Avg_Dir + '.csv','w') as wf:
for file in csv_file_list:
with open(file) as rf:
wf.write(rf.read().strip()+"\n")
Consider several adjustments:
Use context manager, with, for both the read and write process. This avoids the need to close() file objects which you do not do on the read objects.
For skipping lines issue: use either the argument newline='' in open() or lineterminator="\n" argument in csv.writer(). See SO answers for former and latter.
Use os.path.join() to properly concatenate folder and file paths. This method is os-agnostic so accounts for Windows or Unix machines using forward or backslashes types.
Adjusted script:
import os
import csv, glob
Dir = r"C:\Path\To\Source"
Avg_Dir = r"C:\Path\To\Destination\Output"
csv_file_list = glob.glob(os.path.join(Dir, '*.csv')) # returns the file list
print (csv_file_list)
with open(os.path.join(Avg_Dir, 'Output.csv'), 'w', newline='') as f:
wf = csv.writer(f, lineterminator='\n')
for files in csv_file_list:
with open(files, 'r') as r:
next(r) # SKIP HEADERS
rr = csv.reader(r)
for row in rr:
wf.writerow(row)

How do I make the filename of the output csv file equal to the the content of a column

I have a huge csv file with all our student rosters inside of it. So,
1) I want to separate the rosters into smaller csv files based on the
course name. 2) If I can have the output csv file's name be equal to
the course name (example: Algebra1.csv), that would make my life so much
better. Is it possible to iterate through the courses_column of the csv file and when the name of the course changes it makes a new csv file for that course. I think I could read the keys of the dictionary 'read_rosters' and then do a while loop?
An example of the csv input file would look like this:
Student firstname, Student lastname, Class Instructor, Course name, primary learning center
johnny, doe, smith, algebra1, online
jane, doe, austin, geometry, campus
Here is what I have so far:
import os
import csv
path = "/PATH/TO/FILE"
with open(os.path.join(path, "student_rosters.csv"), "rU") as rosters:
read_rosters = csv.DictReader(rosters)
for row in read_rosters:
course_name = row['COURSES_COLUMN_HEADER']
csv_file = os.path.join(course_name, ".csv")
course_csv = csv.writer(open(csv_file, 'wb').next()
In your current code, you're opening an output csv file for each line you read. This will be slow, and, as you've currently written it, it won't work. That's because using the "wb" mode when you open the file erases everything that was in the file before. You might use an "a" mode, but this will still be slow.
How you can best solve the problem depends a bit on your data. If you can rely upon the input always having the rows with the same course next to one another, you could use groupby from the itertools module to easily write the appropriate lines out together:
from itertools import groupby
from operator import itemgetter
with open(os.path.join(path, "student_rosters.csv"), "rb") as rosters:
reader = csv.DictReader(rosters)
for course, rows in groupby(reader, itemgetter('COURSES_COLUMN_HEADER')):
with open(os.path.join(path, course + ".csv"), "wb") as outfile:
writer = csv.DictWriter(outfile, reader.fieldnames)
writer.writerows(rows)
If you can't rely upon the organization of the rows, you have a couple options. One would be to read all the rows into a list, then sort them by course and use itertools.groupby like in the code above.
Another option would be to keep reading just one line at a time, with each output row going into an appropriate file. I'd suggest keeping a dictionary of writer objects, indexed by course name. Here's what that could look like:
writers = {}
with open(os.path.join(path, "student_rosters.csv"), "rb") as rosters:
reader = csv.DictReader(rosters)
for row in reader:
course = row['COURSES_COLUMN_HEADER']
if course not in writers:
outfile = open(os.path.join(path, course + ".csv"), "wb")
writers[course] = csv.DictWriter(outfile, reader.fieldnames)
writers[course].writerow(row)
If you were using this in production, you'd probably want to add some code to close the files after you were done with them, since you can't use with statements to close them automatically.
In my example codes above, I've made the code write out the full rows, just as they were in the input. If you don't want that, you can change the second argument to DictWriter to a sequence of the column names you want to write. You'll also want to include the parameter extrasaction="ignore" so that the extra values in the row dicts will be ignored when the columns you do want are written.
First, this is not what you want:
csv_file = os.path.join(course_name, ".csv")
It will create a file named .csv in a subdirectory named course_name. You likely want something like:
csv_file = os.path.join(path, course_name + ".csv")
Also, the following has two issues: (a) unbalanced parens and (b) writer objects don't have a next method:
course_csv = csv.writer(open(csv_file, 'wb').next()
Try instead:
course_csv = csv.writer(open(csv_file, 'wb'))
And, then you need to write something of your choosing to the new file, probably using a writeheader, writerow or writerows method:
course_csv.writeheader(something_of_your_choosing)
course_csv.writerow(something_else_of_your_choosing)

Trying to import a list of words using csv (Python 2.7)

import csv, Tkinter
with open('most_common_words.csv') as csv_file: # Opens the file in a 'closure' so that when it's finished it's automatically closed"
csv_reader = csv.reader(csv_file) # Create a csv reader instance
for row in csv_reader: # Read each line in the csv file into 'row' as a list
print row[0] # Print the first item in the list
I'm trying to import this list of most common words using csv. It continues to give me the same error
for row in csv_reader: # Read each line in the csv file into 'row' as a list
Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?
I've tried a couple different ways to do it as well, but they didn't work either. Any suggestions?
Also, where does this file need to be saved? Is it okay just being in the same folder as the program?
You should always open a CSV file in binary mode (Python 2) or universal newline mode (Python 3). Also, make sure that the delimiters and quote characters are , and ", or you'll need to specify otherwise:
with open('most_common_words.csv', 'rb') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=';', quotechar='"') # for EU CSV
You can save the file in the same folder as your program. If you don't, you can provide the correct path to open() as well. Be sure to use raw strings if you're on Windows, otherwise the backslashes may trick you: open(r"C:\Python27\data\table.csv")
It seems you have a file with one column as you say here:
It is a simple list of words. When I open it up, it opens into Excel
with one column and 500 rows of 500 different words.
If so, you don't need the csv module at all:
with open('most_common_words.csv') as f:
rows = list(f)
Note in this case, each item of the list will have the newline appended to it, so if your file is:
apple
dog
cat
rows will be ['apple\n', 'dog\n', 'cat\n']
If you want to strip the end of line, then you can do this:
with open('most_common_words.csv') as f:
rows = list(i.rstrip() for i in f)

Categories