Create and append CSV row by row - python

I would just like to create a csv file and at the same time add my data row by row with a for loop.
for x in y:
newRow = "\n%s,%s\n" % (sentence1, sentence2)
with open('Mydata.csv', "a") as f:
f.write(newRow)
After the above process, I tried to read the csv file but I can't separate the columns. It seems that there is only one column, maybe I did something wrong in the csv creation process?
colnames = ['A_sentence', 'B_sentence']
Mydata = pd.read_csv(Mydata, names=colnames, delimiter=";")
print(Mydata['A_sntence']) #output Nan

When you are writing the file, it looks like you are using commas as separators, but when reading the file you are using semicolons (probably just a typo). Change delimiter=";" to delimiter="," and it should work.

Related

Write a Nested python list into csv file

I have below code to write my nested list into a csv file. The nested list looks like this
[['19181011', '13041519', '22121605', '11142007', '23000114'],
['1523141612', '2403051513', '0806022324', '1614012422', '0516121805'],
['23201621', '24171811', '08231524', '16011022', '17131220'],
['2317241822', '2220112421', '1124052211', '1010192318', '2108231524'],
['11220215', '24240507', '19180423', '07081422', '21201224']]
with open('MLpredictions.csv', 'w') as f:
writer = csv.writer(f, delimiter=';', lineterminator='\n')
writer.writerows(high5_pred)
But when i execute this code, i get like below in the csv file:
19181011;13041519;22121605;11142007;23000114
1523141612;2403051513;0806022324;1614012422;0516121805....
i changed the delimiter to ',' but then I get 5 different columns.
I want each list to be 1 row separated by ',' and not ';'.
Expected o/p, a single column:
19181011,13041519,22121605,11142007,23000114
1523141612,2403051513,0806022324,1614012422,0516121805
Any ideas how to do this?
Assuming that there is a specific reason why you want the data all in one column:
The reason you're getting seperate columns is because you're using the csv format, and your data is not escaped. Your raw file looks like this:
19181011,13041519,22121605,11142007,23000114
1523141612,2403051513,0806022324,1614012422,0516121805
but you need it to look like this:
"19181011,13041519,22121605,11142007,23000114"
"1523141612,2403051513,0806022324,1614012422,0516121805"
You're probably best to create a string object for each "row" of your output file. I'd do the following:
with open('MLpredictions.csv', 'w') as f:
writer = csv.writer(f, delimiter=';', lineterminator='\n')
rows = [','.join([str(number) for number in row]) for row in high5_pred]
writer.writerows(rows)
Note: unless you have a good reason why you don't want these numbers in different columns, I'd leave your code as is. It will be a lot easier to deal with the native csv format

How to replace one value with another in a csv file?

I have a CSV file with information and want to replace the information in a specific location with a new value.
For example if my CSV file looks like this:
example1,example2,0
example3,example4,0
exampple5,example6,0
Note that each row is labelled for example:
test = row[0]
test1 = row[1]
test2 = row[2]
If I want to replace
test[0]
with a new value how would I go about doing it?
Simplest way without installing any additional package would be to use built-in csv to read the whole file in a matrix and replace the desired element.
Here is code that would do just that:
import csv
with open('test.csv', 'r') as in_file, open('test_out.csv', 'wb') as out_file:
data = [row for row in csv.reader(in_file)]
data[0][0] = 'new value'
writer = csv.writer(out_file)
writer.writerows(data)
There are a handful of ways to do this, but personally I'm a big fan of pandas. With pandas, you can read a csv file with df = pd.read_csv('path_to_file.csv'). Make changes however you want, if you wanted row 1 column 1, you'd use df.loc[0,0] = new_val. Then when you are done save to the same file df.to_csv('path_to_file.csv').

Only outputting a few lines into a text file, instead of all of them

I've made a Python script that grabs information from a .csv archive, and outputs it into a text file as a list. The original csv file has over 200,000 fields to input and output from, yet when I run my program it only outputs 36 into the .txt file.
Here's the code:
import csv
with open('OriginalFile.csv', 'r') as csvfile:
emailreader = csv.reader(csvfile)
f = open('text.txt', 'a')
for row in emailreader:
f.write(row[1] + "\n")
And the text file only lists up to 36 strings. How can I fix this? Is maybe the original csv file too big?
After many comments, the original problem was encoding of characters in the csv file. If you specify the encoding in pandas it will read it just fine.
Any time you are dealing with a csv file (or excel, sql or R) I would use Pandas DataFrames for this. The syntax is shorter and easier to know what is going on.
import pandas as pd
csvframe = pd.read_csv('OriginalFile.csv', encoding='utf-8')
with open('text.txt', 'a') as output:
# I think what you wanted was the 2nd column from each row
output.write('\n'.join(csvframe.ix[:,1].values))
# the ix is for index and : is for all the rows and the 1 is only the first column
You might have luck with something like the following:
with open('OriginalFile.csv', 'r') as csvfile:
emailreader = csv.reader(csvfile)
with open('text.txt','w') as output:
for line in emailreader:
output.write(line[1]+'\n')

How to copy multiple rows and one column from one CSV file to another CSV Excel?

I am extremely new to python(coding, for that matter).
Could I please get some help as to how can I achieve this. I have gone through numerous threads but nothing helped.
My input file looks like this:
I want my output file to look like this:
Just replication of the first column, twice in the second excel sheet. With a line after every 5 rows.
A .csv file can be opened with a normal text editor, do this and you'll see that the entries for each column are comma-separated (csv = comma separated values). Most likely it's semicolons ;, though.
Since you're new to coding, I recommend trying it manually with a text editor first until you have the desired output, and then try to replicate it with python.
Also, you should post code examples here and ask specific questions about why it doesn't work like you expected it to work.
Below is the solution. Don't forget to configure input/output files and the delimiter:
input_file = 'c:\Temp\input.csv'
output_file = 'c:\Temp\output.csv'
delimiter = ';'
i = 0
output_data = ''
with open(input_file) as f:
for line in f:
i += 1
output_data += line.strip() + delimiter + line
if i == 5:
output_data += '\n'
i = 0
with open(output_file, 'w') as file_:
file_.write(output_data)
Python has a csv module for doing this. It is able to automatically read each row into a list of columns. It is then possible to simply take the first element and replicate it into the second column in an output file.
import csv
with open('input.csv', 'rb') as f_input:
csv_input = csv.reader(f_input)
input_rows = list(csv_input)
with open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
for line, row in enumerate(input_rows, start=1):
csv_output.writerow([row[0], row[0]])
if line % 5 == 0:
csv_output.writerow([])
Note, it is not advisable to write the updated data directly over the input file as if there was a problem you would lose your original file.
If your input file has multiple columns, this script will remove them and simple duplicate the first column.
By default, the csv format separates each column using a comma, this can be modified by specifying a desired delimiter as follows:
csv_output = csv.writer(f_output, delimiter=';')

remove selected csv column in python

I have a variable that contains a string of:
fruit_wanted = 'banana,apple'
I also have a csv file
fruit,'orange','grape','banana','mango','apple','strawberry'
number,1,2,3,4,5,6
value,3,2,2,4,2,1
price,3,2,1,2,3,4
Now how do I delete the column in which the 'fruit' does not listed in the 'fruit_wanted' variable?
So that the outfile would look like
fruit,'banana','apple'
number,3,5
value,2,2
price,1,3
Thank you.
Read the csv file using the DictReader() class, and ignore the columns you don't want:
fruit_wanted = ['fruit'] + ["'%s'" % f for f in fruit_wanted.split(',')]
outfile = csv.DictWriter(open(outputfile, 'wb'), fieldnames=fruit_wanted)
fruit_wanted = set(fruit_wanted)
for row in csv.DictReader(open(inputfile, 'rb')):
row = {k: row[k] for k in row if k in fruit_wanted}
outfile.writerow(row)
Here's some pseudocode:
open the original CSV for input, and the new one for output
read the first row of the original CSV and figure out which columns you want to delete
write the modified first row to the output CSV
for each row in the input CSV:
delete the columns you figured out before
write the modified row to the output CSV

Categories