how to retrieve specific entries from a csv file in python

how to retrieve specific entries from a csv file in python - python

I have a CSV file rsvp1.csv:
_id event_id comments
1 | x | hello..
2 | y | bye
3 | y | hey
4 | z | hi
My question is:
For each event e how can I get the comments written to a separate text file?
There is some error with the following code:
import csv
with open('rsvps1.csv','rU') as f:
reader = csv.DictReader(f, delimiter=',')
rows = list(reader)
fi = open('rsvp.txt','wb')
k=0
for row in rows:
if k == row['event_id']:
fi.write(row['comment']+"\n")
else:
fi.write(row['event_id']+"\t")
fi.write(row['comment']+"\n")
k= row['event_id']
f.close()
fi.close()

I think it is best to just forget you're working with a csv file and think of it as a normal file in which you can to the following.
with open('file.csv', 'r') as f:
lines = f.readlines()
for line in lines:
if not line.startswith('_id'):
line_values = line.split(',')
with open('%s.txt' % line_values[1], 'a') as fp:
fp.write(line_values[2] + '\n')

Splitting the csv File
Given a file rsvps1.csv with this content:
_id,event_id,comments
1,x,hello
2,y,bye
3,y,hey
4,z,hi
This:
import csv
import itertools as it
from operator import itemgetter
with open('rsvps1.csv') as fin:
fieldnames = next(csv.reader(fin))
fin.seek(0)
rows = list(csv.DictReader(fin))
for event_id, event in it.groupby(rows, key=itemgetter('event_id')):
with open('event_{}.txt'.format(event_id), 'w') as fout:
csv_out = csv.DictWriter(fout, fieldnames)
csv_out.writeheader()
csv_out.writerows(event)
splits it into three files:
event_x.txt
_id,event_id,comments
1,x,hello
event_y.txt
_id,event_id,comments
2,y,bye
3,y,hey
and event_z.txt
_id,event_id,comments
4,z,hi
Adapt the output to your needs.
Only Comments
If you do not want a csv as output, this becomes simpler:
import csv
import itertools as it
from operator import itemgetter
with open('rsvps1.csv') as fin:
rows = list(csv.DictReader(fin))
for event_id, event in it.groupby(rows, key=itemgetter('event_id')):
with open('event_{}_comments.txt'.format(event_id), 'w') as fout:
for item in event:
fout.write('{}\n'.format(item['comments']))
Now event_y_comments.txt has this content:
bye
hey

I would suggest that you use pandas as your import tool. It creates a clear datastructure of your csv-file, similar to a spreadsheet in MS Excel. You could then use iterrows to loop over your event_id's and process your comments.
import pandas as pd
data = pd.read_csv('rsvps1.csv', sep = ',')
for index, row in data.iterrows():
print(row['event_id'], row['comment') #Python 3.x
However, I not sure what you want to write in the file. Just the comment for all event_id's? The complete 'comment'-column can be exported to a separate file by
data.to_csv('output.csv', columns = ['comment'])
Additional information according to comment:
When you want to save only certain comments that have the same event_id, then you have to select the corresponding rows at first. This is done by
selected_data = data[data['event_id'] == 'x']
for the event_id 'x'. selected_data now contains a dataframe that only holds rows that have 'x' in the 'event_id'-column. You can then loop through this dataframe as shown above.

Related

Adding a Python List to a CSV File as a Column [duplicate]

I have several CSV files that look like this:
Input
Name Code
blackberry 1
wineberry 2
rasberry 1
blueberry 1
mulberry 2
I would like to add a new column to all CSV files so that it would look like this:
Output
Name Code Berry
blackberry 1 blackberry
wineberry 2 wineberry
rasberry 1 rasberry
blueberry 1 blueberry
mulberry 2 mulberry
The script I have so far is this:
import csv
with open(input.csv,'r') as csvinput:
with open(output.csv, 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
writer.writerow(row+['Berry'])
(Python 3.2)
But in the output, the script skips every line and the new column has only Berry in it:
Output
Name Code Berry
blackberry 1 Berry
wineberry 2 Berry
rasberry 1 Berry
blueberry 1 Berry
mulberry 2 Berry

This should give you an idea of what to do:
>>> v = open('C:/test/test.csv')
>>> r = csv.reader(v)
>>> row0 = r.next()
>>> row0.append('berry')
>>> print row0
['Name', 'Code', 'berry']
>>> for item in r:
... item.append(item[0])
... print item
...
['blackberry', '1', 'blackberry']
['wineberry', '2', 'wineberry']
['rasberry', '1', 'rasberry']
['blueberry', '1', 'blueberry']
['mulberry', '2', 'mulberry']
>>>
Edit, note in py3k you must use next(r)
Thanks for accepting the answer. Here you have a bonus (your working script):
import csv
with open('C:/test/test.csv','r') as csvinput:
with open('C:/test/output.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append('Berry')
all.append(row)
for row in reader:
row.append(row[0])
all.append(row)
writer.writerows(all)
Please note
the lineterminator parameter in csv.writer. By default it is
set to '\r\n' and this is why you have double spacing.
the use of a list to append all the lines and to write them in
one shot with writerows. If your file is very, very big this
probably is not a good idea (RAM) but for normal files I think it is
faster because there is less I/O.
As indicated in the comments to this post, note that instead of
nesting the two with statements, you can do it in the same line:
with open('C:/test/test.csv','r') as csvinput, open('C:/test/output.csv', 'w') as csvoutput:

I'm surprised no one suggested Pandas. Although using a set of dependencies like Pandas might seem more heavy-handed than is necessary for such an easy task, it produces a very short script and Pandas is a great library for doing all sorts of CSV (and really all data types) data manipulation. Can't argue with 4 lines of code:
import pandas as pd
csv_input = pd.read_csv('input.csv')
csv_input['Berries'] = csv_input['Name']
csv_input.to_csv('output.csv', index=False)
Check out Pandas Website for more information!
Contents of output.csv:
Name,Code,Berries
blackberry,1,blackberry
wineberry,2,wineberry
rasberry,1,rasberry
blueberry,1,blueberry
mulberry,2,mulberry

import csv
with open('input.csv','r') as csvinput:
with open('output.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
if row[0] == "Name":
writer.writerow(row+["Berry"])
else:
writer.writerow(row+[row[0]])
Maybe something like that is what you intended?
Also, csv stands for comma separated values. So, you kind of need commas to separate your values like this I think:
Name,Code
blackberry,1
wineberry,2
rasberry,1
blueberry,1
mulberry,2

I used pandas and it worked well...
While I was using it, I had to open a file and add some random columns to it and then save back to same file only.
This code adds multiple column entries, you may edit as much you need.
import pandas as pd
csv_input = pd.read_csv('testcase.csv') #reading my csv file
csv_input['Phone1'] = csv_input['Name'] #this would also copy the cell value
csv_input['Phone2'] = csv_input['Name']
csv_input['Phone3'] = csv_input['Name']
csv_input['Phone4'] = csv_input['Name']
csv_input['Phone5'] = csv_input['Name']
csv_input['Country'] = csv_input['Name']
csv_input['Website'] = csv_input['Name']
csv_input.to_csv('testcase.csv', index=False) #this writes back to your file
If you want that cell value doesn't gets copy, so first of all create a empty Column in your csv file manually, like you named it as Hours
then, Now for this you can add this line in above code,
csv_input['New Value'] = csv_input['Hours']
or simply we can, without adding the manual column, we can
csv_input['New Value'] = '' #simple and easy
I Hope it helps.

Yes Its a old question but it might help some
import csv
import uuid
# read and write csv files
with open('in_file','r') as r_csvfile:
with open('out_file','w',newline='') as w_csvfile:
dict_reader = csv.DictReader(r_csvfile,delimiter='|')
#add new column with existing
fieldnames = dict_reader.fieldnames + ['ADDITIONAL_COLUMN']
writer_csv = csv.DictWriter(w_csvfile,fieldnames,delimiter='|')
writer_csv.writeheader()
for row in dict_reader:
row['ADDITIONAL_COLUMN'] = str(uuid.uuid4().int >> 64) [0:6]
writer_csv.writerow(row)

I don't see where you're adding the new column, but try this:
import csv
i = 0
Berry = open("newcolumn.csv","r").readlines()
with open(input.csv,'r') as csvinput:
with open(output.csv, 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
writer.writerow(row+","+Berry[i])
i++

This code will suffice your request and I have tested on the sample code.
import csv
with open(in_path, 'r') as f_in, open(out_path, 'w') as f_out:
csv_reader = csv.reader(f_in, delimiter=';')
writer = csv.writer(f_out)
for row in csv_reader:
writer.writerow(row + [row[0]]

In case of a large file you can use pandas.read_csv with the chunksize argument which allows to read the dataset per chunk:
import pandas as pd
INPUT_CSV = "input.csv"
OUTPUT_CSV = "output.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory
header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
chunk_df["Berry"] = chunk_df["Name"]
# You apply any other transformation to the chunk
# ...
chunk_df.to_csv(OUTPUT_CSV, header=header, mode=mode)
header = False # Do not save the header for the other chunks
mode = "a" # 'a' stands for append mode, all the other chunks will be appended
If you want to update the file inplace, you can use a temporary file and erase it at the end
import pandas as pd
INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory
header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
chunk_df["Berry"] = chunk_df["Name"]
# You apply any other transformation to the chunk
# ...
chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
header = False # Do not save the header for the other chunks
mode = "a" # 'a' stands for append mode, all the other chunks will be appended
os.replace(TMP_CSV, INPUT_CSV)

For adding a new column to an existing CSV file(with headers), if the column to be added has small enough number of values, here is a convenient function (somewhat similar to #joaquin's solution). The function takes the
Existing CSV filename
Output CSV filename (which will have the updated content) and
List with header name&column values
def add_col_to_csv(csvfile,fileout,new_list):
with open(csvfile, 'r') as read_f, \
open(fileout, 'w', newline='') as write_f:
csv_reader = csv.reader(read_f)
csv_writer = csv.writer(write_f)
i = 0
for row in csv_reader:
row.append(new_list[i])
csv_writer.writerow(row)
i += 1
Example:
new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)
Existing CSV file:
Output(updated) CSV file:

Append new column in existing csv file using python without header name
default_text = 'Some Text'
# Open the input_file in read mode and output_file in write mode
with open('problem-one-answer.csv', 'r') as read_obj, \
open('output_1.csv', 'w', newline='') as write_obj:
# Create a csv.reader object from the input file object
csv_reader = reader(read_obj)
# Create a csv.writer object from the output file object
csv_writer = csv.writer(write_obj)
# Read each row of the input csv file as list
for row in csv_reader:
# Append the default text in the row / list
row.append(default_text)
# Add the updated row / list to the output file
csv_writer.writerow(row)
Thankyou

You may just write:
import pandas as pd
import csv
df = pd.read_csv('csv_name.csv')
df['Berry'] = df['Name']
df.to_csv("csv_name.csv",index=False)
Then you are done. To check it, you may run:
h = pd.read_csv('csv_name.csv')
print(h)
If you want to add a column with some arbitrary new elements(a,b,c), you may replace the 4th line of the code by:
df['Berry'] = ['a','b','c']

Edit a piece of data inside a csv

I have a csv file looking like this
34512340,1
12395675,30
56756777,30
90673412,45
12568673,25
22593672,25
I want to be able to edit the data after the comma from python and then save the csv.
Does anybody know how I would be able to do this?
This bit of code below will write a new line, but not edit:
f = open("stockcontrol","a")
f.write(code)

Here is a sample, which adds 1 to the second column:
import csv
with open('data.csv') as infile, open('output.csv', 'wb') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
# Transform the second column, which is row[1]
row[1] = int(row[1]) + 1
writer.writerow(row)
Notes
The csv module correctly parses the CSV file--highly recommended
By default, each row will be parsed as text, what is why I converted into integer: int(row[1])
Update
If you really want to edit the file "in place", then use the fileinput module:
import fileinput
for line in fileinput.input('data.csv', inplace=True):
fields = line.strip().split(',')
fields[1] = str(int(fields[1]) + 1) # "Update" second column
line = ','.join(fields)
print line # Write the line back to the file, in place

You can use python pandas to edit the column you want for e.g increase the column number by n:
import pandas
data_df = pandas.read_csv('input.csv')
data_df = data_df['column2'].apply(lambda x: x+n)
print data_df
for adding 1 replace n by 1.

How to convert .dat to .csv using python?

I have a file.dat which looks like:
id | user_id | venue_id | latitude | longitude | created_at
---------+---------+----------+-----------+-----------+-----------------
984301 |2041916 |5222 | | |2012-04-21 17:39:01
984222 |15824 |5222 |38.8951118 |-77.0363658|2012-04-21 17:43:47
984315 |1764391 |5222 | | |2012-04-21 17:37:18
984234 |44652 |5222 |33.800745 |-84.41052 | 2012-04-21 17:43:43
I need to get csv file with deleted empty latitude and longtitude rows, like:
id,user_id,venue_id,latitude,longitude,created_at
984222,15824,5222,38.8951118,-77.0363658,2012-04-21T17:43:47
984234,44652,5222,33.800745,-84.41052,2012-04-21T17:43:43
984291,105054,5222,45.5234515,-122.6762071,2012-04-21T17:39:22
I try to do that, using next code:
with open('file.dat', 'r') as input_file:
lines = input_file.readlines()
newLines = []
for line in lines:
newLine = line.strip('|').split()
newLines.append(newLine)
with open('file.csv', 'w') as output_file:
file_writer = csv.writer(output_file)
file_writer.writerows(newLines)
But all the same I get a csv file with "|" symbols and empty latitude/longtitude rows.
Where is mistake?
In general I need to use resulting csv-file in DateFrame, so maybe there is some way to reduce number of actions.

str.strip() removes leading and trailing characters from a string.
You want to split the lines on "|", then strip each element of the resulting list:
import csv
with open('file.dat') as dat_file, open('file.csv', 'w') as csv_file:
csv_writer = csv.writer(csv_file)
for line in dat_file:
row = [field.strip() for field in line.split('|')]
if len(row) == 6 and row[3] and row[4]:
csv_writer.writerow(row)

Use this:
data = pd.read_csv('file.dat', sep='|', header=0, skipinitialspace=True)
data.dropna(inplace=True)

I used standard python features without pre-processing data. I got an idea from one of the previous answers and improved it. If data headers contain spaces (it is often the situation in CSV) we should determine column names by ourselves and skip line 1 with headers.
After that, we can remove NaN values only by specific columns.
data = pd.read_csv("checkins.dat", sep='|', header=None, skiprows=1,
low_memory = False, skipinitialspace=True,
names=['id','user_id','venue_id','latitude','longitude','created_at'])
data.dropna(subset=['latitude', 'longitude'], inplace = True)

Using split() without parameters will result in splitting after a space
example "test1 test2".split() results in ["test1", "test2"]
instead, try this:
newLine = line.split("|")

Maybe it's better to use a map() function instead of list comprehensions as it must be working faster. Also writing a csv-file is easy with csv module.
import csv
with open('file.dat', 'r') as fin:
with open('file.csv', 'w') as fout:
for line in fin:
newline = map(str.strip, line.split('|'))
if len(newline) == 6 and newline[3] and newline[4]:
csv.writer(fout).writerow(newline)

with open("filename.dat") as f:
with open("filename.csv", "w") as f1:
for line in f:
f1.write(line)
This can be used to convert a .dat file to .csv file

Combining previous answers I wrote my code for Python 2.7:
import csv
lat_index = 3
lon_index = 4
fields_num = 6
csv_counter = 0
with open("checkins.dat") as dat_file:
with open("checkins.csv", "w") as csv_file:
csv_writer = csv.writer(csv_file)
for dat_line in dat_file:
new_line = map(str.strip, dat_line.split('|'))
if len(new_line) == fields_num and new_line[lat_index] and new_line[lon_index]:
csv_writer.writerow(new_line)
csv_counter += 1
print("Done. Total rows written: {:,}".format(csv_counter))

This has worked for me:
data = pd.read_csv('file.dat',sep='::',names=list_for_names_of_columns)

How to add a new column to a CSV file?

I have several CSV files that look like this:
Input
Name Code
blackberry 1
wineberry 2
rasberry 1
blueberry 1
mulberry 2
I would like to add a new column to all CSV files so that it would look like this:
Output
Name Code Berry
blackberry 1 blackberry
wineberry 2 wineberry
rasberry 1 rasberry
blueberry 1 blueberry
mulberry 2 mulberry
The script I have so far is this:
import csv
with open(input.csv,'r') as csvinput:
with open(output.csv, 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
writer.writerow(row+['Berry'])
(Python 3.2)
But in the output, the script skips every line and the new column has only Berry in it:
Output
Name Code Berry
blackberry 1 Berry
wineberry 2 Berry
rasberry 1 Berry
blueberry 1 Berry
mulberry 2 Berry

This should give you an idea of what to do:
>>> v = open('C:/test/test.csv')
>>> r = csv.reader(v)
>>> row0 = r.next()
>>> row0.append('berry')
>>> print row0
['Name', 'Code', 'berry']
>>> for item in r:
... item.append(item[0])
... print item
...
['blackberry', '1', 'blackberry']
['wineberry', '2', 'wineberry']
['rasberry', '1', 'rasberry']
['blueberry', '1', 'blueberry']
['mulberry', '2', 'mulberry']
>>>
Edit, note in py3k you must use next(r)
Thanks for accepting the answer. Here you have a bonus (your working script):
import csv
with open('C:/test/test.csv','r') as csvinput:
with open('C:/test/output.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append('Berry')
all.append(row)
for row in reader:
row.append(row[0])
all.append(row)
writer.writerows(all)
Please note
the lineterminator parameter in csv.writer. By default it is
set to '\r\n' and this is why you have double spacing.
the use of a list to append all the lines and to write them in
one shot with writerows. If your file is very, very big this
probably is not a good idea (RAM) but for normal files I think it is
faster because there is less I/O.
As indicated in the comments to this post, note that instead of
nesting the two with statements, you can do it in the same line:
with open('C:/test/test.csv','r') as csvinput, open('C:/test/output.csv', 'w') as csvoutput:

I'm surprised no one suggested Pandas. Although using a set of dependencies like Pandas might seem more heavy-handed than is necessary for such an easy task, it produces a very short script and Pandas is a great library for doing all sorts of CSV (and really all data types) data manipulation. Can't argue with 4 lines of code:
import pandas as pd
csv_input = pd.read_csv('input.csv')
csv_input['Berries'] = csv_input['Name']
csv_input.to_csv('output.csv', index=False)
Check out Pandas Website for more information!
Contents of output.csv:
Name,Code,Berries
blackberry,1,blackberry
wineberry,2,wineberry
rasberry,1,rasberry
blueberry,1,blueberry
mulberry,2,mulberry

import csv
with open('input.csv','r') as csvinput:
with open('output.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
if row[0] == "Name":
writer.writerow(row+["Berry"])
else:
writer.writerow(row+[row[0]])
Maybe something like that is what you intended?
Also, csv stands for comma separated values. So, you kind of need commas to separate your values like this I think:
Name,Code
blackberry,1
wineberry,2
rasberry,1
blueberry,1
mulberry,2

I used pandas and it worked well...
While I was using it, I had to open a file and add some random columns to it and then save back to same file only.
This code adds multiple column entries, you may edit as much you need.
import pandas as pd
csv_input = pd.read_csv('testcase.csv') #reading my csv file
csv_input['Phone1'] = csv_input['Name'] #this would also copy the cell value
csv_input['Phone2'] = csv_input['Name']
csv_input['Phone3'] = csv_input['Name']
csv_input['Phone4'] = csv_input['Name']
csv_input['Phone5'] = csv_input['Name']
csv_input['Country'] = csv_input['Name']
csv_input['Website'] = csv_input['Name']
csv_input.to_csv('testcase.csv', index=False) #this writes back to your file
If you want that cell value doesn't gets copy, so first of all create a empty Column in your csv file manually, like you named it as Hours
then, Now for this you can add this line in above code,
csv_input['New Value'] = csv_input['Hours']
or simply we can, without adding the manual column, we can
csv_input['New Value'] = '' #simple and easy
I Hope it helps.

Yes Its a old question but it might help some
import csv
import uuid
# read and write csv files
with open('in_file','r') as r_csvfile:
with open('out_file','w',newline='') as w_csvfile:
dict_reader = csv.DictReader(r_csvfile,delimiter='|')
#add new column with existing
fieldnames = dict_reader.fieldnames + ['ADDITIONAL_COLUMN']
writer_csv = csv.DictWriter(w_csvfile,fieldnames,delimiter='|')
writer_csv.writeheader()
for row in dict_reader:
row['ADDITIONAL_COLUMN'] = str(uuid.uuid4().int >> 64) [0:6]
writer_csv.writerow(row)

I don't see where you're adding the new column, but try this:
import csv
i = 0
Berry = open("newcolumn.csv","r").readlines()
with open(input.csv,'r') as csvinput:
with open(output.csv, 'w') as csvoutput:
writer = csv.writer(csvoutput)
for row in csv.reader(csvinput):
writer.writerow(row+","+Berry[i])
i++

This code will suffice your request and I have tested on the sample code.
import csv
with open(in_path, 'r') as f_in, open(out_path, 'w') as f_out:
csv_reader = csv.reader(f_in, delimiter=';')
writer = csv.writer(f_out)
for row in csv_reader:
writer.writerow(row + [row[0]]

In case of a large file you can use pandas.read_csv with the chunksize argument which allows to read the dataset per chunk:
import pandas as pd
INPUT_CSV = "input.csv"
OUTPUT_CSV = "output.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory
header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
chunk_df["Berry"] = chunk_df["Name"]
# You apply any other transformation to the chunk
# ...
chunk_df.to_csv(OUTPUT_CSV, header=header, mode=mode)
header = False # Do not save the header for the other chunks
mode = "a" # 'a' stands for append mode, all the other chunks will be appended
If you want to update the file inplace, you can use a temporary file and erase it at the end
import pandas as pd
INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory
header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
chunk_df["Berry"] = chunk_df["Name"]
# You apply any other transformation to the chunk
# ...
chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
header = False # Do not save the header for the other chunks
mode = "a" # 'a' stands for append mode, all the other chunks will be appended
os.replace(TMP_CSV, INPUT_CSV)

For adding a new column to an existing CSV file(with headers), if the column to be added has small enough number of values, here is a convenient function (somewhat similar to #joaquin's solution). The function takes the
Existing CSV filename
Output CSV filename (which will have the updated content) and
List with header name&column values
def add_col_to_csv(csvfile,fileout,new_list):
with open(csvfile, 'r') as read_f, \
open(fileout, 'w', newline='') as write_f:
csv_reader = csv.reader(read_f)
csv_writer = csv.writer(write_f)
i = 0
for row in csv_reader:
row.append(new_list[i])
csv_writer.writerow(row)
i += 1
Example:
new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)
Existing CSV file:
Output(updated) CSV file:

You may just write:
import pandas as pd
import csv
df = pd.read_csv('csv_name.csv')
df['Berry'] = df['Name']
df.to_csv("csv_name.csv",index=False)
Then you are done. To check it, you may run:
h = pd.read_csv('csv_name.csv')
print(h)
If you want to add a column with some arbitrary new elements(a,b,c), you may replace the 4th line of the code by:
df['Berry'] = ['a','b','c']

Append new column in existing csv file using python without header name
default_text = 'Some Text'
# Open the input_file in read mode and output_file in write mode
with open('problem-one-answer.csv', 'r') as read_obj, \
open('output_1.csv', 'w', newline='') as write_obj:
# Create a csv.reader object from the input file object
csv_reader = reader(read_obj)
# Create a csv.writer object from the output file object
csv_writer = csv.writer(write_obj)
# Read each row of the input csv file as list
for row in csv_reader:
# Append the default text in the row / list
row.append(default_text)
# Add the updated row / list to the output file
csv_writer.writerow(row)
Thankyou

How can I get a specific field of a csv file?

I need a way to get a specific item(field) of a CSV. Say I have a CSV with 100 rows and 2 columns (comma seperated). First column emails, second column passwords. For example I want to get the password of the email in row 38. So I need only the item from 2nd column row 38...
Say I have a csv file:
aaaaa#aaa.com,bbbbb
ccccc#ccc.com,ddddd
How can I get only 'ddddd' for example?
I'm new to the language and tried some stuff with the csv module, but I don't get it...

import csv
mycsv = csv.reader(open(myfilepath))
for row in mycsv:
text = row[1]
Following the comments to the SO question here, a best, more robust code would be:
import csv
with open(myfilepath, 'rb') as f:
mycsv = csv.reader(f)
for row in mycsv:
text = row[1]
............
Update: If what the OP actually wants is the last string in the last row of the csv file, there are several aproaches that not necesarily needs csv. For example,
fulltxt = open(mifilepath, 'rb').read()
laststring = fulltxt.split(',')[-1]
This is not good for very big files because you load the complete text in memory but could be ok for small files. Note that laststring could include a newline character so strip it before use.
And finally if what the OP wants is the second string in line n (for n=2):
Update 2: This is now the same code than the one in the answer from J.F.Sebastian. (The credit is for him):
import csv
line_number = 2
with open(myfilepath, 'rb') as f:
mycsv = csv.reader(f)
mycsv = list(mycsv)
text = mycsv[line_number][1]
............

#!/usr/bin/env python
"""Print a field specified by row, column numbers from given csv file.
USAGE:
%prog csv_filename row_number column_number
"""
import csv
import sys
filename = sys.argv[1]
row_number, column_number = [int(arg, 10)-1 for arg in sys.argv[2:])]
with open(filename, 'rb') as f:
rows = list(csv.reader(f))
print rows[row_number][column_number]
Example
$ python print-csv-field.py input.csv 2 2
ddddd
Note: list(csv.reader(f)) loads the whole file in memory. To avoid that you could use itertools:
import itertools
# ...
with open(filename, 'rb') as f:
row = next(itertools.islice(csv.reader(f), row_number, row_number+1))
print row[column_number]

import csv
def read_cell(x, y):
with open('file.csv', 'r') as f:
reader = csv.reader(f)
y_count = 0
for n in reader:
if y_count == y:
cell = n[x]
return cell
y_count += 1
print (read_cell(4, 8))
This example prints cell 4, 8 in Python 3.

There is an interesting point you need to catch about csv.reader() object. The csv.reader object is not list type, and not subscriptable.
This works:
for r in csv.reader(file_obj): # file not closed
print r
This does not:
r = csv.reader(file_obj)
print r[0]
So, you first have to convert to list type in order to make the above code work.
r = list( csv.reader(file_obj) )
print r[0]

Finaly I got it!!!
import csv
def select_index(index):
csv_file = open('oscar_age_female.csv', 'r')
csv_reader = csv.DictReader(csv_file)
for line in csv_reader:
l = line['Index']
if l == index:
print(line[' "Name"'])
select_index('11')
"Bette Davis"

Following may be be what you are looking for:
import pandas as pd
df = pd.read_csv("table.csv")
print(df["Password"][row_number])
#where row_number is 38 maybe

import csv
inf = csv.reader(open('yourfile.csv','r'))
for row in inf:
print row[1]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to retrieve specific entries from a csv file in python - python

Related

Adding a Python List to a CSV File as a Column [duplicate]

Edit a piece of data inside a csv

How to convert .dat to .csv using python?

How to add a new column to a CSV file?

How can I get a specific field of a csv file?

Categories

Resources