Related
I started learning python and was wondering if there was a way to create multiple files from unique values of a column. I know there are 100's of ways of getting it done through pandas. But I am looking to have it done through inbuilt libraries. I couldn't find a single example where its done through inbuilt libraries.
Here is the sample csv file data:
uniquevalue|count
a|123
b|345
c|567
d|789
a|123
b|345
c|567
Sample output file:
a.csv
uniquevalue|count
a|123
a|123
b.csv
b|345
b|345
I am struggling with looping on unique values in a column and then print them out. Can someone explain with logic how to do it ? That will be much appreciated. Thanks.
import csv
from collections import defaultdict
header = []
data = defaultdict(list)
DELIMITER = "|"
with open("inputfile.csv", newline="") as csvfile:
reader = csv.reader(csvfile, delimiter=DELIMITER)
for i, row in enumerate(reader):
if i == 0:
header = row
else:
key = row[0]
data[key].append(row)
for key, value in data.items():
filename = f"{key}.csv"
with open(filename, "w", newline="") as f:
writer = csv.writer(f, delimiter=DELIMITER)
rows = [header] + value
writer.writerows(rows)
import csv
with open('sample.csv', newline='') as csvfile:
reader = csv.reader(csvfile)
for row in reader:
with open(f"{row[0]}.csv", 'a') as inner:
writer = csv.writer(
inner, delimiter='|',
fieldnames=('uniquevalue', 'count')
)
writer.writerow(row)
the task can also be done without using csv module. the lines of the file are read, and with read_file.read().splitlines()[1:] the newline characters are stripped off, also skipping the header line of the csv file. with a set a unique collection of inputdata is created, that is used to count number of duplicates and to create the output files.
with open("unique_sample.csv", "r") as read_file:
items = read_file.read().splitlines()[1:]
for line in set(items):
with open(line[:line.index('|')] + '.csv', 'w') as output:
output.write((line + '\n') * items.count(line))
I have a csv with two columns of data. I want to extract data from one column and write to a text file with single-quote on each element and separated by a comma. For example, I have this..
taxable_entity_id,id
45efc167-9254-406c-b5a8-6aef91a73dd9,331999
5ae97680-f489-4182-9dcb-eb07a73fab15,103507
00018d93-ae71-4367-a0da-f252cea4dfa2,32991
I want all the taxable_entity_ids in a text file like this
'45efc167-9254-406c-b5a8-6aef91a73dd9','5ae97680-f489-4182-9dcb-eb07a73fab15','00018d93-ae71-4367-a0da-f252cea4dfa2'
without any space between two elements, separated by a comma.
Edit:
This is what i tried..
import csv
with open("Taxable_entity_those_who_filed_G1_M_July_but_not_in_Aug.csv", 'r') as csv_File:
reader = csv.DictReader(csv_File)
with open("te_id.csv", 'w') as text_file:
writer = csv.writer(text_file, quotechar='\'', quoting=csv.QUOTE_MINIMAL)
for row in reader:
writer.writerow(row["taxable_entity_id"])
# print(row["taxable_entity_id"])
text_file.close()
csv_File.close()
and this is what I got..
4,5,e,f,c,1,6,7,-,9,2,5,4,-,4,0,6,c,-,b,5,a,8,-,6,a,e,f,9,1,a,7,3,d,d,9
5,a,e,9,7,6,8,0,-,f,4,8,9,-,4,1,8,2,-,9,d,c,b,-,e,b,0,7,a,7,3,f,a,b,1,5
0,0,0,1,8,d,9,3,-,a,e,7,1,-,4,3,6,7,-,a,0,d,a,-,f,2,5,2,c,e,a,4,d,f,a,2
You were close. Simply as you want one single line in the output file, you should write it at once by using a comprehension:
import csv
with open("Taxable_entity_those_who_filed_G1_M_July_but_not_in_Aug.csv", 'r') as csv_File:
reader = csv.DictReader(csv_File)
with open("te_id.csv", 'w') as text_file:
# use QUOTE_ALL to force the quoting
writer = csv.writer(text_file, quotechar='\'', quoting=csv.QUOTE_ALL)
writer.writerow((row["taxable_entity_id"] for row in reader))
And do not use close as you have (correctly) used with.
try that
import pandas as pd
df = pd.read_csv('nameoffile.csv',delimiter = ',')
X = df[0].values
f = open('newfile.txt','w')
for i in X:
f.write(X[i] + ',')
f.close()
It's seems a little odd that you basically want a one row csv file for the taxable_entity_ids, but certain possible. You also don't need to explicitly close() the open files because the with context manager will do it for you automatically.
You also need to open the CSV file with newline='' as shown in all the examples in the csv module's documentation.
Lastly, if you want the all the fields to be quoted you need to use quoting=csv.QUOTE_ALL instead of quoting=csv.QUOTE_MINIMAL.
import csv
inp_filename = "Taxable_entity_those_who_filed_G1_M_July_but_not_in_Aug.csv"
outp_filename = "te_id.csv"
with open(outp_filename, 'w', newline='') as text_file, \
open(inp_filename, 'r', newline='') as csv_File:
reader = csv.DictReader(csv_File)
writer = csv.writer(text_file, quotechar="'", quoting=csv.QUOTE_ALL)
taxable_entity_ids = (row["taxable_entity_id"] for row in reader)
writer.writerow(taxable_entity_ids)
print('done')
I am having a csv file and i want to write it to another csv file.
It's a bit complicated than it seems. Hoping someone to correct my code and rewrite it, so that i can get the desired csvfile. I am using both versions python 2 and 3.
mycsvfile:
id,field1,point_x,point_y,point_z
a1,x1,10,12,3
b1,x2,20,22,5
a2,x1,25,17,7
a1,x2,35,13,3
a1,x5,15,19,9
b1,x1,65,11,2
b2,x5,50,23,1
b2,x1,75,17,7
c1,x2,70,87,2
c2,x1,80,67,4
c3,x2,85,51,6
Figure: mycsvfile
Mycode:
import os
import csv
import collections
from csv import DictWriter
with open(r'C:\Users\Desktop\kar_csv_test\workfiles\incsv.csv', 'r') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
my_dict = collections.defaultdict(dict)
for row in reader:
my_dict[row[0]][row[1]] = [row[2],row[3],row[4]]
print (my_dict)
with open(r'C:\Users\Desktop\kar_csv_test\workfiles\outcsv.csv','w', newline='') as wf:
fieldnames = ['id', 'x1(point_x)', 'x1(point_y)', 'x1(point_z)', 'x2(point_x)', 'x2(point_y)', 'x2(point_z)'] # >>>>>>etc, till x20(point_x), x20(point_y), x20(point_z)
my_write = csv.DictWriter(wf, fieldnames = fieldnames, delimiter = ',')
my_write.writeheader()
Desired output as csv file:
id,x1(point_x),x1(point_y),x1(point_z),x2(point_x),x2(point_y),x2(point_z)
a1,10,12,3,35,13,3,
a2,25,17,7,,,,
b1,65,11,2,20,22,5,
b2,75,17,7,,,,
c1,,,,70,87,2,
c2,80,67,4,,,,
c3,,,,85,51,6,
Figure: Desiredcsvfile
This answer is for Python3 only. The csv module has a very different interface between Python2 and Python3 and writing compatible code is beyond what I am ready to do here.
Here, I would compute the fieldnames list, and compute each row on the same pattern:
...
with open(r'C:\Users\Desktop\kar_csv_test\workfiles\outcsv.csv','w', newline='') as wf:
fieldnames = ['id'] + ['x{}(point_{})'.format(i, j)
for i in range(1, 6) for j in "xyz"] # only up to x5 here
my_write = csv.DictWriter(wf, fieldnames = fieldnames, delimiter = ',')
my_write.writeheader()
for k, v in my_dict.items():
row = {'x{}(point_{})'.format(i, k):
v.get('x{}'.format(i), ('','',''))[j] # get allows to get a blank triple is absent
for i in range(1,6) for j,k in enumerate("xyz")}
row['id'] = k # do not forget the id...
my_write.writerow(row)
With your example data, it gives:
id,x1(point_x),x1(point_y),x1(point_z),x2(point_x),x2(point_y),x2(point_z),x3(point_x),x3(point_y),x3(point_z),x4(point_x),x4(point_y),x4(point_z),x5(point_x),x5(point_y),x5(point_z)
a1,10,12,3,35,13,3,,,,,,,15,19,9
b1,65,11,2,20,22,5,,,,,,,,,
a2,25,17,7,,,,,,,,,,,,
b2,75,17,7,,,,,,,,,,50,23,1
c1,,,,70,87,2,,,,,,,,,
c2,80,67,4,,,,,,,,,,,,
c3,,,,85,51,6,,,,,,,,,
After reviewed few recommend similar topic about comparing value, there is not much help for me.
car.csv
tittle1,tittle2
bmw,2000
mercedes,2000
toyota,1000
honda,1500
geely,500
price.csv
ori_price1,new_price2
2000,5000
1000,2500
The result should looks like
tittle1,tittle2
bmw,5000
mercedes,5000
toyota,2500
honda,1500
geely,500
I have found the code below is very close to the result
import csv
with open('car.csv', 'r') as csv_file, open('price.csv', 'r', newline='') as csv_file2 \
,open('result.csv', 'w', newline='') as new_file:
csv_reader = csv.DictReader(csv_file)
csv_reader2 = csv.DictReader(csv_file2)
csv_writer = csv.writer(new_file)
csv_writer.writerow([ 'tittle1', 'title2'])
for row1,row2 in zip(csv_reader,csv_reader2):
csv_writer.writerow([row1['tittle1'],row1['tittle2'],row2['new_price2']])
With pandas, it can be rather simple:
import pandas as pd
cars=pd.read_csv("./cars.csv")
prices=pd.read_csv("./price.csv")
### iterate over the price changes as rows
for i,row in prices.iterrows():
### find where cars["tittle2"]==row["ori_price1"]
# and update column ['tittle2'] to row["new_price2"]
cars.loc[cars["tittle2"]==row["ori_price1"], ['tittle2']] = row["new_price2"]
cars.to_csv("cars_updated.csv",index=False)
import csv
with open("somecities.csv") as f:
reader = csv.DictReader(f)
data = [r for r in reader]
Contents of somecities.csv:
Country,Capital,CountryPop,AreaSqKm
Canada,Ottawa,35151728,9984670
USA,Washington DC,323127513,9833520
Japan,Tokyo,126740000,377972
Luxembourg,Luxembourg City,576249,2586
New to python and I'm trying to read and append a csv file. I've spent some time experimenting with some responses to similar questions with no luck--which is why I believe the code above to be pretty useless.
What I am essentially trying to achieve is to store each row from the CSV in memory using a dictionary, with the country names as keys, and values being tuples containing the other information in the table in the sequence they are in within the CSV file.
And from there I am trying to add three more cities to the csv(Country, Capital, CountryPop, AreaSqKm) and view the updated csv. How should I go about doing all of this?
The desired additions to the updated csv are:
Brazil, Brasília, 211224219, 8358140
China, Beijing, 1403500365, 9388211
Belgium, Brussels, 11250000, 30528
EDIT:
Import csv
with open("somecities.csv", "r") as csvinput:
with open(" somecities_update.csv", "w") as csvresult:
writer = csv.writer(csvresult, lineterminator='\n')
reader = csv.reader(csvinput)
all = []
headers = next(reader)
for row in reader:
all.append(row)
# Now we write to the new file
writer.write(headers)
for record in all:
writer.write(record)
#row.append(Brazil, Brasília, 211224219, 8358140)
#row.append(China, Beijing, 1403500365, 9388211)
#row.append(Belgium, Brussels, 11250000, 30528)
So assuming you can use pandas for this I would go about it this way:
import pandas as pd
df1 = pd.read_csv('your_initial_file.csv', index_col='Country')
df2 = pd.read_csv('your_second_file.csv', index_col='Country')
dfs = [df1,df2]
final_df = pd.concat(dfs)
DictReader will only represent each row as a dictionary, eg:
{
"Country": "Canada",
...,
"AreaSqKm": "9984670"
}
If you want to store the whole CSV as a dictionary you'll have to create your own:
import csv
all_data = {}
with open("somecities.csv", "r") as f:
reader = csv.DictReader(f)
for row in reader:
# Key = country, values = tuple containing the rest of the data.
all_data[row["Country"]] = (row["Capital"], row["CountryPop"], row["AreaSqKm"])
# Add the new cities to the dictionary here...
# To write the data to a new CSV
with open("newcities.csv", "w") as f:
writer = csv.writer(f)
for key, values in all_data.items():
writer.writerow([key] + list(values))
As others have said, though, the pandas library could be a good choice. Check out its read_csv and to_csv functions.
Just another idea with creating and list and appending the new values through list construct as below, not tested:
import csv
with open("somecities.csv", "r") as csvinput:
with open("result.csv", "w") as csvresult:
writer = csv.writer(csvresult, lineterminator='\n')
reader = csv.reader(csvinput)
all = []
row = next(reader)
row.append(Brazil, Brasília, 211224219, 8358140)
row.append(China, Beijing, 1403500365, 9388211)
all.append(row)
for row in reader:
row.append(row[0])
all.append(row)
writer.writerows(all)
The simplest Form i see, tested in python 3.6
Opening a file with the 'a' parameter allows you to append to the end of the file instead of simply overwriting the existing content. Try that.
>>> with open("somecities.csv", "a") as fd:
... fd.write("Brazil, Brasília, 211224219, 8358140")
OR
#!/usr/bin/python3.6
import csv
fields=['Brazil', 'Brasília', '211224219','8358140']
with open(r'somecities.csv', 'a') as f:
writer = csv.writer(f)
writer.writerow(fields)