Open a csv file in 'rb' mode and iterate over it

Open a csv file in 'rb' mode and iterate over it - python

I've seen an older answer for the post, Inline CSV File Editing with Python, about how to modify a csv file and save it. It uses the tempfile module. I have used the same code in Python 3.6 but I get an error because I read the file in binary mode and I cannot iterate over it.
The code below creates a simple CSV file
import csv
with open('proglanguages.csv', mode='w', newline='') as csv_file:
fieldnames = ['lang', 'value']
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
writer.writerow({'lang': 'Java', 'value': '90'})
writer.writerow({'lang': 'c', 'value': '80'})
writer.writerow({'lang': 'Perl', 'value': '78'})
writer.writerow({'lang': 'C++', 'value': '92'})
writer.writerow({'lang': 'Python', 'value': '0'})
writer.writerow({'lang': 'Fortran', 'value': '70'})
The code below modifies the previous generated CSV file,
from tempfile import NamedTemporaryFile
import shutil
import csv
filename = 'proglanguages.csv'
tempfile = NamedTemporaryFile(delete=False)
with open(filename, mode='rb') as csvFile, tempfile:
reader = csv.reader(csvFile, delimiter=',', quotechar='"')
writer = csv.writer(tempfile, delimiter=',', quotechar='"')
for row in reader:
if row[0] == 'Python':
row[1] = '100'
writer.writerow(row)
else:
writer.writerow(row)
shutil.move(tempfile.name, filename)
In which way, could I use the for-loop for iteration, modify the item and write in the tempfile

You are not lucky here. Answers from Martijn Pieters are always high quality ones. This one was but was targetted at Python 2 while you use Python 3. And the csv semantics have changed... You should no longer use binary mode in Python 3 and do what you did for the initial csv file:
...
tempfile = NamedTemporaryFile(delete=False, newline='', mode='w')
with open(filename, mode='r', newline='') as csvFile, tempfile:
reader = csv.reader(csvFile)
writer = csv.writer(tempfile)
...

Related

Why is my code not working while converting bulk csv to json?

There are two CSV files. I need to convert to JSON. Code is below
import csv
import json
import os
import glob
os.chdir(r'C:\Users\user\Desktop\test' )
result = glob.glob( '*.csv' )
print (result)
def make_json(csvFile, jsonFile):
csvFile, jsonFile = '',''
for i in result:
data = {}
with open(csvFile, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
for rows in csvReader:
key = rows['id']
data[key] = rows
with open(jsonFile, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
csvFilePath =f"{i}"
jsonFilePath =f"{i.split('.')[-2]}.json"
make_json(csvFile, jsonFile)
I got error > csvFile is not mentioned. But the third line from the end mentions the CSV file.
Disclaimer. Please find the error in the code. I already know of the working code which is in pandas

Below is the correct code, but I would recommend you learn to use the python debugger so you can resolve any logic flaws in your code next time. Documentation on the python debugger can be found here:
https://docs.python.org/3/library/pdb.html
Your code was structured in a way that meant for each csv file, you were not setting the file name until after you attempted to open it. The immediate error you saw was caused because you tried to call make_json() before you defined the values for csvFile and jsonFile.
I would recommend changing the code to:
import csv
import json
import glob
def make_json(csvList):
for csvFile in csvList:
data = {}
with open(csvFile, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
for rows in csvReader:
key = rows['id']
data[key] = rows
jsonFile =f"{csvFile.split('.')[-2]}.json"
with open(jsonFile, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
make_json(glob.glob('*.csv'))

You should try this
import csv, json, os, glob
os.chdir(r'C:\Users\user\Desktop\test' )
result = glob.glob( '*.csv' )
print(result)
def make_json():
for i in result:
with open(i, encoding='utf-8') as csvf:
data = [row for row in csv.DictReader(csvf)]
with open(f"{i.split('.')[-2]}.json", 'w', encoding='utf-8') as jsonf:
json.dump(data, jsonf)
make_json()

You did not initialize both the arguments of make_json() - (csvFilePath & jsonFilePath)

Python 3 Encoding Issues when converting to CSV from JSON

Python3
I have looked at the other solutions but they havent seem to have covered the situation I am. I have been charged with writing a script to take JSON and convert it to a CSV file.
I have a good chunk of this done but have encountered an issue when I write the data. The data I received does not match what was written. Below is an example. I am lost on how I can get this to preserve the encoding.
I should mention that the default encoding is UTF-8
Input: necesitará
Output: necesitarÃ¡
import csv
import json
import sys
import sys
print(sys.getdefaultencoding())
stuff = open('data.json')
jsonStuff = json.loads(stuff.read(), encoding="utf-8")
with open('output.csv', 'w', newline='\n', encoding='utf-8') as csvfile:
writer = csv.writer(csvfile, delimiter=",",quotechar='"',quoting=csv.QUOTE_MINIMAL)
for element in jsonStuff:
row = ""
key = element['key']
values = element['valuesRow']
row = element['key']
# values[0]['value'], values[1]['value'], values[2]['value'], values[3]['value'],
writer.writerow([element['key'], values[3]['value']])

Remove encoding='utf-8' in open('output.csv', 'w', newline='\n', encoding='utf-8') should fix it.
data.json (utf-8): {"first": "necesitará", "second": "bodø"}
The following ...
import csv
import json
with open('data.json') as stuff, open('output.csv', 'w', newline='\n', encoding='utf-8') as csvfile:
jsonStuff = json.loads(stuff.read(), encoding="utf-8")
writer = csv.writer(csvfile, delimiter=",", quotechar='"', quoting=csv.QUOTE_MINIMAL)
first = jsonStuff['first']
second = jsonStuff['second']
writer.writerow([first, second])
... gives output.csv: necesitarÃ¡,bodÃ¸
However ...
import csv
import json
with open('data.json') as stuff, open('output.csv', 'w', newline='\n') as csvfile:
jsonStuff = json.loads(stuff.read(), encoding="utf-8")
writer = csv.writer(csvfile, delimiter=",", quotechar='"', quoting=csv.QUOTE_MINIMAL)
first = jsonStuff['first']
second = jsonStuff['second']
writer.writerow([first, second])
... produces output.csv: necesitará,bodø
That said. There is no reason to use json.loads() when you have json.load(), and most of what you've defined are the defaults. I'd simply do ...
import csv
import json
with open('data.json') as jsonfile, open('output.csv', 'w') as csvfile:
json_data = json.load(jsonfile)
writer = csv.writer(csvfile, quoting=csv.QUOTE_MINIMAL)
first = json_data['first']
second = json_data['second']
writer.writerow([first, second])

Python csv register_dialect delimiter is not working

I have the written the code below to read in a large csv file with many variables and then just print 1 variable for every row in the outfile. It is working except that the delimiter is not being picked up.
import csv
fieldnames = ['tag']
outfile = open('ActiveTags.txt', 'w')
csv.register_dialect('me', delimiter=',', quotechar="'", quoting=csv.QUOTE_ALL, lineterminator='')
writer = csv.DictWriter(outfile, fieldnames=fieldnames, dialect='me')
with open('ActiveList_16.csv', 'r', newline='') as f:
reader = csv.DictReader(f)
for row in reader:
Tag = row['Tag']
writer.writerow({'tag': Tag})
outfile.close()
What am I missing here? I do not understand why the delimiter is not working on the outfile.

Python CSV: Can I do this with one 'with open' instead of two?

I am a noobie.
I have written a couple of scripts to modify CSV files I work with.
The scripts:
1.) change the headers of a CSV file then save that to a new CSV file,.
2.) Load that CSV File, and change the order of select columns using DictWriter.
from tkinter import *
from tkinter import filedialog
import os
import csv
root = Tk()
fileName = filedialog.askopenfilename(filetypes=(("Nimble CSV files", "*.csv"),("All files", "*.*")))
outputFileName = os.path.splitext(fileName)[0] + "_deleteme.csv" #my temp file
forUpload = os.path.splitext(fileName)[0] + "_forupload.csv"
#Open the file - change the header then save the file
with open(fileName, 'r', newline='') as infile, open(outputFileName, 'w', newline='') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile, delimiter=',', lineterminator='\n')
row1 = next(reader)
#new header names
row1[0] = 'firstname'
row1[1] = 'lastname'
row1[4] = 'phone'
row1[5] = 'email'
row1[11] = 'address'
row1[21] = 'website'
#write the temporary CSV file
writer.writerow(row1)
for row in reader:
writer.writerow(row)
#Open the temporary CSV file - rearrange some columns
with open(outputFileName, 'r', newline='') as dInFile, open(forUpload, 'w', newline='') as dOutFile:
fieldnames = ['email', 'title', 'firstname', 'lastname', 'company', 'phone', 'website', 'address', 'twitter']
dWriter = csv.DictWriter(dOutFile, restval='', extrasaction='ignore', fieldnames=fieldnames, lineterminator='\n')
dWriter.writeheader()
for row in csv.DictReader(dInFile):
dWriter.writerow(row)
My question is: Is there a more efficient way to do this?
It seems like I shouldn't have to make a temporary CSV file ("_deleteme.csv") I then delete.
I assume making the temporary CSV file is a rookie move -- is there a way to do this all with one 'With open' statement?
Thanks for any help, it is greatly appreciated.
--Luke

csvfile can be any object with a write() method. You could craft a custom element, or use StringIO. You'd have to verify efficiency yourself.

python csv, writing headers only once

So I have a program that creates CSV from .Json.
First I load the json file.
f = open('Data.json')
data = json.load(f)
f.close()
Then I go through it, looking for a specific keyword, if I find that keyword. I'll write everything related to that in a .csv file.
for item in data:
if "light" in item:
write_light_csv('light.csv', item)
This is my write_light_csv function :
def write_light_csv(filename,dic):
with open (filename,'a') as csvfile:
headers = ['TimeStamp', 'light','Proximity']
writer = csv.DictWriter(csvfile, delimiter=',', lineterminator='\n',fieldnames=headers)
writer.writeheader()
writer.writerow({'TimeStamp': dic['ts'], 'light' : dic['light'],'Proximity' : dic['prox']})
I initially had wb+ as the mode, but that cleared everything each time the file was opened for writing. I replaced that with a and now every time it writes, it adds a header. How do I make sure that header is only written once?.

You could check if file is already exists and then don't call writeheader() since you're opening the file with an append option.
Something like that:
import os.path
file_exists = os.path.isfile(filename)
with open (filename, 'a') as csvfile:
headers = ['TimeStamp', 'light', 'Proximity']
writer = csv.DictWriter(csvfile, delimiter=',', lineterminator='\n',fieldnames=headers)
if not file_exists:
writer.writeheader() # file doesn't exist yet, write a header
writer.writerow({'TimeStamp': dic['ts'], 'light': dic['light'], 'Proximity': dic['prox']})

Just another way:
with open(file_path, 'a') as file:
w = csv.DictWriter(file, my_dict.keys())
if file.tell() == 0:
w.writeheader()
w.writerow(my_dict)

You can check if the file is empty
import csv
import os
headers = ['head1', 'head2']
for row in interator:
with open('file.csv', 'a') as f:
file_is_empty = os.stat('file.csv').st_size == 0
writer = csv.writer(f, lineterminator='\n')
if file_is_empty:
writer.writerow(headers)
writer.writerow(row)

I would use some flag and run a check before writing headers! e.g.
flag=0
def get_data(lst):
for i in lst:#say list of url
global flag
respons = requests.get(i)
respons= respons.content.encode('utf-8')
respons=respons.replace('\\','')
print respons
data = json.loads(respons)
fl = codecs.open(r"C:\Users\TEST\Desktop\data1.txt",'ab',encoding='utf-8')
writer = csv.DictWriter(fl,data.keys())
if flag==0:
writer.writeheader()
writer.writerow(data)
flag+=1
print "You have written % times"%(str(flag))
fl.close()
get_data(urls)

Can you change the structure of your code and export the whole file at once?
def write_light_csv(filename, data):
with open (filename, 'w') as csvfile:
headers = ['TimeStamp', 'light','Proximity']
writer = csv.DictWriter(csvfile, delimiter=',', lineterminator='\n',fieldnames=headers)
writer.writeheader()
for item in data:
if "light" in item:
writer.writerow({'TimeStamp': item['ts'], 'light' : item['light'],'Proximity' : item['prox']})
write_light_csv('light.csv', data)

You can use the csv.Sniffer Class and
with open('my.csv', newline='') as csvfile:
if csv.Sniffer().has_header(csvfile.read(1024))
# skip writing headers

While using Pandas: (for storing Dataframe data to CSV file)
just add this check before setting header property if you are using an index to iterate over API calls to add data in CSV file.
if i > 0:
dataset.to_csv('file_name.csv',index=False, mode='a', header=False)
else:
dataset.to_csv('file_name.csv',index=False, mode='a', header=True)

Here's another example that only depends on Python's builtin csv package. This method checks that the header is what's expected or it throws an error. It also handles the case where the file doesn't exist or does exist but is empty by writing the header. Hope this helps:
import csv
import os
def append_to_csv(path, fieldnames, rows):
is_write_header = not os.path.exists(path) or _is_empty_file(path)
if not is_write_header:
_assert_field_names_match(path, fieldnames)
_append_to_csv(path, fieldnames, rows, is_write_header)
def _is_empty_file(path):
return os.stat(path).st_size == 0
def _assert_field_names_match(path, fieldnames):
with open(path, 'r') as f:
reader = csv.reader(f)
header = next(reader)
if header != fieldnames:
raise ValueError(f'Incompatible header: expected {fieldnames}, '
f'but existing file has {header}')
def _append_to_csv(path, fieldnames, rows, is_write_header: bool):
with open(path, 'a') as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
if is_write_header:
writer.writeheader()
writer.writerows(rows)
You can test this with the following code:
file_ = 'countries.csv'
fieldnames_ = ['name', 'area', 'country_code2', 'country_code3']
rows_ = [
{'name': 'Albania', 'area': 28748, 'country_code2': 'AL', 'country_code3': 'ALB'},
{'name': 'Algeria', 'area': 2381741, 'country_code2': 'DZ', 'country_code3': 'DZA'},
{'name': 'American Samoa', 'area': 199, 'country_code2': 'AS', 'country_code3': 'ASM'}
]
append_to_csv(file_, fieldnames_, rows_)
If you run this once you get the following in countries.csv:
name,area,country_code2,country_code3
Albania,28748,AL,ALB
Algeria,2381741,DZ,DZA
American Samoa,199,AS,ASM
And if you run it twice you get the following (note, no second header):
name,area,country_code2,country_code3
Albania,28748,AL,ALB
Algeria,2381741,DZ,DZA
American Samoa,199,AS,ASM
Albania,28748,AL,ALB
Algeria,2381741,DZ,DZA
American Samoa,199,AS,ASM
If you then change the header in countries.csv and run the program again, you'll get a value error, like this:
ValueError: Incompatible header: expected ['name', 'area', 'country_code2', 'country_code3'], but existing file has ['not', 'right', 'fieldnames']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Open a csv file in 'rb' mode and iterate over it - python

Related

Why is my code not working while converting bulk csv to json?

Python 3 Encoding Issues when converting to CSV from JSON

Python csv register_dialect delimiter is not working

Python CSV: Can I do this with one 'with open' instead of two?

python csv, writing headers only once

Categories

Resources