I am trying to "clean up" some data - I'm creating a dictionary of the channels that I need to keep and then I've got an if block to create a second dictionary with the correct rounding.
Dictionary looks like this:
{'time, s': (imported array), 'x temp, C':(imported array),
'x pressure, kPa': (diff. imported array).....etc}
Each imported array is 1-d.
I was looking at this example, but I didn't quite get the way to parse it so that I ended up with what I want.
My desired output is a csv file (do not care if the delimiter is spaces or commas or whatever) with the first row being the keys and the subsequent rows simply being the values.
I feel like what I'm missing is how to use the map function properly.
Also, I'm wondering if I'm using DictWriter when I should be using DictReader.
This is what I originally tried:
with open((filename), 'wb') as outfile:
write = csv.DictWriter(outfile, Fieldname_order)
write.writer.writerow(Fieldname_order)
write.writerows(data)
DictWriter's API doesn't match the data structure you have. DictWriter requires list of dictionaries. You have a dictionary of lists.
You can use the ordinary csv.writer:
my_data = {'time, s': [0,1,2,3], 'x temp, C':[0,10,20,30],
'x pressure, kPa': [0,100,200,300]}
import csv
with open('outfile.csv', 'w') as outfile:
writer = csv.writer(outfile)
writer.writerow(my_data.keys())
writer.writerows(zip(*my_data.values()))
That will write the columns in arbitrary order, which order may change from run to run. One way to make the order to be consistent is to replace the last two lines with:
writer.writerow(sorted(my_data.keys()))
writer.writerows(zip(*(my_data[k] for k in sorted(my_data.keys()))))
Edit: in this example data is a list of dictionaries. Each row in the csv contains one value for each key.
To write your dictionary with a header row and then data rows:
with open(filename, 'wb') as outfile:
writer = csv.DictWriter(outfile, fieldnames)
writer.writeheader()
writer.writerows(data)
To read in data as a dictionary then you do need to use DictReader:
with open(filename, 'r') as infile:
reader = csv.DictReader(infile)
data = [row for row in reader]
Related
I can read a text file with names and print in ascending order to console. I simply want to write the sorted names to a column in a CSV file. Can't I take the printed(file) and send to CSV?
Thanks!
import csv
with open('/users/h/documents/pyprojects/boy-names.txt','r') as file:
for file in sorted(file):
print(file, end='')
#the following isn't working.
with open('/users/h/documents/pyprojects/boy-names.csv', 'w', newline='') as csvFile:
names = ['Column1']
writer = csv.writer(names)
print(file)
You can do something like this:
import csv
with open('boy-names.txt', 'rt') as file, open('boy-names.csv', 'w', newline='') as csv_file:
csv_writer = csv.writer(csv_file, quoting=csv.QUOTE_MINIMAL)
csv_writer.writerow(['Column1'])
for boy_name in sorted(file.readlines()):
boy_name = boy_name.rstrip('\n')
print(boy_name)
csv_writer.writerow([boy_name])
This is covered in the documentation.
The only tricky part is converting the lines from the file to a list of 1-element lists.
import csv
with open('/users/h/documents/pyprojects/boy-names.txt','r') as file:
names = [[k.strip()] for k in sorted(file.readlines())]
with open('/users/h/documents/pyprojects/boy-names.csv', 'w', newline='') as csvFile:
writer = csv.writer(csvFile)
writer.writerow(['Column1'])
writer.writerows(names)
So, names will contain (for example):
[['Able'],['Baker'],['Charlie'],['Delta']]
The CSV recorder expects to write a row or a set of rows. EACH ROW has to be a list (or tuple). That's why I created it like I did. By calling writerows, the outer list contains the set of rows to be written. Each element of the outer list is a row. I want each row to contain one item, so each is a one element list.
If I had created this:
['Able','Baker','Charlie','Delta']
then writerows would have treated each string as a sequence, resulting in a CSV file like this:
A,b,l,e
B,a,k,e,r
C,h,a,r,l,i,e
D,e,l,t,a
which is amusing but not very useful. And I know that because I did it while I was creating your answer.
I have below code to write my nested list into a csv file. The nested list looks like this
[['19181011', '13041519', '22121605', '11142007', '23000114'],
['1523141612', '2403051513', '0806022324', '1614012422', '0516121805'],
['23201621', '24171811', '08231524', '16011022', '17131220'],
['2317241822', '2220112421', '1124052211', '1010192318', '2108231524'],
['11220215', '24240507', '19180423', '07081422', '21201224']]
with open('MLpredictions.csv', 'w') as f:
writer = csv.writer(f, delimiter=';', lineterminator='\n')
writer.writerows(high5_pred)
But when i execute this code, i get like below in the csv file:
19181011;13041519;22121605;11142007;23000114
1523141612;2403051513;0806022324;1614012422;0516121805....
i changed the delimiter to ',' but then I get 5 different columns.
I want each list to be 1 row separated by ',' and not ';'.
Expected o/p, a single column:
19181011,13041519,22121605,11142007,23000114
1523141612,2403051513,0806022324,1614012422,0516121805
Any ideas how to do this?
Assuming that there is a specific reason why you want the data all in one column:
The reason you're getting seperate columns is because you're using the csv format, and your data is not escaped. Your raw file looks like this:
19181011,13041519,22121605,11142007,23000114
1523141612,2403051513,0806022324,1614012422,0516121805
but you need it to look like this:
"19181011,13041519,22121605,11142007,23000114"
"1523141612,2403051513,0806022324,1614012422,0516121805"
You're probably best to create a string object for each "row" of your output file. I'd do the following:
with open('MLpredictions.csv', 'w') as f:
writer = csv.writer(f, delimiter=';', lineterminator='\n')
rows = [','.join([str(number) for number in row]) for row in high5_pred]
writer.writerows(rows)
Note: unless you have a good reason why you don't want these numbers in different columns, I'd leave your code as is. It will be a lot easier to deal with the native csv format
Most of the samples here show hard-coded columns and not an iteration. I have 73 columns I want iterated and expressed properly in the JSON.
import csv
import json
CSV_yearly = r'C:\path\yearly.csv'
JSON_yearly = r'C:\path\json_yearly.json'
with open(CSV_yearly, 'r') as csv_file:
reader = csv.DictReader(csv_file)
with open(JSON_yearly, 'w') as json_file:
for row in reader:
json_file.write(json.dumps(row) + ',' + '\n')
print "done"
Though this creates a json file it does one improperly. I saw examples where an argument inside reader requested a list, but i don't want to type out 73 columns from the csv. My guess is the line of code goes between the start of with and reader.
Your code creates each line in the file as a separate JSON object (sometimes called JsonL or json-lines format). Collect the rows in a list and then serialise as JSON:
with open(CSV_yearly, 'r') as csv_file:
reader = csv.DictReader(csv_file)
with open(JSON_yearly, 'w') as json_file:
rows = list(reader)
json.dump(rows, json_file)
Note that some consumers of JSON expect an object rather than a list as an outer container, in which case your data would have to be
rows = {'data': list(reader)}
Update: - questions from comments
Do you know why the result did not order my columns accordingly?
csv.DictReader uses a standard Python dictionary to create rows, so the order of keys is arbitrary in Python versions before 3.7. If key order must be preserved, try using an OrderedDict:
from collections import OrderedDict
out = []
with open('mycsv.csv', 'rb') as f:
reader = csv.reader(f)
headings = next(reader) # Assumes first row is headings, otherwise supply your own list
for row in reader:
od = OrderedDict(zip(headings, row))
out.append(od)
# dump out to file using json module
Be aware that while this may output json with the required key order, consumers of the json are not required to respect it.
Do you also know why my values in the json were converted into string and not remain as a number or without parenthesis.
All values from a csv are read as strings. If you want different types then you need to perform the necessary conversions after reading from the csv file.
I have a CSV that looks something like this:
Point ID,Northing,Easting,Elevation,Attributes,Feature Code,Combined Scale Factor
1000,5000000,400000,100,Pipe_ln:pipeline_ln:DEPTH:2.5|Pipe_ln:pipeline_ln:COVER_TYPE:DIRT/GRASS|Pipe_ln:pipeline_ln:COMPANY:ABC|Pipe_ln:pipeline_ln:SIZE:10|Pipe_ln:pipeline_ln, pipe_ln,0.999665
As you can see, the Attributes column contains
Pipe_ln:pipeline_ln:DEPTH:2.5|Pipe_ln:pipeline_ln:COVER_TYPE:DIRT/GRASS|Pipe_ln:pipeline_ln:COMPANY:ABC|Pipe_ln:pipeline_ln:SIZE:10|Pipe_ln:pipeline_ln
from which I would like to extract DEPTH:2.5. (There are ~50 rows, depths varies from row to row)
So far my code is as follows:
field_list = []
with open(infile, 'rb') as raw_csv:
reader = DictReader(raw_csv, delimiter=',')
fieldnames = reader.fieldnames + ['DEPTH']
with open(outfile, 'wb') as out_csv:
writer = DictWriter(out_csv, delimiter=',', fieldnames=fieldnames)
headers = {}
for header in writer.fieldnames:
headers[header] = header
writer.writerow(headers)
for row in reader:
if 'Pipe_ln' in row['Feature Code']:
field_list.append(row['Attributes'].replace("|", ":").split(":"))
writer.writerow(row)
DEPT_vals = header_tuple(field_list)
Under DictReader, I manually add the DEPTH field then on the last line of my code, header_tuple is a function I wrote that results in a list of tuples in the form [('DEPTH', 2.5), ('DEPTH', 2.1), etc...]
I can't wrap my head around the proper way, or any way, to get the tuple depth values to write under the DEPTH header. Any suggestions? Am I taking unnecessary steps? (Someone suggested to me to extract the information I wanted from ATTRIBUTES into tuples so that's why I have them as such).
I'm somewhat new to Python and still trying to learn all its tricks and exploitations.
I'm looking to see if it's possible to collect column data from two separate files to create a single dictionary, rather than two distinct dictionaries. The code that I've used to import files before looks like this:
import csv
from collections import defaultdict
columns = defaultdict(list)
with open("myfile.txt") as f:
reader = csv.DictReader(f,delimiter='\t')
for row in reader:
for (header,variable) in row.items():
columns[header].append(variable)
f.close()
This code makes each element of the first line of the file into a header for the columns of data below it. What I'd like to do now is to import a file that only contains one line which I'll use as my header, and import another file that only contains data that I'll match the headers up to. What I've tried so far resembles this:
columns = defaultdict(list)
with open("headerData.txt") as g:
reader1 = csv.DictReader(g,delimiter='\t')
for row in reader1:
for (h,v) in row.items():
columns[h].append(v)
with open("variableData.txt") as f:
reader = csv.DictReader(f,delimiter='\t')
for row in reader:
for (h,v) in row.items():
columns[h].append(v)
Is nesting the open statements the right way to attempt this? Honestly I am totally lost on what to do. Any help is greatly appreciated.
You can't use DictReader like that if the headers are not in the file. But you can create a fake file object that would yield the headers and then the data, using itertools.chain:
from itertools import chain
with open('headerData.txt') as h, open('variableData.txt') as data:
f = chain(h, data)
reader = csv.DictReader(f,delimiter='\t')
# proceed with you code from the first snippet
# no close() calls needed when using open() with "with" statements
Another way of course would be to just read the headers into a list and use regular csv.reader on variableData.txt:
with open('headerData') as h:
names = next(h).split('\t')
with open('variableData.txt') as f:
reader = csv.reader(f, delimiter='\t')
for row in reader:
for name, value in zip(names, row):
columns[name].append(value)
By default, DictReader will take the first line in your csv file and use that as the keys for the dict. However, according to the docs, you can also pass it a fieldnames parameter, which is a sequence containing the names of the keys to use for the dict. So you could do this:
columns = defaultdict(list)
with open("headerData.txt") as f, open("variableData.txt") as data:
reader = csv.DictReader(data,
fieldnames=f.read().rstrip().split('\t'),
delimiter='\t')
for row in reader:
for (h,v) in row.items():
columns[h].append(v)