Possibility of writing dictionary items in columns - python

i have a dictionary in which keys are tuples and values are list like
{('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df'): [5.998999999999998,0.0013169999,
4.0000000000000972], ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de'): [7.89899999,
0.15647999999675390, 8.764380000972, 9.200000000]}
I want to write this dictionary to a csv file in the column format like:
('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df') ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de')
5.998999999999998 7.89899999
0.0013169999 0.15647999999675390
4.0000000000000972 8.764380000972
9.200000000
I know the same thing to write in row format using the code:
writer = csv.writer(open('dict.csv', 'wb'))
for key, value in mydict.items():
writer.writerow([key, value])
How do i write the same thing in columns? Is it even possible? Thanks in advance
I referred python docs for csv here: http://docs.python.org/2/library/csv.html. There is no information on column wise writing.

import csv
mydict = {('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df'):
[5.998999999999998, 0.0013169999, 4.0000000000000972],
('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de'):
[7.89899999, 0.15647999999675390, 8.764380000972, 9.200000000]}
with open('dict.csv', 'wb') as file:
writer = csv.writer(file, delimiter='\t')
writer.writerow(mydict.keys())
for row in zip(*mydict.values()):
writer.writerow(list(row))
Output file dict.csv:
('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df') ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de')
5.998999999999998 7.89899999
0.0013169999 0.1564799999967539
4.000000000000097 8.764380000972

I am sure you can figure out the formatting:
>>> d.keys() #gives list of keys for first row
[('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df'), ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de')]
>>> for i in zip(*d.values()): #gives rows with tuple structure for columns
print i
(5.998999999999998, 7.89899999)
(0.0013169999, 0.1564799999967539)
(4.000000000000097, 8.764380000972)
For your code, do this:
writer = csv.writer(open('dict.csv', 'wb'))
writer.writerow(mydict.keys())
for values in zip(*mydict.values()):
writer.writerow(values)
The ()'s and such will not be added to the csv file.

Related

Exporting a complicated dictionary into a csv file

I have a dictionary in which the values are list. here is an example:
example:
d = {'20190606_CFMPB576run3_CF33456_12.RCC': [1.0354477611940298, '0.51'],
'20190606_CFMPB576run3_CF33457_05.RCC': [1.0412757973733584, '1.09'],
'20190606_CFMPB576run3_CF33505_06.RCC': [1.0531309297912714, '0.81']}
I am trying to export this dictionary into a csv file. like this expected output:
expected output:
file_name,citeria1,criteria2
20190606_CFMPB576run3_CF33456_12.RCC,1.0354477611940298, 0.51
20190606_CFMPB576run3_CF33457_05.RCC,1.0412757973733584,1.09
20190606_CFMPB576run3_CF33505_06.RCC,1.0531309297912714,0.81
to do so, I made the following code:
import csv
with open('mycsvfile.csv', 'w') as f:
header = ["file_name","citeria1","criteria2"]
w = csv.DictWriter(f, my_dict.keys())
w.writeheader()
w.writerow(d)
but it does not return what I want. do you know how to fix it?
Change as follows:
import csv
with open('mycsvfile.csv', 'w') as f:
header = ["file_name", "citeria1", "criteria2"]
w = csv.writer(f)
w.writerow(header)
for key, lst in d.items():
w.writerow([key] + lst)
A DictWriter is given the field/column names and assumes the rows to be provided as dictionaries with keys corresponding to the given field names. In your case, the data structure is different. You can use a simple csv.writer as your rows are a mixture of keys and values of your given dictionary.

Convert list of dictionaries to csv file

I have a list of dictionaries of this kind :
[
{'site1':'data1'},
{'site2':'data2'}
]
What would be the proper way to generate a csv file with the data in this order ? :
row 1 row2
site1 data1
site2 data2
Loop through the dictionaries and write them to the file.
list_of_dicts = [{'site1':'data1'},{'site2':'data2'}]
with open('sites.csv', 'w') as file:
file.write('row1\trow2\n')
for dictionary in list_of_dicts:
file.write('\t'.join(list(dictionary.items())[0]) + '\n')
output:
row1 row2
site1 data1
site2 data2
Note that this requires each dictionary to only have one entry in it, if it has more, one is randomly selected and the others are ignored. There are many different ways to handle their being more than one entry in the dictionaries, so you must add the expected behaviour to the question statement for those cases to be accommodated for.
this should do the trick :)
data = [ {'site1':'data1'}, {'site2':'data2'} ]
with open ('list.csv', 'w') as f:
for dict in data:
for key, value in dict.items():
text = key+','+value+'\n'
f.writelines(text)
I like to use pandas's dataframe to make my data and write them into csv files
a = [{'site1':'data1'},{'site2':'data2'}]
#Get each key and values from each dictionaries in the list
keys = []
vals = []
for a1 in a:
for k, v in a1.items():
keys.append(k)
vals.append(v)
#make the dataframe from the keys and values
result = pd.DataFrame({'row1': keys, 'row2':vals})
#write the data into csv, use index=False to not write the row numbers
result.to_csv("mydata.csv", index=False)
You should use a CSV writer to make sure that any embedded metacharacters such as commas and quotes are escaped properly otherwise data such as {'site3':'data, data and more data'} will corrupt the file.
import csv
my_list = [{'site1':'data1'}, {'site2':'data2'}]
with open('test.csv', 'w', newline='') as out_fp:
writer = csv.writer(out_fp)
for d in my_list:
writer.writerows(d.items())
You could shorten that up a bit with itertools if you want
import itertools
with open('test.csv', 'w', newline='') as out_fp:
csv.writer(out_fp).writerows(itertools.chain.from_iterable(
d.items() for d in my_list))

Creating a single dictionary from two tab delimited files

I'm somewhat new to Python and still trying to learn all its tricks and exploitations.
I'm looking to see if it's possible to collect column data from two separate files to create a single dictionary, rather than two distinct dictionaries. The code that I've used to import files before looks like this:
import csv
from collections import defaultdict
columns = defaultdict(list)
with open("myfile.txt") as f:
reader = csv.DictReader(f,delimiter='\t')
for row in reader:
for (header,variable) in row.items():
columns[header].append(variable)
f.close()
This code makes each element of the first line of the file into a header for the columns of data below it. What I'd like to do now is to import a file that only contains one line which I'll use as my header, and import another file that only contains data that I'll match the headers up to. What I've tried so far resembles this:
columns = defaultdict(list)
with open("headerData.txt") as g:
reader1 = csv.DictReader(g,delimiter='\t')
for row in reader1:
for (h,v) in row.items():
columns[h].append(v)
with open("variableData.txt") as f:
reader = csv.DictReader(f,delimiter='\t')
for row in reader:
for (h,v) in row.items():
columns[h].append(v)
Is nesting the open statements the right way to attempt this? Honestly I am totally lost on what to do. Any help is greatly appreciated.
You can't use DictReader like that if the headers are not in the file. But you can create a fake file object that would yield the headers and then the data, using itertools.chain:
from itertools import chain
with open('headerData.txt') as h, open('variableData.txt') as data:
f = chain(h, data)
reader = csv.DictReader(f,delimiter='\t')
# proceed with you code from the first snippet
# no close() calls needed when using open() with "with" statements
Another way of course would be to just read the headers into a list and use regular csv.reader on variableData.txt:
with open('headerData') as h:
names = next(h).split('\t')
with open('variableData.txt') as f:
reader = csv.reader(f, delimiter='\t')
for row in reader:
for name, value in zip(names, row):
columns[name].append(value)
By default, DictReader will take the first line in your csv file and use that as the keys for the dict. However, according to the docs, you can also pass it a fieldnames parameter, which is a sequence containing the names of the keys to use for the dict. So you could do this:
columns = defaultdict(list)
with open("headerData.txt") as f, open("variableData.txt") as data:
reader = csv.DictReader(data,
fieldnames=f.read().rstrip().split('\t'),
delimiter='\t')
for row in reader:
for (h,v) in row.items():
columns[h].append(v)

Syntax - saving a dictionary as a csv file

I am trying to "clean up" some data - I'm creating a dictionary of the channels that I need to keep and then I've got an if block to create a second dictionary with the correct rounding.
Dictionary looks like this:
{'time, s': (imported array), 'x temp, C':(imported array),
'x pressure, kPa': (diff. imported array).....etc}
Each imported array is 1-d.
I was looking at this example, but I didn't quite get the way to parse it so that I ended up with what I want.
My desired output is a csv file (do not care if the delimiter is spaces or commas or whatever) with the first row being the keys and the subsequent rows simply being the values.
I feel like what I'm missing is how to use the map function properly.
Also, I'm wondering if I'm using DictWriter when I should be using DictReader.
This is what I originally tried:
with open((filename), 'wb') as outfile:
write = csv.DictWriter(outfile, Fieldname_order)
write.writer.writerow(Fieldname_order)
write.writerows(data)
DictWriter's API doesn't match the data structure you have. DictWriter requires list of dictionaries. You have a dictionary of lists.
You can use the ordinary csv.writer:
my_data = {'time, s': [0,1,2,3], 'x temp, C':[0,10,20,30],
'x pressure, kPa': [0,100,200,300]}
import csv
with open('outfile.csv', 'w') as outfile:
writer = csv.writer(outfile)
writer.writerow(my_data.keys())
writer.writerows(zip(*my_data.values()))
That will write the columns in arbitrary order, which order may change from run to run. One way to make the order to be consistent is to replace the last two lines with:
writer.writerow(sorted(my_data.keys()))
writer.writerows(zip(*(my_data[k] for k in sorted(my_data.keys()))))
Edit: in this example data is a list of dictionaries. Each row in the csv contains one value for each key.
To write your dictionary with a header row and then data rows:
with open(filename, 'wb') as outfile:
writer = csv.DictWriter(outfile, fieldnames)
writer.writeheader()
writer.writerows(data)
To read in data as a dictionary then you do need to use DictReader:
with open(filename, 'r') as infile:
reader = csv.DictReader(infile)
data = [row for row in reader]

Write Python dictionary to CSV where where keys= columns, values = rows

I have a list of dictionaries that I want to be able to open in Excel, formatted correctly. This is what I have so far, using csv:
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(ipath, 'wb')
writer = csv.writer(ofile, dialect = 'excel')
for items in list_of_dicts:
for k,v in items.items():
writer.writerow([k,v])
Obviously, when I open the output in Excel, it's formatted like this:
key value
key value
What I want is this:
key key key
value value value
I can't figure out how to do this, so help would be appreciated. Also, I want the column names to be the dictionary keys, in stead of the default 'A, B, C' etc. Sorry if this is stupid.
Thanks
The csv module has a DictWriter class for this, which is covered quite nicely in another SO answer. The critical point is that you need to know all your column headings when you instantiate the DictWriter. You could construct the list of field names from your list_of_dicts, if so your code becomes
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(out_path, 'wb')
fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))
writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')
writer.writeheader() # Assumes Python >= 2.7
for row in list_of_dicts:
writer.writerow(row)
out_file.close()
The way I've constructed fieldnames scans the entire list_of_dicts, so it will slow down as the size increases. You should instead construct fieldnames directly from the source of your data e.g. if the source of your data is also a csv file you can use a DictReader and use fieldnames = reader.fieldnames.
You can also replace the for loop with a single call to writer.writerows(list_of_dicts) and use a with block to handle file closure, in which case your code would become
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))
with open(out_path, 'wb') as out_file:
writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')
writer.writeheader()
writer.writerows(list_of_dicts)
You need to write 2 separate rows, one with the keys, one with the values, instead:
writer = csv.writer(ofile, dialect = 'excel')
writer.writerow([k for d in list_of_dicts k in d])
writer.writerow([v for d in list_of_dicts v in d.itervalues()])
The two list comprehensions extract first all the keys, then all the values, from the dictionaries in your input list, combining these into one list to write to the CSV file.
I think that the most useful is to write the column by column, so each key is a column (good for later on data processing and use for e.g. ML).
I had some trouble yesterday figuring it out but I came up with the solution I saw on some other website. However, from what I see it is not possible to go through the whole dictionary at once and we have to divide it on smaller dictionaries (my csv file had 20k rows at the end - surveyed person, their data and answers. I did it like this:
# writing dict to csv
# 'cleaned' is a name of the output file
# 1 header
# fildnames is going to be columns names
# 2 create writer
writer = csv.DictWriter(cleaned, d.keys())
# 3 attach header
writer.writeheader()
# write separate dictionarties
for i in range(len(list(d.values())[0])):
writer.writerow({key:d[key][i] for key in d.keys()})
I see my solution has one more for loop but from the other hand, I think it takes less memory (but, I am not sure!!)
Hope it'd help somebody ;)

Categories