Storing list in to csv file using python - python

I'm trying to store a type list variable in to a csv file using python. Here is what I got after hours on StackOverflow and python documentation:
Code:
row = {'SgDescription': 'some group', 'SgName': 'sku', 'SgGroupId': u'sg-abcdefgh'}
new_csv_file = open("new_file.csv",'wb')
ruleswriter = csv.writer(new_csv_file,dialect='excel',delimiter=',')
ruleswriter.writerows(row)
new_csv_file.close()
Result:
$ more new_file.csv
S,g,D,e,s,c,r,i,p,t,i,o,n
S,g,N,a,m,e
S,g,G,r,o,u,p,I,d
Can anyone please advice how to store the values to the file like this:
some group,sku,sg-abcdefgh
Thanks a ton!

writerows() expects a sequence of sequences, for example a list of lists. You're passing in a dict, and a dict happens to be iterable: It returns the keys of the dictionary. Each key -- a string -- happens to be iterable as well. So what you get is an element of each iterable per cell, which is a character. You got exactly what you asked for :-)
What you want to do is write one row, with the keys in it, and then maybe another with the values, eg:
import csv
row = {
'SgDescription': 'some group',
'SgName': 'sku',
'SgGroupId': u'sg-abcdefgh'
}
with open("new_file.csv",'wb') as f:
ruleswriter = csv.writer(f)
ruleswriter.writerows([row.keys(), row.values()])
If order is important, use collections.OrderedDict.

Extract your desired data before writing into a csv file,
row = [row['SgDescription'], row['SgName'], row['SgGroupId']] # ['some group', 'sku', u'sg-abcdefgh']
# write to a csv file
with open("new_file.csv",'wb') as f:
ruleswriter = csv.writer(f)
ruleswriter.writerow(row)
PS: if you don't care about the order, just use row.values().
Or use csv.DictWriter,
import csv
row = {'SgDescription': 'some group', 'SgName': 'sku', 'SgGroupId': u'sg-abcdefgh'}
with open('new_file.csv', 'w') as csvfile:
fieldnames = ['SgDescription', 'SgName', 'SgGroupId']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writerow(row)

Related

CSV to JSON array only numbers

import csv
import json
myDict = {}
jsonStr = json.dumps(myDict)
print(jsonStr)
with open('test - Cópia.csv', 'rb',encoding=) as csvdata:
reader = csv.DictReader(csvdata,fieldnames=['Time', 'Yaw','Pitch','Roll','Ax','Ay','Az','Gx','Gy','Gz','Mx','My','Mz'])
json.dump([row for row in reader], open('output.json', 'w+'))
I have a problem that I can´t figure it out yet. I Have a csv only with numbers, no header. With 14 columns, Each column name are above in []. I have to create a JSON that puts the key name and each key name have an array with all the numbers that I have in csv.
CSV is like this :
1364.00,0.15,0.36,-0.13,-3.24,-0.42,-0.15,0.90,0.00,-0.01,0.02,0.26,0.01,-0.04
1374.00,0.30,0.76,-0.25,-3.25,-0.41,-0.13,0.91,0.00,-0.00,0.02,0.26,0.01,-0.04
1384.00,0.45,1.08,-0.35,-3.17,-0.41,-0.10,1.00,-0.00,-0.01,0.02,0.26,0.01,-0.07
1394.00,0.61,1.44,-0.49,-3.21,-0.40,-0.10,1.01,-0.00,-0.01,0.02,0.26,0.01,-0.07
1404.00,0.77,1.81,-0.65,-3.25,-0.40,-0.11,1.00,-0.01,-0.01,0.02,0.26,0.01,-0.07
1414.00,0.92,2.12,-0.83,-3.29,-0.38,-0.14,0.98,-0.00,-0.01,0.02,0.26,0.01,-0.07
1424.00,1.05,2.43,-1.01,-3.34,-0.37,-0.14,0.96,-0.00,-0.01,0.02,0.26,0.01,-0.07
1434.00,1.21,2.78,-1.15,-2.95,-0.38,-0.10,0.91,-0.00,-0.01,0.02,0.26,0.01,-0.05
1444.00,1.35,3.10,-1.27,-2.97,-0.37,-0.09,0.90,-0.00,-0.01,0.02,0.26,0.01,-0.05
1454.00,1.49,3.42,-1.39,-2.99,-0.37,-0.10,0.90,-0.00,-0.01,0.02,0.26,0.01,-0.05
1464.00,1.62,3.74,-1.57,-3.02,-0.37,-0.14,0.90,-0.00,-0.01,0.02,0.26,0.01,-0.05
1474.00,1.74,4.08,-1.77,-3.05,-0.38,-0.16,0.87,-0.00,-0.01,0.02,0.26,0.01,-0.05
2054.00,8.39,14.06,-10.55,-0.08,-0.05,0.06,1.20,-0.01,0.02,-0.00,0.24,-0.01,-0.04
and I want to create a JSON file like this
session 1 { "Time": [an array with all the numbers that are in column 0 of csv],
"Pitch": [an array with all the numbers that are in column 1 of csv],
...
}
Here's how to do it using only built-in functions and modules included in Python's standard library.
As I mentioned in a comment, you will need to read in the entire CSV file first. This is because its row and columns need to be transposed in order to output them as an array the way you want. Fortunately doing that is easy using the built-in zip() function.
Also note the use of csv.reader instead of a csv.DictReader. This change was made because the zip() function couldn't (easily) be used on a list of dictionaries. The field names are still used, but it's not until the dictionary is created as describe next. Note that this will ignore the extra value in each row that does not have a fieldname.
You can use a dictionary comprehension to create one formatted the way you want before calling json.dump() to write it to the output file.
import csv
import json
fieldnames = 'Time','Yaw','Pitch','Roll','Ax','Ay','Az','Gx','Gy','Gz','Mx','My','Mz'
csv_filepath = 'test - Cópia.csv'
json_filepath = 'output.json'
with open(csv_filepath, 'r', newline='') as csv_file:
rows = (map(float, row) for row in csv.reader(csv_file)) # Expr to read entire CSV file.
data = tuple(zip(*rows)) # Transpose the CSV file's rows and cols.
my_dict = {'session 1': {fieldname: row for fieldname in fieldnames for row in data}}
with open(json_filepath, 'w') as json_file:
json.dump(my_dict, json_file, indent=4)
print('done')
I recommend using Pandas to take care of the details of reading CSVs. It also has convenient features for manipulating the data.
import pandas as pd
import json
# Note that your column names include one less column than the sample CSV
df = pd.read_csv('test.csv', names=['Time', 'Yaw', 'Pitch', 'Roll', 'Ax', 'Ay', 'Az', 'Gx', 'Gy', 'Gz', 'Mx', 'My', 'Mz', 'extra'])
# Note that this can be simplified if you don't need the top-level "session 1"
output = {'session 1': df.to_dict(orient='list')}
with open('output.json', 'w') as f:
json.dump(output, f)
If for some reason you need a solution using only stdlib, here's one which uses a defaultdict to build the inner output. An important note is that the csv module reads things as strings by default, so you need to convert the numbers to floats otherwise they will be quoted in the output json.
import csv
from collections import defaultdict
inner_output = defaultdict(list)
fieldnames = ['Time', 'Yaw', 'Pitch', 'Roll', 'Ax', 'Ay', 'Az', 'Gx', 'Gy', 'Gz', 'Mx', 'My', 'Mz', 'extra']
reader = csv.reader(open('test.csv'))
for row in reader:
for name, value in zip(fieldnames, row):
# Note that the csv module only reads strings
inner_output[name].append(float(value))
output = {'session 1': inner_output}

Output nested array with variable length to CSV

I've got a datasample that I would like to output in a CSV file. The data structure is a nested list of different german terms (dict) and the corresponding possible english translations (list):
all_terms = [{'Motor': ['engine', 'motor']},
{'Ziel': ['purpose', 'goal', 'aim', 'destination']}]
As you can see, one german term could hold variable quantities of english translations. I want to output each german term and each of its corresponding translations into separate columns in one row, so "Motor" is in column 1, "engine" in column 2 and "motor" in column 3.
See example:
I just don't know how to loop correctly through the data.
So far, my code to output:
with open(filename, 'a') as csv_file:
writer = csv.writer(csv_file)
# The for loop
for x in all_terms:
for i in x:
for num in i:
writer.writerow([i, x[i][num]])
But this error is thrown out:
writer.writerow([i, x[i][num]]) TypeError: list indices must be integers, not unicode
Any hint appreciated, and maybe there's even a smarter way than 3 nested for loops.
How about the following solution:
import csv
all_terms = [{'Motor': ['engine', 'motor']},
{'Ziel': ['purpose', 'goal', 'aim', 'destination']}]
with open('test.csv', 'a') as csv_file:
writer = csv.writer(csv_file)
# The for loop
for small_dict in all_terms:
for key in small_dict:
output = [key, *small_dict[key]]
writer.writerow(output)
Output in test.txt:
Motor,engine,motor
Ziel,purpose,goal,aim,destination
I used * operator to unpack all items inside the dictionary's values to create a row for the writerow to write in. This can potentially take care of the case if you have multiple entries in a dictionary inside of all_terms.
Here's a way to do it:
import csv
all_terms = [{'Motor': ['engine', 'motor']},
{'Ziel': ['purpose', 'goal', 'aim', 'destination']}]
filename = 'tranlations.csv'
with open(filename, 'a', newline='') as csv_file:
writer = csv.writer(csv_file)
for term in all_terms:
word, translations = term.popitem()
row = [word] + translations
writer.writerow(row)
CSV file's contents afterwards:
Motor,engine,motor
Ziel,purpose,goal,aim,destination

writing to a single CSV file from multiple dictionaries

Background
I have multiple dictionaries of different lengths. I need to write the values of dictionaries to a single CSV file. I figured I can loop through each dictionary one by one and write the data to CSV. I ran in to a small formatting issue.
Problem/Solution
I realized after I loop through the first dictionary the data of the second writing gets written the row where the first dictionary ended as displayed in the first image I would ideally want my data to print as show in the second image
My Code
import csv
e = {'Jay':10,'Ray':40}
c = {'Google':5000}
def writeData():
with open('employee_file20.csv', mode='w') as csv_file:
fieldnames = ['emp_name','age','company_name','size']
writer = csv.DictWriter(csv_file, fieldnames=fieldnames)
writer.writeheader()
for name in e:
writer.writerow({'emp_name':name,'age':e.get(name)})
for company in c:
writer.writerow({'company_name':company,'size':c.get(company)})
writeData()
PS: I would have more than 2 dictionaries so I am looking for a generic way where I can print data from row under the header for all the dictionaries. I am open to all solutions and suggestions.
If all dictionaries are of equal size, you could use zip to iterate over them in parallel. If they aren't of equal size, and you want the iteration to pad to the longest dict, you could use itertools.zip_longest
For example:
import csv
from itertools import zip_longest
e = {'Jay':10,'Ray':40}
c = {'Google':5000}
def writeData():
with open('employee_file20.csv', mode='w') as csv_file:
fieldnames = ['emp_name','age','company_name','size']
writer = csv.writer(csv_file)
writer.writerow(fieldnames)
for employee, company in zip_longest(e.items(), c.items()):
row = list(employee)
row += list(company) if company is not None else ['', ''] # Write empty fields if no company
writer.writerow(row)
writeData()
If the dicts are of equal size, it's simpler:
import csv
e = {'Jay':10,'Ray':40}
c = {'Google':5000, 'Yahoo': 3000}
def writeData():
with open('employee_file20.csv', mode='w') as csv_file:
fieldnames = ['emp_name', 'age', 'company_name', 'size']
writer = csv.writer(csv_file)
writer.writerow(fieldnames)
for employee, company in zip(e.items(), c.items()):
writer.writerow(employee + company)
writeData()
A little side note: If you use Python3, dictionaries are ordered. This isn't the case in Python2. So if you use Python2, you should use collections.OrderedDict instead of the standard dictionary.
There might be a more pythonic solution, but I'd do something like this:
I haven't used your .csv writer thing before, so I just made my own comma separated output.
e = {'Jay':10,'Ray':40}
c = {'Google':5000}
dict_list = [e,c] # add more dicts here.
max_dict_size = max(len(d) for d in dict_list)
output = ""
# Add header information here.
for i in range(max_dict_size):
for j in range(len(dict_list)):
key, value = dict_list[j].popitem() if len(dict_list[j]) else ("","")
output += f"{key},{value},"
output += "\n"
# Now output should contain the full text of the .csv file
# Do file manipulation here.
# You could also do it after each row,
# Where I currently have the output += "\n"
Edit: A little more thinking and I found something that might polish this a bit. You could first map the dictionary into a list of keys using the .key() function on each dictionary and appending those to an empty list.
The advantage with that is that you'd be able to go "forward" instead of popping the dictionary items off the back. It also wouldn't destroy the dictionary.

Possibility of writing dictionary items in columns

i have a dictionary in which keys are tuples and values are list like
{('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df'): [5.998999999999998,0.0013169999,
4.0000000000000972], ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de'): [7.89899999,
0.15647999999675390, 8.764380000972, 9.200000000]}
I want to write this dictionary to a csv file in the column format like:
('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df') ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de')
5.998999999999998 7.89899999
0.0013169999 0.15647999999675390
4.0000000000000972 8.764380000972
9.200000000
I know the same thing to write in row format using the code:
writer = csv.writer(open('dict.csv', 'wb'))
for key, value in mydict.items():
writer.writerow([key, value])
How do i write the same thing in columns? Is it even possible? Thanks in advance
I referred python docs for csv here: http://docs.python.org/2/library/csv.html. There is no information on column wise writing.
import csv
mydict = {('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df'):
[5.998999999999998, 0.0013169999, 4.0000000000000972],
('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de'):
[7.89899999, 0.15647999999675390, 8.764380000972, 9.200000000]}
with open('dict.csv', 'wb') as file:
writer = csv.writer(file, delimiter='\t')
writer.writerow(mydict.keys())
for row in zip(*mydict.values()):
writer.writerow(list(row))
Output file dict.csv:
('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df') ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de')
5.998999999999998 7.89899999
0.0013169999 0.1564799999967539
4.000000000000097 8.764380000972
I am sure you can figure out the formatting:
>>> d.keys() #gives list of keys for first row
[('c4:7d:4f:53:24:be', 'ac:81:12:62:91:df'), ('a8:5b:4f:2e:fe:09', 'de:62:ef:4e:21:de')]
>>> for i in zip(*d.values()): #gives rows with tuple structure for columns
print i
(5.998999999999998, 7.89899999)
(0.0013169999, 0.1564799999967539)
(4.000000000000097, 8.764380000972)
For your code, do this:
writer = csv.writer(open('dict.csv', 'wb'))
writer.writerow(mydict.keys())
for values in zip(*mydict.values()):
writer.writerow(values)
The ()'s and such will not be added to the csv file.

Write Python dictionary to CSV where where keys= columns, values = rows

I have a list of dictionaries that I want to be able to open in Excel, formatted correctly. This is what I have so far, using csv:
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(ipath, 'wb')
writer = csv.writer(ofile, dialect = 'excel')
for items in list_of_dicts:
for k,v in items.items():
writer.writerow([k,v])
Obviously, when I open the output in Excel, it's formatted like this:
key value
key value
What I want is this:
key key key
value value value
I can't figure out how to do this, so help would be appreciated. Also, I want the column names to be the dictionary keys, in stead of the default 'A, B, C' etc. Sorry if this is stupid.
Thanks
The csv module has a DictWriter class for this, which is covered quite nicely in another SO answer. The critical point is that you need to know all your column headings when you instantiate the DictWriter. You could construct the list of field names from your list_of_dicts, if so your code becomes
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
out_file = open(out_path, 'wb')
fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))
writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')
writer.writeheader() # Assumes Python >= 2.7
for row in list_of_dicts:
writer.writerow(row)
out_file.close()
The way I've constructed fieldnames scans the entire list_of_dicts, so it will slow down as the size increases. You should instead construct fieldnames directly from the source of your data e.g. if the source of your data is also a csv file you can use a DictReader and use fieldnames = reader.fieldnames.
You can also replace the for loop with a single call to writer.writerows(list_of_dicts) and use a with block to handle file closure, in which case your code would become
list_of_dicts = [{'hello': 'goodbye'}, {'yes': 'no'}]
out_path= "/docs/outfile.txt"
fieldnames = sorted(list(set(k for d in list_of_dicts for k in d)))
with open(out_path, 'wb') as out_file:
writer = csv.DictWriter(out_file, fieldnames=fieldnames, dialect='excel')
writer.writeheader()
writer.writerows(list_of_dicts)
You need to write 2 separate rows, one with the keys, one with the values, instead:
writer = csv.writer(ofile, dialect = 'excel')
writer.writerow([k for d in list_of_dicts k in d])
writer.writerow([v for d in list_of_dicts v in d.itervalues()])
The two list comprehensions extract first all the keys, then all the values, from the dictionaries in your input list, combining these into one list to write to the CSV file.
I think that the most useful is to write the column by column, so each key is a column (good for later on data processing and use for e.g. ML).
I had some trouble yesterday figuring it out but I came up with the solution I saw on some other website. However, from what I see it is not possible to go through the whole dictionary at once and we have to divide it on smaller dictionaries (my csv file had 20k rows at the end - surveyed person, their data and answers. I did it like this:
# writing dict to csv
# 'cleaned' is a name of the output file
# 1 header
# fildnames is going to be columns names
# 2 create writer
writer = csv.DictWriter(cleaned, d.keys())
# 3 attach header
writer.writeheader()
# write separate dictionarties
for i in range(len(list(d.values())[0])):
writer.writerow({key:d[key][i] for key in d.keys()})
I see my solution has one more for loop but from the other hand, I think it takes less memory (but, I am not sure!!)
Hope it'd help somebody ;)

Categories