Write array of dictionaries to csv in Python 3? - python

I have been wrestling with this for a day or two now, and I can't seem to get it right.
project_index = [
{A: ['1', '2', '3']},
{B: ['4', '5', '6']},
{C: ['7', '8', '9']},
{D: ['10', '11', '12']},
{E: ['13', '14', '15']},
{F: ['16', '17', '18']}
]
I have tried so many different things to try to get this into a .CSV table, but it keeps coming out in ridiculously incorrect format, eg them tiling down diagonally, or a bunch of rows of just the keys over and over (EG:
A B C D E F
A B C D E F
A B C D E F
A B C D E F )
Also, even if I get the values to show up, the entire array of strings shows up in one cell.
Is there any way I can get it to make each dictionary a column, with each string in the array value as its own cell in said column?
Example:
Thank you in advance!

Assuming all your keys are unique... then this (Modified Slightly):
project_index = [
{'A': ['1', '2', '3']},
{'B': ['4', '5', '6']},
{'C': ['7', '8', '9']},
{'D': ['10', '11', '12', '20']},
{'E': ['13', '14', '15']},
{'F': ['16', '17', '18']}
]
Should probably look like this:
project_index_dict = {}
for x in project_index:
project_index_dict.update(x)
print(project_index_dict)
# Output:
{'A': ['1', '2', '3'],
'B': ['4', '5', '6'],
'C': ['7', '8', '9'],
'D': ['10', '11', '12', '20'],
'E': ['13', '14', '15'],
'F': ['16', '17', '18']}
At this point, rather than re-invent the wheel... you could just use pandas.
import pandas as pd
# Work-around for uneven lengths:
df = pd.DataFrame.from_dict(project_index_dict, 'index').T.fillna('')
df.to_csv('file.csv', index=False)
Output file.csv:
A,B,C,D,E,F
1,4,7,10,13,16
2,5,8,11,14,17
3,6,9,12,15,18
,,,20,,
csv module method:
import csv
from itertools import zip_longest, chain
header = []
for d in project_index:
header.extend(list(d))
project_index_rows = [dict(zip(header, x)) for x in
zip_longest(*chain(list(*p.values())
for p in project_index),
fillvalue='')]
with open('file.csv', 'w') as f:
writer = csv.DictWriter(f, fieldnames = header)
writer.writeheader()
writer.writerows(project_index_rows)

My solution does not use Pandas. Here is the plan:
For the header row, grab all the keys from the dictionaries
For the data row, use zip to transpose columns -> rows
import csv
def first_key(d):
"""Return the first key in a dictionary."""
return next(iter(d))
def first_value(d):
"""Return the first value in a dictionary."""
return next(iter(d.values()))
with open("output.csv", "w", encoding="utf-8") as stream:
writer = csv.writer(stream)
# Write the header row
writer.writerow(first_key(d) for d in project_index)
# Write the rest
rows = zip(*[first_value(d) for d in project_index])
writer.writerows(rows)
Contents of output.csv:
A,B,C,D,D,F
1,4,7,10,13,16
2,5,8,11,14,17
3,6,9,12,15,18

Related

How can I create a list of dictionaries from a csv file?

I want to create a "dictionary of dictionaries" for each row of the following csv file
name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5
So the idea is, that mydict["Alice"] should be {'AGATC': 2, 'AATG': 8, 'TATC': 3} etc.
I really do not understand the .reader and .DictReader functions sufficiently. https://docs.python.org/3/library/csv.html#csv.DictReader
Because I am a newbie and cannot quite understand the docs. Do you have other 'easier' resources, that you can recommend?
First, I have to get the first column, i.e. names and put them as keys. How can I access that first column?
Second, I want to create a dictionary inside that name (as the value), with the keys being AGATC,AATG,TATC. Do you understand what I mean? Is that possible?
Edit, made progess:
# Open the CSV file and read its contents into memory.
with open(argv[1]) as csvfile:
reader = list(csv.reader(csvfile))
# Each row read from the csv file is returned as a list of strings.
# Establish dicts.
mydict = {}
for i in range(1, len(reader)):
print(reader[i][0])
mydict[reader[i][0]] = reader[i][1:]
print(mydict)
Out:
{'Alice': ['2', '8', '3'], 'Bob': ['4', '1', '5'], 'Charlie': ['3', '2', '5']}
But how to implement nested dictionaries as described above?
Edit #3:
# Open the CSV file and read its contents into memory.
with open(argv[1]) as csvfile:
reader = list(csv.reader(csvfile))
# Each row read from the csv file is returned as a list of strings.
# Establish dicts.
mydict = {}
for i in range(1, len(reader)):
print(reader[i][0])
mydict[reader[i][0]] = reader[i][1:]
print(mydict)
print(len(reader))
dictlist = [dict() for x in range(1, len(reader))]
#for i in range(1, len(reader))
for i in range(1, len(reader)):
dictlist[i-1] = dict(zip(reader[0][1:], mydict[reader[i][0]]))
#dictionary = dict(zip(reader[0][1:], mydict[reader[1][0]]))
print(dictlist)
Out:
[{'AGATC': '2', 'AATG': '8', 'TATC': '3'}, {'AGATC': '4', 'AATG': '1', 'TATC': '5'}, {'AGATC': '3', 'AATG': '2', 'TATC': '5'}]
{'AGATC': 1, 'AATG': 1, 'TATC': 5}
So I solved it for myself:)
The following code will give you what you've asked for in terms of dict struture.
import csv
with open('file.csv', newline='') as csvfile:
mydict = {}
reader = csv.DictReader(csvfile)
# Iterate through each line of the csv file
for row in reader:
# Create the dictionary structure as desired.
# This uses a comprehension
# Foreach item in the row get the key and the value except if the key
# is 'name' (k != 'name')
mydict[row['name']] = { k: v for k, v in row.items() if k != 'name' }
print(mydict)
This will give you
{
'Alice': {'AGATC': '2', 'AATG': '8', 'TATC': '3'},
'Bob': {'AGATC': '4', 'AATG': '1', 'TATC': '5'},
'Charlie': {'AGATC': '3', 'AATG': '2', 'TATC': '5'}
}
There are plenty of videos and articles covering comprehensions on the net if you need more information on these.

Taking a column from a csv file and putting it into a list in python

I need to write some code in python that takes a column from an csv file and makes it a list. Here is my code until now.
import csv
from collections import defaultdict
columns = defaultdict(list)
with open('Team1_BoMInput.csv') as f:
reader = csv.DictReader(f)
for row in reader:
for (k,v) in row.items():
columns[k].append(v)
y = (columns['Quantity'])
x = (columns[('Actual Price')])
b = ['2', '2', '1', '1', '1', '1', '1', '1', '1', '2', '1', '1', '3', '4', '1', '1', '1', '8', '2', '2', '1', '1', '1', '1', '4', '1', '2', '2', '2', '1', '2', '2', '2', '1', '2', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '3', '2', '1', '1', '1']
a = ['$6.41', '$14.97', '$6.78', '$11.44', '$22.61', '$1.58', '$11.68', '$19.99', '$12.99', '$3.66', '$24.99', '$1.04', '$0.09', '$1.92', '$4.80', '$1.50', '$17.92', '$1.36', '$65.52', '$24.38', '$1.91', '$3.40', '$13.79', '$39.55', '$1.94', '$3.38', '$11.34', '$18.33', '$21.13', '$8.24', '$30.14', '$125.97', '$26.54', '$8.58', '$12.77', '$11.42', '$1.32', '$2.63', '$8.58', '$0.40', '$0.57', '$2.54', '$2.83', '$1.41', '$9.03', '$3.38', '$5.98', '$4.51', '$2.54', '$6.76', '$4.51', '$1.13', '$14.24']
for i in range(0, len(b)):
b[i] = float(b[i])
print(b)
x = ([s.strip('$') for s in a])
for i in range(0, len(x)):
x[i] = float(x[i])
print(x)
instead of having the values of a and b listed in the program, I want it to take the column from the csv file and use the values of that.
Thanks in advance
Try this:
import pandas as pd
df=pd.read_csv("Team1_BoMInput.csv")
y=list(df['Quantity'])
x=list(df['Actual Price'])
Refer the Below Code, leveraging the pandas library for faster computations and lesser code
import pandas as pd
df=pd.read_csv("Team1_BoMInput.csv")
quantity_list_value=list(df.loc[:,"Quantity"].astype(float).values)
price_list_value=list(df.loc[:,"Actual Price"].apply(lambda x: x.replace("$","")).astype(float).values)
I think you code will not run unless you change it to this:
import csv
from collections import defaultdict
columns = defaultdict(list)
with open('Team1_BoMInput.csv') as f:
reader = csv.DictReader(f)
for row in reader:
for (k,v) in row.items():
if k not in columns:
columns[k] = list()
columns[k].append(v)
# Rest of your code

Getting "keys" from list of dicts [duplicate]

This question already has answers here:
How to return dictionary keys as a list in Python?
(13 answers)
Closed 4 years ago.
I have the following data:
[{'id': ['132605', '132750', '132772', '132773', '133065', '133150', '133185', '133188', '133271', '133298']},
{'number': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']},
{'id': ['1', '1', '1', '1', '1', '1', '1', '1', '1', '1']}]
What would be the best way to get a list of the keys (as if it was a dict and not an array)? Currently I'm doing:
>>> [list(i.keys())[0] for i in e.original_column_data]
['id', 'number', 'id']
But that feels a bit hackish
What is hacky about it? It's a bit inelegant. You just need to do the following:
>>> keys = []
>>> data = [{'id': ['132605', '132750', '132772', '132773', '133065', '133150', '133185', '133188', '133271', '133298']},
... {'number': ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']},
... {'id': ['1', '1', '1', '1', '1', '1', '1', '1', '1', '1']}]
>>> for d in data:
... keys.extend(d)
...
>>> keys
['id', 'number', 'id']
Or if you prefer one-liners:
>>> [k for d in data for k in d]
['id', 'number', 'id']
first way
iteration on a dictionary gives you its keys, so a simple
>>> [key for key in dict]
gives you a list of keys and you can get what you want with
>>> [key for dict in dict_list for key in dict]
second way (only python 2)
use .key() (used in your code)
but there is no need to use list() (edit: for python 2)
here's what it will look like:
>>> [dict.keys()[0] for dict in dict_list]
in your code, dictionaries have only one key so these two have the same result.
but I prefer the first one since it gives all keys of all the dictionaries
This is simpler and does the same thing:
[k for d in e.original_column_data for k in d]
=> ['id', 'number', 'id']

converting strings into integers in list comprehension

I am trying to write a function which takes the file and split it with the new line and then again split it using comma delimiter(,) after that I want to convert each string inside that list to integers using only list comprehension
# My code but it's not converting the splitted list into integers.
def read_csv(filename):
string_list = open(filename, "r").read().split('\n')
string_list = string_list[1:len(string_list)]
splitted = [i.split(",") for i in string_list]
final_list = [int(i) for i in splitted]
return final_list
read_csv("US_births_1994-2003_CDC_NCHS.csv")
Output:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'
How the data looks after splitting with comma delimiter(,)
us = open("US_births_1994-2003_CDC_NCHS.csv", "r").read().split('\n')
splitted = [i.split(",") for i in us]
print(splitted)
Output:
[['year', 'month', 'date_of_month', 'day_of_week', 'births'],
['1994', '1', '1', '6', '8096'],
['1994', '1', '2', '7', '7772'],
['1994', '1', '3', '1', '10142'],
['1994', '1', '4', '2', '11248'],
['1994', '1', '5', '3', '11053'],
['1994', '1', '6', '4', '11406'],
['1994', '1', '7', '5', '11251'],
['1994', '1', '8', '6', '8653'],
['1994', '1', '9', '7', '7910'],
['1994', '1', '10', '1', '10498']]
How do I convert each string inside this output as integers and assign it to a single list using list comprehension.
str.split() produces a new list; so splitted is a list of lists. You'd want to convert the contents of each contained list:
[[int(v) for v in row] for row in splitted]
Demo:
>>> csvdata = '''\
... year,month,date_of_month,day_of_week,births
... 1994,1,1,6,8096
... 1994,1,2,7,7772
... '''
>>> string_list = csvdata.splitlines() # better way to split lines
>>> string_list = string_list[1:] # you don't have to specify the second value
>>> splitted = [i.split(",") for i in string_list]
>>> splitted
[['1994', '1', '1', '6', '8096'], ['1994', '1', '2', '7', '7772']]
>>> splitted[0]
['1994', '1', '1', '6', '8096']
>>> final_list = [[int(v) for v in row] for row in splitted]
>>> final_list
[[1994, 1, 1, 6, 8096], [1994, 1, 2, 7, 7772]]
>>> final_list[0]
[1994, 1, 1, 6, 8096]
Note that you could just loop directly over the file to get separate lines too:
string_list = [line.strip().split(',') for line in openfileobject]
and skipping an entry in such an object could be done with next(iterableobject, None).
Rather than read the whole file into memory and manually split the data, you could just use the csv module:
import csv
def read_csv(filename):
with open(filename, 'r', newline='') as csvfile:
reader = csv.reader(csvfile)
next(reader, None) # skip first row
for row in reader:
yield [int(c) for c in row]
The above is a generator function, producing one row at a time as you loop over it:
for row in read_csv("US_births_1994-2003_CDC_NCHS.csv"):
print(row)
You can still get a list with all rows with list(read_csv("US_births_1994-2003_CDC_NCHS.csv")).

Sorting list with strings & ints in numerical order

I'm currently dealing with the below list:
[['John', '1', '2', '3'], ['Doe', '1', '2', '3']]
I'm incredibly new to python, I'm wanting to order this list in numerical order (high - low) but maintain the string at the beginning of the list. Like this:-
[['John', '3', '2', '1'], ['Doe', '3', '2', '1']]
There will always be one name & integers there after.
I collect this list from a csv file like so:-
import csv
with open('myCSV.csv', 'r') as f:
reader = csv.reader(f)
your_list = list(reader)
print(sorted(your_list))
Any help is much appreciated. Thanks in advance..
Iterate over the list and sort only the slice of each sublist without the first item. To sort strings as numbers pass key=int to sorted. Use reverse=True since you need a reversed order:
>>> l = [['John', '1', '2', '3'], ['Doe', '1', '2', '3']]
>>>
>>> [[sublist[0]] + sorted(sublist[1:], key=int, reverse=True) for sublist in l]
[['John', '3', '2', '1'], ['Doe', '3', '2', '1']]

Categories