Add to Values in An Array in a CSV File - python

I imported my CSV File and made the data into an array. Now I was wondering, what can I do so that I'm able to print a specific value in the array? For instance if I wanted the value in the 2nd row, 2nd column.
Also how would I go about adding the two values together? Thanks.
import csv
import numpy as np
f = open("Test.csv")
csv_f = csv.reader(f)
for row in csv_f:
print(np.array(row))
f.close()

There is no need to use csv module.
This code reads csv file and prints value of cell in second row and second column. I am assuming that fields are separated by commas.
with open("Test.csv") as fo:
table = [row.split(",") for row in fo.read().replace("\r", "").split("\n")]
print table[1][1]

So, I grabbed a dataset ("Company Funding Records") from here. Then, I just rewrote a little...
#!/usr/bin/python
import csv
#import numpy as np
csvaslist = []
f = open("TechCrunchcontinentalUSA.csv")
csv_f = csv.reader(f)
for row in csv_f:
# print(np.array(row))
csvaslist.append(row)
f.close()
# Now your data is in a dict. Everything past this point is just playing
# Add together a couple of arbitrary values...
print int(csvaslist[2][7]) + int(csvaslist[11][7])
# Add using a conditional...
print "\nNow let's see what Facebook has received..."
fbsum = 0
for sublist in csvaslist:
if sublist[0] == "facebook":
print sublist
fbsum += int(sublist[7])
print "Facebook has received", fbsum
I've commented lines at a couple points to show what's being used and what was unneeded. Notice at the end that referring to a particular datapoint is simply a matter of referencing what is, effectively, original_csv_file[line_number][field_on_that_line], and then recasting as int, float, whatever you need. This is because the csv file has been changed to a list of lists.

To get specific values within your array/file, and add together:
import csv
f = open("Test.csv")
csv_f = list(csv.reader(f))
#returns the value in the second row, second column of your file
print csv_f[1][1]
#returns sum of two specific values (in this example, value of second row, second column and value of first row, first column
sum = int(csv_f[1][1]) + int(csv_f[0][0])
print sum

Related

How can I create an endless array?

I'm trying to create an array in Python, so I can access the last cell in it without defining how many cells there are in it.
Example:
from csv import reader
a = []
i = -1
with open("ccc.csv","r") as f:
csv_reader = reader(f)
for row in csv_reader:
a[i] = row
i = i-1
Here I'm trying to take the first row in the CSV file and put it in the last cell on the array, in order to put it in reverse order on another file.
In this case, I don't know how many rows are in the CSV file, so I can not set the cells in the array as the number of the rows in the file
I tried to use f.append(row), but it inserts the values to the first cell of the array, and I want it to insert the values to the last cell of the array.
Read all the rows in the normal order, and then reverse the list:
from csv import reader
with open('ccc.csv') as f:
a = list(reader(f))
a.reverse()
First up, your current code is going to raise an index error on account of there being no elements, so a[-1] points to nothing at all.
The function you're looking for is list.insert which it inherits from the generic sequence types. list.insert takes two arguments, the index to insert a value in and the value to be inserted.
To rewrite your current code for this, you'd end up with something like
import dbf
from csv import reader
a = []
with open("ccc.csv", "r") as f:
csv_reader = reader(f)
for row in csv_reader:
a.insert(0, row)
This would reverse the contents of the csv file, which you can then write to a new file or use as you need

Python read csv single specific cell

Im trying to read just a single cell, in order to bring in the date to use elsewhere.
Using pandas I get an error If I try to do this, generally just that the dataframe cant be read because it expects a workable dataframe and not a single cell value prior to the actual convertable dataframe far below the initial line. How can I just get the cell i.e. [A2]
CSV example
Yes, pandas is not very good at files with inconsistent format (like varying number of columns). For that purpose, I recommend you should use csv the standard library.
The code below should give you the desired value.
import csv
row = 2; col = 1 # A2 = (2,1) cell
with open("yourfile.csv") as f:
reader = csv.reader(f)
for i in range(row):
row = next(reader)
value = row[col-1] # because python index starts at zero
print(value)
Demo (using string instead of a file):
import csv
row = 2; col = 1 # A2 = (2,1) cell
input_str = ["a,b", "c,d,e", "f,g,h,i"]
reader = csv.reader(input_str)
for i in range(row):
row = next(reader)
value = row[col-1] # because python index starts at zero
print(value)
To be able to access csv like a two-dimensional object you need to convert it first to 2D python List
import csv
with open("imdb.csv") as f:
csv_as_list = list(csv.reader(f, delimiter=","))
print(csv_as_list[3][2])
The format for accessing element is csv_as_list[row_index][column_index].

Can I print lines randomly from a csv in Python?

I'm trying print lines randomly from a csv.
Lets say the csv has the below 10 lines -
1,One
2,Two
3,Three
4,Four
5,Five
6,Six
7,Seven
8,Eight
9,Nine
10,Ten
If I write a code like below, it prints each line as a list in the same order as present in the CSV
import csv
with open("MyCSV.csv") as f:
reader = csv.reader(f)
for row_num, row in enumerate(reader):
print(row)
Instead, I'd like it to be random.
Its just a print for now. I'll later pass each line as a List to a Function.
This should work. You can reuse the lines list in your code as it is shuffled.
import random
with open("tmp.csv", "r") as f:
lines = f.readlines()
random.shuffle(lines)
print(lines)
import csv
import random
csv_elems = []
with open("MyCSV.csv") as f:
reader = csv.reader(f)
for row_num, row in enumerate(reader):
csv_elems.append(row)
random.shuffle(csv_elems)
print(csv_elems[0])
As you can see I'm just printing the first elem, you can iterate over the list, keep shuffling & print
Well you can define a list, append all elements of csv file into it, then shuffle it and print them, assume that the name of this list is temp
import csv
import random
temp = []
with open("your csv file.csv") as file:
reader = csv.reader(file)
for row_num, row in enumerate(reader):
temp.append(row)
random.shuffle(temp)
for i in range(len(temp)):
print(temp[i])
Why better don't you use pandas to handle csv?
import pandas as pd
data = pd.read_csv("MyCSV.csv")
And to get the samples you are looking for just write:
data.sample() # print one sample
data.sample(5) # to write 5 samples
Also if you want to pass each line to a function.
data_after_function = data.appy(function_name)
and inside the function you can cast the line into a list with list()
Hope this helps!
Couple of things to do:
Store CSV into a sequence of some sort
Get the data randomly
For 1, it’s probably best to use some form of sequence comprehension (I’ve gone for nested tuple in a list as it seems you want the row numbers and we can’t use dictionaries for shuffle).
We can use the random module for number 2.
import random
import csv
with open("MyCSV.csv") as f:
reader = csv.reader(f)
my_csv = [(row_num, row) for row_num, row in enumerate(reader)]
# get only 1 item from the list at random
random_row = random.choice(my_csv)
# randomise the order of all the rows
shuffled_csv = random.shuffle(my_csv)

Python - CSV to Matrix

Can you help me with this problem?
I`m new in programming and want to find out how to create a matrix, which looks like this:
matrix = {"hello":["one","two","three"],
"world": ["five","six","seven"],
"goodbye":["one","two","three"]}
I want to import a csv, which has all the strings (one, two three,...) in it and I tried with the split method, but I`m not getting there...
Another problems are the names of the categories (hello, world, goodbye)
Do you have any suggestions?
have you looked into the csv module?
https://docs.python.org/2/library/csv.html
import csv
TEST_TEXT = """\
hello,one,two,three
world,four,five,six
goodbye,one,two,three"""
TEST_FILE = TEST_TEXT.split("\n")
#file objects iterate over newlines anyway
#so this is how it would be when opening a file
#this would be the minimum needed to use the csv reader object:
for row in csv.reader(TEST_FILE):
print(row)
#or to get a list of all the rows you can use this:
as_list = list(csv.reader(TEST_FILE))
#splitting off the first element and using it as the key in a dictionary
dict_I_call_matrix = {row[0]:row[1:] for row in csv.reader(TEST_FILE)}
print(dict_I_call_matrix)
without_csv = [row.split(",") for row in TEST_FILE] #...row in TEST_TEXT.split("\n")]
matrix = [row[1:] for row in without_csv]
labels = [row[0] for row in without_csv]

calculation then and insert results into a csv in python

this is my first post but I am hoping you can tell me how to perform a calculation and insert the value within a csv data file.
For each row I want to be able to be able to take each 'uniqueclass' and sum the scores achieved in column 12. See example data below;
text1,Data,Class,Uniqueclass1,data1,data,2,data2,data3,data4,data5,175,12,data6,data7
text1,Data,Class,Uniqueclass1,data1,data,2,data2,data3,data4,data5,171,18,data6,data7
text1,Data,Class,Uniqueclass2,data1,data,4,data2,data3,data4,data5,164,5,data6,data7
text1,Data,Class,Uniqueclass2,data1,data,4,data2,data3,data4,data5,121,21.5,data6,data7
text2,Data,Class,Uniqueclass2,data1,data,4,data2,data3,data4,data5,100,29,data6,data7
text2,Data,Class,Uniqueclass2,data1,data,4,data2,data3,data4,data5,85,21.5,data6,data7
text3,Data,Class,Uniqueclass3,data1,data,3,data2,data3,data4,data5,987,35,data6,data7
text3,Data,Class,Uniqueclass3,data1,data,3,data2,data3,data4,data5,286,18,data6,data7
text3,Data,Class,Uniqueclass3,data1,data,3,data2,data3,data4,data5,003,5,data6,data7
So for instance the first Uniqueclass lasts for the first two rows. I would like to be able to therefore insert a subsquent value on that row which would be '346'(the sum of both 175 & 171.) The resultant would look like this:
text1,Data,Class,Uniqueclass1,data1,data,2,data2,data3,data4,data5,175,12,data6,data7,346
text1,Data,Class,Uniqueclass1,data1,data,2,data2,data3,data4,data5,171,18,data6,data7,346
I would like to be able to do this for each of the uniqueclass'
Thanks SMNALLY
I always like the defaultdict class for this type of thing.
Here would be my attempt:
from collections import defaultdict
class_col = 3
data_col = 11
# Read in the data
with open('path/to/your/file.csv', 'r') as f:
# if you have a header on the file
# header = f.readline().strip().split(',')
data = [line.strip().split(',') for line in f]
# Sum the data for each unique class.
# assuming integers, replace int with float if needed
count = defaultdict(int)
for row in data:
count[row[class_col]] += int(row[data_col])
# Append the relevant sum to the end of each row
for row in xrange(len(data)):
data[row].append(str(count[data[row][class_col]]))
# Write the results to a new csv file
with open('path/to/your/new_file.csv', 'w') as nf:
nf.write('\n'.join(','.join(row) for row in data))

Categories