I am trying to open csv files (two) and subtract values from two columns from those two files. I call the data into arrays, and then use a map and operator.sub to get this, but I am getting stuck on outputting that data as print or as another csv.
I have data in the form of two columns - a1, b1 and a2, b2 in two files. I want to find subtract b2(i) and b1(i), and make a new csv file with the difference "b". Data values are a bit large to copy here. For example, 1,10, /n 2, 15 /n 3 20, etc. and 1,20 /n 2,20 /n 3, 30. I should get 5, 5, 10 as output list or as an array or even as an output file.
My problem - I am not getting any output, but error saying "list" is not callable. I searched through a lot of details on the built-in function matter, but still don't know where I am messing up.
import csv
try:
from itertools import imap
except ImportError:
# Python 3...
imap=map
from operator import sub
a = []
b = []
c = []
with open('1.csv') as csvDataFile:
csvReader = csv.reader(csvDataFile)
for row in csvReader:
a.append(row[1])
with open('2.csv') as csvDataFile:
csvReader = csv.reader(csvDataFile)
for row in csvReader:
b.append(row[1])
c = list(imap(sub, a, b))
print(c)
Just create an output file and put data inside
You may use pandas or just current csv lib
with open('my_output.csv', mode='w') as output_file:
writer = csv.writer(output_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['John Smith', 'Accounting', 'November'])
Related
Based on the following link:
I can easily format a single MAC. But I'm having an issue trying to do multiple from a csv file. When I run the file, it converts them but the script will convert each one like 6 times. If I add "return" then it only converts the first one 6 times.
def readfile_csv():
with open('ap_macs.csv', 'r',encoding='utf-8-sig') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for lines in csv_reader:
data = (lines[0])
for i in range(0,12,2):
format_mac = ':'.join(data[i:i + 2] for i in range(0, 12, 2))
print(format_mac.swapcase())
Ideally, I'd love to be able to do this with Pandas and Excel but the indexing is killing me. Appreciate any help. Thank you.
ap_macs
A1B2C3D4E5F6
A1B2C3D4E5F7
A1B2C3D4E5F8
A1B2C3D4E5F9
a1b2c3d4e5f6
a1b2c3d4e5f7
a1b2c3d4e5f8
a1b2c3d4e5f9
You could use pandas for this. Note that pandas is overkill if all you're using it for is to read the csv.
df = pd.read_csv('ap_macs.csv')
# Slice the mac addresses into chunks
# This list will contain one `pd.Series` each for the second through last chunks
chunks = [df["ap_macs"].str[i:i+2] for i in range(2, 12, 2)]
# Then concatenate all the chunks, with a separator, to the first chunk
df["MAC"] = df['ap_macs'].str[0:2].str.cat(chunks, ":")
which gives:
ap_macs MAC
0 A1B2C3D4E5F6 A1:B2:C3:D4:E5:F6
1 A1B2C3D4E5F7 A1:B2:C3:D4:E5:F7
2 A1B2C3D4E5F8 A1:B2:C3:D4:E5:F8
3 A1B2C3D4E5F9 A1:B2:C3:D4:E5:F9
4 a1b2c3d4e5f6 a1:b2:c3:d4:e5:f6
5 a1b2c3d4e5f7 a1:b2:c3:d4:e5:f7
6 a1b2c3d4e5f8 a1:b2:c3:d4:e5:f8
7 a1b2c3d4e5f9 a1:b2:c3:d4:e5:f9
Of course, you can overwrite the ap_macs column if you want, but I created a new column for this demonstration.
If you want to use your csv.reader approach, you need to create your string first, and then print it.
def readfile_csv():
# csv_data = []
with open('ap_macs.csv', 'r',encoding='utf-8-sig') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for row in csv_reader:
data = row[0]
for i in range(0,12,2):
format_mac = ':'.join(data[i:i + 2] for i in range(0, 12, 2)).swapcase()
print(format_mac)
# csv_data.append(format_mac)
# return csv_data
which will print:
a1:b2:c3:d4:e5:f6
a1:b2:c3:d4:e5:f7
a1:b2:c3:d4:e5:f8
a1:b2:c3:d4:e5:f9
A1:B2:C3:D4:E5:F6
A1:B2:C3:D4:E5:F7
A1:B2:C3:D4:E5:F8
A1:B2:C3:D4:E5:F9
Note that printing is not the same as returning data, and if you actually want to use this data outside the function, you'll have to return it (uncomment the commented lines)
i m trying to create a CSV file from a number of different lists
the data in the lists are all either integers or floating points
i can place the field names correctly into my csv file, but when i try to add the data from my lists into new rows in the csv file, i get the TypeError: 'int' object is not subscriptable, error message
my code is as follows
import csv
fileName= str(input("input a file name > "))
with open(fileName+'.csv', 'w') as csvfile:
fieldnames = ['Generation', 'Juviniles', 'Adults', 'Seniles', 'Total']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
i=0
for n in range (gens):
writer.writerow({'Generation': str(i), 'Juviniles': str(juviniles[i]), 'Adults': str(adults[i]), 'Seniles': str(seniles[i]), 'Total': str(total[i]))
i = i+1
writer.writerow({'Generation': str(i), 'Juviniles': str(juviniles[i]), 'Adults': str(adults[i]), 'Seniles': str(seniles[i]), 'Total': str(total[i])})
it worked fine when i only entered the first two columns (generation and juviniles), but when i tried to extend to 5 columns, it throws me the error.
my first thought as that i had to change the data in the lists to strings (hence the str(0 functions) but no difference
any help would be greatly received.
It seems you are trying to zip together separate lists, each representing a column in the CSV file. Here is an easy way to do it:
import csv
from itertools import count, izip
# Some other details here
with open(fileName+'.csv', 'w') as csvfile:
fieldnames = ['Generation', 'Juviniles', 'Adults', 'Seniles', 'Total']
writer = csv.writer(csvfile)
writer.writerow(fieldnames)
for row in izip(count(), juviniles, adults, seniles, total):
writer.writerow(row)
Discussion
Don't need to use DictWriter, just use a regular writer: we are not dealing with dictionaries
count() is a generator which generates 0, 1, 2, 3, ...
izip (or zip) combines all of the columns to make the rows
I have two files, the first one is called book1.csv, and looks like this:
header1,header2,header3,header4,header5
1,2,3,4,5
1,2,3,4,5
1,2,3,4,5
The second file is called book2.csv, and looks like this:
header1,header2,header3,header4,header5
1,2,3,4
1,2,3,4
1,2,3,4
My goal is to copy the column that contains the 5's in book1.csv to the corresponding column in book2.csv.
The problem with my code seems to be that it is not appending right nor is it selecting just the index that I want to copy.It also gives an error that I have selected an incorrect index position. The output is as follows:
header1,header2,header3,header4,header5
1,2,3,4
1,2,3,4
1,2,3,41,2,3,4,5
Here is my code:
import csv
with open('C:/Users/SAM/Desktop/book2.csv','a') as csvout:
write=csv.writer(csvout, delimiter=',')
with open('C:/Users/SAM/Desktop/book1.csv','rb') as csvfile1:
read=csv.reader(csvfile1, delimiter=',')
header=next(read)
for row in read:
row[5]=write.writerow(row)
What should I do to get this to append properly?
Thanks for any help!
What about something like this. I read in both books, append the last element of book1 to the book2 row for every row in book2, which I store in a list. Then I write the contents of that list to a new .csv file.
with open('book1.csv', 'r') as book1:
with open('book2.csv', 'r') as book2:
reader1 = csv.reader(book1, delimiter=',')
reader2 = csv.reader(book2, delimiter=',')
both = []
fields = reader1.next() # read header row
reader2.next() # read and ignore header row
for row1, row2 in zip(reader1, reader2):
row2.append(row1[-1])
both.append(row2)
with open('output.csv', 'w') as output:
writer = csv.writer(output, delimiter=',')
writer.writerow(fields) # write a header row
writer.writerows(both)
Although some of the code above will work it is not really scalable and a vectorised approach is needed. Getting to work with numpy or pandas will make some of these tasks easier so it is great to learn a bit of it.
You can download pandas from the Pandas Website
# Load Pandas
from pandas import DataFrame
# Load each file into a pandas dataframe, this is based on a numpy array
data1 = DataFrame.from_csv('csv1.csv',sep=',',parse_dates=False)
data2 = DataFrame.from_csv('csv2.csv',sep=',',parse_dates=False)
#Now add 'header5' from data1 to data2
data2['header5'] = data1['header5']
#Save it back to csv
data2.to_csv('output.csv')
Regarding the "error that I have selected an incorrect index position," I suspect this is because you're using row[5] in your code. Indexing in Python starts from 0, so if you have A = [1, 2, 3, 4, 5] then to get the 5 you would do print(A[4]).
Assuming the two files have the same number of rows and the rows are in the same order, I think you want to do something like this:
import csv
# Open the two input files, which I've renamed to be more descriptive,
# and also an output file that we'll be creating
with open("four_col.csv", mode='r') as four_col, \
open("five_col.csv", mode='r') as five_col, \
open("five_output.csv", mode='w', newline='') as outfile:
four_reader = csv.reader(four_col)
five_reader = csv.reader(five_col)
five_writer = csv.writer(outfile)
_ = next(four_reader) # Ignore headers for the 4-column file
headers = next(five_reader)
five_writer.writerow(headers)
for four_row, five_row in zip(four_reader, five_reader):
last_col = five_row[-1] # # Or use five_row[4]
four_row.append(last_col)
five_writer.writerow(four_row)
Why not reading the files line by line and use the -1 index to find the last item?
endings=[]
with open('book1.csv') as book1:
for line in book1:
# if not header line:
endings.append(line.split(',')[-1])
linecounter=0
with open('book2.csv') as book2:
for line in book2:
# if not header line:
print line+','+str(endings[linecounter]) # or write to file
linecounter+=1
You should also catch errors if row numbers don't match.
I am trying to read the lines of a text file into a list or array in python. I just need to be able to individually access any item in the list or array after it is created.
The text file is formatted as follows:
0,0,200,0,53,1,0,255,...,0.
Where the ... is above, there actual text file has hundreds or thousands more items.
I'm using the following code to try to read the file into a list:
text_file = open("filename.dat", "r")
lines = text_file.readlines()
print lines
print len(lines)
text_file.close()
The output I get is:
['0,0,200,0,53,1,0,255,...,0.']
1
Apparently it is reading the entire file into a list of just one item, rather than a list of individual items. What am I doing wrong?
You will have to split your string into a list of values using split()
So,
lines = text_file.read().split(',')
EDIT:
I didn't realise there would be so much traction to this. Here's a more idiomatic approach.
import csv
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
# do something
You can also use numpy loadtxt like
from numpy import loadtxt
lines = loadtxt("filename.dat", comments="#", delimiter=",", unpack=False)
So you want to create a list of lists... We need to start with an empty list
list_of_lists = []
next, we read the file content, line by line
with open('data') as f:
for line in f:
inner_list = [elt.strip() for elt in line.split(',')]
# in alternative, if you need to use the file content as numbers
# inner_list = [int(elt.strip()) for elt in line.split(',')]
list_of_lists.append(inner_list)
A common use case is that of columnar data, but our units of storage are the
rows of the file, that we have read one by one, so you may want to transpose
your list of lists. This can be done with the following idiom
by_cols = zip(*list_of_lists)
Another common use is to give a name to each column
col_names = ('apples sold', 'pears sold', 'apples revenue', 'pears revenue')
by_names = {}
for i, col_name in enumerate(col_names):
by_names[col_name] = by_cols[i]
so that you can operate on homogeneous data items
mean_apple_prices = [money/fruits for money, fruits in
zip(by_names['apples revenue'], by_names['apples_sold'])]
Most of what I've written can be speeded up using the csv module, from the standard library. Another third party module is pandas, that lets you automate most aspects of a typical data analysis (but has a number of dependencies).
Update While in Python 2 zip(*list_of_lists) returns a different (transposed) list of lists, in Python 3 the situation has changed and zip(*list_of_lists) returns a zip object that is not subscriptable.
If you need indexed access you can use
by_cols = list(zip(*list_of_lists))
that gives you a list of lists in both versions of Python.
On the other hand, if you don't need indexed access and what you want is just to build a dictionary indexed by column names, a zip object is just fine...
file = open('some_data.csv')
names = get_names(next(file))
columns = zip(*((x.strip() for x in line.split(',')) for line in file)))
d = {}
for name, column in zip(names, columns): d[name] = column
This question is asking how to read the comma-separated value contents from a file into an iterable list:
0,0,200,0,53,1,0,255,...,0.
The easiest way to do this is with the csv module as follows:
import csv
with open('filename.dat', newline='') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',')
Now, you can easily iterate over spamreader like this:
for row in spamreader:
print(', '.join(row))
See documentation for more examples.
Im a bit late but you can also read the text file into a dataframe and then convert corresponding column to a list.
lista=pd.read_csv('path_to_textfile.txt', sep=",", header=None)[0].tolist()
example.
lista=pd.read_csv('data/holdout.txt',sep=',',header=None)[0].tolist()
Note: the column name of the corresponding dataframe will be in the form of integers and i choose 0 because i was extracting only the first column
Better this way,
def txt_to_lst(file_path):
try:
stopword=open(file_path,"r")
lines = stopword.read().split('\n')
print(lines)
except Exception as e:
print(e)
I am trying to read in a table from a .CSV file which should have 5 columns.
But, some rows have corrupt data..making it more than 5 columns.
How do I reject those rows and continue reading further ?
*Using
temp = read_table(folder + r'\temp.txt, sep=r'\t')
Just gives an error and stops the program*
I am new to Python...please help
Thanks
Look into using Python's csv module.
Without testing the damaged file it is difficult to say if this will do the trick however the csvreader reads a csv file's rows as a list of strings so you could potentially check if the list has 5 elements and proceed that way.
A code example:
out = []
with open('file.csv', 'rb') as csvfile:
reader = csv.reader(csvfile, delimeter=' ')
for row in reader:
if len(row) == 5:
out.append(row)