At work, I need to carry out this process every month, which involves downloading some files from a server, copying them into a new folder, deleting the existing files, etc, which is quite mundane. I tasked myself with writing a Python script to do this.
One step of this process is opening an Excel file and saving it as a CSV file. Is there anyway of doing this Python?
EDIT:
The main difficulty I have with this is two-fold. I know how to write a CSV file using the Python's csv library, but
How do I read in Excel files into Python?
Does the result from reading in an Excel file then writing it as a CSV file coincide with opening the file in Excel, and perform save-as CSV manually?
Is there a better way of doing this then the way suggested here?
I guess I really want an answer for 1. 2 I can find out myself by using something like Winmerge...
To manipulate Excel files in Python, take a look at this question and answer, where the use of xlrd package is used. Example:
from xlrd import open_workbook
book = open_workbook('example.xlsx')
sheet = book.sheet_by_index(1)
To manipulate CSV files in Python, take a look at this question and answer and documentation, where the library csvis used. Example:
import csv
with open('example.csv', 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in spamreader:
print ', '.join(row)
I had to do this for a project in a course I took.
You dont need to involve Excel in the process as you can simply create a csv file and then open it in any program you like.
If you know how to write to a file as txt then you can create a csv. I am not sure if this is the most efficient way but it works.
When setting up the file, instead of something like:
data = [["Name", "Age", "Gender"],["Joe",13,"M"],["Jill",14,"F"]]
filename = input("What do you want to save the file as ")
filename = filename + ".txt"
file = open(filename,"w")
for i in range(len(data)):
line = ""
for x in range (len(data[i])):
line += str(data[i][x])
line += ","
file.write(line)
file.write("\n")
file.close()
simple replace the file extension from txt to csv like this:
filename = filename + ".csv"
Related
My work is in the process of switching from SAS to Python and I'm trying to get a head start on it. I'm trying to write separate .txt files, each one representing all of the values of an Excel file column.
I've been able to upload my Excel sheet and create a .txt file fine, but for practical uses, I need to find a way to create a loop that will go through and make each column into it's own .txt file, and name the file "ColumnName.txt".
Uploading Excel Sheet:
import pandas as pd
wb = pd.read_excel('placements.xls')
Creating single .txt file: (Named each column A-Z for easy reference)
with open("A.txt", "w") as f:
for item in wb['A']:
f.write("%s\n" % item)
Trying my hand at a for loop (to no avail):
import glob
for file in glob.glob("*.txt"):
f = open(( file.rsplit( ".", 1 )[ 0 ] ) + ".txt", "w")
f.write("%s\n" % item)
f.close()
The first portion worked like a charm and gave me a .txt file with all of the relevant data.
When I used the glob command to attempt making some iterations, it doesn't error out, but only gives me one output file (A.txt) and the only data point in A.txt is the letter A. I'm sure my inputs are way off... after scrounging around forever it's what I found that made sense and ran, but I don't think I'm understanding the inputs going in to the command, or if what I'm running is just totally inaccurate.
Any help anyone would give would be much appreciated! I'm sure it's a simple loop, just hard to wrap your head around when you're so new to python programming.
Thanks again!
I suggest use pandas for write files by to_csv, only change extension to .txt:
# Uploading Excel Sheet:
import pandas as pd
df = pd.read_excel('placements.xls')
# Creating single .txt file: (Named each column A-Z for easy reference)
for col in df.columns:
print (col)
#python 3.6+
df[col].to_csv(f"{col}.txt", index=False, header=None)
#python bellow
#df[col].to_csv("{}.txt".format(col), index=False, header=None)
I'm attempting to rewrite specific cells in a csv file using Python.
However, whenever I try to modify an aspect of the csv file, the csv file ends up being emptied (the file contents becomes blank).
Minimal code example:
import csv
ReadFile = open("./Resources/File.csv", "rt", encoding = "utf-8-sig")
Reader = csv.reader(ReadFile)
WriteFile = open("./Resources/File.csv", "wt", encoding = "utf-8-sig")
Writer = csv.writer(WriteFile)
for row in Reader:
row[3] = 4
Writer.writerow(row)
ReadFile.close()
WriteFile.close()
'File.csv' looks like this:
1,2,3,FOUR,5
1,2,3,FOUR,5
1,2,3,FOUR,5
1,2,3,FOUR,5
1,2,3,FOUR,5
In this example, I'm attempting to change 'FOUR' to '4'.
Upon running this code, the csv file becomes empty instead.
So far, the only other question related to this that I've managed to find is this one, which does not seem to be dealing with rewriting specific cells in a csv file but instead deals with writing new rows to a csv file.
I'd be very grateful for any help anyone reading this could provide.
The following should work:
import csv
with open("./Resources/File.csv", "rt", encoding = "utf-8-sig") as ReadFile:
lines = list(csv.reader(ReadFile))
with open("./Resources/File.csv", "wt", encoding = "utf-8-sig") as WriteFile:
Writer = csv.writer(WriteFile)
for line in lines:
line[3] = 4
Writer.writerow(line)
When you open a writer with w option, it will delete the contents and start writing the file anew. The file is therefore, at the point when you start to read, empty.
Try writing to another file (like FileTemp.csv) and at the end of the program renaming FileTemp.csv to File.csv.
I am trying to append several csv files into a single csv file using python while adding the file name (or, even better, a sub-string of the file name) as a new variable. All files have headers. The following script does the trick of merging the files, but does not cover the file name as variable issue:
import glob
filenames=glob.glob("/filepath/*.csv")
outputfile=open("out.csv","a")
for line in open(str(filenames[1])):
outputfile.write(line)
for i in range(1,len(filenames)):
f = open(str(filenames[i]))
f.next()
for line in f:
outputfile.write(line)
outputfile.close()
I was wondering if there are any good suggestions. I have about 25k small size csv files (less than 100KB each).
You can use Python's csv module to parse the CSV files for you, and to format the output. Example code (untested):
import csv
with open(output_filename, "wb") as outfile:
writer = None
for input_filename in filenames:
with open(input_filename, "rb") as infile:
reader = csv.DictReader(infile)
if writer is None:
field_names = ["Filename"] + reader.fieldnames
writer = csv.DictWriter(outfile, field_names)
writer.writeheader()
for row in reader:
row["Filename"] = input_filename
writer.writerow(row)
A few notes:
Always use with to open files. This makes sure they will get closed again when you are done with them. Your code doesn't correctly close the input files.
CSV files should be opened in binary mode.
Indices start at 0 in Python. Your code skips the first file, and includes the lines from the second file twice. If you just want to iterate over a list, you don't need to bother with indices in Python. Simply use for x in my_list instead.
Simple changes will achieve what you want:
For the first line
outputfile.write(line) -> outputfile.write(line+',file')
and later
outputfile.write(line+','+filenames[i])
I have excel sheet with Ids such as je2456,je2645,je2893,....
I would like to save it in a list in python.
But its throwing errors while importing like
'No such file or directory exists.'
Make sure that the file you are reading from is in the same folder.
import csv
def csv_reader(input_file_name):
with open(input_file_name, newline='') as csvfile:
return list(csv.reader(csvfile))
Now the call the function and save the output:
# Make sure to add the extension for the file name, whatever it may be.
my_data_list = csv_reader("Your_Input_File_Here.csv")
Now your my_data_list is a list containing all the rows from the CSV file.
I'm using Python's csv module to do some reading and writing of csv files.
I've got the reading fine and appending to the csv fine, but I want to be able to overwrite a specific row in the csv.
For reference, here's my reading and then writing code to append:
#reading
b = open("bottles.csv", "rb")
bottles = csv.reader(b)
bottle_list = []
bottle_list.extend(bottles)
b.close()
#appending
b=open('bottles.csv','a')
writer = csv.writer(b)
writer.writerow([bottle,emptyButtonCount,100, img])
b.close()
And I'm using basically the same for the overwrite mode(which isn't correct, it just overwrites the whole csv file):
b=open('bottles.csv','wb')
writer = csv.writer(b)
writer.writerow([bottle,btlnum,100,img])
b.close()
In the second case, how do I tell Python I need a specific row overwritten? I've scoured Gogle and other stackoverflow posts to no avail. I assume my limited programming knowledge is to blame rather than Google.
I will add to Steven Answer :
import csv
bottle_list = []
# Read all data from the csv file.
with open('a.csv', 'rb') as b:
bottles = csv.reader(b)
bottle_list.extend(bottles)
# data to override in the format {line_num_to_override:data_to_write}.
line_to_override = {1:['e', 'c', 'd'] }
# Write data to the csv file and replace the lines in the line_to_override dict.
with open('a.csv', 'wb') as b:
writer = csv.writer(b)
for line, row in enumerate(bottle_list):
data = line_to_override.get(line, row)
writer.writerow(data)
You cannot overwrite a single row in the CSV file. You'll have to write all the rows you want to a new file and then rename it back to the original file name.
Your pattern of usage may fit a database better than a CSV file. Look into the sqlite3 module for a lightweight database.