Using output .csv file from one python script in another - python

I am currently using two different types of python scripts. One extracts data and saves it as a CSV file, and the other characterizes data. Both work perfectly separately, but I am trying to find a way to characterize the data from the outputted CSV file without having to run them separately. Importing script1 into script2 is easy, but reading the CSV file from script1 is what I can't figure out. I am going to provide the output of script1 and where I am trying to insert it in script2:
# create file or append to file
filename = '%s.csv' % gps
if os.path.exists(filename):
append_write = 'a' # append if already exists
else:
append_write = 'w' # make a new file if not
# save file
with open('!/usr/bin/env python/%s.csv' % gps, mode=append_write) as features_file:
features_writer = csv.writer(features_file, delimiter=' ', quotechar='"', quoting=csv.QUOTE_MINIMAL)
(!/usr/bin/env python has replaced the directory I am actually saving this CSV file in due to privacy reasons.)
I am trying to then place the file from this output into the following command:
x_new = pd.read_csv('filename %s.csv gps' , names = attributes)
I have tried a variety of ways to input the script1 output into this command, but can't find the correct way to do this. Please help me out. If any further information is needed please let me know.

This line is definitely wrong:
x_new = pd.read_csv('filename %s.csv gps' , names = attributes)
probably you mean:
x_new = pd.read_csv(filename , names = attributes)
Or
x_new = pd.read_csv('%s.csv' % gps , names = attributes)
but I don't think the filename part should be there either.
But don't write it to a file. You can just keep manipulating the data. And if you want to save the file anyway, you don't have to read the file, you can still just keep manipulating the array/frame you have.
also open(filename,"w") is all you need, if the file doesn't exist, it will be created.

You should try using formatted strings and putting variable directly into your file filepath (this assumes you're using Python 3+) :
x_new = pd.read_csv(f'{gps}.csv')

Related

Creating Multiple .txt files from an Excel file with Python Loop

My work is in the process of switching from SAS to Python and I'm trying to get a head start on it. I'm trying to write separate .txt files, each one representing all of the values of an Excel file column.
I've been able to upload my Excel sheet and create a .txt file fine, but for practical uses, I need to find a way to create a loop that will go through and make each column into it's own .txt file, and name the file "ColumnName.txt".
Uploading Excel Sheet:
import pandas as pd
wb = pd.read_excel('placements.xls')
Creating single .txt file: (Named each column A-Z for easy reference)
with open("A.txt", "w") as f:
for item in wb['A']:
f.write("%s\n" % item)
Trying my hand at a for loop (to no avail):
import glob
for file in glob.glob("*.txt"):
f = open(( file.rsplit( ".", 1 )[ 0 ] ) + ".txt", "w")
f.write("%s\n" % item)
f.close()
The first portion worked like a charm and gave me a .txt file with all of the relevant data.
When I used the glob command to attempt making some iterations, it doesn't error out, but only gives me one output file (A.txt) and the only data point in A.txt is the letter A. I'm sure my inputs are way off... after scrounging around forever it's what I found that made sense and ran, but I don't think I'm understanding the inputs going in to the command, or if what I'm running is just totally inaccurate.
Any help anyone would give would be much appreciated! I'm sure it's a simple loop, just hard to wrap your head around when you're so new to python programming.
Thanks again!
I suggest use pandas for write files by to_csv, only change extension to .txt:
# Uploading Excel Sheet:
import pandas as pd
df = pd.read_excel('placements.xls')
# Creating single .txt file: (Named each column A-Z for easy reference)
for col in df.columns:
print (col)
#python 3.6+
df[col].to_csv(f"{col}.txt", index=False, header=None)
#python bellow
#df[col].to_csv("{}.txt".format(col), index=False, header=None)

How Do I Change the Data Stored in a Text File in Python?

I am trying to create a basic mathematical quiz and need to be able to store the name of the user next to their score. To ensure that I could edit the data dynamically regardless of the length of the user's name or the number of digits in their score, I decided to split up the name and score with a comma and use the split function. I'm new to file handling in python so don't know if I am using the wrong mode ("r+") but when I complete the quiz, my score is not recorded at all, nothing is added to the file. Here is my code:
for line in class_results.read():
if student_full_name in line:
student = line.split(",")
student[1] = correct
line.replace(line, "{},{}".format(student_full_name, student[1]))
else:
class_results.write("{},{}".format(student_full_name, correct))
Please let me know how I can get this system to work. Thank you in advance.
Yes r+ opens the file for both reading and writing and to summarize:
r when the file will only be read
w for only writing (an existing file with the same name will be erased)
a opens the file for appending; any data written to the file is automatically added to the end.
I will recommend instead of comma separation to benifit from json or yaml syntax, it fits better in this case.
scores.json:
{
"student1": 12,
"student2": 798
}
The solution:
import json
with open(filename, "r+") as data:
scores_dict = json.loads(data.read())
scores_dict[student_full_name] = correct # if already exist it will be updated otherwise it will be added
data.seek(0)
data.write(json.dumps(scores_dict))
data.truncate()
scores.yml will looks as follow:
student1: 45
student2: 7986
Solution:
import yaml
with open(filename, "r+") as data:
scores_dict = yaml.loads(data.read())
scores_dict[student_full_name] = correct # if already exist it will be updated otherwise it will be added
data.seek(0)
data.write(yaml.dump(scores_dict, default_flow_style=False))
data.truncate()
to instal yaml python package: pip install pyyaml
Modifying a file in place is generally a poor way to do this. It risks errors causing the resulting file to be half new data, half old, with the split point being corrupted. The usual pattern is to write to a new file, then atomically replace the old file with the new file, so either you have the entire original old file and a partial new file, or the new file, not a mish-mash of both.
Given your example code, here is how you would fix it up to do that:
import csv
import os
from tempfile import NamedTemporaryFile
origfile = '...'
origdir = os.path.dirname(origfile)
# Open original file for read, and tempfile in same directory for write
with open(origfile, newline='') as inf, NamedTemporaryFile('w', dir=origdir, newline='') as outf:
old_results = csv.reader(inf)
new_results = csv.writer(outf)
for name, oldscore in old_results:
if name == student_full_name:
# Found our student, replace their score
new_results.writerow((name, correct))
# The write out the rest of the lines unchanged
new_results.writerows(old_results)
# and we're done
break
else:
new_results.writerow((name, oldscore))
else:
# else block on for loop executes if loop ran without break-ing
new_results.writerow((student_full_name, correct))
# If we got here, no exceptions, so let's keep the new data to replace the old
outf.delete = False
# Atomically replaces the original file with the temp file with updated data
os.replace(outf.name, origfile)

Write to a specific Excel worksheet using python

I want to overwrite specific cells in an already existing excel file. I've searched and found this answer, writing to existing workbook using xlwt. I've applied it as the following,
def wrtite_to_excel (self):
#first I must open the specified excel file, sice open_file is in the same class, hence we can get it using self.sheet.
bookwt = copy(self.workbook)
sheetwt= bookwt.get_sheet(0)
#now, I must know the column that was estimated so I overwrite it,
colindex= self.columnBox.current() #returns the index of the estimated column
for i in range (1, self.grid.shape[0]):
if (str (self.sheet.cell_value(i,colindex)).lower () == self.missingBox.get().lower()):
#write the estimated value:
sheetwt.write (i, colindex, self.grid[i])
bookwt.save(self.filename + '.out' + os.path.splitext(self.filename)[-1])
Notice that, self.workbook already exists in another method in the same class this way,
def open_file (self, file_name):
try:
self.workbook = xlrd.open_workbook(file_name)
I really don't know what this means, '.out' + os.path.splitext(self.filename)[-1], but it seems that it causes the modified file to be saved in the same path of the original one with a different name.
After running the program, a new Excel file gets saved in the same path of the original one, however it is saved with a weird name as data.xlsx.out.xlsx and it doesn't open. I think it's caused by this line '.out' + os.path.splitext(self.filename)[-1]. I removed that line in order to overwrite the original file and not saving a copy, but when running the program I become unable to open the original file and I get an error message saying that the file can't be opened because the file format or extension is not valid.
What I really want is to modify the original file not to create a modified copy.
EDIT: SiHa's answer could modify the existing file without creating a copy if only the file name is specified like this,
bookwt.save(self.filename)
And, it could save a new copy this way,
filepath, fileext = os.path.splitext(self.filename)
bookwt.save(filepath + '_out' + fileext)
Or as the line provided in my code in the question. However, in all of these methods the same problem exists, where after modifying the file it can't be opened. After searching I found that the problem could be solved by changing the extension of the original file from .xlsx to .xls. After making this change, the problem was solve. This is the link where I found the solution http://www.computing.net/answers/office/the-file-formatfile-extension-is-not-valid/19454.html
Thank You.
To explain the line in question:
(self.filename + '.out' Means concatenate `.out' to the end of the original filename.
+ os.path.splitext(self.filename)[-1]) Means split the filename into a list of ['path', 'extension'] then concatenate the last element (the extension) back onto the end again.
So you end up with data.xlsx.out.xlsx
You should just be able to use bookwt.save(self.filename), although you may run in to errors if you still have the file open for reading. It may be safer to create a copy in a similar manner to the above:
filepath, fileext = os.path.splitext(self.filename)
bookwt.save(filepath + '_out' + fileext)
Which should give you data_out.xlsx
You can save excel file as CSV files this means that when they are open by python they show the values in plain text seperated by commas for example the spreadsheet with the address in the columns a to b and the rows 1 to 2 would look like this
A1,B1
A2,B2
this means that you can edit them like normal files and excel can still open them

Save Outfile with Python Loop in SPSS

Ok so I've been playing with python and spss to achieve almost what I want. I am able to open the file and make the changes, however I am having trouble saving the files (and those changes). What I have (using only one school in the schoollist):
begin program.
import spss, spssaux
import os
schoollist = ['brow']
for x in schoollist:
school = 'brow'
school2 = school + '06.sav'
filename = os.path.join("Y:\...\Data", school2) #In this instance, Y:\...\Data\brow06.sav
spssaux.OpenDataFile(filename)
#--This block are the changes and not particularly relevant to the question--#
cur=spss.Cursor(accessType='w')
cur.SetVarNameAndType(['name'],[8])
cur.CommitDictionary()
for i in range(cur.GetCaseCount()):
cur.fetchone()
cur.SetValueChar('name', school)
cur.CommitCase()
cur.close()
#-- What am I doing wrong here? --#
spss.Submit("save outfile = filename".)
end program.
Any suggestions on how to get the save outfile to work with the loop? Thanks. Cheers
In your save call, you are not resolving filename to its actual value. It should be something like this:
spss.Submit("""save outfile="%s".""" % filename)
I'm unfamiliar with spssaux.OpenDataFile and can't find any documentation on it (besides references to working with SPSS data files in unicode mode). But what I am going to guess is the problem is that it grabs the SPSS data file for use in the Python program block, but it isn't actually opened to further submit commands.
Here I make a test case that instead of using spssaux.OpenDataFile to grab the file, does it all with SPSS commands and just inserts the necessary parts via python. So first lets create some fake data to work with.
*Prepping the example data files.
FILE HANDLE save /NAME = 'C:\Users\andrew.wheeler\Desktop\TestPython'.
DATA LIST FREE / A .
BEGIN DATA
1
2
3
END DATA.
SAVE OUTFILE = "save\Test1.sav".
SAVE OUTFILE = "save\Test2.sav".
SAVE OUTFILE = "save\Test3.sav".
DATASET CLOSE ALL.
Now here is a paired down version of what your code is doing. I have the LIST ALL. command inserted in so you can check the output that it is adding the variable of interest to the file.
*Sequential opening the data files and appending data name.
BEGIN PROGRAM.
import spss
import os
schoollist = ['1','2','3']
for x in schoollist:
school2 = 'Test' + x + '.sav'
filename = os.path.join("C:\\Users\\andrew.wheeler\\Desktop\\TestPython", school2)
#opens the SPSS file and makes a new variable for the school name
spss.Submit("""
GET FILE = "%s".
STRING Name (A20).
COMPUTE Name = "%s".
LIST ALL.
SAVE OUTFILE = "%s".
""" %(filename, x,filename))
END PROGRAM.

How to rename files and change the file type as well?

I have a list with .dbf files which I want to change to .csv files. By hand I open them in excel and re-save them as .csv, but this takes too much time.
Now I made a script which changes the file name, but when I open it, it is still a .dbf file type (although it is called .csv). How can I rename the files in such a way that the file type also changes?
My script uses (the dbf and csv file name are listed in a seperate csv file):
IN = dbffile name
OUT = csvfile name
for output_line in lstRename:
shutil.copyfile(IN,OUT)
Changing the name of a file (and the extension is just part of the complete name) has absolutely no effect on the contents of the file. You need to somehow convert the contents from one format to the other.
Using my dbf module and python it is quite simple:
import dbf
IN = 'some_file.dbf'
OUT = 'new_name.csv'
dbf.Table(IN).export(filename=OUT)
This will create a .csv file that is actually in csv format.
If you have ever used VB or looked into VBA, you can write a simple excel script to open each file, save it as csv and then save it with a new name.
Use the macro recorder to record you once doing it yourself and then edit the resulting script.
I have now created a application that automates this. Its called xlsto (look for the xls.exe release file). It allows you to pick a folder and convert all xls files to csv (or any other type).
You need a converter
Search for dbf2csv in google.
It depends what you want to do. It seems like you want to convert files to other types. There are many converters out there, but a computer alone doesn't know every file type. For that you will need to download some software. If all you want to do is change the file extension,
(ex. .png, .doc, .wav) then you can set your computer to be able to change both the name and the extension. I hoped I helped in some way :)
descargar libreria dbfpy desde http://sourceforge.net/projects/dbfpy/?source=dlp
import csv,glob
from dbfpy import dbf
entrada = raw_input(" entresucarpetadbf ::")
lisDbf = glob.glob(entrada + "\\*dbf")
for db in lisDbf:
print db
try:
dbfFile = dbf.Dbf(open(db,'r'))
csvFile = csv.writer(open(db[:-3] + "csv", 'wb'))
headers = range(len(dbfFile.fieldNames))
allRows = []
for row in dbfFile:
rows = []
for num in headers:
rows.append(row[num])
allRows.append(rows)
csvFile.writerow(dbfFile.fieldNames)
for row in allRows:
print row
csvFile.writerow(row)
except Exception,e:
print e
It might be that the new file name is "xzy.csv.dbf". Usually in C# I put quotes in the filename. This forces the OS to change the filename. Try something like "Xzy.csv" in quotes.

Categories