Opening xlxs workbook in Python, automating file name

Opening xlxs workbook in Python, automating file name - python

I'm a heavy R user but very new to Python. I'm trying to edit a Python script that is part of a larger workflow. The script starts off by opening an .xlsx document. This document is produced in an earlier part of the workflow and the naming convention is always YYYYMMDD_location (ex: 20191101_Potomac). I would like to set up the Python script so that it automatically pastes today's date and that location variable into the path.
In R, to not have to manually update the path name each time I run the script, I would do something like:
#R
library(openxlsx)
dir_name <- 'C:/path/to/file/'
location_selection <- 'Potomac'
date <- (paste(format(Sys.Date(),"%Y%m%d"))
open.xlsx(paste0(dir_name, date, "_", location_selection, ".xlsx")
I've looked up how to set up something similar in Python (ex: Build the full path filename in Python), but not making much progress in producing something that works.
# Python
import datetime
dir_name = 'C:\path\to\file'
location_selection = todays_date.strftime('%Y%m%d')+'_Potomac'
suffix = '.xlsx'
file = os.path.join(dir_name, location_selection + suffix)
book = xlrd.open_workbook(file)

No need to use os module as you provide the full directory path.
from datetime import date
dir_name = r'C:\path\to\file'
location_selection = f"{date.today().strftime('%Y%m%d')}_Potomac"
suffix = '.xlsx'
file_name = f'{dir_name}\{location_selection}{suffix}'
book = xlrd.open_workbook(file_name)
We can add more variables as well :
from datetime import date
dir_name = r'C:\path\to\file'
today = date.today().strftime('%Y%m%d')
location = "Potomac"
suffix = '.xlsx'
file_name = f'{dir_name}\{today}_{location}{suffix}'
book = xlrd.open_workbook(file_name)
Finaly we can create a function that could be reused :
from datetime import date
def get_filename(location, dir_name=r'C:\path\to\file', suffix=".xlsx", date_format = '%Y%m%d'):
today = date.today().strftime(date_format)
return f'{dir_name}\{today}_{location}{suffix}'
book = xlrd.open_workbook(get_filename("Potomac"))
book2 = xlrd.open_workbook(get_filename("Montreal"))
book3 = xlrd.open_workbook(get_filename("NewYork"))

Related

how can i know the path of a file created by os.makedirs() and store it in a variable for later use?

i want to build a function that convert names from csv to a document in word by docx library and i want to create an empty file with os.makedirs(), the file is created but i cant get its path to later join path with word document to save the document in that file
here is my code:
import docx
import pandas as pd
from datetime import datetime
import os
from docx2pdf import convert
from pathlib import Path
def auto_fill(x,y):
database=pd.read_csv(x)
df=pd.DataFrame(database)
df=df.dropna(axis=0)
targeted_doc=docx.Document(y)
date = datetime.date(datetime.now())
strdate = date.strftime("%m-%d-%Y")
path = strdate
newfile = os.makedirs(path)
newfile_path = path(newfile)
for i in range(len(df.Name)+1):
for n in range (len(targeted_doc.paragraphs)+1):
if targeted_doc.paragraphs[n].text=="Name of trainee":
name=targeted_doc.paragraphs[n].runs[0].text=df.at[i,'Name']
for m in range(len(targeted_doc.paragraphs) + 1):
if targeted_doc.paragraphs[m].text == "tissue date":
date = targeted_doc.paragraphs[n].runs[0].text = strdate
for l in range(len(targeted_doc.paragraphs) + 1):
if targeted_doc.paragraphs[n].text == "tserial number":
sr_num = targeted_doc.paragraphs[l].runs[0].text = df.at[i, 'serial number']
name_of_file = (f"{df.at[i, 'Name']}.docx")
outputdoc=targeted_doc.save(name_of_file)
path_of_document=path(outputdoc)
completesave = os.path.join(path_of_document, name_of_file)
convert(path_of_document,newfile_path+f"{name_of_file}.pdf")
auto_fill("database.csv","01.docx")

If I'm understanding what you're trying to accomplish, then just use the path variable you made earlier. Since you used os.makedirs(path), then the path to that would just be the path object.

If you don't change the location, you can use the same path. If you change the location, you can get the path from your script using os.getcwd() and join it wiht the path using os.path.join()
You can store the result of os.path.join(os.getcwd(), path) to a variable and use it later. You can compose that absolute path before creating the file, so you'll have the entire path

Python: How do I add files to a new directory with a time-stamp?

Im new to python. I am trying to create a script that will add imported files to a new directory with a timestamp, as a daily backup. How do I point to the new directory as it changes name every day? Here is my script:
gis = GIS("https://arcgis.com", "xxx", "xxx")
items = gis.content.search(query="type:Feature Service, owner:xxx", max_items=5000,)
import datetime
import shutil
import os
now = datetime.datetime.today()
nTime = now.strftime("%d-%m-%Y")
source = r"C:\Users\Bruger\xxx\xxx\xxx\Backup\Backup"
dest = os.path.join(source+nTime)
if not os.path.exists(dest):
os.makedirs(dest) #creat dest dir
source_files = os.listdir(source)
for f in source_files:
source_file = os.path.join(source,f)
if os.path.isfile(source_file): #check if source file is a file not dir
shutil.move(source_file,dest) #move all only files (not include dir) to dest dir
for item in items:
service_title = item.title
if service_title == "Bøjninger_09_06":
try:
service_title = item.title
version = "1"
fgdb_title = service_title+version
result = item.export(fgdb_title, "File Geodatabase")
result.download(r"C:\Users\Bruger\xxx\xxx\xxx\Backup\?????) **what shal I write here in order to point to the new folder?**
result.delete()
except:
print("An error occurred downloading"+" "+service_title)```

You could try an f-string instead of a regular string, like this:
rf"C:\Users\Bruger\xxx\xxx\xxx\Backup\{dest}"

You can use something like that:
result.download("C:\Users\Bruger\xxx\xxx\xxx\Backup\{}".format(dest))

python zip extract with timestamp under Windows [duplicate]

I'm trying to extract files from a zip file using Python 2.7.1 (on Windows, fyi) and each of my attempts shows extracted files with Modified Date = time of extraction (which is incorrect).
import os,zipfile
outDirectory = 'C:\\_TEMP\\'
inFile = 'test.zip'
fh = open(os.path.join(outDirectory,inFile),'rb')
z = zipfile.ZipFile(fh)
for name in z.namelist():
z.extract(name,outDirectory)
fh.close()
I also tried using the .extractall method, with the same results.
import os,zipfile
outDirectory = 'C:\\_TEMP\\'
inFile = 'test.zip'
zFile = zipfile.ZipFile(os.path.join(outDirectory,inFile))
zFile.extractall(outDirectory)
Can anyone tell me what I'm doing wrong?
I'd like to think this is possible without having to post-correct the modified time per How do I change the file creation date of a Windows file?.

Well, it does take a little post-processing, but it's not that bad:
import os
import zipfile
import time
outDirectory = 'C:\\TEMP\\'
inFile = 'test.zip'
fh = open(os.path.join(outDirectory,inFile),'rb')
z = zipfile.ZipFile(fh)
for f in z.infolist():
name, date_time = f.filename, f.date_time
name = os.path.join(outDirectory, name)
with open(name, 'wb') as outFile:
outFile.write(z.open(f).read())
date_time = time.mktime(date_time + (0, 0, -1))
os.utime(name, (date_time, date_time))
Okay, maybe it is that bad.

Based on Jia103's answer, I have developed a function (using Python 2.7.14) which preserves directory and file dates AFTER everything has been extracted. This isolates any ugliness in the function, and you can also use zipfile.Zipfile.extractAll() or whatever zip extract method you want:
import time
import zipfile
import os
# Restores the timestamps of zipfile contents.
def RestoreTimestampsOfZipContents(zipname, extract_dir):
for f in zipfile.ZipFile(zipname, 'r').infolist():
# path to this extracted f-item
fullpath = os.path.join(extract_dir, f.filename)
# still need to adjust the dt o/w item will have the current dt
date_time = time.mktime(f.date_time + (0, 0, -1))
# update dt
os.utime(fullpath, (date_time, date_time))
To preserve dates, just call this function after your extract is done.
Here's an example, from a script I wrote to zip/unzip game save directories:
z = zipfile.ZipFile(zipname, 'r')
print 'I have opened zipfile %s, ready to extract into %s' \
% (zipname, gamedir)
try: os.makedirs(gamedir)
except: pass # Most of the time dir already exists
z.extractall(gamedir)
RestoreTimestampsOfZipContents(zipname, gamedir) #<-- USED
print '%s zip extract done' % GameName[game]
Thanks everyone for your previous answers!

Based on Ethan Fuman's answer, I have developed this version (using Python 2.6.6) which is a little more consise:
zf = ZipFile('archive.zip', 'r')
for zi in zf.infolist():
zf.extract(zi)
date_time = time.mktime(zi.date_time + (0, 0, -1))
os.utime(zi.filename, (date_time, date_time))
zf.close()
This extracts to the current working directory and uses the ZipFile.extract() method to write the data instead of creating the file itself.

Based on Ber's answer, I have developed this version (using Python 2.7.11), which also accounts for directory mod dates.
from os import path, utime
from sys import exit
from time import mktime
from zipfile import ZipFile
def unzip(zipfile, outDirectory):
dirs = {}
with ZipFile(zipfile, 'r') as z:
for f in z.infolist():
name, date_time = f.filename, f.date_time
name = path.join(outDirectory, name)
z.extract(f, outDirectory)
# still need to adjust the dt o/w item will have the current dt
date_time = mktime(f.date_time + (0, 0, -1))
if (path.isdir(name)):
# changes to dir dt will have no effect right now since files are
# being created inside of it; hold the dt and apply it later
dirs[name] = date_time
else:
utime(name, (date_time, date_time))
# done creating files, now update dir dt
for name in dirs:
date_time = dirs[name]
utime(name, (date_time, date_time))
if __name__ == "__main__":
unzip('archive.zip', 'out')
exit(0)
Since directories are being modified as the extracted files are being created inside them, there appears to be no point in setting their dates with os.utime until after the extraction has completed, so this version caches the directory names and their timestamps till the very end.

Change CSV name to CSV date time python

I want to change csv name (in this case Example.csv) to a specific name: date time name. I have a library called from datetime import datetime
This is my sentence to create a cvsFile:
with open('Example.csv', 'w') as csvFile:
I want that my output to be:
20180820.csv
20180821.csv
20180822.csv ... etc
And if I run more that one time in the same day, I want that my output to be:
20180820.csv (First time that I run the script)
20180821(2).csv (Second time run)
... etc

Something like this:
import pandas as pd
import datetime
current_date = datetime.datetime.now()
filename = str(current_date.day)+str(current_date.month)+str(current_date.year)
df.to_csv(str(filename + '.csv'))

Since you know how to create the file name you just have to check whether it already exists or not :
def exists(filename):
try:
with open(filename) as f:
file_exists = True
except FileNotFoundError:
file_exists = False
return file_exists
name = 'some_date.csv'
c = 0
while exists(filename):
c += 1
name = 'some_date({}).csv'.format(c)
# do stuff with name

Please find a solution if you can manage a 'progressive' variable taking track of the files. Otherwise you need to check the disk content and it might be rather more complex.
import datetime
progressive = 0
today = datetime.date.today()
todaystr = str(today)
rootname = todaystr
progressive += 1
if progressive > 1:
rootname = todaystr + '_' + str(progressive)
filename = rootname + '.csv'

Count the number of files in the directory with the same date in its name and use that information to create the file name. Here is a solution for both your problems.
import datetime
import os
now = datetime.datetime.now().strftime("%y%m%d")
# count the number of files already in the output dir with date (now)
count = len([name for name in os.listdir('./output/') if (os.path.isfile(name) and now in name)])
csv_name = './output/' + now
if count > 0:
csv_name = csv_name + "(" + str(count+1) +")"
csv_name = csv_name + ".csv"
with open(csv_name, 'w') as csvFile:
pass
Good Luck.

I found the solution:
Only take the real time in a variable, and then concatenate with .csv (and also I put this csv output in a specific folder called output). Finally I open the csvFile with the variable name.
> now = datetime.now().strftime('%Y%m%d-%Hh%M')
> csvname = './output/' + now + '.csv'
> with open(csvname, 'w') as csvFile:
I can not do the second part of my problem. I want that if I run more than one time the code the date time name change or add (2), (3)... etc.

How to use python to turn a .dbf into a shapefile

I have been scouring the internet trying to find a pythonic (sp?!) way to process this data..
Everyday we will recieve a load of data in .dbf format (hopefully) - we then need to save this data as a shapefile.
Does anyone have any links or any suggestions as to my process?

To append the file's creation_date to its name, you need to obtain the creation date with os.stat() and then rename the file with os.rename(). You can format the date string with date.strftime().
import datetime, os
filename = 'original.ext'
fileinfo = os.stat(filename)
creation_date = datetime.date.fromtimestamp(fileinfo.st_ctime)
os.rename(filename, filename + '-' + creation_date.strftime('%Y-%m-%d'))

Off the top of my head:
import os
import datetime
myfile = "test.txt"
creationdate = os.stat(myfile).st_ctime
timestamp = datetime.datetime.fromtimestamp(creationdate)
datestr = datetime.datetime.strftime(timestamp, "%Y%m%d")
os.rename(myfile, os.path.splitext(myfile)[0] + datestr + os.path.splitext(myfile)[1])
renames test.txt to test20110221.txt.

It was in model builder all along!
# (generated by ArcGIS/ModelBuilder)
# Usage: DBF2SHAPEFILE <XY_Table> <Y_Field> <X_Field> <Output_Feature_Class>
# ---------------------------------------------------------------------------
# Import system modules
import sys, string, os, arcgisscripting, datetime
# Adds the creation date to all of the previous shapefiles in that folder
filename = 'D:/test.txt'
fileinfo = os.stat(filename)
creation_date = datetime.date.fromtimestamp(fileinfo.st_ctime)
os.rename(filename, filename + '-' + creation_date.strftime('%Y-%m-%d'))
# Create the Geoprocessor object
gp = arcgisscripting.create()
# Load required toolboxes...
gp.AddToolbox("C:/Program Files/ArcGIS/ArcToolbox/Toolboxes/Data Management Tools.tbx")
# Script arguments...
XY_Table = sys.argv[1]
Y_Field = sys.argv[2]
X_Field = sys.argv[3]
Output_Feature_Class = sys.argv[4]
# Local variables...
Layer_Name_or_Table_View = ""
# Process: Make XY Event Layer...
gp.MakeXYEventLayer_management(XY_Table, X_Field, Y_Field, Layer_Name_or_Table_View, "")
# Process: Copy Features...
gp.CopyFeatures_management(Layer_Name_or_Table_View, Output_Feature_Class, "", "0", "0", "0")

If you wanted to do it without using ArcGIS, you could use OGR's python bindings or the ogr2ogr utility through a subprocess. You could use the utility through a windows batch file, which would be a lot faster than calling the arc process for every file if you have many to do...
As you know it's not a question of changing the extension, there is a specific format required.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Opening xlxs workbook in Python, automating file name - python

Related

how can i know the path of a file created by os.makedirs() and store it in a variable for later use?

Python: How do I add files to a new directory with a time-stamp?

python zip extract with timestamp under Windows [duplicate]

Change CSV name to CSV date time python

How to use python to turn a .dbf into a shapefile

Categories

Resources