Python: finding files (png) with a creation date between two set dates - python

I am relatively new to Python and im trying to make a script that finds files (photos) that have been created between two dates and puts them into a folder.
For that, I need to get the creation date of the files somehow (Im on Windows).
I already have everything coded but I just need to get the date of each picture. Would also be interesting to see in which form the date is returned. The best would be like m/d/y or d/m/y (d=day; m=month, y=year).
Thank you all in advance! I am new to this forum

I imagine you are somehow listing files if so then use the
os.stat(path).st_ctime to get the creation time in Windows and then using datetime module string format it.
https://docs.python.org/2/library/stat.html#stat.ST_CTIME
https://stackoverflow.com/a/39359270/928680
this example shows how to convert the mtime (modified) time but the same applies to the ctime (creation time)
once you have the ctime it's relatively simple to check if that falls with in a range
https://stackoverflow.com/a/5464465/928680
you will need to do your date logic before converting​ to a string.
one of the solutions, not very efficient.. just to show one of the ways this can be done.
import os
from datetime import datetime
def filter_files(path, start_date, end_date, date_format="%Y"):
result = []
start_time_obj = datetime.strptime(start_date, date_format)
end_time_obj = datetime.strptime(end_date, date_format)
for file in os.listdir(path):
c_time = datetime.fromtimestamp(os.stat(file).st_ctime)
if start_time_obj <= c_time <= end_time_obj:
result.append("{}, {}".format(os.path.join(path, file), c_time))
return result
if __name__ == "__main__":
print "\n".join(filter_files("/Users/Jagadish/Desktop", "2017-05-31", "2017-06-02", "%Y-%m-%d"))
cheers!

See the Python os package for basic system commands, including directory listings with options. You'll be able to extract the file date. See the Python datetime package for date manipulation.
Also, check the available Windows commands on your version: most of them have search functions with a date parameter; you could simply have an OS system command return the needed file names.

You can use subprocess to run a shell command on a file to get meta_data of that file.
import re
from subprocess import check_output
meta_data = check_output('wmic datafile where Name="C:\\\\Users\\\\username\\\\Pictures\\\\xyz.jpg"', shell=True)
# Note that you have to use '\\\\' instead of '\\' for specifying path of the file
pattern = re.compile(r'\b(\d{14})\b.*')
re.findall(pattern,meta_data.decode())
=> ['20161007174858'] # This is the created date of your file in format - YYYYMMDDHHMMSS

Here is my solution. The Pillow/Image module can access the metadata of the .png file. Then we access the 36867 position of that metadata which is DateTimeOriginal. Finally I convert the string returned to a datetime object which gives flexibility to do whatever you need to do with it. Here is the code.
from PIL import Image
from datetime import datetime
# Get the creationTime
creationTime = Image.open('myImage.PNG')._getexif()[36867]
# Convert creationTime to datetime object
creationTime = datetime.strptime(creationTime, '%Y:%m:%d %H:%M:%S')

Related

Adding changing dates in a string

I'm using xlwings to pull in a excel file from a shared drive.
The files' names change daily based off the data. Eg;
dailysummary_20220429.xlsx
dailysummary_20220428.xlsx
dailysummary_20220427.xlsx
dailysummary_20220426.xlsx
I'm trying to make the code dynamic so that it pulls in today's file each day but struggling with the syntax to make this work. Any help would be much appreciated. So far I have;
from datetime import date
workbook = xw.Book(r'I:\Analytics\dailysummary_{date.today()}.xlsx')
sheet1 = workbook.sheets['OutputTable'].used_range.value
dailydata = pd.DataFrame(sheet1)
Thanks so much!
as suggested by MattR above, you need to format a date the way you want. It will work, but you are using the wrong type of string literal for your purposes.
workbook = xw.Book(f'I:\Analytics\dailysummary_{date.today().strftime("%Y%m%d")}.xlsx')
an f string lets you do the interpolation. A raw string (prefixed with an r) is sort of the opposite -- no interpolation at all
I like breaking things up a little more, might look like overkill though. It allows for easier refactoring. The pathlib module will help you in the future if files start to move, or you get into wanting to use the pathlib.Path.cwd() or .home() to get base paths without needing to change the code all of the time.
The today_str allows you to override the date if you need an old one or something. Just pass '20220425' or whatever.
import datetime as dt
import pathlib
import pandas as pd
import xlwings as xw
def get_dailydata_df(today_str: str = None) -> pd.DataFrame:
base_path = pathlib.Path('I:/Analytics')
if today_str is None:
today_str = dt.datetime.now().strftime('%Y%m%d')
file_str = 'dailysummary_'
file_str = file_str + today_str + '.xlsx'
today_path = pathlib.Path(base_path, file_str)
wb = xw.Book(today_path)
sheet1 = wb.sheets['OutputTable'].used_range.value
dailydata = pd.DataFrame(sheet1)
return dailydata

Add current date to end of a filename when exporting using to_excel

When I run my script I would like to export it as an Excel file with the current date tagged onto the end of it, I could put the date in manually but as I run this each day I would like it to use the current date automatically.
So, to just output an normal excel via python/pandas I use:
df.to_excel('myfile.xlsx')
And in my working directory I get an Excel file called "myfile.xlsx".
But I would like today's current date added to the end, so if I ran the script today the file would be called "myfile 24/09/2019.xlsx".
This will get you there and employs string formatting for clean / readable code:
from datetime import datetime as dt
# Create filename from current date.
mask = '%d%m%Y'
dte = dt.now().strftime(mask)
fname = "myfile_{}.xlsx".format(dte)
df.to_excel(fname)
As mentioned in a comment above, some OS use / as a path separator, so I suggest using a dmY (24092019) date format. As shown here
Output:
myfile_24092019.xlsx

using pytz to get local timezone name [duplicate]

How do I get the Olson timezone name (such as Australia/Sydney) corresponding to the value given by C's localtime call?
This is the value overridden via TZ, by symlinking /etc/localtime, or setting a TIMEZONE variable in time-related system configuration files.
I think best bet is to go thru all pytz timezones and check which one matches local timezone, each pytz timezone object contains info about utcoffset and tzname like CDT, EST, same info about local time can be obtained from time.timezone/altzone and time.tzname, and I think that is enough to correctly match local timezone in pytz database e.g.
import time
import pytz
import datetime
local_names = []
if time.daylight:
local_offset = time.altzone
localtz = time.tzname[1]
else:
local_offset = time.timezone
localtz = time.tzname[0]
local_offset = datetime.timedelta(seconds=-local_offset)
for name in pytz.all_timezones:
timezone = pytz.timezone(name)
if not hasattr(timezone, '_tzinfos'):
continue#skip, if some timezone doesn't have info
# go thru tzinfo and see if short name like EDT and offset matches
for (utcoffset, daylight, tzname), _ in timezone._tzinfos.iteritems():
if utcoffset == local_offset and tzname == localtz:
local_names.append(name)
print local_names
output:
['America/Atikokan', 'America/Bahia_Banderas',
'America/Bahia_Banderas', 'America/Belize', 'America/Cambridge_Bay',
'America/Cancun', 'America/Chicago', 'America/Chihuahua',
'America/Coral_Harbour', 'America/Costa_Rica', 'America/El_Salvador',
'America/Fort_Wayne', 'America/Guatemala',
'America/Indiana/Indianapolis', 'America/Indiana/Knox',
'America/Indiana/Marengo', 'America/Indiana/Marengo',
'America/Indiana/Petersburg', 'America/Indiana/Tell_City',
'America/Indiana/Vevay', 'America/Indiana/Vincennes',
'America/Indiana/Winamac', 'America/Indianapolis', 'America/Iqaluit',
'America/Kentucky/Louisville', 'America/Kentucky/Louisville',
'America/Kentucky/Monticello', 'America/Knox_IN',
'America/Louisville', 'America/Louisville', 'America/Managua',
'America/Matamoros', 'America/Menominee', 'America/Merida',
'America/Mexico_City', 'America/Monterrey',
'America/North_Dakota/Beulah', 'America/North_Dakota/Center',
'America/North_Dakota/New_Salem', 'America/Ojinaga',
'America/Pangnirtung', 'America/Rainy_River', 'America/Rankin_Inlet',
'America/Resolute', 'America/Resolute', 'America/Tegucigalpa',
'America/Winnipeg', 'CST6CDT', 'Canada/Central', 'Mexico/General',
'US/Central', 'US/East-Indiana', 'US/Indiana-Starke']
In production you can create such a mapping beforehand and save it instead of iterating always.
Testing script after changing timezone:
$ export TZ='Australia/Sydney'
$ python get_tz_names.py
['Antarctica/Macquarie', 'Australia/ACT', 'Australia/Brisbane',
'Australia/Canberra', 'Australia/Currie', 'Australia/Hobart',
'Australia/Lindeman', 'Australia/Melbourne', 'Australia/NSW',
'Australia/Queensland', 'Australia/Sydney', 'Australia/Tasmania',
'Australia/Victoria']
This is kind of cheating, I know, but getting from '/etc/localtime' doesn't work for you?
Like following:
>>> import os
>>> '/'.join(os.readlink('/etc/localtime').split('/')[-2:])
'Australia/Sydney'
Hope it helps.
Edit: I liked #A.H.'s idea, in case '/etc/localtime' isn't a symlink. Translating that into Python:
#!/usr/bin/env python
from hashlib import sha224
import os
def get_current_olsonname():
tzfile = open('/etc/localtime')
tzfile_digest = sha224(tzfile.read()).hexdigest()
tzfile.close()
for root, dirs, filenames in os.walk("/usr/share/zoneinfo/"):
for filename in filenames:
fullname = os.path.join(root, filename)
f = open(fullname)
digest = sha224(f.read()).hexdigest()
if digest == tzfile_digest:
return '/'.join((fullname.split('/'))[-2:])
f.close()
return None
if __name__ == '__main__':
print get_current_olsonname()
One problem is that there are multiple "pretty names" , like "Australia/Sydney" , which point to the same time zone (e.g. CST).
So you will need to get all the possible names for the local time zone, and then select the name you like.
e.g.: for Australia, there are 5 time zones, but way more time zone identifiers:
"Australia/Lord_Howe", "Australia/Hobart", "Australia/Currie",
"Australia/Melbourne", "Australia/Sydney", "Australia/Broken_Hill",
"Australia/Brisbane", "Australia/Lindeman", "Australia/Adelaide",
"Australia/Darwin", "Australia/Perth", "Australia/Eucla"
you should check if there is a library which wraps TZinfo , to handle the time zone API.
e.g.: for Python, check the pytz library:
http://pytz.sourceforge.net/
and
http://pypi.python.org/pypi/pytz/
in Python you can do:
from pytz import timezone
import pytz
In [56]: pytz.country_timezones('AU')
Out[56]:
[u'Australia/Lord_Howe',
u'Australia/Hobart',
u'Australia/Currie',
u'Australia/Melbourne',
u'Australia/Sydney',
u'Australia/Broken_Hill',
u'Australia/Brisbane',
u'Australia/Lindeman',
u'Australia/Adelaide',
u'Australia/Darwin',
u'Australia/Perth',
u'Australia/Eucla']
but the API for Python seems to be pretty limited, e.g. it doesn't seem to have a call like Ruby's all_linked_zone_names -- which can find all the synonym names for a given time zone.
If evaluating /etc/localtime is OK for you, the following trick might work - after translating it to python:
> md5sum /etc/localtime
abcdefabcdefabcdefabcdefabcdefab /etc/localtime
> find /usr/share/zoneinfo -type f |xargs md5sum | grep abcdefabcdefabcdefabcdefabcdefab
abcdefabcdefabcdefabcdefabcdefab /usr/share/zoneinfo/Europe/London
abcdefabcdefabcdefabcdefabcdefab /usr/share/zoneinfo/posix/Europe/London
...
The duplicates could be filtered using only the official region names "Europe", "America" ... If there are still duplicates, you could take the shortest name :-)
Install pytz
import pytz
import time
#import locale
import urllib2
yourOlsonTZ = None
#yourCountryCode = locale.getdefaultlocale()[0].split('_')[1]
yourCountryCode = urllib2.urlopen('http://api.hostip.info/country.php').read()
for olsonTZ in [pytz.timezone(olsonTZ) for olsonTZ in pytz.all_timezones]:
if (olsonTZ._tzname in time.tzname) and (str(olsonTZ) in pytz.country_timezones[yourCountryCode]):
yourOlsonTZ = olsonTZ
break
print yourOlsonTZ
This code will take a best-guess crack at your Olson Timezone based both on your Timezone Name (as according to Python's time module), and your Country Code (as according to Python's locale module the hostip.info project, which references your IP address and geolocates you accordingly).
For example, simply matching the Timzone Names could result in America/Moncton, America/Montreal, or America/New_York for EST (GMT-5). If your country is the US, however, it will limit the answer to America/New_York.
However, if your country is Canada, the script will simply default to the topmost Canadian result (America/Moncton). If there is a way to further refine this, please feel free to leave suggestions in comments.
The tzlocal module for Python is aimed at exactly this problem. It produces consistent results under both Linux and Windows, properly converting from Windows time zone ids to Olson using the CLDR mappings.
This will get you the time zone name, according to what's in the TZ variable, or localtime file if unset:
#! /usr/bin/env python
import time
time.tzset
print time.tzname
Here's another posibility, using PyICU instead; which is working for my purposes:
>>> from PyICU import ICUtzinfo
>>> from datetime import datetime
>>> datetime(2012, 1, 1, 12, 30, 18).replace(tzinfo=ICUtzinfo.getDefault()).isoformat()
'2012-01-01T12:30:18-05:00'
>>> datetime(2012, 6, 1, 12, 30, 18).replace(tzinfo=ICUtzinfo.getDefault()).isoformat()
'2012-06-01T12:30:18-04:00'
Here it is interpreting niave datetimes (as would be returned by a database query) in the local timezone.
I prefer following a slightly better than poking around _xxx values
import time, pytz, os
cur_name=time.tzname
cur_TZ=os.environ.get("TZ")
def is_current(name):
os.environ["TZ"]=name
time.tzset()
return time.tzname==cur_name
print "Possible choices:", filter(is_current, pytz.all_timezones)
# optional tz restore
if cur_TZ is None: del os.environ["TZ"]
else: os.environ["TZ"]=cur_TZ
time.tzset()
I changed tcurvelo's script to find the right form of time zone (Continent/..../City), in most of cases, but return all of them if fails
#!/usr/bin/env python
from hashlib import sha224
import os
from os import listdir
from os.path import join, isfile, isdir
infoDir = '/usr/share/zoneinfo/'
def get_current_olsonname():
result = []
tzfile_digest = sha224(open('/etc/localtime').read()).hexdigest()
test_match = lambda filepath: sha224(open(filepath).read()).hexdigest() == tzfile_digest
def walk_over(dirpath):
for root, dirs, filenames in os.walk(dirpath):
for fname in filenames:
fpath = join(root, fname)
if test_match(fpath):
result.append(tuple(root.split('/')[4:]+[fname]))
for dname in listdir(infoDir):
if dname in ('posix', 'right', 'SystemV', 'Etc'):
continue
dpath = join(infoDir, dname)
if not isdir(dpath):
continue
walk_over(dpath)
if not result:
walk_over(join(infoDir))
return result
if __name__ == '__main__':
print get_current_olsonname()
This JavaScript project attempts to solve the same issue in the browser client-side. It works by playing "twenty questions" with the locale, asking for the UTC offset of certain past times (to test for summer time boundaries, etc.) and using those results to deduce what the local time zone must be. I am not aware of any equivalent Python package unfortunately, so if someone wanted to use this solution it would have to be ported to Python.
While this formula will require updating every time (at worst) the TZ database is updated, a combination of this algorithm and the solution proposed by Anurag Uniyal (keeping only possibilities returned by both methods) sounds to me like the surest way to compute the effective local timezone. As long as there is some difference between the UTC offset of at least one local time in any two time zones, such a system can correctly choose between them.

Python - how to read Windows "Media Created" date (not file creation date)

I have several old video files that I'm converting to save space. Since these files are personal videos, I want the new files to have the old files' creation time.
Windows has an attribute called "Media created" which has the actual time recorded by the camera. The files' modification times are often incorrect so there are hundreds of files where this won't work.
How can I access this "Media created" date in Python? I've been googling like crazy and can't find it. Here's a sample of the code that works if the creation date and modify date match:
files = []
for file in glob.glob("*.AVI"):
files.append(file)
for orig in files:
origmtime = os.path.getmtime(orig)
origatime = os.path.getatime(orig)
mark = (origatime, origmtime)
for target in glob.glob("*.mp4"):
firstroot = target.split(".mp4")[0]
if firstroot in orig:
os.utime(target, mark)
As Borealid noted, the "Media created" value is not filesystem metadata. The Windows shell gets this value as metadata from within the file itself. It's accessible in the API as a Windows Property. You can easily access Windows shell properties if you're using Windows Vista or later and have the Python extensions for Windows installed. Just call SHGetPropertyStoreFromParsingName, which you'll find in the propsys module. It returns a PyIPropertyStore instance. The property that's labelled "Media created" is System.Media.DateEncoded. You can access this property using the property key PKEY_Media_DateEncoded, which you'll find in propsys.pscon. In Python 3 the returned value is a datetime.datetime subclass, with the time in UTC. In Python 2 the value is a custom time type that has a Format method that provides strftime style formatting. If you need to convert the value to local time, the pytz module has the IANA database of time zones.
For example:
import pytz
import datetime
from win32com.propsys import propsys, pscon
properties = propsys.SHGetPropertyStoreFromParsingName(filepath)
dt = properties.GetValue(pscon.PKEY_Media_DateEncoded).GetValue()
if not isinstance(dt, datetime.datetime):
# In Python 2, PyWin32 returns a custom time type instead of
# using a datetime subclass. It has a Format method for strftime
# style formatting, but let's just convert it to datetime:
dt = datetime.datetime.fromtimestamp(int(dt))
dt = dt.replace(tzinfo=pytz.timezone('UTC'))
dt_tokyo = dt.astimezone(pytz.timezone('Asia/Tokyo'))
If the attribute you're talking about came from the camera, it's not a filesystem permission: it's metadata inside the videos themselves which Windows is reading out and presenting to you.
An example of this type of metadata would be a JPEG image's EXIF data: what type of camera took the photo, what settings were used, and so forth.
You would need to open up the .mp4 files and parse the metadata, preferably using some existing library for doing that. You wouldn't be able to get the information from the filesystem because it's not there.
Now if, on the other hand, all you want is the file creation date (which didn't actually come from the camera, but was set when the file was first put onto the current computer, and might have been initialized to some value that was previously on the camera)... That can be gotten with os.path.getctime(orig).

Rename a file with standard date and time stored in Variables in Python on Ubuntu/Windows

I asked this question before but some guys divert me on wrong direction and I didn't get the right answer yet.
I Know how to rename the file but I am struggle to add date and time with the new name of file.
Can you plz guide me that how Can I do that?
import os
os.rename('mark.txt', 'steve.txt')
Try this:
import os
import time
timestamp = time.strftime('%H%M-%Y%m%d')
os.rename('oldname.txt', 'oldname_%s.txt' % (timestamp))
The following will append the timestamp to the file name. You can use this example to expand on it and do whatever you feel like doing. This is a better way then using datetime.datetime.now() as, unformatted, that string will contain a space and that is not recommended on Linux.
I think this will help you
print('renaming archive...')
import datetime
dt = str(datetime.datetime.now())
import os
newname = 'danish_'+dt+'.txt'
os.rename('danish.txt', newname)
print('renaming complete...')
from datetime import datetime
import os
current_time = str(datetime.utcnow())
current_time = "_".join(current_time.split()).replace(":","-")
current_time = current_time[:-7]
os.rename('orfile.txt', 'orfile_'+current_time+'.txt')
This will rename the file to the exact timestamp.
orfile2015-01-02_16-17-41.txt
Please use appropriate variable names it is a bad habit to give names to variables which don't make sense.
import datetime
import os
current_time = datetime.datetime.now()
os.rename('mark.txt', 'mark_' + str(current_time) + '.txt')

Categories