Well i have 2 different scripts that i wrote
The first one is just getting an md5 hash from all files that are .exe
The other script is some agent who check's every 3 seconds if their is new files
in the directory .
now i need to make the agent check the files and also print every md5
this are my scripts :
import os, time
path_to_watch = "/root/PycharmProjects/untitled1"
before = dict ([(f, None) for f in os.listdir (path_to_watch)])
while 1:
time.sleep (3)
after = dict ([(f, None) for f in os.listdir (path_to_watch)])
added = [f for f in after if not f in before]
removed = [f for f in before if not f in after]
if added: print "Added: ", ", ".join (added)
if removed: print "Removed: ", ", ".join (removed)
before = after
And the second one who checks for md5
import glob
import os
import hashlib
work_path = '/root/PycharmProjects/untitled1/'
filenames = glob.glob("/root/PycharmProjects/untitled1/*.exe" )
if len(os.listdir(work_path)) > 0:
for filename in filenames:
with open(filename, 'rb') as inputfile:
data = inputfile.read()
print hashlib.md5(data).hexdigest()
else:
print '0'
Thanks for the help !
How about reducing the iteration from the hash generation, wrapping it into a function and call it when a new file is found:
import time
import glob
import os
import hashlib
def md5(filename):
with open(filename, 'rb') as inputfile:
data = inputfile.read()
print filename, hashlib.md5(data).hexdigest()
path_to_watch = "."
before = os.listdir(path_to_watch)
while 1:
time.sleep(3)
after = os.listdir(path_to_watch)
added = [f for f in after if not f in before]
removed = [f for f in before if not f in after]
if added:
print "Added: ", ", ".join(added)
for filename in added:
md5(filename)
if removed:
print "Removed: ", ", ".join(removed)
before = after
Also stripped some unnecessary dict stuff from the code.
I suggest you take it as a challenge to reduce the number of statements and the number of data transformations to a minimum while keeping the function of the script. At the same time it might be worth a look to the Python Style Guide ;)
Related
I am working on a renaming function which is indexing the video files based on their Media creation date. As media creation date is not the file metadata, I am using the win32com.propsys module it works completely as expected till the last element of the FILES list but goes into loop for the remaining one file. I am unable to catch the issue. It would be really grateful to have positive suggestions.
import os
import pytz
import datetime
from win32com.propsys import propsys, pscon
os.chdir(r'H:\Study material\Python\practice')
current_path = r'H:\Study material\Python\practice'
files = os.listdir(current_path)
fi = []
li = []
for f in files:
properties = propsys.SHGetPropertyStoreFromParsingName(r'H:\Study material\Python\practice'+'\\'+f )
d = properties.GetValue(pscon.PKEY_Media_DateEncoded).GetValue()
fi.append([str(d),f])
fi.sort()
l = [s[1] for s in fi]
for f in files:
i = l.index(f) + 1
new_name = str(i)+'-'+ f
li.append(new_name)
i = 0
for f in files:
os.rename(f,li[i])
i+=1
I guess the last item in sorted file list is a directory (maybe '__pycache__' ?). Try to check if it is really a file:
...
for f in files:
if not os.path.isfile(f):
print(f'not a file: {f}')
continue
properties = propsys.SHGetPropertyStoreFromParsingName(
os.path.join(r'H:\Study material\Python\practice', f) )
d = properties.GetValue(pscon.PKEY_Media_DateEncoded).GetValue()
fi.append([str(d), f])
...
Or try to print every filename and new name and see if everything is correct:
...
for i, f in enumerate(files):
print(f, 'rename to', li[i])
os.rename(f, li[i])
I'm writing a simple script which loops over some text file and uses a function which should replace some string looking in a .csv file (every row has the word to replace and the word which I want there)
Here is my simple code:
import os
import re
import csv
def substitute_tips(table, tree_content):
count = 0
for l in table:
print("element of the table", l[1])
reg_tree = re.search(l[1],tree_content)
if reg_tree is not None:
#print("match in the tree: ",reg_tree.group())
tree_content = tree_content.replace(reg_tree.group(), l[0])
count = count + 1
else:
print("Not found: ",l[1])
tree_content = tree_content
print("Substitutions done: ",count)
return(tree_content)
path=os.getcwd()
table_name = "162_table.csv"
table = open(table_name)
csv_table = csv.reader(table, delimiter='\t')
for root, dirs, files in os.walk(path, topdown=True):
for name in files:
if name.endswith(".tree"):
print(Fore.GREEN + "Working on treefile", name)
my_tree = open(name, "r")
my_tree_content = my_tree.read()
output_tree = substitute_tips(csv_table, my_tree_content)
output_file = open(name.rstrip("tree") + "SPECIES_NAME.tre", "w")
output_file.write(output_tree)
output_file.close()
else:
print(Fore.YELLOW + name ,Fore.RED + "doesn't end in .tree")
It's probably very easy, but I'm a newbie.
Thanks!
The files list returned by os.walk contains only the file names rather than the full path names. You should join root with the file names instead to be able to open them:
Change:
my_tree = open(name, "r")
...
output_file = open(name.rstrip("tree") + "SPECIES_NAME.tre", "w")
to:
my_tree = open(os.path.join(root, name), "r")
...
output_file = open(os.path.join(root, name.rstrip("tree") + "SPECIES_NAME.tre"), "w")
I'm using a Raspberry Pi3 with python 3.5
For a project I need to autoprint every new JPEG image that will come into my folder on my Raspberry Pi3.
I did a lot of research, but did not come across the final answer yet.
What I do have is a printer connected to the Pi with USB and CUPS (that is working properly)
What I now need is a python script that will check if there is a new folder and then if so, autoprints it.
What I did find is a lot about FindFirstChangeAutofication
I tried this script:
(changed the path to watch into the folder that needs watching)
import os
import time
path_to_watch = ('/home/pi/jebenter/')
before = dict ([(f, None) for f in os.listdir (path_to_watch)])
while 1:
time.sleep (10)
after = dict ([(f, None) for f in os.listdir (path_to_watch)])
added = [f for f in after if not f in before]
removed = [f for f in before if not f in after]
if added: print "Added: ", ", ".join (added)
if removed: print "Removed: ", ", ".join (removed)
before = after`
It is not ready but not sure how to finish it..
And I need a way to send it to my printer.
Does somebody now how to help?
As I wrote in my comment above, you could print the files using lp or something similar.
I haven't tested this, but it should run:
import os
import time
import subprocess
path_to_watch = ('/home/pi/jebenter/')
before = dict ([(f, None) for f in os.listdir (path_to_watch)])
while 1:
time.sleep (10)
after = dict ([(f, None) for f in os.listdir (path_to_watch)])
added = [f for f in after if not f in before]
removed = [f for f in before if not f in after]
if added:
print "Added: ", ", ".join (added)
subprocess.Popen(["lp", "-d", printer_name, "--"] + added).communicate()
if removed: print "Removed: ", ", ".join (removed)
before = after
So, i wrote this to monitor a folder for new pictures and print any that are found. It works, but I am assuming there is a more robust/efficient way to tackle this problem as I want it to run for 5-6 hours at a time.
My main problem is that I don't like using "open" while loops like this....
Would anyone tackle this differently? If so, would anyone be willing to explain?
import os
import glob
import win32com.client
import time
from pywinauto.findwindows import find_window
from pywinauto.win32functions import SetForegroundWindow
printed = []
i = 10
while i < 1000000000000000:
files = glob.glob("C://Users//pictures/*.jpg")
for filename in files:
print filename
try:
if printed.index(str(filename)) >= 0:
print printed.index(filename)
print "Image found"
except ValueError:
printed.append(filename)
os.startfile(filename, "print")
shell = win32com.client.Dispatch("WScript.Shell")
time.sleep(2)
SetForegroundWindow(find_window(title='Print Pictures'))
shell.AppActivate("Print Pictures")
shell.SendKeys("{ENTER}")
i = i + 1
time.sleep(5)
link below is related post. instead of using a long while loop you can use a watcher to trigger your operation.
How to detect new or modified files
Big thanks to scope for his comment, i have added my printing lines to the example and it works well. Code posted below for anyone who wants it, commented code is in the link code posted. Now to tidy up a few other things....
import os
import win32file
import win32event
import win
import glob
import win32com.client
import time
from pywinauto.findwindows import find_window
from pywinauto.win32functions import SetForegroundWindow
def print_photo(filename):
print filename
filename = path_to_watch +"\\" + filename[0]
os.startfile(filename, "print")
shell = win32com.client.Dispatch("WScript.Shell")
time.sleep(2)
SetForegroundWindow(find_window(title='Print Pictures'))
shell.AppActivate("Print Pictures")
shell.SendKeys("{ENTER}")
path_to_watch = os.path.abspath ("C:\\Users\\Ciaran\\Desktop\\")
change_handle = win32file.FindFirstChangeNotification (
path_to_watch,
0,
win32con.FILE_NOTIFY_CHANGE_FILE_NAME
)
try:
old_path_contents = dict ([(f, None) for f in os.listdir (path_to_watch)])
while 1:
result = win32event.WaitForSingleObject (change_handle, 500)
if result == win32con.WAIT_OBJECT_0:
new_path_contents = dict ([(f, None) for f in os.listdir (path_to_watch)])
added = [f for f in new_path_contents if not f in old_path_contents]
print_photo(added)
deleted = [f for f in old_path_contents if not f in new_path_contents]
if added: print "Added: ", ", ".join (added)
if deleted: print "Deleted: ", ", ".join (deleted)
old_path_contents = new_path_contents
win32file.FindNextChangeNotification (change_handle)
finally:
win32file.FindCloseChangeNotification (change_handle)
I'm new to Python and want to count contents in 60k text files which are the same, and list all the different contents with a number of how many are the same, something like uniq -c but on a file, rather than line, level.
So far, I have:
from os import listdir
from os.path import isfile, join
mypath = "C:\Users\daniel.schneider\Downloads\Support" # my Filepath
onlyfiles = [ f for f in listdir(mypath) if isfile(join(mypath,f)) ]
for currentFile in onlyfiles:
currentPath = mypath + '\\' + currentFile
f = open(currentPath)
print currentPath
for currentLine in currentFile:
print currentLine[24:]
f.close()
break
I haven't tested it thoroughly, but you could use Python's hashlib to get a MD5 hash on each file, and store the filenames in a list associated with each hash in a dictionary.
Then, to get the unique content with a count of how many files it appears in, iterate over the dictionary:
import os
import hashlib
mypath = 'testdup'
onlyfiles = [f for f in os.listdir(mypath)
if os.path.isfile(os.path.join(mypath,f)) ]
files = {}
for filename in onlyfiles:
filehash = hashlib.md5(open(os.path.join(mypath, filename), 'rb')
.read()).hexdigest()
try:
files[filehash].append(filename)
except KeyError:
files[filehash] = [filename]
for filehash, filenames in files.items():
print('{0} files have this content:'.format(len(filenames)))
print(open(os.path.join(mypath,filenames[0])).read())