Python: Parse terminal response - python

I want to write a script for automatic backing up some documents on my raspberry pi to a google drive. Therefore I installed rclone and it seems to work well.
For organisation purpose I want to create for every upload a new folder with a 3 digit number e.g 001, 002, 003, ...
This is my code so far:
import os
print("Exisiting folders:")
print(os.system("rclone lsf backup_account:backup"))
print("Create new folder...")
createFolder = os.system("rclone mkdir backup_account:backup/003")
print("Exisiting folders:")
folders = str(os.system("rclone lsf backup_account:backup"))
print(type(folders))
print(len(folders))
First I print the already existing folders in the google drive directory "backup".
Second I create a new folder (in this example it is a static number and will be changed to a dynamic one, once the rest is working)
Third I print the existing folders once again to check if everything worked fine.
Up to here, everything indeed works well and i get a printout like this:
Existing folders:
001/
002/
0
Create new folder...
Existing folders:
001/
002/
003/
<type 'str'>
1
As you see, it gives the folders as a string, if i leave out the Str() it returns a int.
what I don't understand is, that the len(folders) = 1.
What I want is: Check in the beginning the existing folders and create a new one(following the numbering schema) and then copy the backup files to this new folder.
As the script wont be running all the time, i cannot store anything in a variable.
Any hints on how to put the existing folders into a list, array, ... to find the last element/highest number/...?
Running raspbian buster

Related

How do I copy subfolders into another location

I'm making a program to back up files and folders to a destination.
The problem I'm currently facing is if I have a folder inside a folder and so on, with files in between them, I can't Sync them at the destination.
e.g.:
The source contains folder 1 and file 2. Folder 1 contains folder 2, folder 2 contains folder 3 and files etc...
The backup only contains folder 1 and file 2.
If the backup doesn't exist I simply use: shutil.copytree(path, path_backup), but in the case, I need to sync I can't get the files and folders or at least I'm not seeing a way to do it. I have walked the directory with for path, dir, files in os.walk(directory) and even used what someone suggest in another post:
def walk_folder(target_path, path_backup):
for files in os.scandir(target_path):
if os.path.isfile(files):
file_name = os.path.abspath(files)
print(file_name)
os.makedirs(path_backup)
elif os.path.isdir(files):
walk_folder(files, path_backup)
Is there a way to make the directories in the backup folder from the ground up and then add the info alongside or is the only way to just delete the whole folder and use shutil.copytree(path, path_backup).
With makedirs, all it does is say it can't create because the folder already exists, this is understandable as it's trying to write in the Source folder and not in the backup. Is there a way to make the path to replace Source for backup?
If any more code is needed feel free to ask!

Get top level folders only using Python Box SDK

I am trying to run a search query in Box root folder to find folder names that contain a particular string. However I only want the folders that are 1 level below (similar to a ls command). However get_items() will return folders matching the string even deeper down.
For example if I search for "AA" in the below folder structure it should only return Folder1AA, Folder2AA and Folder3AA and not Folder4AA and Folder5AA :
StartingFolder
Folder1AA
File1B
Folder4AA
Folder1C
File1D
Folder2AA
Folder5AA
File1C
Folder2B
File1D
Folder3AA
File1B
Any ideas on how to do that ?

Attempting to delete files in s3 folder but the command is removing the entire directory itself

I have an s3 bucket which has 4 folders now of which is input/.
After the my airflow DAG Runs at the end of the py code are few lines which attempt to delete all files in the input/.
response_keys = self._s3_hook.delete_objects(bucket=self.s3_bucket, keys=s3_input_keys)
deleted_keys = [x['Key'] for x in response_keys.get("Deleted", []) if x['Key'] not in ['input/']]
self.log.info("Deleted: %s", deleted_keys)
if "Errors" in response_keys:
errors_keys = [x['Key'] for x in response_keys.get("Errors", [])]
raise AirflowException("Errors when deleting: {}".format(errors_keys))
Now, this sometimes deletes all files and sometimes deletes the directory itself. I am not sure why it is deleting even though I have specifically excluded the same.
Is there any other way I can try to achieve the deletion?
PS I tried using BOTO, but the AWS has a security which will not let both access the buckets. so Hook is all I got. Please help
Directories do not exist in Amazon S3. Instead, the Key (filename) of an object includes the full path. For example, the Key might be invoices/january.xls, which includes the path.
When an object is created in a path, the directory magically appears. If all objects in a directory are deleted, then the directory magically disappears (because it never actually existed).
However, if you click the Create Folder button in the Amazon S3 management console, a zero-byte object is created with the name of the directory. This forces the directory to 'appear' since there is an object in that path. However, the directory does not actually exist!
So, your Airflow job might be deleting all the objects in a given path, which causes the directory to disappear. This is quite okay and nothing to be worried about. However, if the Create Folder button was used to create the folder, then the folder will still exist when all objects are deleted (assuming that the delete operation does not also delete the zero-length object).

Track Directory or file changes when you run the code

I created a tool that go through certain path using os.walk to check the folder if it's empty or not and list down the folders and files inside it if there is any this is how the result is
...\1.This folder\My Folder\Recored
['My Text 1.txt', 'My Text 2.txt']
OR
...\1.My Pic
This Folder is empty :(
what I want to do is track changes and color with red the new folders or the files that has been modified since the last run.
I don't want to keep watching the changes I want to see what has been change since the last run
I was trying to have something in text like log so I can compare between the current list and the text with no success
for path, directory, files in os.walk(r'C:\Users\J\MyFolder'):
if files or directory:
print(path)
print("\n")
print((os.listdir(path)))
Folder modified timestamp (os.path.getmtime("folder_path")) will provide information about folder changes timestamp value.
Your system (tool which you have created) can tell when it ran previously and also get the timestamp of folder which you are scanning, compare and color it accordingly.
To get changes in datetime format;
datetime.fromtimestamp(os.path.getmtime(folder_path))

HTCondor output files: obtain created directory

I am using HTcondor to generate some data (txt, png). By running my program, it creates a directory next to the .sub file, named datasets, where the datasets are stored into. Unfortunately, condor does not give me back this created data when finished. In other words, my goal is to get the created data in a "Datasets" subfolder next to the .sub file.
I tried:
1) to not put the data under the datasets subfolder, and I obtained them as thought. Howerver, this is not a smooth solution, since I generate like 100 files which are now mixed up with the .sub file and all the other.
2) Also I tried to set this up in the sub file, leading to this:
notification = Always
should_transfer_files = YES
RunAsOwner = True
When_To_Transfer_Output = ON_EXIT_OR_EVICT
getenv = True
transfer_input_files = main.py
transfer_output_files = Datasets
universe = vanilla
log = log/test-$(Cluster).log
error = log/test-$(Cluster)-$(Process).err
output = log/test-$(Cluster)-$(Process).log
executable = Simulation.bat
queue
This time I get the error, that Datasets was not found. Spelling was checked already.
3) Another option would be, to pack everything in a zip, but since I have to run hundreds of jobs, I do not want to unpack all this files afterwards.
I hope somebody comes up with a good idea on how to solve this.
Just for the record here: HTCondor does not transfer created directories at the end of the run or its contents. The best way to get the content back is to write a wrapper script that will run your executable and then compress the created directory at the root of the working directory. This file will be transferred with all other files. For example, create run.exe:
./Simulation.bat
tar zcf Datasets.tar.gz Datasets
and in your condor submission script put:
executable = run.exe
However, if you do not want to do this and if HTCondor is using a common shared space like an AFS you can simply copy the whole directory out:
./Simulation.bat
cp -r Datasets <AFS location>
The other alternative is to define an initialdir as described at the end of: https://research.cs.wisc.edu/htcondor/manual/quickstart.html
But one must create the directory structure by hand.
also, look around pg. 65 of: https://indico.cern.ch/event/611296/contributions/2604376/attachments/1471164/2276521/TannenbaumT_UserTutorial.pdf
This document is, in general, a very useful one for beginners.

Categories