I have project on github. Few people are using it. I want to create a system, that will allow those people to update to current version. I was thinking about creating script, that will fetch *.zip file of my project from github, unpack it and later open() every file and check every line (if x == line). In case of != i want to swap files.
My question is: is this the right way to do this? Or is it better to just grab *.zip and force-swap every project file?
Related
I am using Robocorp, an RPA platform.
In my bot, I have to click a link that automatically downloads a file. (the file name is generated randomly by the site)
I then need to rename the file and move it to a specific directory.
I have two questions on the best way to do this:
Should I simply change the Chrome settings on the RPA.Browser.Selenium library to ask for the download location upon downloading. (Note that I have not been able to get this to work in Robocorp)
Or should I wait for a new file to appear in the bot's "Downloads" folder and then manage that file from there?
Above is an image of one of my many attempts of editing the 'options' argument of the Open Available Browser task to ask me where to save the file prior to downloading. I noticed there is also a 'download' argument but I do not believe it is the right one to edit based on some research.
https://robocorp.com/docs/libraries/rpa-framework/rpa-browser/keywords#open-available-browser
Here is the docs page for the "Open Available Browser' task in Selenium that shows all of its arguments.
Is there a more reliable way to do this?
This is a bit tricky scenario. First, I will start by stating I do not have any experience in Robocorp tool. But this is a generic problem in test automation and RPA.
There are few things to consider while waiting for a file to download including;
Is the file fully downloaded?
What is the max time I need to allow for download?
What if there are pervious files in the folder with similar name?
So to overcome this I would take the following approach. (tested and working for more than 4+ prod level automation frameworks with >20k scenarios)
Maintain a unique "Downloads" folder for each execution. Do this by creating a new folder with unique name and setting that as the downloads folder at the beginning of the run.
In a loop, check the download file size continuously and wait for the file size to NOT increase to check the file is fully downloaded.
??? profit ???
I am currently working on a script that automatically syncs files from the Documents and Picture directory with an USB stick that I use as sort of an "essentials backup". In practice, this should identify filenames and some information about them (like last time edited etc.) in the directories that I choose to sync.
If a file exists in one directory, but not in the other (i.e. it's on my computer but not on my USB drive), it should automatically copy that file to the USB as well. Likewise, if a file exists in both directories, but has different mod-times, it should replace the older with the newer one.
However, I have some issues with storing that information for the purpose of comparing those files. I initially thought about a file class, that stores all that information and through which I can compare objects with the same name.
Problem 1 with that approach is, that if I create an object, how do I name it? Do I name it like the file? I then would have to remove the file-extension like .txt or .py, because I'd run into trouble with my code. but I might have a notes.odt and a notes.jpg, which would be problem 2.
I am pretty new to Python, so my imagination is probably limited by my lack of knowledge. Any pointers on how I could make that work?
it's a kind of open question but please bear with me.
I am working on several projects (mainly with pandas) and I have created my standard approach to manage them:
1. create a main folder for all files in a project
2. create a data folder
3. have all the output in another folder
and so on.
One of my main activities is data cleaning, and in order to standardize it I have created a dictionary file where I store the various translation of the same entity, e.g. USA, US, United States, and so on, so that the files I am producing are consistent.
Every time I create a new project, I copy the dictionary file in the data directory and then:
xls = pd.ExcelFile(r"data/dictionary.xlsx")
df_area = xls.parse("area")
and after, to translate the country name into my standard, I call:
join_column, how_join = "country", "inner"
df_ct = pd.concat([
df_ct.merge(df_area, left_on=join_column, right_on="country_name", how=how_join),
df_ct.merge(df_area, left_on=join_column, right_on="alternative01", how=how_join),
and finally I check that I am not losing an record with a miss-join.
Over and over the same thing.
I would like to have a way to remove all this unnecessary cut and paste (of the file and of the code). Also, the file I used on the first projects are already deprecated and I need to update them (and sometime the code) when I need to process new data. Sometimes I also lose track of where is the latest dictionary file! Overall it's a lot of maintenance, which I believe might be saved.
Creating my own package is the way to go or is it a little too much ambitious?
Is there another shortcut? Overall it's not a lot of code, but multiplied by several projects.
Thanks for any insight, your time going through this is appreciated.
At the end I decided to create my own package.
It required some time so I am happy to share the details about the process (I run python on jupyter and windows).
The first step is to decide where to store the code.
In my case it was C:\Users\my_user\Documents
You need to add this directory to the list of the directories where python is looking for packages. this is achieved running the following statement:
import sys
sys.path.append("C:\\Users\\my_user\\Documents")
In order to run the above statement each time you start python, it must be included into a file in the directory (this directory might vary depending on your installation):
C:\Users\my_user\.ipython\profile_default\startup
the file can be named "00-first.py" ("50-middle.py" or "99-last.py" will also work)
To verify everything is working, restart python and run the command:
print(sys.path)
you should be able to see your directory at this point.
create a folder with the package name in your directory, and a subfolder (I prefer not to have code in the main package folder)
C:\Users\my_user\Documents\my_package\my_subfolder
put an empty file named "_ _init__.py" (note that there should be no space between underscores, but I do not know how to achieve it with the editor) in each of the two folders: my package and my_subfolder. At this point you should be able already to import your empty package from python
import my_package as my_pack
inside my_subfolder create a file (my_code.py) which will store the actual code
def my_function(name):
print("Hallo " + name)
modify the outer _ _init__.py file to include shortcuts. Add the following:
from my_package.my_subfolder.my_code import my_function
You should be able now to run the following in python:
my_pack.my_function("World!")
Hope you find it useful!
Ok, so I wrote a python program which creates backups of virtual machines on a virtual machine server and saves them onto an NFS. I want to make it so that only the most recent 10 backups are saved. So after 10 backups, start over writing the first, second, third, etc. What is the best approach for this? All i could think of is to have a text file which contains all the log information and a current state. Is there a better route? This is for Xen Server which uses python. thanks
What about writing a simple txt file each time a new backup is made?
Could be something like this: backup_ddmmyy_h_m.txt in an backup_cache directory?
And then before you make a new backup you simply check if you have 10 backup txt files, and delete the oldest one.
I am currently working on an app that syncs one specific folder in a users Google Drive. I need to find when any of the files/folders in that specific folder have changed. The actual syncing process is easy, but I don't want to do a full sync every few seconds.
I am condisering one of these methods:
1) Moniter the changes feed and look for any file changes
This method is easy but it will cause a sync if ANY file in the drive changes.
2) Frequently request all files in the whole drive eg. service.files().list().execute() and look for changes within the specific tree. This is a brute force approach. It will be too slow if the user has 1000's of files in their drive.
3) Start at the specific folder, and move down the folder tree looking for changes.
This method will be fast if there are only a few directories in the specific tree, but it will still lead to numerous API requests.
Are there any better ways to find whether a specific folder and its contents have changed?
Are there any optimisations I could apply to method 1,2 or 3.
As you have correctly stated, you will need to keep (or work out) the file hierarchy for a changed file to know whether a file has changed within a folder tree.
There is no way of knowing directly from the changes feed whether a deeply nested file within a folder has been changed. Sorry.
There are a couple of tricks that might help.
Firstly, if your app is using drive.file scope, then it will only see its own files. Depending on your specific situation, this may equate to your folder hierarchy.
Secondly, files can have multiple parents. So when creating a file in folder-top/folder-1/folder-1a/folder-1ai. you could declare both folder-1ai and folder-top as parents. Then you simply need to check for folder-top.