Clone GitLab repo & modify selected files then push with Python

Clone GitLab repo & modify selected files then push with Python - python

I am trying to clone all the projects of a specific group of GitLab and then trying to modify all the .yaml files & .txt files by adding a comment in the top of the selected files.
However I am not sure if I am going through the right approach.
I am able to clone all the gitlab repositories but not sure how I can write comments in the beginning of the selected files.
#!/usr/bin/python3
import os
import sys
import gitlab
import subprocess
glab = gitlab.Gitlab(f'http://GitLabInstance', f'PAT')
groups = glab.groups.list()
groupname = 'MY_GROUP'
for group in groups:
if group.name == groupname:
projects = group.projects.list(all=True)
for repo in projects:
command = f'git clone {repo.ssh_url_to_repo} -b master'
process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
output, _ = process.communicate()
process.wait()
text = 'sample comment'
os.system('find . -name ".txt"')
f = open("textfile.txt","a")
f.write(text)
f.close()
However, with the current script which I am trying, I am only able to write modify only one file where I am writing the comment in the end of the line.
Is there any way we can make this script as generic python script so that we can make use of it for normal git too and not for gitlab? What changes needs to be made for that?
How can I pass all the files which are coming as output of find . -name ".txt" , because I want to add that comment on top for all such files.
How to push it back to the master branch automatically as part of script itself.
I went through so much of documentation regarding writing in files of python but looks like we can only append in the end and not in beginning, The only possible solutions I am getting is where I need to rewrite the contents of a file again by deleting it or by cutting and pasting it which I find very disastrous.
I am new to the world of python and finding it little difficult to manage things from here.
Thanks

Let me try to answer your questions -
There is nothing specific to gitlab in the script except that the repositories reside in gitlab. You are still using git to clone. Only the URL of the repo will change and that is non specific to git.
Instead of doing find, use os.listdir() which will return all the fiels in an array whicch you can then filter for the extension .txt
In [1]: fs = os.listdir('/home/ninan/try/')
In [2]: fs
Out[2]: ['not.xls', 'somefile.txt']
In [3]: all_txt = [file for file in fs if os.path.splitext(file)[1] == '.txt']
In [4]: all_txt
Out[4]: ['somefile.txt']
You cloned them in a loop, push to the same repos in a loop.
Writing in beginning of a file:
❯ cat somefile.txt
1 │ I am a file
2 │ blah blah
3 │ check this out
~/try
❯
From my pyconsole:
Just put the In statements to your script
In [1]: f = open(path, 'r+')
In [1]: lines = f.readlines()
In [2]: f.seek(0)
Out[2]: 0
In [3]: f.write("I am a line at the beginning. I am the alpha and not the omega\n\n")
Out[3]: 62
In [4]: for line in lines: # write old content after new
...: f.write(line)
...:
In [5]: f.close()
Catting the file after the above execution:
1 │ I am a line at the beginning. I am the alpha and not the omega
2 │
3 │ I am a file
4 │ blah blah
5 │ check this out
Another simple way is :
with open(path, 'r') as f: lines = f.read()
with open(path, 'w') as f: f.write("I am a line at the beginning. I am the alpha and not the omega\n" + lines)
To recursively search in directories:
Check out fnmatch or you can do this with plain python also using os.walk() -
[os.path.join(root, file) for root, dirname, dir in os.walk(folder) for file in dir if file.endswith(".txt")]

Related

Loop through many files in 2 directories

Given: Folder 1 with A.txt and B.txt and Folder 2 with A.txt. and B.txt
How would I be able to run them concurrently such as file A.txt from folder 1 should run with file from folder 2 A.txt and so on.
What I have so far loops through all of the second folders files and then loops through the first folders files, which throws it out of order. Some stuff will be done such as merging parts of the files together (which has been done so far).
My main question is how would I be able to run through 2 directories simultaneously and do stuff inside them.
Note there are many files in Folder 1 and Folder 2 so I need to find a way that utilizes directory schema of some sort
patha=/folder1
pathb=/folder2
import os,glob
for filename in glob.glob(os.path.join(patha,'*.txt'):
for filenamez in glob.glob(os.path.join(pathb,'*.txt'):
MY FUNCTION THAT DOES OTHER STUFF

You can open files with the same name in both folders simultaneously using context managers and do whatever needs to be done from both input streams:
import os
my_folders = ['Folder1', 'Folder2']
common_files = set(os.listdir('Folder1')) & set(os.listdir('Folder2'))
non_common_files = set(os.listdir('Folder1')) ^ set(os.listdir('Folder2'))
print(f'common_files" {common_files}')
print(f'files without matches: {non_common_files}')
for f_name in common_files:
with open(os.path.join(my_folders[0], f_name)) as src_1:
with open(os.path.join(my_folders[1], f_name)) as src_2:
# do the stuff on both sources... for instance print first line of each:
print(f'first line of src_1: {src_1.readline()}')
print(f'first line of src_2: {src_2.readline()}')
Output
common_files" {'A.txt'}
files without matches: set()
first line of src_1: some txt
first line of src_2: text in folder 2's A

Is zip what you're looking for?
import glob
import os
files_a = glob.glob(os.path.join(path_a, "*.txt")
files_b = glob.glob(os.path.join(path_b, "*.txt")
for file_a, file_b in zip(files_a, files_b):
pass

You could maybe do something like this:
from threading import Thread
import os,glob
def dir_iterate(path: str):
for filename in glob.glob(os.path.join(path,'*.txt'):
# Other stuff ..
path1 = "./directory1"
path2 = "./directory2"
Thread(target = dir_iterate, args=(path1,)).start()
Thread(target = dir_iterate, args=(path2,)).start()

This should work,
import glob
import os
files_a = sorted(glob.glob(os.path.join(path_a, "*.txt")))
files_b = sorted(glob.glob(os.path.join(path_b, "*.txt")))
for file_a, file_b in zip(files_a, files_b):
# Add code to concat

Python: Looping through files in a different directory and scanning data

I am having a hard time looping through files in a directory that is different from the directory where the script was written. I also ideally would want my script through go to through all files that start with sasa. There are a couple of files in the folder such as sasa.1, sasa.2 etc... as well as other files such as doc1.pdf, doc2.pdf
I use Python Version 2.7 with windows Powershell
Locations of Everything
1) Python Script Location ex: C:Users\user\python_project
2) Main_Directory ex: C:Users\user\Desktop\Data
3) Current_Working_Directory ex: C:Users\user\python_project
Main directory contains 100 folders (folder A, B, C, D etc..)
Each of these folders contains many files including the sasa files of interest.
Attempts at running script
For 1 file the following works:
Script is run the following way: python script1.py
file_path = 'C:Users\user\Desktop\Data\A\sasa.1
def writing_function(file_path):
with open(file_path) as file_object:
lines = file_object.readlines()
for line in lines:
print(lines)
writing_function(file_path)
However, the following does not work
Script is run the following way: python script1.py A sasa.1
import os
import sys
from os.path import join
dr = sys.argv[1]
file_name = sys.argv[2]
file_path = 'C:Users\user\Desktop\Data'
new_file_path = os.path.join(file_path, dr)
new_file_path2 = os.path.join(new_file_path, file_name)
def writing_function(paths):
with open(paths) as file_object:
lines = file_object.readlines()
for line in lines:
print(line)
writing_function(new_file_path2)
I get the following error:
with open(paths) as file_object:
IO Error: [Errno 2] No such file or directory:
'C:Users\\user\\Desktop\\A\\sasa.1'
Please note right now I am just working on one file, I want to be able to loop through all of the sasa files in the folder.

It can be something in the line of:
import os
from os.path import join
def function_exec(file):
code to execute on each file
for root, dirs, files in os.walk('path/to/your/files'): # from your argv[1]
for f in files:
filename = join(root, f)
function_exec(filename)
Avoid using the variable dir. it is a python keyword. Try print(dir(os))
dir_ = argv[1] # is preferable

No one mentioned glob so far, so:
https://docs.python.org/3/library/glob.html
I think you can solve your problem using its ** magic:
If recursive is true, the pattern “**” will match any files and zero
or more directories and subdirectories. If the pattern is followed by
an os.sep, only directories and subdirectories match.

Also note you can change directory location using
os.chdir(path)

Add files to empty directory within Tar in Python

In Python, I am trying to create a tar with two empty directories in it, and then add a list of files to each empty directory within the tar. I have tried doing it this way below, but It does not work.
def ISIP_tar_files():
with tarfile.open("eeg_files.tar", "w") as f:
ep_dir = tarfile.TarInfo("Eplilepsy Reports")
not_ep_dir = tarfile.TarInfo("Non Epilepsy Reports")
ep_dir.type = not_ep_dir.type = tarfile.DIRTYPE
f.addfile(ep_dir)
f.addfile(not_ep_dir)
with ep_dir.open():
for name in ep_list:
f.tarfile.add(name)
I honestly did not believe it would work, but it was worth a try because I couldn't find any other solutions on Google. This is just one module of the code, and it does not include the main program or imports. ep_list is a list of files with paths, it looks similar to:
ep_list = [/data/foo/bar/file.txt, /data/foo/bar2/file2.txt, ...]
Any Sugegstions?

import tarfile
import os
ep_list = ['./foo/bar/file.txt', './foo/bar/file2.txt']
def ISIP_tar_files():
with tarfile.open("eeg_files.tar", "w") as f:
ep_dir = tarfile.TarInfo("Eplilepsy Reports")
not_ep_dir = tarfile.TarInfo("Non Epilepsy Reports")
ep_dir.type = not_ep_dir.type = tarfile.DIRTYPE
ep_dir.mode = not_ep_dir.mode = 0o777
f.addfile(ep_dir)
f.addfile(not_ep_dir)
for name in ep_list:
f.add(name, arcname="Eplilepsy Reports/" + os.path.basename(name), recursive=False)
The directory file permission mode should be made executable at least for the owner. Otherwise it cannot be extracted.
arcname is the alternative name for the file in the archive.
recursive means whether or not keep the original directories added recursively, its default value is True.

Getting sub directories list in a text file and append that txt file with new subdirectory name

I am trying to write a script which will list down all subdirectories in a directory into a txt file.
this script will run every 1 hour through cron job so that i can append to the txt file already created in previous run and add new subdir names.
For eg:
/Directory
/subdir1
/subdir2
/subdir3
txt.file should have following columns:
subdir_name timestamp first_filenamein_thatSUBDIR
subdir1 2015-23-12 abc.dcm
subdir2 2014-23-6 ghj.nii
.
.
.
I know to get list of directories using os.listdir but don't know how to approach this problem as i want to write same txt file with new names. ANy idea how should i do that in python?
EDit: With os.listdir i am getting sub directories name but not the time stamp. And other problem is how can i create two columns one with sub directory name and other with its time stamp as shown above?
With #Termi's help i got this code working:
import time
import os
from datetime import datetime
parent_dir = '/dicom/'
sub_dirs = os.walk(parent_dir).next()[1]
with open('exam_list.txt','a+') as f:
lines = f.readlines()
present_dirs = [line.split('\t')[0] for line in lines]
for sub in sub_dirs[1:len(sub_dirs)]:
sub = sub + '/0001'
latest_modified = os.path.getctime(os.path.join(parent_dir,sub))
if sub not in present_dirs and time.time() - latest_modified < 4600 :
created = datetime.strftime(datetime.fromtimestamp(latest_modified),'%Y-%d-%m')
file_in_subdir = os.walk(os.path.join(parent_dir,sub)).next()[2][1]
f.write("%s\t%s\t%s\n"%(sub,created,file_in_subdir))
This code, when typed on python terminal, works well with all the variables sub, created, file_in_subdir holding some value, however, is not able to write it in a file mentioned at the beginning of the code.
I also tried if file writing is a problem using following code:
with open('./exam_list.txt','a+') as f:
f.write("%s\t%s\n"%(sub,file_in_subdir))
Above two lines creates file properly as i intended..
Not able to point out what is the error.

To get the immediate sub-directories in the parent directory use os.walk('path/to/parent/dir').next()[1].
os.walk().next() gives a list of lists as [current_dir, [sub-dirs], [files] ] so next()[1] gives sub-directories
opening the file with 'a+' will allow you to both read and append to the file. Then store the sub-directories that are already in the file
with open('dirname.txt','a+') as f:
lines = f.readlines()
present_dirs = [line.split('\t')[0] for line in lines]
Now for each sub-directory check whether it is already present in the list and if not, add it to the file. If you execute it every hour you can even check for new files created(or modified in linux systems) in the last hour by using getctime
time.time() - os.path.getctime(os.path.join(parent_dir,sub)) < 3600
Now for any new sub-directory use os.walk('path/to/subdir').next[2] and get the filenames inside
import time
import os
from datetime import datetime
parent_dir = '/path/to/parent/directory'
sub_dirs = os.walk(parent_dir).next()[1]
with open('dirname.txt','a+') as f:
lines = f.readlines()
present_dirs = [line.split('\t')[0] for line in lines]
for sub in sub_dirs:
latest_modified = os.path.getctime(os.path.join(parent_dir,sub))
if sub not in present_dirs and time.time() - latest_modified < 3600 :
created = datetime.strftime(datetime.fromtimestamp(latest_modified),'%Y-%d-%m')
file_in_subdir = os.walk(os.path.join(parent_dir,sub)).next()[2][0]
f.write("%s\t%s\t%s\n"%(sub,created,file_in_subdir))

with open('some.txt', 'a') as output:
output.write('whatever you want to add')
Opening a file with 'a' as a parameter appends everything you write to it to its end.

You can use walk from os package.
It's more better than listdir.
You can read more about it here
En Example:
import os
from os.path import join, getctime
with open('output.txt', 'w+') as output:
for root, dirs, files in os.walk('/Some/path/'):
for name in files:
create_time = getctime(join(root, name))
output.write('%s\t%s\t%s\n' % (root, name, create_time))

Editing file names and saving to new directory in python

I would like to edit the file name of several files in a list of folders and export the entire file to a new folder. While I was able to rename the file okay, the contents of the file didn't migrate over. I think I wrote my code to just create a new empty file rather than edit the old one and move it over to a new directory. I feel that the fix should be easy, and that I am missing a couple of important lines of code. Below is what I have so far:
import libraries
import os
import glob
import re
directory
directory = glob.glob('Z:/Stuff/J/extractions/test/*.fsa')
The two files in the directory look like this when printed out
Z:/Stuff/J/extractions/test\c2_D10.fsa
Z:/Stuff/J/extractions/test\c3_E10.fsa
for fn in directory:
print fn
this script was designed to manipulate the file name and export the manipulated file to a another folder
for fn in directory:
output_directory = 'Z:/Stuff/J/extractions/test2'
value = os.path.splitext(os.path.basename(fn))[0]
matchObj = re.match('(.*)_(.*)', value, re.M|re.I)
new_fn = fn.replace(str(matchObj.group(0)), str(matchObj.group(2)) + "_" + str(matchObj.group(1)))
base = os.path.basename(new_fn)
v = open(os.path.join(output_directory, base), 'wb')
v.close()
My end result is the following:
Z:/Stuff/J/extractions/test2\D10_c2.fsa
Z:/Stuff/J/extractions/test2\E10_c3.fsa
But like I said the files are empty (0 kb) in the output_directory

As Stefan mentioned:
import shutil
and replace:
v = open(os.path.join(output_directory, base), 'wb')
v.close()
with:
shutil.copyfile (fn, os.path.join(output_directory, base))

If I'am not wrong, you are only opening the file and then you are immediately closing it again?
With out any writing to the file it is surely empty.
Have a look here:
http://docs.python.org/2/library/shutil.html
shutil.copyfile(src, dst) ;)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Clone GitLab repo & modify selected files then push with Python - python

Related

Loop through many files in 2 directories

Python: Looping through files in a different directory and scanning data

Add files to empty directory within Tar in Python

Getting sub directories list in a text file and append that txt file with new subdirectory name

Editing file names and saving to new directory in python

Categories

Resources