How to move files on Drive folder from Google Collab? - python

I'm using this code to read paths inside a txt file, the code changes the extension of the paths from jpg to json.
%cd /content
eliminados = 0
with open('vuelo1.txt') as b:
for o in b:
o = o.replace("jpg", "json")
print('path:',o)
eliminados = eliminados + 1
!mv $o /content/drive/MyDrive/Banano/etiquetas_eval/Datasets_originales_inferidos/etiquetas/malas/$etiquetas/
print(eliminados)
Then I need to move the json files to another folder for which I use the following line:
!mv $o /content/drive/MyDrive/Banano/etiquetas_eval/Datasets_originales_inferidos/etiquetas/malas/$etiquetas/
where $o is the path of the json file inside the txt file and the next path is the destination folder
However I get this error:
mv: missing destination file operand after '/content/drive/MyDrive/Banano/etiquetas_eval/Datasets_originales_inferidos/ric/vuelo1/DJI_0338_2-1.json'
Try 'mv --help' for more information.
/bin/bash: line 1: /content/drive/MyDrive/Banano/etiquetas_eval/Datasets_originales_inferidos/etiquetas/malas/vuelo1/: Is a directory
path: /content/drive/MyDrive/Banano/etiquetas_eval/Datasets_originales_inferidos/ric/vuelo1/DJI_0332_1-2.json
Any idea what I'm doing wrong?

There's a newline at the end of $o, so your !mv $o /content/drive/.. is broken into 2 lines / commands:
mv /content/drive/.../DJI_0338_2-1.json
/content/drive/.../malas/vuelo1/
That's why you see 2 separate error messages.
Try replacing o = o.replace("jpg", "json") with o = o.rstrip().replace("jpg", "json")
to strip newlines.
Debugging tip: using something like print(f'path: "{o}"') makes it far easier to spot such issues; and if you are not quite sure what exactly gets sent to Bash or how vars are evaluated, test your commands with echo first:
!echo mv $o /content/drive/.../malas/$etiquetas/

Related

Python is throwing error for a file path inside the Project folder where it is not supposed to and seems to be a python glitch

Python is throwing some inconsistent error for a file reference inside the Project folder based on 'Working Directory' in the script configuration which is very weird
My Project Structure as follows
config_utils.py
f = ''
def config_init():
global f
txt_dir = '../files/sample.txt'
f = open(txt_dir, "r")
f = f.read()
mycode.py
import config_ru.config_utils as cu
cu.config_init()
print(cu.f)
On executing mycode.py, it throws the below error w.r.t "sample.txt" in "files" package
but if I change the Working directory of "my_code.py" in the script configuration from "level2" to "level1", mycode.py gets executed successfully
This is very weird because in both the cases the location of "sample.txt" remains unchanged and both the error and being forced to change the Working Directory seems to be unacceptable. Please clarify
The work-around is to get the path of the module you are in and apply the relative path of the resource file to that:
from pathlib import Path
f = ''
def config_init():
global f
p = Path(__file__).parent.absolute()
txt_dir = (p / '../files/sample.txt').resolve()
f = open(txt_dir, "r")
f = f.read()
Looks like normal behavior. In the line
txt_dir = '../files/sample.txt'
the .. means 'go one directory up'. So, when you are in level2, it will go up one level (to level1) and look for files/sample.txt, which does not exist. However, when you are in level1, then the .. will bring you to the pythonProject dir, where it can find files/sample.txt

Clone GitLab repo & modify selected files then push with Python

I am trying to clone all the projects of a specific group of GitLab and then trying to modify all the .yaml files & .txt files by adding a comment in the top of the selected files.
However I am not sure if I am going through the right approach.
I am able to clone all the gitlab repositories but not sure how I can write comments in the beginning of the selected files.
#!/usr/bin/python3
import os
import sys
import gitlab
import subprocess
glab = gitlab.Gitlab(f'http://GitLabInstance', f'PAT')
groups = glab.groups.list()
groupname = 'MY_GROUP'
for group in groups:
if group.name == groupname:
projects = group.projects.list(all=True)
for repo in projects:
command = f'git clone {repo.ssh_url_to_repo} -b master'
process = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
output, _ = process.communicate()
process.wait()
text = 'sample comment'
os.system('find . -name ".txt"')
f = open("textfile.txt","a")
f.write(text)
f.close()
However, with the current script which I am trying, I am only able to write modify only one file where I am writing the comment in the end of the line.
Is there any way we can make this script as generic python script so that we can make use of it for normal git too and not for gitlab? What changes needs to be made for that?
How can I pass all the files which are coming as output of find . -name ".txt" , because I want to add that comment on top for all such files.
How to push it back to the master branch automatically as part of script itself.
I went through so much of documentation regarding writing in files of python but looks like we can only append in the end and not in beginning, The only possible solutions I am getting is where I need to rewrite the contents of a file again by deleting it or by cutting and pasting it which I find very disastrous.
I am new to the world of python and finding it little difficult to manage things from here.
Thanks
Let me try to answer your questions -
There is nothing specific to gitlab in the script except that the repositories reside in gitlab. You are still using git to clone. Only the URL of the repo will change and that is non specific to git.
Instead of doing find, use os.listdir() which will return all the fiels in an array whicch you can then filter for the extension .txt
In [1]: fs = os.listdir('/home/ninan/try/')
In [2]: fs
Out[2]: ['not.xls', 'somefile.txt']
In [3]: all_txt = [file for file in fs if os.path.splitext(file)[1] == '.txt']
In [4]: all_txt
Out[4]: ['somefile.txt']
You cloned them in a loop, push to the same repos in a loop.
Writing in beginning of a file:
❯ cat somefile.txt
1 │ I am a file
2 │ blah blah
3 │ check this out
~/try
❯
From my pyconsole:
Just put the In statements to your script
In [1]: f = open(path, 'r+')
In [1]: lines = f.readlines()
In [2]: f.seek(0)
Out[2]: 0
In [3]: f.write("I am a line at the beginning. I am the alpha and not the omega\n\n")
Out[3]: 62
In [4]: for line in lines: # write old content after new
...: f.write(line)
...:
In [5]: f.close()
Catting the file after the above execution:
1 │ I am a line at the beginning. I am the alpha and not the omega
2 │
3 │ I am a file
4 │ blah blah
5 │ check this out
Another simple way is :
with open(path, 'r') as f: lines = f.read()
with open(path, 'w') as f: f.write("I am a line at the beginning. I am the alpha and not the omega\n" + lines)
To recursively search in directories:
Check out fnmatch or you can do this with plain python also using os.walk() -
[os.path.join(root, file) for root, dirname, dir in os.walk(folder) for file in dir if file.endswith(".txt")]

how to change the path name according to my system

below code is from github and i what to change it according to system path
with open("./output/cifar_inception_plot.pkl", 'rb') as f:
dat = pickle.load(f)
total_inception = dict({})
for item in dat:
allis = dat[item]
allis = [x[0] for x in allis]
total_inception[os.path.basename(item)] = np.array(allis)
when i tried to change it like code below:
with open("./Users/Amulya/Desktop/cifar_inception.pkl", 'rb') as f:
dat = pickle.load(f)
total_inception = dict({})
for item in dat:
allis = dat[item]
allis = [x[0] for x in allis]
total_inception[os.path.basename(item)] = np.array(allis)
i got error
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
in
77 }
78
---> 79 with open("./Users/Amulya/Desktop/cifar_inception.pkl", 'rb') as f:
80 dat = pickle.load(f)
81 total_inception = dict({})
FileNotFoundError: [Errno 2] No such file or directory: './Users/Amulya/Desktop/cifar_inception.pkl'
iam still getting error any solution on how to write the filename corretly
If file is in the same directory as the python script you're running you can just use the name of the file itself open("cifar_inception.pkl") otherwise you can use various utils in the os library. Ultimately you need to know where the file is on your system and what the full path is, when in doubt use the full path from root assuming you're on Mac OS based on the path you provided it it might just be "/Users/Amulya/Desktop/cifar_inception.pkl".
./Users/Amulya/Desktop/cifar_inception.pkl does not exist.
The ./ means "in this directory". For example:
I am in my home folder:
/home/me/
And I want to access my Download folder. In stead of type "/home/me/Download" I could just use the "./Download".
Your program is trying to look for a folder named User inside it's current path.
Just remove the single dot from the absolute path, and it will probably work.
The symbol . indicates you are providing a relative path, which is relative to your current working directory. I suggest you to change the string representing the path of the file you are trying to open to a string containing the absolute path of the file.
If you can access the file via a file manager app with GUI (e.g. Windows Explorer on Windows, Nautilus on Ubuntu), you can easily check the absolute path of the file by checking its properties. In case you can access the file's directory via command-line, PWD is an environment variable that outputs the current directory's absolute path. You can check it with echo $PWD and append the filename (cifar_inception.pkl) to the command's output in order to obtain the file's absolute path.

awk: fatal: cannot open file 'file' for reading (Permission denied)

A piece code below is part of a larger program which I am running on a remote server via a batch script with #!/bin/bash -l as its first line.
On my local machine it runs normally but on a remote server permission issues arises. What may be wrong?
The description of the code may not important to the problem, but basically the code uses awk in processing the contents of the files based on the names of the files.
Why is awk denied permission to operate on the files? When I run awk directly on a shell prompt of the remote server it works normally.
#!/usr/bin/env python
list_of_files = ["file1", "file2", "file3"]
for file in list_of_files:
awk_cmd = '''awk '/^>/{print ">" substr(FILENAME,1,length(FILENAME)) ++i; next} 1' ''' + file + " > tmp && mv tmp " + file + \
" | cat files > 'pooled_file' "
exitcode = subprocess.call(awk_cmd, shell=True)
Any help would be appreciated.
I am pretty sure it is a permissions issue since when you are landing into remote machine it is NOT landing on directory where your Input_file(s) are present, off course it will land in HOME directory of logged in user at remote server. So it is a good practice to mention file names with complete paths(Make sure file names with path you are giving are present in target location too else you could write a wrapper over it to check either files are present or not too). Could you please try following.
#!/usr/bin/env python
list_of_files = ["/full/path/file1", "/full/path/file2", "/full/path/file3"]
for file in list_of_files:
awk_cmd = '''awk '/^>/{num=split(FILENAME,array,"/");print ">" substr(array[num],1,length(array[num])) ++i; next} 1' ''' + file + " > tmp$$ && mv tmp$$ " + file + \
" | cat files > 'pooled_file' "
exitcode = subprocess.call(awk_cmd, shell=True)
I haven't tested it but I have changed it as per full path, since awk will print complete path with filename so I have changed FILENAME in your code to as per array's place, I also changed tmp temporary file to tmp$$ for safer side.

Delete certain files from a directory using regex regarding their file names

Here I am making attempts to create a code what would delete files in a folder according to the mask. All files what include 17 should be removed File name format is ??_????17*.*, where ? - Any symbol 1..n,A..z, _ and 17 - are in any files (others contain 18, as well) and its extension doesn't matter. Certain example of a file AB_DEFG17Something.Anything - Copy (2).txt
import os
import re
dir_name = "/Python/Test_folder" # open the folder and read files
testfolder = os.listdir(dir_name)
def matching(r, s): # condition if there's nothing to match
match = re.search(r, s)
if match:
return match.group()
return "Files don't exist!"
matching(r'^\w\w\[_]\w\w\w\w\[1]\[7]\w+\[.]\w+', testfolder) # matching the file's mask
for item in testfolder.index(matching):
if item.name(matching, s):
os.remove(os.path.join(dir_name, item))
# format of filenames not converted : ??_????17*.*
# convert for python separarately : [\w][\w][_\w][\w][\w][\w]\[1]\[7][\w]+[\.][\w]+
# ? - Any symbol 1..n,A..z \w repeating is *
# * - Any number of symbols 1..n, A..z
# _ and 17 - in any files `
There are a few mistakes, as well.
File "D:\Python\Test_folder\Remover v2.py", line 14, in
matching(r'\w\w[_]\w\w\w\w[1][7]\w+[.]\w+', testfolder) # matching the file's mask
 File "D:\Python\Test_folder\Remover v2.py", line 9, in matching
match = re.search(r, s)
File "c:\Program Files (x86)\Wing IDE Personal 6.0\bin\runtime-python2.7\Lib\re.py", line 146, in search
return _compile(pattern, flags).search(string)
I'm a beginner and with amateurish approach would like to get experience in PY, parallel learning details. What am I doing wrong? Any help would be useful. Thx
Don't reinvent the wheel, rather use glob() instead:
import os
from glob import glob
for file in glob('/Python/Test_folder/AB_CDEF17*.*'):
os.remove(file)
Using glob.glob
for filename in glob.glob(os.path.join(dirname, "AB_CDEF17*.*")):
try:
# Trying to remove a current file
os.remove(os.path.join(dirname, filename))
except EnvironmentError:
# You don't have permission to do it
pass
Using os.scandir and re.match
pattern = re.compile(r"AB_CDEF17\w+\.\w+")
for filename in os.scandir(dirname):
if pattern.match(filename):
try:
os.remove(os.path.join(dirname, filename))
except EnvironmentError:
pass
You can use the following command directly from your shell:
cd $PATH; for inode in $(ls -il AB_CDEF17*.* | awk '{print $1}'); do find . -type f -inum $inode -exec rm -i {} \;; done
cd $PATH; go to the folder in question
$(ls -il AB_CDEF17*.* | awk '{print $1}') will print all the inumbers of the files in your current directory, I am using this detour since it looks like there are spaces inside the filenames, therefore rm command will not work properly on them.
find . -type f -inum $inode -exec rm -i {} \;; find the files based on their inumber and delete them by asking you permission.
if you are sure about what you do and you really want to embed it in some python code:
from subprocess import call
call('cd $PATH; for inode in $(ls -il AB_CDEF17*.* | awk '{print $1}'); do find . -type f -inum $inode -exec rm -f {} \;; done')
Watch out: by putting rm -f the files will be deleted without asking your confirmation
You can try glob solution
For example,these are files in folder
~/Test-folder$ ls *.txt -1
AB_DEFG17Sitanything.n.txt
AB_DEFG17SOManything.copy(2).txt
AB_DEFG17SOManything.nis.txt
AB_DEFG17SOManything.n.txt
AB_DEFG18SOManything.n.txt
AB_DEFG28SOManything.n.txt
AB_PIZG17SOManything.piz.txt
AB_PIZG28SOManything.n.txt
AB_PIZG76SOManything.n.txt
My code
import glob
r = [f for f in glob.glob("*.txt") if "AB_DEFG" in f or "17" in f]
for f in r:
print (f)
You will get
AB_DEFG17SOManything.n.txt
AB_DEFG17SOManything.nis.txt
AB_PIZG17SOManything.piz.txt
AB_DEFG17Sitanything.n.txt
AB_DEFG28SOManything.n.txt
AB_DEFG17SOManything.copy(2).txt
AB_DEFG18SOManything.n.txt
I forgot to add remove solution
import glob,os
r = [f for f in glob.glob("*.txt") if "AB_DEFG" in f or "17" in f]
for f in r:
os.remove(f)
Only two files will stay
AB_PIZG28SOManything.n.txt
AB_PIZG76SOManything.n.txt

Categories