How to delete files using the syntax '*' with python3? - python

There are some files that named like percentxxxx.csv,percentyyyy.csv in the dir.I want to delete the files with the name begins with percent.
I find the os.remove function maybe can help me,bu I don't konw how to solve the problem.
Are there any other functions can delete files using the syntax percent*.csv ?
The following is my method:
system_dir=os.getcwd()
for fname in os.listdir(system_dir):
# print(fname)
if fname.startswith('report'):
os.remove(os.path.join(system_dir, fname))
I mainly want to know whether there are more easier methed ,for example using * syntax in the method.

Use glob:
import os
import glob
for csv in glob.glob("percent*.csv"):
os.remove(csv)

Related

How to get the names of the files using glob function in python?

I have some files in a directory :
SRR01231_1.fastq
SRR01231_2.fastq
SRR01232_1.fastq
SRR01232_2.fastq
SRR01233_1.fastq
SRR01233_2.fastq
I am writing a snakemake workflow to do some analysis on these files. For that i need the names of the files in this directory. I am trying to get them by glob function. I am not able to properly utilise the glob function.
sample code i wrote:
import glob
srr, fr = glob.glob({id}+'_'+{int}+'fastq')
The output I am expecting is, id (i.e., SRR1231) to be saved to srr and the int following to be saved as fr.
Is it possible to use some other function to do the same?
Any suggestions or help is appreciated.
You can use pathlib.Path and its glob method to parse such info:
import pathlib
fastq_paths = pathlib.Path("/path/to/your/fastq-files").glob("*.fastq")
for path in fastq_paths:
srr, fr = path.stem.split("_")
print(srr, fr)

Finding latest file in a folder using python

I've searched for an answer for this but the answers still gave me an error message and I wasn't allowed to ask there because I had to make a new question. So here it goes...
I need my python script to use the latest file in a folder.
I tried several things, currently the piece of code looks like this:
list_of_files = glob.glob('/my/path/*.csv')
latest_file = max(list_of_files, key=os.path.getmtime)
But the code fails with the following comment:
ValueError: max() arg is an empty sequence
Does anyone have an idea why?
It should be ok if the list is not empty, but it seems to be. So first check if the list isn't empty by printing it or something similar.
I tested this code and it worked fine:
import os
import glob
mypath = "C:/Users/<Your username>/Downloads/*.*"
print(min(glob.glob(mypath), key=os.path.getmtime))
print(max(glob.glob(mypath), key=os.path.getmtime))
glob.glob has a limitation of not matching the files that start with a .
So, if you want to match these files, this is what you should do - (assume a directory having .picture.png in it)
import glob
glob.glob('.p*') #assuming you're already in the directory
Also, it would be an ideal way to check the number of files present in the directory, before operating on them.

apply command to list of files in python

I've a tricky problem. I need to apply a specific command called xRITDecompress to a list of files with extension -C_ and I should do this with Python.
Unfortunately, this command doesn't work with wildcards and I can't do something like:
os.system("xRITDecompress *-C_")
In principle, I could write an auxiliary bash script with a for cycle and call it inside my python program. However, I'd like not to rely on auxiliary files...
What would be the best way to do this within a python program?
You can use glob.glob() to get the list of files on which you want to run the command and then for each file in that list, run the command -
import glob
for f in glob.glob('*-C_'):
os.system('xRITDecompress {}'.format(f))
From documentation -
The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell.
If by _ (underscore) , you wanted to match a single character , you should use - ? instead , like -
glob.glob('*-C?')
Please note, glob would only search in current directory but according to what you wanted with the original trial, seems like that maybe what you want.
You may also, want to look at subprocess module, it is a more powerful module for running commands (spawning processes). Example -
import subprocess
import glob
for f in glob.glob('*-C_'):
subprocess.call(['xRITDecompress',f])
You can use glob.glob or glob.iglob to get files that match the given pattern:
import glob
files = glob.iglob('*-C_')
for f in files:
os.system("xRITDecompress %s" % f)
Just use glob.glob to search and os.system to execute
import os
from glob import glob
for file in glob('*-C_'):
os.system("xRITDecompress %s" % file)
I hope it satisfies your question

python open file matching pattern excluding substring

I need to open some files inside a folder in python
Say, I have the following files in the folder:
text_pbs.fna
text_pdom_fo_oo.fna
text_pdom_fo_oo_aa.fna
text_pdom_fo_oo.ali
text_pdom_ba_ar.fna
text_pdom_ba_ar_aa.fna
text_pdom_ba_ar.ali
text_pdom_ba_az.fna
text_pdom_ba_az_aa.fna
text_pdom_ba_az.ali
I want to open:
text_pdom_fo_oo.fna
text_pdom_ba_ar.fna
text_pdom_ba_az.fna
only.
I tried with glob:
glob.glob('*_pdom_*[^aa].fna')
But it doesn't work.
Many thanks to point out the problem in the above pattern. Is there any other work around for this?
The ^ is not handled and must be replaced by !, You should try this code:
import glob
glob.glob('*_pdom_*[!aa].fna')
gives the result:
['text_pdom_fo_oo.fna','text_pdom_ba_ar.fna','text_pdom_ba_az.fna']

open file in python without file's full name?

I am trying to execute f = open('filename') in python.
However, I dont know the full name of the file. All I know is that it starts with 's12' and ends with '.ka',I know the folder where it's located, and I know it is the only file in that folder that starts and ends with "s12" and ".ka". Is there a way to do this?
Glob is your friend:
from glob import glob
filename = glob('s12*.ka')[0]
Careful though, glob returns a list of all files matching this pattern so you might want to assert that you get the file you actually want somehow.

Categories