Delimit os.walk to look only files and not directories - python

I have a little question, I have a directorie which contain directories (with files in it) and files. Can i use os.walk to treat files 1 by 1 but not the files into directories ?
Thank you for your answers

Do you want to only list the files in the highest level dir without going in to sub-dirs? os.listdir should do it for you.
You can easily add a check to skip dirs this way
for f in os.listdir(path):
if f.is_dir():
continue
print f

What about
os.walk("/path/to/dir").next()[2]

Well for exemple i have a directorie like that:
boite_noire/
.....helloworld/
.....test1.txt
.....test2.txt
I would like somethink like that at the end of the script:
boite_noire/
.....helloworld/
.....test1/
.....test2/
And in the test1 dirctorie I put test1.txt and same for test2.
I tried listdir but without success and yes os.walk.next()2 should be a good idea cause my problem is when i run my script my os.walk scan the directories and files inside and i don't want to, i only want him to scan the files at the source.
My code with os.walk
enter image description here

My code with os.listdir:
enter image description here
I think there is something easier for both but i don't know what :/

Related

os.walk isn't showing all the files in the given path

I'm trying to make my own backup program but to do so I need to be able to give a directory and be able to get every file that is somewhere deep down in subdirectories to be able to copy them. I tried making a script but it doesn't give me all the files that are in that directory. I used documents as a test and my list with items is 3600 but the amount of files should be 17000. why isn't os.walk showing everything?
import os
data = []
for mdir, dirs, files in os.walk('C:/Users/Name/Documents'):
data.append(files)
print(data)
print(len(data))
Use data.extend(files) instead of data.append(files).
files is a list of files in a directory. It looks like ["a.txt", "b.html"] and so on. If you use append, you end up with data looking like
[..., ["a.txt", "b.html"]]
whereas I suspect you're after
[..., "a.txt", "b.html"]
Using extend will provide the second behaviour.

How to count the number of files with a specific extension in all folders and subfolders

Quick question. I have a folder with 20 sub-folders. I want to count all the files with a .xlsx in all the folders and sub-subfodlers etc. I need to use os.walk to make sure my code literately walk through every folder/sub-folder possible. This is the code I have right now. Howeverr, I get an Invalid Syntax
a = os.getcwd()
list1 = []
for root, dirs, files in os.walk(a):
for file in files:
if file.endswith('.txt'):
list1 = (os.path.join(root, file)
a = sum([len(list1)])
print(a)
Does someone maybe has an easier or prettier code to fix this problem?
You have one parenthesis missing and you need to append the path to the list.
So I would try something like:
a = os.getcwd()
list1 = []
for root, dirs, files in os.walk(a):
for file in files:
if file.endswith('.txt'):
list1.append(os.path.join(root, file))
print(len(list1))
(you guys answered while I was working on my contribution, I still post it in case in can help other users understand the issues in the question post)
Your approach using os.walk is, to me, the good one. Though, I found a few issues with your code:
after entering the if statement, you have a (useless) opening parenthesis that is never closed;
you need to use the list method .append to extend the content of a list with the path of your .xlsx file, what you are doing right now is replacing for each 'file' the content of list1;
you don't need to apply 'sum' on 'len' as len() already returns the number of elements in your list, and you need to run 'len' outside your loops after your list is complete.
This all results in the code given as an answer by #rbeucher

Iterating over .wav files in subdirectories of parent directory

Cheers everybody,
I need help with something in python 3.6 exactly. So i have structure of data like this:
|main directory
| |subdirectory's(plural)
| | |.wav files
I'm currently working from a directory where main directory is placed so I don't need to specify paths before that. So firstly I wanna iterate over my main directory and find all subdirectorys. Then in each of them I wanna find the .wav files, and when done with processing them I wanna go to next subdirectory and so on until all of them are opened, and all .wav files are processed. Exactly what I wanna do with those .wav files is input them in my program, process them so i can convert them to numpy arrays, and then I convert that numpy array into some other object (working with tensorflow to be exact, and wanna convert to TF object). I wrote about the whole process if anybody has any fast advices on doing that too so why not.
I tried doing it with for loops like:
for subdirectorys in open(data_path, "r"):
for files in subdirectorys:
#doing some processing stuff with the file
The problem is that it always raises error 13, Permission denied showing on that data_path I gave him but when I go to properties there it seems okay and all permissions are fine.
I tried some other ways like with os.open or i replaced for loop with:
with open(data_path, "r") as data:
and it always raises permission denied error.
os.walk works in some way but it's not what I need, and when i tried to modify it id didn't give errors but it also didnt do anything.
Just to say I'm not any pro programmer in python so I may be missing an obvious thing but ehh, I'm here to ask and learn. I also saw a lot of similiar questions but they mainly focus on .txt files and not specificaly in my case so I need to ask it here.
Anyway thanks for help in advance.
Edit: If you want an example for glob (more sane), here it is:
from pathlib import Path
# The pattern "**" means all subdirectories recursively,
# with "*.wav" meaning all files with any name ending in ".wav".
for file in Path(data_path).glob("**/*.wav"):
if not file.is_file(): # Skip directories
continue
with open(file, "w") as f:
# do stuff
For more info see Path.glob() on the documentation. Glob patterns are a useful thing to know.
Previous answer:
Try using either glob or os.walk(). Here is an example for os.walk().
from os import walk, path
# Recursively walk the directory data_path
for root, _, files in walk(data_path):
# files is a list of files in the current root, so iterate them
for file in files:
# Skip the file if it is not *.wav
if not file.endswith(".wav"):
continue
# os.path.join() will create the path for the file
file = path.join(root, files)
# Do what you need with the file
# You can also use block context to open the files like this
with open(file, "w") as f: # "w" means permission to write. If reading, use "r"
# Do stuff
Note that you may be confused about what open() does. It opens a file for reading, writing, and appending. Directories are not files, and therefore cannot be opened.
I suggest that you Google for documentation and do more reading about the functions used. The documentation will help more than I can.
Another good answer explaining in more detail can be seen here.
import glob
import os
main = '/main_wavs'
wavs = [w for w in glob.glob(os.path.join(main, '*/*.wav')) if os.path.isfile(w)]
In terms of permissions on a path A/B/C... A, B and C must all be accessible. For files that means read permission. For directories, it means read and execute permissions (listing contents).

how to loop through certain directories in a directory structure in python?

I have the following directory structure
As you can see in the pics, there are a lot of .0 files in different directories. This directory structure exists for 36 folders(Human_C1 to C36) and each Human_C[num] folder has a 1_image_contours folder which has a contours folder with all related .0 files.
These .0 files contain some co-ordinates(x,y). I wish to loop through all these files, take the data in them and put it in an excel sheet(I am using pandas for this).
The problem is, how I loop through only these set of files and none else? (there can be .0 files in contour_image folders also)
Thanks in advance
Since your structure is not recursive I would recommend this:
import glob
zero_files_list = glob.glob("spinux/generated/Human_C*/*/contours/*.0")
for f in zero_files_list:
print("do something with "+f)
Run it from the parent directory of spinux or you'll have no match!
It will expand the pattern for the fixed directory tree above, just as if you used ls or echo in a linux shell.

Find a file in a tree in python-3.4 on windows

I want to search a file in a tree.
I know the filename and the root of the tree, but I want to find the file path. I use python-3.4 on windows 7.
I have:
#I have a list with C header file name => my_headers_list that look like:
for part in my_headers_list:
print(part)
#show this
common.h
extern.h
something.h
else.h
I also have a tree ( something/ is a folder):
Software/
something.c
Component1/
extern.c
extern.h
Component2/
Core/
common.h
common.c
Config/
common_m.c
common_m.h
Component3/
Managment/
Core/
Config/
etc it's an example but it's really close of my real case.
I want:
#a list like the first one, but with the full path of the files
for part in my_headers_list:
print(part)
#show this
Software/Component2/Core/common.h
Software/Component1/extern.h
Software/Component3/Core/something.h
Software/Component3/Management/else.h
I precise that it should be generic, so I can't make hard link or hard path etc.
For the moment I've try some tricky scripts with some os.listdir etc but it seems messy and don't always work.
#I try something like this
dirs = os.listdir("Software/")
for part in dirs:
#don't know how to correctly manage these informations to get what I want.
Do you guys have some way to do this ?
Keep in mind that I don't want to add any lib or plugin to my python.
Thank you,
Regards,
Question resolved in comments.
I'll use os.walk / glob.
Thank to #kevin #Lafexlos

Categories