I Have around 1500 folders, each containing relevant data and some other irrelavant data.
Every folder is in my data directory. For example one of those folders 'folder_00' contains irrelavant folder, csv files(the actual data) and other csv files.
now i try to iterate over my data folder, entry each folder, count the amount of data, copy those data into another path, rename the data in the other path by theire folder name and number of data.
For now i just try to count the main folders of the data (like data/folder_00 but not the folder inside folder_00) and count the data inside that foulder .
dir_count=0
file_count = 0
for subdir, dirs, files in os.walk(src_path):
for dir_name in dirs:
print(os.path.join(src_path, dir_name))
dir_count=dir_count+1
print(dir_count)
for file_name in files:
if file_name.startswith("relevant_keyword")and not file_name.startswith("irrelevant_keyword"):
file_count+=1
print(file_name)
print(file_count)
I tried to debug that code but it still doesnt work as i want it to^^
it seems like the loop would only access one folder and count the data. It starts at 200 for some reason but counts currectly.
But the for loop where i want to count the folder doesnt work correctly too. Something is really wrong here^^
Related
I'm new to Python and having some trouble looping all the files in my directory.
I am trying to import data from all Excel files from all of the subfolders I have in one single directory. For example, I have a directory named "data" which has five different subfolders and each subfolder contains Excel files from which I want to extract data.
I guess my current code is not working because it just loops all the files in a directory without considering the subfolders. How do I modify my current code to extract data from all the subfolders in my directory?
data_location = "data/"
for file in os.listdir(data_location):
df_file = pd.read_excel(data_location + file)
df_file.set_index(df_file.columns[0], inplace=True)
selected_columns = df_file.loc["Country":"Score", :'Unnamed: 1']
selected_columns.dropna(inplace=True)
df_total = pd.concat([selected_columns, df_total], ignore_index=True)
Also, I've been trying to create a new variable using each file name as I import them. For example, if there are 5 files(file1~file5) in a directory, I want to create a new variable called "Source" and each value would be file1, file2, file3, file4, file5. I want python to append this value for the new variable as it imports each file in the loop. Could anyone please help me with this?
to go through subdirectories recursively, try something like this:
data_location = 'C:/path/to/data'
for subdir, dirs, files in os.walk(data_location):
for file in files:
df_file = pd.read_excel(data_location + file)
Im trying to combine all .txt files in a directory and output them after the last one.
I have following Filesystem structure:
objects
object1
attribute1.txt
attribute2.txt
attribute3.txt
object2
attribute1.txt
attribute2.txt
attribute3.txt
and so on...
I've looked up this code
for subdir, dirs, files in os.walk(rootdir):
for file in files:
# collect the information
I am looking for a for loop like
for subdir, dirs, files in os.walk(rootdir): # <- what do I need to change here?
for object in objects: # <- how to implement this line correctly?
for file in files:
# collect the information
print(information)
But I have to idea how to do that, since I am very new to python.
EDIT:
Python concatenate text files does not answer my question, since there is not an actual loop but only an Array with file names.
The first loop you show will already go through all the files. And the path to the files is stored in subdir
If you run:
for subdir, dirs, files in os.walk(rootdir):
block=""
for file in files:
block+=subdir+"/"+file
print block
You'll see that you get all your files. So now instead of the print statement put the command you want to read the files and store the content to a variable (you should read files with the path subdir+"/"+file).
You need to store the output of your for loop somewhere. This is an important thing to learn =)
my_files = [] #empty list
for file in directory:
my_files.append(file) # each iteration adds to the list
print(my_files)
I use os.renmae to rename files and move them about, but i am failing at doing the following task.
I have a main folder containing sub-folders with the structure below.
Main folder "Back", containing sub-folders named with letters and numbers e.g. A01, A02, B01, B02, etc.. inside each of those folders is a set of files, amongst them is a file called "rad" so a file path example looks something like this:
Back/A01/rad
/A02/rad
/B01/rad
.../rad
I have another sub-folder called "rads" inside the main "Back"
Back/rads
What i want to do is copy all the rad files only, from each of the folders in back and move them to the folder "rads" and name each rad file based on the folder it came from.
e.g. rad_A01, rad_A02, rad_B01, etc...
I couldnt really figure out how to increase the folder number when i move the files.
os.rename ("Back//rad, Back/rads/rad_")
I thought of making a list of all the names of the files and then do something like from x in list, do os.rename (), but i didnt know how to tell python to name the file according to the subfolder it came from as they are not a continuous series..
Any help is appreciated.
Thanks in advance
import os
for subdir, dirs, files in os.walk('./Back/'):
for file in files:
filepath = subdir+os.sep+file
if filepath.endswith("rad.txt"):
par_dir = os.path.split(os.path.dirname(filepath))[1]
os.system('cp '+filepath+' ./Back/rads/rad_'+par_dir)
save this python file beside Back directory and it should work fine!
This code iterates over each files in each subdirectory of Back, checks all files with name rad.txt, appends name of parent directory and copy to the rads folder.
Note: I saved rad files with .txt extension, change it accordingly.
I have files like:
00001.jpg
00002.jpg
.
.
.
01907.jpg
I want to add some files to this directory which are named the same. But their names should continue like
01908.jpg
01909.jpg
.
.
12906.jpg
I couldn't manage to do that. How can i make this happen?
Thanks a lot:)
I tried
import os
files=[]
files = sorted(os.listdir('directory'))
b=len(files)
for i in range(0,b):
a=files[i]
os.rename(a,(a+1))
print (files)
you have a source dir (which contains the badly/identical named files) and a target dir (which contains files that should not be overwritten).
I would:
list the target dir & sort like you did (the rest of your attempt is clearly off...)
get the last item and parse as integer (without extension): add 1 and that gives the next free index.
loop in the source dir
generate a new name for the current file using the new computed index
use shutil.move or shutil.copy to move/copy the new files with the new name
like this:
import os,shutil
s = "source_directory"
d = "target_directory"
files = sorted(os.listdir(d))
highest_index = int(os.path.splitext(files[-1])[0])+1
for i,f in enumerate(sorted(os.listdir(s)),highest_index):
new_name = "{:05}.png".format(i)
shutil.copy(os.path.join(s,f),os.path.join(d,new_name))
You can do this:
import os
directory1 = 'path to the directory you want to move the files to'
directory2 = 'path to the directory you want to move the files to'
for file in ordered(os.listdir(directory2)):
counter = len(os.listdir(directory1))
file_number = int(file.split('.')[0]) #Get the file number
os.rename(os.path.join(directory2, file), os.path.join(directory1 + str(file_number + counter)))
What I have done:
Looped over the files that I wanted to rename and move.
Found the number of files, which I assumed that it is going to be the same as the name of the last file in this directory, in the main directory which the files are going to be moved to and made sure it will keep updating itself so that no overwrites happen.
Then I got the number of the current file in the loop.
Finally, I used os.rename to rename and move the file from the 1st directory to the 2nd.
I'm trying to export all of my maps that are in my subdirectories.
I have the code to export, but I cannot figure out where to add the loop that will make it do this for all subdirectories. As of right now, it is exporting the maps in the directory, but not the subfolders.
import arcpy, os
arcpy.env.workspace = ws = r"C:\Users\162708\Desktop\Burn_Zones"
for subdir, dirs, files in os.walk(ws):
for file in files:
mxd_list = arcpy.ListFiles("*.mxd")
for mxd in mxd_list:
current_mxd = arcpy.mapping.MapDocument(os.path.join(ws, mxd))
pdf_name = mxd[:-4] + ".pdf"
arcpy.mapping.ExportToPDF(current_mxd, pdf_name)
del mxd_list
What am I doing wrong that it isn't able to iterate through the subfolders?
Thank you!
Iterating through os.walk result you should give tuples containing (path, dirs, files) (the first in the tuple is the current path that contains files which is why I tend to name it that way). The current directory does not change automatically so you need to incorporate it into the path you're giving to arcpy.ListFiles like this:
arcpy.ListFiles(os.path.join(path, "*.mxd"))
You should also remove the loop for file in files. It seems like you're exporting the files per directory so why export the whole directory every time for each file?
Also you should change arcpy.mapping.MapDocument(os.path.join(ws, mxd)) to arcpy.mapping.MapDocument(os.path.join(path, mxd)) where path is again the first element from os.walk.