For loop over multiple files in a folder - python

I'm attempting to extract a single value from each file contained in folder A. The code runs without throwing an error but returns an empty array for Mfinal. Does anyone see where things might be going wrong?
Mfinal=[]
path = r'C:Desktop/thesis/hrfiles/A'
all_files = glob.glob(path + '/*.csv')
for filename in all_files:
df=pd.dataframe(filename)
mass=df[9]
m=mass[-1]
Mfinal.append(m)

Even if m == None, Mfinal cannot be empty if for loops without any errors.
So, the reasonable suspicion here is that all_files is empty.
Thus, nothing is found by glob.glob.
If you are on Window, try
path = r'C:\Desktop\thesis\hrfiles\A'
all_files = glob.glob(path + '\*.csv')

I agree with ghchoi on the initial path here. Fairly sure there needs to be \ following C:.

Related

How to count the number of files with a specific extension in all folders and subfolders

Quick question. I have a folder with 20 sub-folders. I want to count all the files with a .xlsx in all the folders and sub-subfodlers etc. I need to use os.walk to make sure my code literately walk through every folder/sub-folder possible. This is the code I have right now. Howeverr, I get an Invalid Syntax
a = os.getcwd()
list1 = []
for root, dirs, files in os.walk(a):
for file in files:
if file.endswith('.txt'):
list1 = (os.path.join(root, file)
a = sum([len(list1)])
print(a)
Does someone maybe has an easier or prettier code to fix this problem?
You have one parenthesis missing and you need to append the path to the list.
So I would try something like:
a = os.getcwd()
list1 = []
for root, dirs, files in os.walk(a):
for file in files:
if file.endswith('.txt'):
list1.append(os.path.join(root, file))
print(len(list1))
(you guys answered while I was working on my contribution, I still post it in case in can help other users understand the issues in the question post)
Your approach using os.walk is, to me, the good one. Though, I found a few issues with your code:
after entering the if statement, you have a (useless) opening parenthesis that is never closed;
you need to use the list method .append to extend the content of a list with the path of your .xlsx file, what you are doing right now is replacing for each 'file' the content of list1;
you don't need to apply 'sum' on 'len' as len() already returns the number of elements in your list, and you need to run 'len' outside your loops after your list is complete.
This all results in the code given as an answer by #rbeucher

Python makes additional folder

I have a simple question, but cant really solve that. I have the following code in groupby loop. for each file, python makes the folders \user\Desktop\CO\Sites on my destination folder, but I just want to find the path and put my zip file on that path, not making that again.
Can you please advise?
for n,g in groupby:
csv=g.to_csv(index=False)
filename = '{}{}'.format(r'C:/Users/Desktop/CO/Sites/Site_',n)
os.chdir(r'C:\Users\Desktop')
filename_csv = filename + '_Co_'+ '.csv'
filename_zip = filename + '_Co_' +'.zip'
with open(filename_csv,'w') as out_file:
out_file.write(csv)
zip_all_zips.append(filename_zip)
zip_all_csvs.append(filename_csv)
Maybe you can check if the directory is present, otherwise create it:
if not os.path.exists("filepath/"):

How to handle long path with spaces in Windows with Python

In the following code, I need to iterate through files in a directory with long names and spaces in paths.
def avg_dmg_acc(path):
for d in os.listdir(path):
sub_path = path + '/' + d
if os.path.isdir(sub_path):
if d.startswith('Front'):
for f in os.listdir(sub_path):
fpath = r"%s" % sub_path + '/' + f
print(fpath)
print(os.path.exists(fpath))
df = pd.read_csv(fpath)
Then I ran the function providing the argument path:
path = r"./Mid-Con Master dd3d5c56-581c-42e0-acde-04e7feed3bb8/620138 91852327-e08d-4ed1-9774-383c888cb04e/Power End 2d41ba63-dfb9-4984-a5a5-153997fea43a"
avg_dmg_acc(path)
However I am getting file not exist error:
File b'./Mid-Con Master dd3d5c56-581c-42e0-acde-04e7feed3bb8/620138 91852327-e08d-4ed1-9774-383c888cb04e/Power End 2d41ba63-dfb9-4984-a5a5-153997fea43a/Front c41f42ce-7158-4371-8cf6-82d1bcf04787/Damage Accumulation f907a97a-6d2d-40f6-ba02-0bc0599b773b.csv' does not exist
As you can see, I am already using r"path" since I read it somewhere it handles spaces in path. Also the path was constructed manually in this version, e.g. sub_path = path + '/' + d but I tried to use os.path.join(path, d) originally and it didn't work. I also tried Path from pathlib since it is the recommended way in Python 3 and still the same. At one point I tried to use os.path.abspath instead of the relative path I am using now with ./ but it still says file not exist.
Why is it not working? Is it because the path is too long or spaces are still not dealt with correctly?
It turns out it is the length of the path that is causing this problem. I tried to reduce the folder name of the lowest level one character at a time and got to the point where os.path.exists(fpath) changed from false to true. I think I will need to rename all the folder names before processing

How does one rename multiple files using python?

I have lots of programming experience but this is my first python script. I am trying to add the prefix "00" to all the files in a specific folder. First I read the names of all the files and save them in an array. Then I sort through the array and add the prefix "00" then use the os.rename function but somewhere along the way I've messed up something.
import sys, os
file_list = []
for file in os.listdir(sys.argv[1]):
file_list.append(file)
for i in file_list:
file_list[i] = prevName
newName = '00' + file_list[i]
os.rename(prevName, newName)
I have a .py file in the folder with all the files I want to rename. The .py file contains the script above. When i double click the .py file a cmd window flashes and disappears and none of the file names have been changed. Any help would be appreciated, sorry if this is a very obvious mistake, my python level is quite n00b at the moment.
In addition to the answer by #Padraic, also make following changes to your code.
import sys, os
file_list = []
for f in os.listdir(sys.argv[1]):
file_list.append(f)
for i in range(len(file_list)):
prevName = file_list[i]
if prevName != 'stackoverflow.py': # Mention .py file so that it doesnt get renamed
newName = '00' + file_list[i]
os.rename(prevName, newName)
Check your indentation. The second for loop is not indented correctly.
for i in file_list:
file_list[i] = prevName
You are not iterating correctly. for loops in Python are like foreach loops you may know from other programming languages. i in for i in file_list actually gives you the list's elements, so you should be doing
for i in range(len(file_list)):
file_list[i] = ......
although it is not very pythonic nor generally a good idea to modify the collection that you're currently iterating over.
Your code errors because you provide no args so sys.argv[1] would give an IndexError, you would need to call the script with the dir name from a cmd prompt not double click it:
python your_script directory <- argv[1]
Or change the code and specify the path, you also need to join the path to the filename.
path = "full_path"
for f in os.listdir(path):
curr,new = os.path.join(path,f), os.path.join(path,"00{}".format(f))
os.rename(curr,new)
os.listdir returns a list so just iterate over that, you don't need to create a list and append to it.
for i in file_list: would also make each i a filename not an index so that would cause another error but as above you don't need to do it anyway.

Why would Python think a file doesn't exist when I think it does?

I'm trying to import some files to plot, and all was going well until I moved my program to the directory above where it was before. The relevant piece of code that seems to be problematic is below:
import os
import pandas as pd
path = os.getcwd() + '/spectrum_scan/'
files = os.listdir(path)
dframefiles = pd.DataFrame(files)
up = pd.read_csv(dframefiles.ix[i][0])
If I type directly into the shell os.path.exists(path) it returns True.
The first file in the directory spectrum_scan is foo.csv.
When I type os.path.exists(path + 'foo.csv') it returns True but os.path.isfile('foo.csv') returns False.
Also, asking for files and dframefiles returns everything as it should, but when the code is run I get Exception: File foo.csv does not exist.
Is there something obvious I'm missing?
You are using os.listdir(), which returns filenames without a path. You'll need to add the path to these:
files = [os.path.join(path, f) for f in os.listdir(path)]
otherwise python will try and look for 'foo.csv' in the current directory, and not in the spectrum_scan sub-directory where the files really are located.

Categories