Updating, and opening filenames in a loop - python

Basically, the problem I'm have is trying to open multiple files in a for loop. The filename has this format:
filename = 'mms1_fgm_srvy_l2_20160104_v4.18.0.cdf'
With '20160104' being the date, which I know how to update in the loop. The problem is that the '18' attached at the end isn't constant for every file, and I don't know how it changes, unlike the dates. I was wondering is there is a way to update the number, and check if the file exists in my directory.
As always, any help would be greatly appreciated. Thanks.

You can use the glob.glob() function with a suitable filename pattern to get a list of files (that exist) which match the pattern.
For example:
import glob
pattern = 'mms1_fgm_srvy_l2_*_v4.*.0.cdf'
for filename in glob.glob(pattern):
with open(filename) as file:
process(file)

import os
BASE_NAME = 'mms1_fgm_srvy_l2_20160104_v4.{}.0'
EXT = '.cdf'
attempts = int(input('Check file up to: '))
for num in range(attempts):
file_name = BASE_NAME.format(num) + EXT
if os.path.isfile(file_name):
# open file here
print("Opened File")
else:
print("File does not exist")
Checks if the file exists and if it does you can load it and save it how ever you want else it will print the the file doesn't exist

Related

User input only works for the last file extension in a folder

I am struggling to fix a bug in my code. The variable (fext) is only true for the last file in a folder. So if by chance the last file is 'jpg' then my code will continue as planned. But if by chance the last file is a 'gpx' or a 'csv' then the Else error will activate even though there is a 'jpg' file in the folder.
Can somebody please help me refine my code so that this work if all file types are in the folder? I am still quite new to Python and stuck on how to proceed.
Here is my code below:
import os, string
from os.path import isfile, join
file_path = input("Enter the folder link: ")
print("")
TF = False
path_it = (os.path.join(root, filename)
for root, _, filenames in os.walk(file_path)
for filename in filenames)
for path in path_it:
fext = os.path.splitext(os.path.basename(path))[1]
fname = os.path.splitext(os.path.basename(path))[0]
while True:
file_type = input("Enter file extention (e.g. txt, wav, jpg, gpx, pdf): ")
print(file_type)
if file_type in fext:
TF = True
break
else:
print("\n*** There is no '" + file_type + "' file extension in this folder, please try again.\n")
Other code...
Thanks
A list comprehension is likely your best solution to get your desired result. This will store all the filetypes in the directory passed in a list.
fext = [os.path.splitext(os.path.basename(path))[1] for path in path_it]
fname = [os.path.splitext(os.path.basename(path))[1] for path in path_it]
But, you also need to make sure that the input file type matches the format. The above will give you (for example) ['.csv', '.pdf', '.gpx'], so you need to make sure that the format of the input is the same, in other words, not just 'csv' but '.csv' otherwise there will be no match.
The while loop can also be changed to while not TF, and once TF changes to True, the loop will be broken, instead of breaking the loop using break.
The fext and fname variables should return an iterable if you are to check against all extensions contained within the folder. Try the following list comprehensions.
fext = [os.path.splitext(os.path.basename(path))[1] for path in path_it]
fname = [os.path.splitext(os.path.basename(path))[1] for path in path_it]

How to access the last filename in a directory in python

I'm trying to loop through some files in a directory. If the filename has two specific strings together, then I'm supposed to open and read those files for information. However, if none of the files have those two strings, I want to print an error message only once.
for filename in os.listdir(directory):
if filename.find("<string1>") != -1 and filename.find("<string2>") != -1:
#open file
else:
#print error message
I know doing this will print as many error messages as there are files in the directory (i.e. if there's 15 files with no matches, I'll get 15 error messages). But what I want is to only print an error message once after there aren't any matches in any of the N files in directory. I figured I could do something like this:
for filename in os.listdir(directory):
if filename.find("<string1>") != -1 and filename.find("<string2>") != -1:
#open file
else:
if filename[-1]: #if filename is last in directory
#print error message
But I've discovered this doesn't work. How would I get an error message to print only after the last filename has been read and doesn't match?
A simple solution would be to initialize some boolean flag before your for loop, e.g. found = false
If you find a file, set found = true. Then you can check the value of found after your for loop finishes and print the appropriate message based on its value.
Filter the list of files before the for-loop:
filenames = [fname for fname in os.listdir(directory)
if '<string1>' in fname and '<string2>' in fname]
if filenames:
for filename in filenames:
#open file
else:
#print error message
You can probably also use the glob module to get the filenames:
import glob
filenames = glob.glob(directory + '/*string1*string2*')
Another way is to use a variable to check if all the files have been processed. Checked and found it working in Python 2.7
import os
directory = "E:\\test\\"
files_count = len(os.listdir(directory))
files_processed = 0
for filename in os.listdir(directory):
if 'string1' in filename and 'string2' in filename:
#open file
print ("Opening file")
else:
files_processed = files_processed + 1
if (files_processed >= files_count):
print ("error message")
Not sure if this is extreme. But I'd make it a function and raise IOError.
Plus, i'd always use absolute path. Try the pathlib module too
import os
def get_files(directory):
for filename in os.listdir(directory):
if "string1" in filename and "string2" in filename:
yield filename
raise IOError("No such file")
for file in get_files('.'):
print(file)
# do stuff with file

using "With open" in python to write to another directory

I want to do the following:
1) Ask the user for input for a file path they wish a directory listing for.
2) Take this file path and enter the results, in a list, in a text file in the directory they input NOT the current directory.
I am very nearly there but the last step is that I can't seem to save the file to the directory the user has input only the current directory. I have set out the current code below (which works for the current directory). I have tried various variations to try and save it to the directory input by the user but to no avail - any help would be much appreciated.
CODE BELOW
import os
filenames = os.path.join(input('Please enter your file path: '))
with open ("files.txt", "w") as a:
for path, subdirs, files in os.walk(str(filenames)):
for filename in files:
f = os.path.join(path, filename)
a.write(str(f) + os.linesep)
I came across this link https://cmdlinetips.com/2012/09/three-ways-to-write-text-to-a-file-in-python/. I think your issue has something to do with you needing to provide the full path name and or the way you are using the close() method.
with open(out_filename, 'w') as out_file:
..
..
.. parsed_line
out_file.write(parsed_line)
You have to alter the with open ("files.txt", "w") as a: statement to not only include the filename, but also the path. This is where you should use os.path.join(). Id could be handy to first check the user input for existence with os.path.exists(filepath).
os.path.join(input(...)) does not really make sense for the input, since it returns a single str, so there are no separate things to be joined.
import os
filepath = input('Please enter your file path: ')
if os.path.exists(filepath):
with open (os.path.join(filepath, "files.txt"), "w") as a:
for path, subdirs, files in os.walk(filepath):
for filename in files:
f = os.path.join(path, filename)
a.write(f + os.linesep)
Notice that your file listing will always include a files.txt-entry, since the file is created before os.walk() gets the file list.
As ShadowRanger kindly points out, this LBYL (look before you leap) approach is unsafe, since the existence check could pass, although the file system is changed later while the process is running, leading to an exception.
The mentioned EAFP (it's easier to ask for forgiveness than permission) approach would use a try... except block to handle all errors.
This approach could look like this:
import os
filepath = input('Please enter your file path: ')
try:
with open (os.path.join(filepath, "files.txt"), "w") as a:
for path, subdirs, files in os.walk(filepath):
for filename in files:
f = os.path.join(path, filename)
a.write(f + os.linesep)
except:
print("Could not generate directory listing file.")
You should further refine it by catching specific exceptions. The more code is in the try block, the more errors unrelated to the directory reading and file writing are also caught and suppressed.
Move to the selected directory then do things.
Extra tip: In python 2 use raw_input to avoid special chars error like : or \ ( just use input in python 3 )
import os
filenames = raw_input('Please enter your file path: ')
if not os.path.exists(filenames):
print 'BAD PATH'
return
os.chdir(filenames)
with open ("files.txt", "w") as a:
for path, subdirs, files in os.walk('.'):
for filename in files:
f = os.path.join(path, filename)
a.write(str(f) + os.linesep)

python check if the folder content existed

The purpose of this code is:
Read a csv file which contains a column for a list of file names
here is the csv file:
https://drive.google.com/open?id=0B5bJvxM9TZkhVGI5dkdLVzAyNTA
Then check a specific folder to check if the files exist or not
If its found a file is not in the list delete it
here is the code:
import pandas as pd
import os.path
data = pd.read_csv('data.csv')
names = data['title']
path = "C:\\Users\\Sayed\\Desktop\\Economic Data"
for file in os.listdir(path):
os.path.exists(file)
print(file)
file = os.path.join(path, file)
fileName = os.path.splitext(file)
if fileName not in names:
print('error')
os.remove(file)
I modified the first code, and this is the new code and I got no error but the simply delete all the files in the directory
os.chdir does not return anything, so assigning the result to path means that path has None, which causes the error.
Since you're using pandas, here's a little trick to speed this up using pd.Series.isin.
root = "C:\Users\Sayed\Desktop\Economic Data"
files = os.listdir(root)
for f in data.loc[~data['title'].isin(files), 'title'].tolist():
try:
os.remove(os.path.join(root, f))
except OSError:
pass
Added a try-except check in accordance with EAFP (since I'm not doing an os.path.exists check here). Alternatively, you could add a filter based on existence using pd.Series.apply:
m = ~data['title'].isin(files) & data['title'].apply(os.path.exists)
for f in data.loc[m, 'title'].tolist():
os.remove(os.path.join(root, f))
Your path is the return value of the os.chdir() call. Which is obviously None.
You want to set path to the string representing the path ... leave the chdir out.

pipe one file at a time python

I have more than 10000 json files which I have to convert to for further processing. I am using the following code:
import json
import time
import os
import csv
import fnmatch
tweets = []
count = 0
search_folder = ('/Volumes/Transcend/Axiom/IPL/test/')
for root, dirs, files in os.walk(search_folder):
for file in files:
pathname = os.path.join(root, file)
for file in open(pathname):
try:
tweets.append(json.loads(file))
except:
pass
count = count + 1
This iterates over just one file and stops. I tried adding while True: before for file in open(pathname): and it just doesn't stop nor it creates the csv files. I want to read one file at a time, convert it to csv, then move on to the next file. I tried adding count = count + 1 at the end of the file after completing converting the csv. Still it stops after converting the first file. Can someone help please?
Your indentation is off; you need to put the second for loop inside the first one.
Separate from your main problem, you should use a with statement to open the file. Also, you were reusing the variable name file, which you shouldn't be using anyway since it's the name of a built-in. I also made a few other minor edits.
import json
import os
tweets = []
count = 0
search_folder = '/Volumes/Transcend/Axiom/IPL/test/'
for root, dirs, filenames in os.walk(search_folder):
for filename in filenames:
pathname = os.path.join(root, filename)
with open(pathname, 'r') as infile:
for line in infile:
try:
tweets.append(json.loads(line))
except: # Don't use bare except clauses
pass
count += 1

Categories