Naming and writing different files in a for loop (python) - python

I have what I think is a basic python error. I am building several graphs with the networkx module, and I need to write their edgelists in different gexf files (for gephi). Since I have to perform these operations multiple times I am doing this in a for loop, and I get an error while writing the files.
I need a graph (therefore, a different output file) for each element of the owner column of a dataframe.
for owner in df.owner.unique():
sdf=df[df['owner']==owner]
sG=nx.Graph()
sG.add_nodes_from(sdf['col1'])
sG.add_nodes_from(sdf['col2'])
i=0
while i < len(sdf):
sG.add_edge(sdf.iloc[i,0],sdf.iloc[i,1], weight=sdf.iloc[i,2])
i+=1
with open('com_{}.gexf'.format(owner),'x') as f:
nx.write_gexf(sG,f)
On the first iteration I get a
FileNotFoundError: [Errno 2] No such file or directory
error, suggesting that this is not the right way to create, name and write the files in a loop. What is the right way to do this?

if owner contains a slash, for example "foo/bar", then open will first try to navigate to directory com_foo before creating file bar.gexf. If com_foo doesn't exist, then this exception will occur.
One possible solution is to replace all slashes in owner with a less objectionable character.
with open('com_{}.gexf'.format(owner.replace("/", "_")),'x') as f:

Related

How to run through a list of variables as part of a file path?

I need to go through a series of identical directories and combine two .txt files from each into a single file.
I tried using a list (partial list included, in total ~1000 directories) but Python keeps interpreting my list variable as text in the file path.
import os
for subject in ['100307', '100408', '101107']:
os.chdir("/Users/me/Desktop/SubjPerformance/(subject)")
filenames = ['0bk_nlr.txt', '2bk_nlr.txt']
with open('all_bk_nlr', 'w') as outfile:
for fname in filenames:
with open(fname) as infile:
outfile.write(infile.read())
The error I keep getting is:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/me/Desktop/SubjPerformance/(subject)'
Obviously, (subject) is not part of the file path. I want it to take an item from my list, but it is interpreting what I want to be a list variable as text instead.
I'm sure this could also be done with a wildcard character that runs through every subdirectory within /SubjPerformance but I don't know how to build that loop.
Thanks for your help, and sorry for the ignorant question--I'm a neuroscience researcher, not a developer!
You use incorrect syntax, use
"/Users/me/Desktop/SubjPerformance/%s" % subject
or
"/Users/me/Desktop/SubjPerformance/{}".format(subject)
instead.
In order to use a variable inside a string, you can use "/Users/me/Desktop/SubjPerformance/%s" % subject. You can learn more about string formatting here.

How to have multiple programs access the same file without manually giving them all the file path?

I'm writing several related python programs that need to access the same file however, this file will be updated/replaced intermittently and I need them all to access the new file. My current idea is to have a specific folder where the latest file is placed whenever it needs to be replaced and was curious how I could have python select whatever text file is in the folder.
Or, would I be better off creating a program that has a Class entirely dedicated to holding the information of the file and have each program reference the file in that class. I could have the Class use tkinter.filedialog to select a new file whenever necessary and perhaps have a text file that has the path or name to the file that I need to access and have the other programs reference that.
Edit: I don't need to write to the file at all just read from it. However, I would like to have it so that I do not need to manually update the path to the file every time I run the program or update the file path.
Edit2: Changed title to suit the question more
If the requirement is to get the most recently modified file in a specific directory:
import os
mypath = r'C:\path\to\wherever'
myfiles = [(f,os.stat(os.path.join(mypath,f)).st_mtime) for f in os.listdir(mypath)]
mysortedfiles = sorted(myfiles,key=lambda x: x[1],reverse=True)
print('Most recently updated: %s'%mysortedfiles[0][0])
Basically, get a list of files in the directory, together with their modified time as a list of tuples, sort on modified date, then get the one you want.
It sounds like you're looking for a singleton pattern, which is a neat way of hiding a lot of logic into an 'only one instance' object.
This means the logic for identifying, retrieving, and delivering the file is all in one place, and your programs interact with it by saying 'give me the one instance of that thing'. If you need to alter how it identifies, retrieves, or delivers what that one thing is, you can keep that hidden.
It's worth noting that the singleton pattern can be considered an antipattern as it's a form of global state, it depends on the context of the program if this is a deal breaker or not.
To "have python select whatever text file is in the folder", you could use the glob library to get a list of file(s) in the directory, see: https://docs.python.org/2/library/glob.html
You can also use os.listdir() to list all of the files in a directory, without matching pattern names.
Then, open() and read() whatever file or files you find in that directory.

How do you delete empty files with a specific ending (.gff)?

I an pretty new to python and have been given a problem to solve in my lab, how could one possibly remove specific files ending with .gff or another ending if the file is empty? The files were all just created and are all in the same directory.
for getting files of a particular kind out of a directory glob works very well glob.glob("./file/path/*.gff") will return a list of files ending in .gff
as for finding the size of the file use os.stat("./file/path/blah.gff").st_size

Python - [Errno 2] No such file or directory,

I am trying to make a minor modification to a python script made by my predecessor and I have bumped into a problem. I have studied programming, but coding is not my profession.
The python script processes SQL queries and writes them to an excel file, there is a folder where all the queries are kept in .txt format. The script creates a list of the queries found in the folder and goes through them one by one in a for cycle.
My problem is if I want to rename or add a query in the folder, I get a "[Errno 2] No such file or directory" error. The script uses relative path so I am puzzled why does it keep making errors for non-existing files.
queries_pathNIC = "./queriesNIC/"
def queriesDirer():
global filelist
l = 0
filelist = []
for file in os.listdir(queries_pathNIC):
if file.endswith(".txt"):
l+=1
filelist.append(file)
return(l)
Where the problem arises in the main function:
for round in range(0,queriesDirer()):
print ("\nQuery :",filelist[round])
file_query = open(queries_pathNIC+filelist[round],'r'); # problem on this line
file_query = str(file_query.read())
Contents of queriesNIC folder
00_1_Hardware_WelcomeNew.txt
00_2_Software_WelcomeNew.txt
00_3_Software_WelcomeNew_AUTORENEW.txt
The scripts runs without a problem, but if I change the first query name to
"00_1_Hardware_WelcomeNew_sth.txt" or anything different, I get the following error message:
FileNotFoundError: [Errno 2] No such file or directory: './queriesNIC/00_1_Hardware_WelcomeNew.txt'
I have also tried adding new text files to the folder (example: "00_1_Hardware_Other.txt") and the script simply skips processing the ones I added altogether and only goes with the original files.
I am using Python 3.4.
Does anyone have any suggestions what might be the problem?
Thank you
The following approach would be an improvement. The glob module can produce a list of files ending with .txt quite easily without needing to create a list.
import glob, os
queries_pathNIC = "./queriesNIC/"
def queriesDirer(directory):
return glob.glob(os.path.join(directory, "*.txt"))
for file_name in queriesDirer(queries_pathNIC):
print ("Query :", file_name)
with open(file_name, 'r') as f_query:
file_query = f_query.read()
From the sample you have given, it is not clear if you need further access to the round variable or the file list.

Delete multiple directories in python

In python, I understand that I can delete multiple files with the same name using the following command for eg:
for f in glob.glob("file_name_*.txt"):
os.remove(f)
And that a single directory can be deleted with shutil.rmtree('/path/to/dir') - and that this command will delete the directory even if the directory is not empty. On the other hand, os.rmdir() needs that the directory be empty.
I actually want to delete multiple directories with the same name, and they are not empty. So, I am looking for something like
shutil.rmtree('directory_*')
Is there a way to do this with python?
You have all of the pieces: glob() iterates, and rmtree() deletes:
for path in glob.glob("directory_*"):
shutil.rmtree(path)
This will throw OSError if one of the globbed paths names a file, or for any other reason that rmtree() can fail. You can add error handling as you see fit, once you decide how you want to handle the errors. It doesn't make sense to add error handling unless you know what you want to do with the error, so I have left error handling out.

Categories