Create files in a specific directory [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I am trying to create a file (.txt) in the data directory but it creates a folder
This is the code I am using
How can I create the file
lenID = abs(len(id) - 5)
nameid = ""
for i in range(lenID):
nameid += "0"
nameid += id
self.pathID = os.getcwd() + "\\Backup\\Data\\" + nameid
self.pathimages = os.getcwd() + "\\Backup\\Data\\" + nameid + "\\Contacts"
pathlogo = os.getcwd() + "\\Backup\\Data\\" + nameid + "\\Logo"
pathimeeting = os.getcwd() + "\\Backup\\Data\\" + nameid + "\\Meeting"
pathnote= os.getcwd() + "\\Backup\\Data\\" + nameid + "\\Notes.txt"
pathID = os.path.join(os.getcwd() + "\\Backup\\Data\\" + nameid)
####### CREATE FOLDER
if not os.path.exists(pathID):
os.mkdir(pathID)
if not os.path.exists(self.pathimages):
os.mkdir(self.pathimages)
if not os.path.exists(pathlogo):
os.mkdir(pathlogo)
if not os.path.exists(pathimeeting):
os.mkdir(pathimeeting)
if not os.path.exists(pathnote):
os.mkdir(pathnote)
self.ui.label_2.setText(self.pathID)
self.Cargar(self.pathimages)
self.Logo(pathlogo)
self.Notes(self.pathID)

os.mkdir() creates a directory, wheras os.mknod() creates a new filesystem node (file), so you should change the applicable function calls to that.
Alternatively, (due to os.mknod() not being great cross-platform), you can open a file for writing then immediately close it again, thus creating a blank file:
with open(pathnote, 'w'): pass

Related

how to print the exception from exce? My code is in string. It works fine on the correct code snippet. not on error [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 days ago.
This post was edited and submitted for review 4 days ago and failed to reopen the post:
Not suitable for this site
Improve this question
import os
path = "C:\\Users\\Abdul Rafay\\Downloads\\Compressed\\day3_t1\\day3_t1"
file_name = os.listdir(path)
word1 = "email"
word2 = "return"
word3 = "def"
x = ""
y = "5"
for i in file_name:
path1 = os.path.join(path, I)
with open(path1, 'r') as fp:
lines = fp.readlines()
for line in lines:
if line.find(word1) != -1:
print("File: ",path1)
print("Email: ",line.strip("email= "))
elif line.find(word2) != -1 or line.find(word3) != -1:
x += line
if 'def' in x and 'return' in x:
print("Solution(5): ")
exec(x + """
try:
print(solution("""+str(y)+"""))
except Exception as err:
print(err)
""")
print("=========================")
x = ""
#The End---------------------------------------The End
Type 1 (with Error)
Type 2 (No Error)
I am reading the "solution" method from these files. and pass the parameter using exec and execute the function.
But the problem is when there is no error in the code it works fine but if there is a error it doesn't show the exception.
This is the output. when there is error it prints the particular function multiple times.

Optimize the performance of retreiving file sizes with pysftp

I have a requirement to get the file details for certain locations (within the system and SFTP) and get the file size for some locations on SFTP which can be achieved using the shared code.
def getFileDetails(location: str):
filenames: list = []
if location.find(":") != -1:
for file in glob.glob(location):
filenames.append(getFileNameFromFilePath(file))
else:
with pysftp.Connection(host=myHostname, username=myUsername, password=myPassword) as sftp:
remote_files = [x.filename for x in sorted(sftp.listdir_attr(location), key=lambda f: f.st_mtime)]
if location == LOCATION_SFTP_A:
for filename in remote_files:
filenames.append(filename)
sftp_archive_d_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
elif location == LOCATION_SFTP_B:
for filename in remote_files:
filenames.append(filename)
sftp_archive_e_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
else:
for filename in remote_files:
filenames.append(filename)
sftp.close()
return filenames
There are more than 10000+ files in LOCATION_SFTP_A and LOCATION_SFTP_B. For each file, I need to get the file size. To get the size I am using
sftp_archive_d_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
sftp_archive_e_size_mapping[filename] = sftp.stat(location + "/" + filename).st_size
# Time Taken : 5 min+
sftp_archive_d_size_mapping[filename] = 1 #sftp.stat(location + "/" + filename).st_size
sftp_archive_e_size_mapping[filename] = 1 #sftp.stat(location + "/" + filename).st_size
# Time Taken : 20-30 s
If I comment sftp.stat(location + "/" + filename).st_size and assign static value It takes only 20-30 seconds to run the entire code. I am looking for a way How can optimize the time and get the file size details.
The Connection.listdir_attr already gives you the file size in SFTPAttributes.st_size.
There's no need to call Connection.stat for each file to get the size (again).
See also:
With pysftp or Paramiko, how can I get a directory listing complete with attributes?
How to fetch sizes of all SFTP files in a directory through Paramiko

rename file that already exists [duplicate]

I'm learning python and also english. And I have a problem that might be easy, but I can't solve it. I have a folder of .txt's, I was able to extract by regular expression a sequence of numbers of each one. I rename each file with the sequence I extracted from .txt
path_txt = (r'''C:\Users\user\Desktop\Doc_Classifier\TXT''')
for TXT in name_files3:
with open(path_txt + '\\' + TXT, "r") as content:
search = re.search(r'(([0-9]{4})(/)(([1][9][0-9][0-9])|([2][0-9][0-9][0-9])))', content.read())
if search is not None:
name3 = search.group(0)
name3 = name3.replace("/", "")
os.rename(os.path.join(path_txt, TXT),
os.path.join("Processos3", name3 + "_" + str(random.randint(100, 999)) + ".txt"))
I need to check if the file already exists, and rename it by adding an increment. Currently to differentiate the files I am adding a random number to the name (random.randint(100, 999))
PS: Currently the script finds "7526/2016" in .txt, by regular expression. Remove the "/". Rename the file with "75262016" + a random number (example: 7526016_111). Instead of renaming using a random number, I would like to check if the file already exists, and rename it using an increment (example: 7526016_copy1, 7526016_copy2)
Replace:
os.rename(
os.path.join(path_txt, TXT),
os.path.join("Processos3", name3 + "_" + str(random.randint(100, 999)) + ".txt")
)
With:
fp = os.path.join("Processos3", name3 + "_%d.txt")
postfix = 0
while os.path.exists(fp % postfix):
postfix += 1
os.rename(
os.path.join(path_txt, TXT),
fp % postfix
)
The code below iterates through the files found in the current working directory, and looks a base filename and for its increments. As soon as it finds an unused increment, it opens a file with that name and writes to it. So if you already have the files "foo.txt", "foo1.txt", and "foo2.txt", the code will make a new file named "foo3.txt".
import os
filenames = os.listdir()
our_filename = "foo"
cur = 0
cur_filename = "foo"
extension = ".txt"
while(True):
if (cur_filename) in filenames:
cur += 1
cur_filename = our_filename + str(cur) + extension
else:
# found a filename that doesn't exist
f = open(cur_filename,'w')
f.write(stuff)
f.close()

Why Python program execution slows down when using functions?

So I have a rather general question I was hoping to get some help with. I put together a Python program that runs through and automates workflows at the state level for all the different counties. The entire program was created for research at school - not actual state work. Anyways, I have two designs shown below. The first is an updated version. It takes about 40 minutes to run. The second design shows the original work. Note that it is not a well structured design. However, it takes about five minutes to run the entire program. Could anybody give any insight why there are such differences between the two? The updated version is still ideal as it is much more reusable (can run and grab any dataset in the url) and easy to understand. Furthermore, 40 minutes to get about a hundred workflows completed is still a plus. Also, this is still a work in progress. A couple minor issues still need to be addressed in the code but it is still a pretty cool program.
Updated Design
import os, sys, urllib2, urllib, zipfile, arcpy
from arcpy import env
path = os.getcwd()
def pickData():
myCount = 1
path1 = 'path2URL'
response = urllib2.urlopen(path1)
print "Enter the name of the files you need"
numZips = raw_input()
numZips2 = numZips.split(",")
myResponse(myCount, path1, response, numZips2)
def myResponse(myCount, path1, response, numZips2):
myPath = os.getcwd()
for each in response:
eachNew = each.split(" ")
eachCounty = eachNew[9].strip("\n").strip("\r")
try:
myCountyDir = os.mkdir(os.path.expanduser(myPath+ "\\counties" + "\\" + eachCounty))
except:
pass
myRetrieveDir = myPath+"\\counties" + "\\" + eachCounty
os.chdir(myRetrieveDir)
myCount+=1
response1 = urllib2.urlopen(path1 + eachNew[9])
for all1 in response1:
allNew = all1.split(",")
allFinal = allNew[0].split(" ")
allFinal1 = allFinal[len(allFinal)-1].strip(" ").strip("\n").strip("\r")
numZipsIter = 0
path8 = path1 + eachNew[9][0:len(eachNew[9])-2] +"/"+ allFinal1
downZip = eachNew[9][0:len(eachNew[9])-2]+".zip"
while(numZipsIter <len(numZips2)):
if (numZips2[numZipsIter][0:3].strip(" ") == "NWI") and ("remap" not in allFinal1):
numZips2New = numZips2[numZipsIter].split("_")
if (numZips2New[0].strip(" ") in allFinal1 and numZips2New[1] != "remap" and numZips2New[2].strip(" ") in allFinal1) and (allFinal1[-3:]=="ZIP" or allFinal1[-3:]=="zip"):
urllib.urlretrieve (path8, allFinal1)
zip1 = zipfile.ZipFile(myRetrieveDir +"\\" + allFinal1)
zip1.extractall(myRetrieveDir)
#maybe just have numzips2 (raw input) as the values before the county number
#numZips2[numZipsIter][0:-7].strip(" ") in allFinal1 or numZips2[numZipsIter][0:-7].strip(" ").lower() in allFinal1) and (allFinal1[-3:]=="ZIP" or allFinal1[-3:]=="zip"
elif (numZips2[numZipsIter].strip(" ") in allFinal1 or numZips2[numZipsIter].strip(" ").lower() in allFinal1) and (allFinal1[-3:]=="ZIP" or allFinal1[-3:]=="zip"):
urllib.urlretrieve (path8, allFinal1)
zip1 = zipfile.ZipFile(myRetrieveDir +"\\" + allFinal1)
zip1.extractall(myRetrieveDir)
numZipsIter+=1
pickData()
#client picks shapefiles to add to map
#section for geoprocessing operations
# get the data frames
#add new data frame, title
#check spaces in ftp crawler
os.chdir(path)
env.workspace = path+ "\\symbology\\"
zp1 = os.listdir(path + "\\counties\\")
def myGeoprocessing(layer1, layer2):
#the code in this function is used for geoprocessing operations
#it returns whatever output is generated from the tools used in the map
try:
arcpy.Clip_analysis(path + "\\symbology\\Stream_order.shp", layer1, path + "\\counties\\" + layer2 + "\\Streams.shp")
except:
pass
streams = arcpy.mapping.Layer(path + "\\counties\\" + layer2 + "\\Streams.shp")
arcpy.ApplySymbologyFromLayer_management(streams, path+ '\\symbology\\streams.lyr')
return streams
def makeMap():
#original wetlands layers need to be entered as NWI_line or NWI_poly
print "Enter the layer or layers you wish to include in the map"
myInput = raw_input();
counter1 = 1
for each in zp1:
print each
print path
zp2 = os.listdir(path + "\\counties\\" + each)
for eachNew in zp2:
#print eachNew
if (eachNew[-4:] == ".shp") and ((myInput in eachNew[0:-7] or myInput.lower() in eachNew[0:-7])or((eachNew[8:12] == "poly" or eachNew[8:12]=='line') and eachNew[8:12] in myInput)):
print eachNew[0:-7]
theMap = arcpy.mapping.MapDocument(path +'\\map.mxd')
df1 = arcpy.mapping.ListDataFrames(theMap,"*")[0]
#this is where we add our layers
layer1 = arcpy.mapping.Layer(path + "\\counties\\" + each + "\\" + eachNew)
if(eachNew[7:11] == "poly" or eachNew[7:11] =="line"):
arcpy.ApplySymbologyFromLayer_management(layer1, path + '\\symbology\\' +myInput+'.lyr')
else:
arcpy.ApplySymbologyFromLayer_management(layer1, path + '\\symbology\\' +eachNew[0:-7]+'.lyr')
# Assign legend variable for map
legend = arcpy.mapping.ListLayoutElements(theMap, "LEGEND_ELEMENT", "Legend")[0]
# add wetland layer to map
legend.autoAdd = True
try:
arcpy.mapping.AddLayer(df1, layer1,"AUTO_ARRANGE")
#geoprocessing steps
streams = myGeoprocessing(layer1, each)
# more geoprocessing options, add the layers to map and assign if they should appear in legend
legend.autoAdd = True
arcpy.mapping.AddLayer(df1, streams,"TOP")
df1.extent = layer1.getExtent(True)
arcpy.mapping.ExportToJPEG(theMap, path + "\\counties\\" + each + "\\map.jpg")
# Save map document to path
theMap.saveACopy(path + "\\counties\\" + each + "\\map.mxd")
del theMap
print "done with map " + str(counter1)
except:
print "issue with map or already exists"
counter1+=1
makeMap()
Original Design
import os, sys, urllib2, urllib, zipfile, arcpy
from arcpy import env
response = urllib2.urlopen('path2URL')
path1 = 'path2URL'
myCount = 1
for each in response:
eachNew = each.split(" ")
myCount+=1
response1 = urllib2.urlopen(path1 + eachNew[9])
for all1 in response1:
#print all1
allNew = all1.split(",")
allFinal = allNew[0].split(" ")
allFinal1 = allFinal[len(allFinal)-1].strip(" ")
if allFinal1[-10:-2] == "poly.ZIP":
response2 = urllib2.urlopen('path2URL')
zipcontent= response2.readlines()
path8 = 'path2URL'+ eachNew[9][0:len(eachNew[9])-2] +"/"+ allFinal1[0:len(allFinal1)-2]
downZip = str(eachNew[9][0:len(eachNew[9])-2])+ ".zip"
urllib.urlretrieve (path8, downZip)
# Set the path to the directory where your zipped folders reside
zipfilepath = 'F:\Misc\presentation'
# Set the path to where you want the extracted data to reside
extractiondir = 'F:\Misc\presentation\counties'
# List all data in the main directory
zp1 = os.listdir(zipfilepath)
# Creates a loop which gives use each zipped folder automatically
# Concatinates zipped folder to original directory in variable done
for each in zp1:
print each[-4:]
if each[-4:] == ".zip":
done = zipfilepath + "\\" + each
zip1 = zipfile.ZipFile(done)
extractiondir1 = extractiondir + "\\" + each[:-4]
zip1.extractall(extractiondir1)
path = os.getcwd()
counter1 = 1
# get the data frames
# Create new layer for all files to be added to map document
env.workspace = "E:\\Misc\\presentation\\symbology\\"
zp1 = os.listdir(path + "\\counties\\")
for each in zp1:
zp2 = os.listdir(path + "\\counties\\" + each)
for eachNew in zp2:
if eachNew[-4:] == ".shp":
wetlandMap = arcpy.mapping.MapDocument('E:\\Misc\\presentation\\wetland.mxd')
df1 = arcpy.mapping.ListDataFrames(wetlandMap,"*")[0]
#print eachNew[-4:]
wetland = arcpy.mapping.Layer(path + "\\counties\\" + each + "\\" + eachNew)
#arcpy.Clip_analysis(path + "\\symbology\\Stream_order.shp", wetland, path + "\\counties\\" + each + "\\Streams.shp")
streams = arcpy.mapping.Layer(path + "\\symbology\\Stream_order.shp")
arcpy.ApplySymbologyFromLayer_management(wetland, path + '\\symbology\\wetland.lyr')
arcpy.ApplySymbologyFromLayer_management(streams, path+ '\\symbology\\streams.lyr')
# Assign legend variable for map
legend = arcpy.mapping.ListLayoutElements(wetlandMap, "LEGEND_ELEMENT", "Legend")[0]
# add the layers to map and assign if they should appear in legend
legend.autoAdd = True
arcpy.mapping.AddLayer(df1, streams,"TOP")
legend.autoAdd = True
arcpy.mapping.AddLayer(df1, wetland,"AUTO_ARRANGE")
df1.extent = wetland.getExtent(True)
# Export the map to a pdf
arcpy.mapping.ExportToJPEG(wetlandMap, path + "\\counties\\" + each + "\\wetland.jpg")
# Save map document to path
wetlandMap.saveACopy(path + "\\counties\\" + each + "\\wetland.mxd")
del wetlandMap
print "done with map " + str(counter1)
counter1+=1
Have a look at this guide:
https://wiki.python.org/moin/PythonSpeed/PerformanceTips
Let me quote:
Function call overhead in Python is relatively high, especially compared with the execution speed of a builtin function. This strongly suggests that where appropriate, functions should handle data aggregates.
So effectively this suggests, to not factor out something as a function that is going to be called hundreds of thousands of times.
In Python functions won't be inlined, and calling them is not cheap. If in doubt use a profiler to find out how many times is each function called, and how long does it take on average. Then optimize.
You might also give PyPy a shot, as they have certain optimizations built in. Reducing the function call overhead in some cases seems to be one of them:
Python equivalence to inline functions or macros
http://pypy.org/performance.html

Copy a file to a new location and increment filename, python

I am trying to:
Loop through a bunch of files
makes some changes
Copy the old file to a sub directory. Here's the kicker I don't want to overwrite the file in the new directory if it already exists. (e.g. if "Filename.mxd" already exists, then copy and rename to "Filename_1.mxd". If "Filename_1.mxd" exists, then copy the file as "Filename_2.mxd" and so on...)
save the file (but do a save, not a save as so that it overwrites the existing file)
it goes something like this:
for filename in glob.glob(os.path.join(folderPath, "*.mxd")):
fullpath = os.path.join(folderPath, filename)
mxd = arcpy.mapping.MapDocument(filename)
if os.path.isfile(fullpath):
basename, filename2 = os.path.split(fullpath)
# Make some changes to my file here
# Copy the in memory file to a new location. If the file name already exists, then rename the file with the next instance of i (e.g. filename + "_" + i)
for i in range(50):
if i > 0:
print "Test1"
if arcpy.Exists(draftloc + "\\" + filename2) or arcpy.Exists(draftloc + "\\" + shortname + "_" + str(i) + extension):
print "Test2"
pass
else:
print "Test3"
arcpy.Copy_management(filename2, draftloc + "\\" + shortname + "_" + str(i) + extension)
mxd.save()
So, 2 things I decided to do, was to just set the range of files well beyond what I expect to ever occur (50). I'm sure there's a better way of doing this, by just incrementing to the next number without setting a range.
The second thing, as you may see, is that the script saves everything in the range. I just want to save it once on the next instance of i that does not occur.
Hope this makes sense,
Mike
Use a while loop instead of a for loop. Use the while loop to find the appropriate i, and then save afterwards.
The code/pseudocode would look like:
result_name = original name
i = 0
while arcpy.Exists(result_name):
i+=1
result_name = draftloc + "\\" + shortname + "_" + str(i) + extension
save as result_name
This should fix both issues.
thanks to Maty suggestion above, I've come up with my answer. For those who are interested, my code is:
result_name = filename2
print result_name
i = 0
# Check if file exists
if arcpy.Exists(draftloc + "\\" + result_name):
# If it does, increment i by 1
i+=1
# While each successive filename (including i) does not exists, then save the next filename
while not arcpy.Exists(draftloc + "\\" + shortname + "_" + str(i) + extension):
mxd.saveACopy(draftloc + "\\" + shortname + "_" + str(i) + extension)
# else if the original file didn't satisfy the if, the save it.
else:
mxd.saveACopy(draftloc + "\\" + result_name)

Categories