I have recently started learning more about Python Packages and Modules. I'm currently busy updating my existing modules so that that can be run as script or imported as a module into my other code. I'm not sure how to construct my input arguments within my module and pass them to the main() function within my module.
I've have written my my main() function and called it under if __name__ == '__main__' passing the input arguments. The inputs are currently hard coded to show what I'm trying to achieve. Any help in how to correctly construct my input arguments that the user will pass, which will then be passed onto the main function will be appreciated.
As mentioned I'm trying to be able to use the following as a script when used directly or imported as a module into my other code and run from there. If I import it as a module would I call the main() function when importing it? Is the following structure correct in how I have written the following? Any advice is appreciated.
'''
Created on March 12, 2017
Create a new ArcHydro Schema
File Geodatabase and Rasters
Folder
#author: PeterW
'''
# import site-packages and modules
import re
from pathlib import Path
import arcpy
# set environment settings
arcpy.env.overwriteOutput = True
def archydro_rasters_folder(workspace):
"""Create rasters folder directory
if it doens't already exist"""
model_name = Path(workspace).name
layers_name = re.sub(r"\D+", "Layers", model_name)
layers_folder = Path(workspace, layers_name)
if layers_folder.exists():
arcpy.AddMessage("Rasters folder: {0} exists".format(layers_name))
else:
layers_folder.mkdir(parents=True)
arcpy.AddMessage("Rasters folder {0} created".format(layers_name))
def archydro_fgdb_schema(workspace, schema, dem):
"""Create file geodatabase using XML
schema and set coordinate system based
on input DEM if it doesn't already exist"""
model_name = Path(workspace).name
fgdb = "{0}.gdb".format(model_name)
if arcpy.Exists(str(Path(workspace, fgdb))):
arcpy.AddMessage("{0} file geodatabase exists".format(fgdb))
else:
new_fgdb = arcpy.CreateFileGDB_management(str(workspace), fgdb)
import_type = "SCHEMA_ONLY"
config_keyword = "DEFAULTS"
arcpy.AddMessage("New {0} file geodatabase created".format(fgdb))
arcpy.ImportXMLWorkspaceDocument_management(new_fgdb, schema,
import_type,
config_keyword)
arcpy.AddMessage("ArcHydro schema imported")
projection = arcpy.Describe(dem).spatialReference
projection_name = projection.PCSName
feature_dataset = Path(workspace, fgdb, "Layers")
arcpy.DefineProjection_management(str(feature_dataset),
projection)
arcpy.AddMessage("Changed projection to {0}".format(projection_name))
def main(workspace, dem, schema):
"""main function to create rasters folder
and file geodatabase"""
archydro_rasters_folder(workspace)
archydro_fgdb_schema(schema, dem, workspace)
if __name__ == '__main__':
main(workspace = r"E:\Projects\2016\01_Bertrand_Small_Projects\G113268\ArcHydro\Model04",
dem = r"E:\Projects\2016\01_Bertrand_Small_Projects\G113268\ArcHydro\DEM2\raw",
schema = r"E:\Python\Masters\Schema\ESRI_UC12\ModelBuilder\Schema\Model01.xml")
Updated: 17/03/13
The following is my updated Python module based on Jonathan's suggestions:
'''
Created on March 12, 2017
Create a new ArcHydro Schema
File Geodatabase and Rasters
Folder
#author: PeterW
'''
# import site-packages and modules
import re
from pathlib import Path
import arcpy
import argparse
# set environment settings
arcpy.env.overwriteOutput = True
def rasters_directory(workspace):
"""Create rasters folder directory
if it doens't already exist"""
model_name = Path(workspace).name
layers_name = re.sub(r"\D+", "Layers", model_name)
layers_folder = Path(workspace, layers_name)
if layers_folder.exists():
arcpy.AddMessage("Rasters folder: {0} exists".format(layers_name))
else:
layers_folder.mkdir(parents=True)
arcpy.AddMessage("Rasters folder {0} created".format(layers_name))
def fgdb_schema(workspace, schema, dem):
"""Create file geodatabase using XML
schema and set coordinate system based
on input DEM if it doesn't already exist"""
model_name = Path(workspace).name
fgdb = "{0}.gdb".format(model_name)
if arcpy.Exists(str(Path(workspace, fgdb))):
arcpy.AddMessage("{0} file geodatabase exists".format(fgdb))
else:
new_fgdb = arcpy.CreateFileGDB_management(str(workspace), fgdb)
import_type = "SCHEMA_ONLY"
config_keyword = "DEFAULTS"
arcpy.AddMessage("New {0} file geodatabase created".format(fgdb))
arcpy.ImportXMLWorkspaceDocument_management(new_fgdb, schema,
import_type,
config_keyword)
arcpy.AddMessage("ArcHydro schema imported")
projection = arcpy.Describe(dem).spatialReference
projection_name = projection.PCSName
feature_dataset = Path(workspace, fgdb, "Layers")
arcpy.DefineProjection_management(str(feature_dataset),
projection)
arcpy.AddMessage("Changed projection to {0}".format(projection_name))
def model_schema(workspace, schema, dem):
"""Create model schema: rasters folder
and file geodatabase"""
rasters_directory(workspace)
fgdb_schema(schema, dem, workspace)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Create a ArcHydro schema')
parser.add_argument('--workspace', metavar='path', required=True,
help='the path to workspace')
parser.add_argument('--schema', metavar='path', required=True,
help='path to schema')
parser.add_argument('--dem', metavar='path', required=True,
help='path to dem')
args = parser.parse_args()
model_schema(workspace=args.workspace, schema=args.schema, dem=args.dem)
This looks correct to me, and yes if you're looking to use this as a module you would import main. Though, it would probably be better to name it in a more descriptive way.
To clarify how __main__ and the function main() works. When you execute a module it will have a name which is stored in __name__. If you execute the module stand alone as a script it will have the name __main__. If you execute it as part of a module ie import it into another module it will have the name of the module.
The function main() can be named anything you would like, and that wouldn't affect your program. It's commonly named main in small scripts but it's not a particularly good name if it's part of a larger body of code.
In terms letting a user to input arguments when running as a script I would look into either using argparse or click
An example of how argparse would work.
if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser(description='Create a ArcHydro schema')
parser.add_argument('--workspace', metavar='path', required=True,
help='the path to workspace')
parser.add_argument('--schema', metavar='path', required=True,
help='path to schema')
parser.add_argument('--dem', metavar='path', required=True,
help='path to dem')
args = parser.parse_args()
main(workspace=args.workspace, schema=args.schema, dem=args.dem)
Related
I have to run the main file that depends on the rest of a relatively large project. Project structure can be seen as such
main.py
opts.py
--models \
--model1.py
--model2.py
...
--schedulers
--scheduler1.py
--scheduler2.py
...
...
The problem is when I have to pass arguments to each component (using argparse). A simple way would be to define all parameters in a single place for each component. This worked so far for me (in opts.py), but I would like to make something more elegant. In my parse function for each component parse_models or parse_scheduler I would like to iterate through each submodule of models and schedulers and let them define their own arguments by calling a function define_arguments that each of them has where they create their own sub parser.
All in all, how do I iterate through the submodules and call their define_arguments function from within opts.py?
You can iterate over all the python files using the glob module. You can find the correct path with the parent module's __path__ attribute. Import the modules using importlib.import_module. The imported module then contains the define_arguments function that you can pass a parser per submodule to define the arguments on:
from glob import glob
import os, importlib
def load_submodule_parsers(parent_module, parser, help=None):
if help is None:
help = parent_module.__name__ + " modules"
modules = glob(os.path.join(parent_module.__path__, "*.py"))
subparsers = parser.add_subparsers(help=help)
for module_file in modules:
module_name = os.path.basename(module_file)[:-3]
if module == "__init__":
continue
module = importlib.import_module(module_name, package=parent_module.__name__)
if "define_arguments" not in module.__dict__:
raise ImportError(parent_module.__name__ + " submodule '" + module_name + "' does not have required 'define_arguments' function.")
parser = subparsers.add_parser(module_name)
module.define_arguments(parser)
Pass the function the parent module object:
import argparse, models, schedulers
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers()
models_parser = subparsers.add_parser("models")
load_submodule_parsers(models, models_parser)
schedulers_parser = subparsers.add_parser("schedulers")
load_submodule_parsers(schedulers, schedulers_parser)
Untested code but I think you can refine it from here
The main problem I have is accessing variables defined in a configuration file as a module in other scripts when those variables are defined by argparse inputs. If I encapsulate this code within a function, I will not have access to it in other scripts. Similarly, if I don't use a function, the configuration file will expect inputs whenever it is called as a module. I think global variables might be the answer, but I don't know how to set them up in this use case?
import shutil
import inspect
import configparser
import sys
import argparse
def cmd_inputs():
parser = argparse.ArgumentParser()
parser.add_argument('-dir','--geofloodhomedir',help="File path to directory that will hold the cfg file \
and the inputs and outputs folders. Default is the GeoNet3 directory.",
type=str)
parser.add_argument('-p','--project',
help="Folder within GIS, Hydraulics, and NWM directories \
of Inputs and Outputs. Default is 'my_project'.",
type=str)
parser.add_argument('-n','--DEM_name',
help="Name of Input DEM (without extension) and the prefix used for all \
project outputs. Default is 'dem'",
type=str)
parser.add_argument('--no_chunk',help="Dont batch process DEM's > 1.5 GB for Network Extraction. Default is to chunk DEMs larger than 1.5 GB.",
action='store_true')
global home_dir
args = parser.parse_args()
if args.geofloodhomedir:
home_dir = args.geofloodhomedir
if os.path.abspath(home_dir) == os.path.dirname(os.path.abspath(__file__)):
home_dir = os.getcwd()
print("hello")
print('GeoNet and GeoFlood Home Directory: ')
print(home_dir)
else:
ab_path = os.path.abspath(__file__)
home_dir = os.path.dirname(ab_path)
print('Using default GeoNet and GeoFlood home directory:')
print(home_dir)
if args.project:
project_name=args.project
print(f"Project Name: {project_name}")
else:
project_name='my_project'
print(f"Default Project Name: {project_name}")
if args.DEM_name:
dem_name = args.DEM_name
print(f'DEM Name: {dem_name}')
else:
dem_name='dem'
print(f'Default DEM: {dem_name}')
if not args.no_chunk:
print("Chunking DEMs > 1.5 GBs in Network Extraction script")
chunk = 1
else:
print("Not chunking input DEM or its products in Network Extraction script")
chunk = 0
config = configparser.ConfigParser()
config['Section']={'geofloodhomedir':home_dir,
'projectname':project_name,
'dem_name':dem_name,
'Chunk_DEM':chunk}
with open(os.path.join(home_dir,'GeoFlood.cfg'),'w') as configfile:
config.write(configfile)
if __name__=='__main__':
cmd_inputs()
print ("Configuration Complete")```
I am trying to create a python program that will create a provided number of identical virtual machines. I have used the community sample scripts to get as much as i can running but i am completely stuck now.
#!/usr/bin/env python
"""
vSphere SDK for Python program for creating tiny VMs (1vCPU/128MB)
"""
import atexit
import hashlib
import json
import random
import time
import requests
from pyVim import connect
from pyVmomi import vim
from tools import cli
from tools import tasks
from add_nic_to_vm import add_nic, get_obj
def get_args():
"""
Use the tools.cli methods and then add a few more arguments.
"""
parser = cli.build_arg_parser()
parser.add_argument('-c', '--count',
type=int,
required=True,
action='store',
help='Number of VMs to create')
parser.add_argument('-d', '--datastore',
required=True,
action='store',
help='Name of Datastore to create VM in')
parser.add_argument('--datacenter',
required=True,
help='Name of the datacenter to create VM in.')
parser.add_argument('--folder',
required=True,
help='Name of the vm folder to create VM in.')
parser.add_argument('--resource-pool',
required=True,
help='Name of resource pool to create VM in.')
parser.add_argument('--opaque-network',
help='Name of the opaque network to add to the new VM')
# NOTE (hartsock): as a matter of good security practice, never ever
# save a credential of any kind in the source code of a file. As a
# matter of policy we want to show people good programming practice in
# these samples so that we don't encourage security audit problems for
# people in the future.
args = parser.parse_args()
return cli.prompt_for_password(args)
def create_dummy_vm(vm_name, service_instance, vm_folder, resource_pool,
datastore):
"""Creates a dummy VirtualMachine with 1 vCpu, 128MB of RAM.
:param name: String Name for the VirtualMachine
:param service_instance: ServiceInstance connection
:param vm_folder: Folder to place the VirtualMachine in
:param resource_pool: ResourcePool to place the VirtualMachine in
:param datastore: DataStrore to place the VirtualMachine on
"""
datastore_path = '[' + datastore + '] ' + vm_name
# bare minimum VM shell, no disks. Feel free to edit
vmx_file = vim.vm.FileInfo(logDirectory=None,
snapshotDirectory=None,
suspendDirectory=None,
vmPathName=datastore_path)
config = vim.vm.ConfigSpec(name=vm_name, memoryMB=128, numCPUs=1,
files=vmx_file, guestId='dosGuest',
version='vmx-07')
print("Creating VM {}...".format(vm_name))
task = vm_folder.CreateVM_Task(config=config, pool=resource_pool)
tasks.wait_for_tasks(service_instance, [task])
A=1
def main():
"""
Simple command-line program for creating Dummy VM based on Marvel character
names
"""
name = "computer" + str(A)
args = get_args()
service_instance = connect.SmartConnectNoSSL(host=args.host,
user=args.user,
pwd=args.password,
port=int(args.port))
if not service_instance:
print("Could not connect to the specified host using specified "
"username and password")
return -1
atexit.register(connect.Disconnect, service_instance)
content = service_instance.RetrieveContent()
datacenter = get_obj(content, [vim.Datacenter], args.datacenter)
vmfolder = get_obj(content, [vim.Folder], args.folder)
resource_pool = get_obj(content, [vim.ResourcePool], args.resource_pool)
vm_name = name
create_dummy_vm(vm_name, service_instance, vmfolder, resource_pool,
args.datastore)
A + 1
if args.opaque_network:
vm = get_obj(content, [vim.VirtualMachine], vm_name)
add_nic(service_instance, vm, args.opaque_network)
return 0
# Start program
if __name__ == "__main__":
main()
The error i get when running it is
Creating VM computer1...
Traceback (most recent call last):
File "create_vm.py", line 142, in <module>
main()
File "create_vm.py", line 133, in main
args.datastore)
File "create_vm.py", line 98, in create_dummy_vm
task = vm_folder.CreateVM_Task(config=config, pool=resource_pool)
AttributeError: 'NoneType' object has no attribute 'CreateVM_Task'
I know that my CreateVM_task is returning a a parameter of none but i cant seem to figure out why.
Problem was with the config parameters. With the current code, the datacenter and vmfolder objects come back as none when printed. To fix this I edited it to the following block.
content = service_instance.RetrieveContent()
datacenter = content.rootFolder.childEntity[0]
vmfolder = datacenter.vmFolder
hosts = datacenter.hostFolder.childEntity
resource_pool = hosts[0].resourcePool
I've been trying to find the answer to this but everything i look at is reading a file from another directory or running a file from another directory. I'm sure what i want to do isn't that hard but i am new at this and don't know how to describe what it's called.
I have a python script run.py that is in the /src directory. I run that script form the /src directory. run.py calls two files (configuration.py and server.py). These two files are in a folder called lib (src/lib).All folders have an empty __init__.py file.
When i take these files out of lib and put them just in src i can run the script when the script looks like it does below.
import os
import inspect
import sys
import configuration
import server
# Initialize Parameters
f_path = os.path.abspath(inspect.getfile(inspect.currentframe()))
absolute_path = os.path.dirname(f_path)
if __name__ == "__main__":
from optparse import OptionParser, OptionGroup
parser = OptionParser()
parser.usage = "usage: %prog [options] "
parser.description = "Primary application entry point."
parser.add_option("-v", "--verbose", dest="verbose", action="store_true",
default=False, help="Run verbose.")
group = OptionGroup(parser, "Node operations")
group.add_option("--do-queue-job", dest="do_queue_job", action="store_true",
help="Run the next job in the quasar-server queue.")
parser.add_option_group(group)
(options, args) = parser.parse_args()
# Clear argv to prevent issues with web.py automatically using arguments to
# bind ip addresses.
sys.argv = []
configuration = configuration.Configuration("/home/mscarpa/PhpstormProjects/quasar-node/quasar-node/quasar-node/src/config.yml")
if (options.do_queue_job):
# Get server instance
server_connection = server.QuasarConnection(configuration)
#return server_connection
# Get next job from server
next_job = server_connection.get_next_job()
#return next_job
The two parts of the code i know i have to change if i move the two files to /src/lib are the following:
configuration = configuration.Configuration("/home/mscarpa/PhpstormProjects/quasar-node/quasar-node/quasar-node/src/config.yml")
server_connection = server.QuasarConnection(configuration)
i am thinking that i would just have to put.lib before them like so, but every time i try it it says lib is not defined.
configuration = lib.configuration.Configuration("/home/mscarpa/PhpstormProjects/quasar-node/quasar-node/quasar-node/src/config.yml")
server_connection = lib.server.QuasarConnection(configuration)
This is probably a noob question, but does anyone know how to target these files if the are in the src/lib directory as opposed to just the src directory
You just need to change your import statement to reflect the module's new location:
import lib.configuration as configuration
import lib.server as server
And the rest of your script doesn't really need to change.
I got it. I think your answer may have worked in certain cases but i think my problem being new at this is figuring out what to search.
it was a sys.arg thing so i had to include that path to that lib folder before i imported the files.
sys.path.insert(0, '/home/mscarpa/PhpstormProjects/quasar-node/quasar-node/quasar-node/src/lib')
import configuration
import server
I started using Python few days back and I think I have a very basic question where I am stuck. Maybe I am not doing it correctly in Python so wanted some advice from the experts:
I have a config.cfg & a class test in one package lib as follows:
myProj/lib/pkg1/config.cfg
[api_config]
url = https://someapi.com/v1/
username=sumitk
myProj/lib/pkg1/test.py
class test(object):
def __init__(self, **kwargs):
config = ConfigParser.ConfigParser()
config.read('config.cfg')
print config.get('api_config', 'username')
#just printing here but will be using this as a class variable
def some other foos()..
Now I want to create an object of test in some other module in a different package
myProj/example/useTest.py
from lib.pkg1.test import test
def temp(a, b, c):
var = test()
def main():
temp("","","")
if __name__ == '__main__':
main()
Running useTest.py is giving me error:
...
print config.get('api_config', 'username')
File "C:\Python27\lib\ConfigParser.py", line 607, in get
raise NoSectionError(section)
ConfigParser.NoSectionError: No section: 'api_config'
Now if I place thie useTest.py in the same package it runs perfectly fine:
myProj/lib/pkg1/useTest.py
myProj/lib/pkg1/test.py
myProj/lib/pkg1/config.cfg
I guess there is some very basic package access concept in Python that I am not aware of or is there something I am doing wrong here?
The issue here is that you have a different working directory depending on which module is your main script. You can check the working directory by adding the following lines to the top of each script:
import os
print os.getcwd()
Because you just provide 'config.cfg' as your file name, it will attempt to find that file inside of the working directory.
To fix this, give an absolute path to your config file.
You should be able to figure out the absolute path with the following method since you know that config.cfg and test.py are in the same directory:
# inside of test.py
import os
config_path = os.path.join(os.path.dirname(os.path.abspath(__file__)),
'config.cfg')