Rerun maxent using python - python

I'm trying to create a script that reruns maxent for different inputs. I have around 1500 species that need to be processed separately. My idea is to use a python loop for this program. But I can't seem to find the right information to start.
Right now I have 3 simple lines which tells python to open the program.
import subprocess
subprocess.call(['java', '-jar', r'C:\Program Files (x86)\Maxent\maxent.jar'])
subprocess.call([r'C:\Program Files (x86)\Maxent\maxent.bat'])
Now I want to tell python which input to use. However, I can't seem to find any documentation on a function which specifies the input for a program.
Does anyone have any ideas on how to approach the next step?
-------------------Edit------------------------------------
Right now I have the following code:
import glob
import subprocess
insect = glob.glob('D:\Maxent\samples\*.csv')
for species in insect:
subprocess.call(['java', '-jar', r'D:\Maxent\maxent.jar', 'environmentallayers=D:\Maxent\layers',
species, 'outputdirectory= D:\Maxent\outputs', 'redoifexists', 'autorun'])
This gives me the following error in maxent:
Initialization flags not understood: D:\Maxent\samples\Aeshna_juncea.csv
and the folowing error in pyhton
C:\Users\merel\PycharmProjects\untitled\venv\Scripts\python.exe "C:/Users/merel/PycharmProjects/untitled/maxent python.py"
Error: Initialization flags not understood: species
Error: No species selected
I also tried it with the ' around species. This gave me the following error:
C:\Users\merel\PycharmProjects\untitled\venv\Scripts\python.exe "C:/Users/merel/PycharmProjects/untitled/maxent python.py"
Error: Initialization flags not understood: species
Error: No species selected
I don't know why the program doesn't understand the argument. I also tried it with x instead of species to make sure that the word species didn't already exist in the library.

You need to pass arguments/flags to Maxent's jar file in order to achieve your goals, if I understood it correctly.
I've downloaded the Maxen and found the necessary arguments/flags. When you start Maxent, click help and scroll down to Batch mode, you can find all the arguments/flags there also an example usage as well; java -mx512m -jar maxent.jar environmentallayers=layers samplesfile=samples\bradypus.csv outputdirectory=outputs togglelayertype=ecoreg redoifexists autorun
You can add those arguments/flags after your path such like this:
subprocess.call(['java', '-jar', r'C:\Program Files (x86)\Maxent\maxent.jar', 'environmentallayers=layers', 'samplesfile=samples\bradypus.csv', 'outputdirectory=outputs', 'togglelayertype=ecoreg', 'redoifexists', 'autorun'])
I hope this helps you on your project. I have not tried any of this since I do not know anything about your field.
Edit:
You don't have to call the .bat file since it also executes the maxent.jar wtih the given arguments/flags.

Related

ValueError: need more than 0 values to unpack (Python 2)

I am trying to replicate another researcher's findings by using the Python file that he added as a supplement to his paper. It is the first time I am diving into Python, so the error might be extremely simple to fix, yet after two days I haven't still. For context, in the Readme file there's the following instruction:
"To run the script, make sure Python2 is installed. Put all files into one folder designated as “cf_dir”.
In the script I get an error at the following lines:
if __name__ == '__main__':
cf_dir, cf_file, cf_phys_file = sys.argv[1:4]
os.chdir(cf_dir)
cf = pd.read_csv(cf_file)
cf_phys = pd.read_csv(cf_phys_file)
ValueError: need more than 0 values to unpack
The "cf_file" and "cf_phys_file" are two major components of all files that are in the one folder named "cf_dir". The "cf_phys_file" relates only to two survey question's (Q22 and Q23), and the "cf_file" includes all other questions 1-21. Now it seems that the code is meant to retrieve those two files from the directory? Only for the "cf_phys_file" the columns 1:4 are needed. The current working directory is already set at the right location.
The path where I located "cf_dir" is as follows:
C:\Users\Marc-Marijn Ossel\Documents\RSM\Thesis\Data\Suitable for ML\Data en Artikelen\Per task Suitability for Machine Learning score readme\cf_dir
Alternative option in readme file,
In the readme file there's this option, but also here I cannot understand how to direct the path to the right location:
"Run the following command in an open terminal (substituting for file names
below): python cfProcessor_AEAPnP.py cf_dir cf_file cf_phys_file task_file jobTaskRatingFile
jobDataFile OESfile
This should generate the data and plots as necessary."
When I run that in "Command Prompt", I get the following error, and I am not sure how to set the working directory correctly.
- python: can't open file 'cfProcessor_AEAPnP.py': [Errno 2] No such file or directory
Thanks for the reading, and I hope there's someone who could help me!
Best regards & stay safe out there during Corona!!
Marc
cf_dir, cf_file, cf_phys_file = sys.argv[1:4]
means, the python file expects few arguments when called.
In order to run
python cfProcessor_AEAPnP.py cf_dir cf_file cf_phys_file task_file jobTaskRatingFile jobDataFile OESfile
the command prompt should be in that folder.
So, open command prompt and type
cd path_to_the_folder_where_ur_python_file_is_located
Now, you would have reached the path of the python file.
Also, make sure you give full path in double quotes for the arguments.

Executing subprocess cannot find specified file on Windows

I'm working inside a system that has Jython2.5 but I need to be able to call some of Google's apis so I wrote an offline script that I wanted to call from my Jython environment and return to me small pieces of data. Like a JobID or a sheet URL or something from Google.
I've tried a number of things but I always get an error back from Windows, saying that it cannot find the file specified.
Path is done in two ways.
The first way using a string
stringPath = r"‪C:\GooglePipes\Scripts\filetobq.py C:\GooglePipes\Keys\DEV-BigQueryKey.json nofile C:\GooglePipes\BQ_Downtime\TESTFILE.CSV dataset1 table1"
And the second way, as a sequence (per the docs, using shell=false supply a sequence)
seqPath = [r"‪C:\GooglePipes\Scripts\filetobq.py",r"C:\GooglePipes\Keys\DEV-BigQueryKey.json","nofile",r"C:\GooglePipes\BQ_Downtime\TESTFILE.CSV","dataset1","table1"]
Called with
data, err = Popen(seqPath, shell=True, stderr=PIPE, stdout=PIPE).communicate()
#Read values back in
print data
print err
Replacing seqPath with stringPath to try it either way.
I've been at this all weekend, every time I run it I get from Windows
The system cannot find the path specified.
from the err print. I've been unable to debug much further than this. I'm not really sure what's happening. When I paste the stringPath variable directly into my computer's command window it executes.
I've also called subprocess.list2cmdline(seqPath) to see what it's outputting. It's giving me a ? in front of the string, but I haven't been able to figure out what that means. I can paste the rest of the string, starting after the question mark into the command window and it executes.
?C:\GooglePipes\Scripts\filetobq.py C:\GooglePipes...
I've tried a number of different combinations of true and false on shell, passing different args into Popen, double slashes, and I have no less than 30 tabs open from stack overflow and other help forums. I just have no idea what to do at this point and any help is appreciated.
Edit
The ? at the start of the sting is actually a NULL character when I did some additional logging. This seems to be the root of my problem. I can't figure out why it shows up, but it was present in my copy pastes. I started manually typing, and I got it working. When I feed the path with my Jython program it is present again.
Ultimately the error was the ?/NULL character.
I went back to the source value where the program was grabbing the path and it was present there. After I hand-re keyed it in, everything started working.
If you copy and paste what I put in the question, you can see the NULL character in the string if you run it through a string->ASCII converter.
>C:
>NULL 67 58
What a bunch of bullsh***.

Can't get working command line on prompt to work on subprocess

I need to extract text from a PDF. I tried the PyPDF2, but the textExtract method returned an encrypted text, even though the pdf is not encrypted acoording to the isEncrypted method.
So I moved on to trying accessing a program that does the job from the command prompt, so I could call it from python with the subprocess module. I found this program called textExtract, which did the job I wanted with the following command line on cmd:
"textextract.exe" "download.pdf" /to "download.txt"
However, when I tried running it with subprocess I couldn't get a 0 return code.
Here is the code I tried:
textextract = shlex.split(r'"textextract.exe" "download.pdf" /to "download.txt"')
subprocess.run(textextract)
I already tried it with shell=True, but it didn't work.
Can anyone help me?
I was able to get the following script to work from the command line after installing the PDF2Text Pilot application you're trying to use:
import shlex
import subprocess
args = shlex.split(r'"textextract.exe" "download.pdf" /to "download.txt"')
print('args:', args)
subprocess.run(args)
Sample screen output of running it from a command line session:
> C:\Python3\python run-textextract.py
args: ['textextract.exe', 'download.pdf', '/to', 'download.txt']
Progress:
Text from "download.pdf" has been successfully extracted...
Text extraction has been completed!
The above output was generated using Python 3.7.0.
I don't know if your use of spyder on anaconda affects things or not since I'm not familiar with it/them. If you continue to have problems with this, then, if it's possible, I suggest you see if you can get things working directly—i.e. running the the Python interpreter on the script manually from the command line similar to what's shown above. If that works, but using spyder doesn't, then you'll at least know the cause of the problem.
There's no need to build a string of quoted strings and then parse that back out to a list of strings. Just create a list and pass that:
command=["textextract.exe", "download.pdf", "/to", "download.txt"]
subprocess.run(command)
All that shlex.split is doing is creating a list by removing all of the quotes you had to add when creating the string in the first place. That's an extra step that provides no value over just creating the list yourself.

Testing 7-Zip archives from a python script

So I've got a python script that, at it's core, makes .7z archives of selected directories for the purpose of backing up data. For simplicty sake I've simply invoked 7-zip through the windows command line, like so:
def runcompressor(target, contents):
print("Compressing {}...".format(contents))
archive = currentmodule
archive += "{}\\{}.7z".format(target, target)
os.system('7z u "{}" "{}" -mx=9 -mmt=on -ssw -up1q0r2x2y2z1w2'.format(archive, contents))
print("Done!")
Which creates a new archive if one doesn't exist and updates the old one if it does, but if something goes wrong the archive will be corrupted, and if this command hits an existing, corrupted archive, it just gives up. Now 7zip has a command for testing the integrity of an archive, but the documentation says nothing about giving an output, and then comes the trouble of capturing that output in python.
Is there a way I can test the archives first, to determine if they've been corrupted?
The 7z executable returns a value of two or greater if it encounters a problem. In a batch script, you would generally use errorlevel to detect this. Unfortunately, os.system() under Windows gives the return value of the command interpreter used to run your program, not the exit value of your program itself.
If you want the latter, you'll probably going to have to get your hands a little dirtier with the subprocess module, rather than using the os.system() call.
If you have version 3.5 (or better), this is as simple as:
import subprocess as sp
x = sp.run(['7z', 'a', 'junk.7z', 'junk.txt'], stdout=sp.PIPE, stderr=sp.STDOUT)
print(x.returncode)
That junk.txt in my case is a real file but junk.7z is just a copy of one of my text files, hence an invalid archive. The output from the program is 2 so it's easily detectable if something went wrong.
If you print out x rather than just x.returncode, you'll see something like (reformatted and with \r\n sequences removed for readability):
CompletedProcess(
args=['7z', 'a', 'junk.7z', 'junk.txt'],
returncode=2,
stdout=b'
7-Zip [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
Error: junk.7z is not supported archive
System error:
Incorrect function.
'
)

running python code, to use sextractor with more than one image file

I have written piece of code which runs sextractor from python, however I only know how to do this for one file, and i need to loop it over 62 files. Im not sure how i would go about doing this. I have attached my code bellow:
#!/usr/bin/env python
# build a catalog using sextractor on des image here
sys.path.append('/home/fitsfiles') #not sure if this does anything/is correct
def sex(image, output, sexdir='/home/sextractor-2.5.0', check_img=None,config=None, l=None) :
'''Construct a sextractor command and run it.'''
#creates a sextractor line e.g sex img.fits -catalog_name -checkimage_name
q="/home/fitsfiles/"+ "01" +".fits"
com = [ "sex ", q, " -CATALOG_NAME " + output]
s0=''
com = s0.join(com)
res = os.system(com)
return res
img_name=sys.argv[0]
output=img_name[0:1]+'_star_catalog.fits'
t=sex(img_name,output)
print '----done !---'
so this code produces a command in my main terminal of, sex /home/fitsfiles/01.fits -CATALOG_NAME g_star_catalog.fits
which successfully produces a star catalogue as I want.
However I want my code to to this for 62 fits files and change the name of star_catalog.fits depending upon which fitsfile is being used. any help would be appreciated.
There are many ways you could approach this. Let's assume you want to run your script as something like
python extract_stars.py /home/fitsfiles/*.fits
Then, you could try something like this:
for arg in len(sys.argv):
filename = arg.split('/')[-1].strip('.fits')
t = sex(arg, filename +'_star_catalog.fits')
# Whatever else
This assumes that you remove the line in sex that reformats the input filename. Also, you do not need to append the fits directory to your path.
The alternative approach is, if you do not plan to do anything else in python, you could write a bash script which would really simplify the task.
And, as a side note, you if you had asked this question more generally (ie, I wish to apply a function I wrote to a number of input files) and without reference to a rather uncommonly used application, you would have likely received an answer much more quickly.
The community has now developed some python wrappers which allow you to run sextractor as if it was a python command. These are: pysex, sewpy and astromatic_wrapper.
The good thing about sextractor wrappers is that allow you to write much cleaner code without the need of defining extra functions, invoking os commands or having the configuration files and the outputfiles on your machine. Moreover, the output can be an astropy table, a pandas dataframe or a numpy array.
For your specific case, you could use pysex and do:
import pysex
import glob
filelist = glob.glob('/directory/*.fits')
for fitsfile in filelist:
cat = pysex.run(fitsfile, params=['X_IMAGE', 'Y_IMAGE', 'FLUX_APER'],
conf_args={'PHOT_APERTURES':5})
print cat['FLUX_APER']

Categories