NCO/pynco: ncea can't find files from within Python - python

I am trying to run ncea from within python to make monthly averages from daily files over many years of data.
The command:
ncea -v analysed_sst,sea_ice_fraction /mnt/r01/data/goes-poes_ghrsst/daily/200301*.nc 200301-gp-monthly.nc
runs fine in the terminal.
But in Python, I get the following error:
call(["ncea","-v","analysed_sst,sea_ice_fraction","/mnt/r01/data/goes-poes_ghrsst/daily/200301*.nc",monthly_file])
ncea: ERROR file /mnt/r01/data/goes-poes_ghrsst/daily/200301*.nc neither exists locally nor matches remote filename patterns
I also tried:
nco.ncea(input="/mnt/r01/data/goes-poes_ghrsst/daily/200301*.nc",output=monthly_file).variables['analysed_sst','sea_ice_fraction']
and get the same error.
I can't figure out if this is an NCO problem or a Python thing.
I get the same error when I use only two files to see if the issue comes from the wildcard.
For example:
input_string="/mnt/r01/data/goes-poes_ghrsst/daily/20030201000000-STAR-L4_GHRSST-SSTfnd-Geo_Polar_Blended_Night-GLOB-v02.0-fv01.0-0-360.nc /mnt/r01/data/goes-poes_ghrsst/daily/20030202000000-STAR-L4_GHRSST-SSTfnd-Geo_Polar_Blended_Night-GLOB-v02.0-fv01.0-0-360.nc"
call(["ncea","-v","analysed_sst,sea_ice_fraction",input_string,monthly_file])
ncea: ERROR file /mnt/r01/data/goes-poes_ghrsst/daily/20030201000000-STAR-L4_GHRSST-SSTfnd-Geo_Polar_Blended_Night-GLOB-v02.0-fv01.0-0-360.nc,/mnt/r01/data/goes-poes_ghrsst/daily/20030202000000-STAR-L4_GHRSST-SSTfnd-Geo_Polar_Blended_Night-GLOB-v02.0-fv01.0-0-360.nc neither exists locally nor matches remote filename patterns
I can't figure out what the syntax should be.
I get the same error if I do:
input_string="file1,file2"
input_string="file1 file2"
input_string="file1\ file2"
And if I try a list instead, like what glob.glob would return:
input_string=["file1","file2"]
I get:
TypeError: expected str, bytes or os.PathLike object, not list
Thanks!

So after finding this question: Using all elements of a list as argument to a system command (netCDF operator) in a python code
I finally figured it out:
input_string="/mnt/r01/data/goes-poes_ghrsst/daily/200301*.nc"
monthly_file="200301-gp-monthly.nc"
list1=['ncea','-v','analysed_sst,sea_ice_fraction']
list2=glob.glob(input_string)
command=list1+list2+[monthly_file]
subprocess.run(command)

Related

ValueError: need more than 0 values to unpack (Python 2)

I am trying to replicate another researcher's findings by using the Python file that he added as a supplement to his paper. It is the first time I am diving into Python, so the error might be extremely simple to fix, yet after two days I haven't still. For context, in the Readme file there's the following instruction:
"To run the script, make sure Python2 is installed. Put all files into one folder designated as “cf_dir”.
In the script I get an error at the following lines:
if __name__ == '__main__':
cf_dir, cf_file, cf_phys_file = sys.argv[1:4]
os.chdir(cf_dir)
cf = pd.read_csv(cf_file)
cf_phys = pd.read_csv(cf_phys_file)
ValueError: need more than 0 values to unpack
The "cf_file" and "cf_phys_file" are two major components of all files that are in the one folder named "cf_dir". The "cf_phys_file" relates only to two survey question's (Q22 and Q23), and the "cf_file" includes all other questions 1-21. Now it seems that the code is meant to retrieve those two files from the directory? Only for the "cf_phys_file" the columns 1:4 are needed. The current working directory is already set at the right location.
The path where I located "cf_dir" is as follows:
C:\Users\Marc-Marijn Ossel\Documents\RSM\Thesis\Data\Suitable for ML\Data en Artikelen\Per task Suitability for Machine Learning score readme\cf_dir
Alternative option in readme file,
In the readme file there's this option, but also here I cannot understand how to direct the path to the right location:
"Run the following command in an open terminal (substituting for file names
below): python cfProcessor_AEAPnP.py cf_dir cf_file cf_phys_file task_file jobTaskRatingFile
jobDataFile OESfile
This should generate the data and plots as necessary."
When I run that in "Command Prompt", I get the following error, and I am not sure how to set the working directory correctly.
- python: can't open file 'cfProcessor_AEAPnP.py': [Errno 2] No such file or directory
Thanks for the reading, and I hope there's someone who could help me!
Best regards & stay safe out there during Corona!!
Marc
cf_dir, cf_file, cf_phys_file = sys.argv[1:4]
means, the python file expects few arguments when called.
In order to run
python cfProcessor_AEAPnP.py cf_dir cf_file cf_phys_file task_file jobTaskRatingFile jobDataFile OESfile
the command prompt should be in that folder.
So, open command prompt and type
cd path_to_the_folder_where_ur_python_file_is_located
Now, you would have reached the path of the python file.
Also, make sure you give full path in double quotes for the arguments.

Rerun maxent using python

I'm trying to create a script that reruns maxent for different inputs. I have around 1500 species that need to be processed separately. My idea is to use a python loop for this program. But I can't seem to find the right information to start.
Right now I have 3 simple lines which tells python to open the program.
import subprocess
subprocess.call(['java', '-jar', r'C:\Program Files (x86)\Maxent\maxent.jar'])
subprocess.call([r'C:\Program Files (x86)\Maxent\maxent.bat'])
Now I want to tell python which input to use. However, I can't seem to find any documentation on a function which specifies the input for a program.
Does anyone have any ideas on how to approach the next step?
-------------------Edit------------------------------------
Right now I have the following code:
import glob
import subprocess
insect = glob.glob('D:\Maxent\samples\*.csv')
for species in insect:
subprocess.call(['java', '-jar', r'D:\Maxent\maxent.jar', 'environmentallayers=D:\Maxent\layers',
species, 'outputdirectory= D:\Maxent\outputs', 'redoifexists', 'autorun'])
This gives me the following error in maxent:
Initialization flags not understood: D:\Maxent\samples\Aeshna_juncea.csv
and the folowing error in pyhton
C:\Users\merel\PycharmProjects\untitled\venv\Scripts\python.exe "C:/Users/merel/PycharmProjects/untitled/maxent python.py"
Error: Initialization flags not understood: species
Error: No species selected
I also tried it with the ' around species. This gave me the following error:
C:\Users\merel\PycharmProjects\untitled\venv\Scripts\python.exe "C:/Users/merel/PycharmProjects/untitled/maxent python.py"
Error: Initialization flags not understood: species
Error: No species selected
I don't know why the program doesn't understand the argument. I also tried it with x instead of species to make sure that the word species didn't already exist in the library.
You need to pass arguments/flags to Maxent's jar file in order to achieve your goals, if I understood it correctly.
I've downloaded the Maxen and found the necessary arguments/flags. When you start Maxent, click help and scroll down to Batch mode, you can find all the arguments/flags there also an example usage as well; java -mx512m -jar maxent.jar environmentallayers=layers samplesfile=samples\bradypus.csv outputdirectory=outputs togglelayertype=ecoreg redoifexists autorun
You can add those arguments/flags after your path such like this:
subprocess.call(['java', '-jar', r'C:\Program Files (x86)\Maxent\maxent.jar', 'environmentallayers=layers', 'samplesfile=samples\bradypus.csv', 'outputdirectory=outputs', 'togglelayertype=ecoreg', 'redoifexists', 'autorun'])
I hope this helps you on your project. I have not tried any of this since I do not know anything about your field.
Edit:
You don't have to call the .bat file since it also executes the maxent.jar wtih the given arguments/flags.

Can't get working command line on prompt to work on subprocess

I need to extract text from a PDF. I tried the PyPDF2, but the textExtract method returned an encrypted text, even though the pdf is not encrypted acoording to the isEncrypted method.
So I moved on to trying accessing a program that does the job from the command prompt, so I could call it from python with the subprocess module. I found this program called textExtract, which did the job I wanted with the following command line on cmd:
"textextract.exe" "download.pdf" /to "download.txt"
However, when I tried running it with subprocess I couldn't get a 0 return code.
Here is the code I tried:
textextract = shlex.split(r'"textextract.exe" "download.pdf" /to "download.txt"')
subprocess.run(textextract)
I already tried it with shell=True, but it didn't work.
Can anyone help me?
I was able to get the following script to work from the command line after installing the PDF2Text Pilot application you're trying to use:
import shlex
import subprocess
args = shlex.split(r'"textextract.exe" "download.pdf" /to "download.txt"')
print('args:', args)
subprocess.run(args)
Sample screen output of running it from a command line session:
> C:\Python3\python run-textextract.py
args: ['textextract.exe', 'download.pdf', '/to', 'download.txt']
Progress:
Text from "download.pdf" has been successfully extracted...
Text extraction has been completed!
The above output was generated using Python 3.7.0.
I don't know if your use of spyder on anaconda affects things or not since I'm not familiar with it/them. If you continue to have problems with this, then, if it's possible, I suggest you see if you can get things working directly—i.e. running the the Python interpreter on the script manually from the command line similar to what's shown above. If that works, but using spyder doesn't, then you'll at least know the cause of the problem.
There's no need to build a string of quoted strings and then parse that back out to a list of strings. Just create a list and pass that:
command=["textextract.exe", "download.pdf", "/to", "download.txt"]
subprocess.run(command)
All that shlex.split is doing is creating a list by removing all of the quotes you had to add when creating the string in the first place. That's an extra step that provides no value over just creating the list yourself.

Windows Python2.7 path parsing error

I'm attempting to use the python 010 editor template parser
The doc specifically states (to get started):
import pfp
pfp.parse(data_file="C:\path2File\file.SWF",template_file="C:\path2File\SWFTemplate.bt")
However, it throws:
RuntimeError: Unable to invoke 'cpp'. Make sure its path was passed correctly
Original error: [Error 2] The system cannot find the file specified
I've tried everything, from using raw strings:
df = r"C:\path2File\file.swf"
tf = r"C:\path2File\SWFTemplate.bt"
To single and then double '\'s or '/'s in the string. However, it keeps throwing the above error message.
I checked the files are in the path and ensured everything is properly spelled, case sensitively.
To test my paths, I've used the windows "type" (equiv to *nix strings) and passed the strings as args in a subprocess.Popen which worked.
The problem is that it's trying to invoke a C++ compiler: cpp and you don't have one.
You'll need to install one, or make sure that your PATH has a cpp.exe on it somewhere.

Python 3 FTPLIB, NoneType Errors, and Uploads/Downloads

In my script I want to be able, in the end, to be able to download all files in a directory and all sub-directories... So I am trying FTPLIB. I'm trying to call dir of my ftp server and put it into a variable, but I get NONETYPE?! I can connect to the server and when I call directory = session.dir() It displays a kind of matrix style output in the console with files, read/write perms, dates, etc.... But when I then try to print Directory all I seem to get is "None". My initial idea was to for each item in the directory download them to my computer, but I can't seem to get a list of the directory!
directory = session.dir()
print(str(directory))
Sorry for the long and probably trivial explanation, but I have become a little bit too frustrated.
Any help would be very much appreciated!
-Clem
First, read this. http://docs.python.org/library/ftplib.html#ftplib.FTP.nlst
Then, try this:
directory = session.nlst()
print(directory)
Note.
You don't need to do print(str(...)). The print function gets the string representation for you.
In the official docs, the very first example shows how to do what you need: use .retrlines('LIST') to read the output of LIST command.
Another way is to use .nlst().

Categories