I have a problem with files I generated with Python. I have some files (.sh-files) that I want to create dynamically. The files themselves can be executed properly and do what they are supposed to.
The Python generated files however are IDENTICAL, the diff command in Linux has an empty result! But when I execute the generated .sh scripts, they give me random errors.
Here is my normal file (for example):
OBJDUMP=`which riscv32-unknown-elf-objdump`
OBJCOPY=`which riscv32-unknown-elf-objcopy`
COMPILER=`which riscv32-unknown-elf-gcc`
RANLIB=`which riscv32-unknown-elf-ranlib`
VSIM=`which vsim`
echo $VSIM
TARGET_C_FLAGS="-O3 -m32 -g"
#TARGET_C_FLAGS="-O2 -g -falign-functions=16 -funroll-all-loops"
# if you want to have compressed instructions, set this to 1
RVC=0
# if you are using zero-riscy, set this to 1, otherwise it uses RISCY
USE_ZERO_RISCY=0
# set this to 1 if you are using the Floating Point extensions for riscy only
RISCY_RV32F=0
# zeroriscy with the multiplier
ZERO_RV32M=0
# zeroriscy with only 16 registers
ZERO_RV32E=0
# riscy with PULPextensions, it is assumed you use the ETH GCC Compiler
GCC_MARCH="IMXpulpv2"
#compile arduino lib
ARDUINO_LIB=1
PULP_GIT_DIRECTORY=../../
SIM_DIRECTORY="$PULP_GIT_DIRECTORY/vsim"
#insert here your post-layout netlist if you are using IMPERIO
PL_NETLIST=""
cmake "$PULP_GIT_DIRECTORY"/sw/ \
-DPULP_MODELSIM_DIRECTORY="$SIM_DIRECTORY" \
-DCMAKE_C_COMPILER="$COMPILER" \
-DVSIM="$VSIM" \
-DRVC="$RVC" \
-DRISCY_RV32F="$RISCY_RV32F" \
-DUSE_ZERO_RISCY="$USE_ZERO_RISCY" \
-DZERO_RV32M="$ZERO_RV32M" \
-DZERO_RV32E="$ZERO_RV32E" \
-DGCC_MARCH="$GCC_MARCH" \
-DARDUINO_LIB="$ARDUINO_LIB" \
-DPL_NETLIST="$PL_NETLIST" \
-DCMAKE_C_FLAGS="$TARGET_C_FLAGS" \
-DCMAKE_OBJCOPY="$OBJCOPY" \
-DCMAKE_OBJDUMP="$OBJDUMP"
And here is the python generated one.
OBJDUMP=`which riscv32-unknown-elf-objdump`
OBJCOPY=`which riscv32-unknown-elf-objcopy`
COMPILER=`which riscv32-unknown-elf-gcc`
RANLIB=`which riscv32-unknown-elf-ranlib`
VSIM=`which vsim`
TARGET_C_FLAGS="-O3 -m32 -g"
RVC=0
USE_ZERO_RISCY=0
RISCY_RV32F=0
ZERO_RV32M=0
ZERO_RV32E=0
GCC_MARCH="IMXpulpv2"
ARDUINO_LIB=1
PULP_GIT_DIRECTORY=../../
SIM_DIRECTORY="$PULP_GIT_DIRECTORY/vsim"
PL_NETLIST=""
cmake "$PULP_GIT_DIRECTORY"/sw/ \
-DPULP_MODELSIM_DIRECTORY="$SIM_DIRECTORY" \
-DCMAKE_C_COMPILER="$COMPILER" \
-DVSIM="$VSIM" \
-DRVC="$RVC" \
-DRISCY_RV32F="$RISCY_RV32F" \
-DUSE_ZERO_RISCY="$USE_ZERO_RISCY" \
-DZERO_RV32M="$ZERO_RV32M" \
-DZERO_RV32E="$ZERO_RV32E" \
-DGCC_MARCH="$GCC_MARCH" \
-DARDUINO_LIB="$ARDUINO_LIB" \
-DPL_NETLIST="$PL_NETLIST" \
-DCMAKE_C_FLAGS="$TARGET_C_FLAGS" \
-DCMAKE_OBJCOPY="$OBJCOPY" \
-DCMAKE_OBJDUMP="$OBJDUMP"
I dont know how much this will help you. But that's how it is.
Now I execute this scripts with ./script and the first script executes. The second one gives me the error:
CMake Error: The source directory "../sw/build/ " does not exist.
And the path where the script resides in is exactly ../sw/build/
What's going on here?
Related
I am working to implement a snakemake pipeline on our university's HPC. I am doing so in an activated conda environment and with the following script submitted using sbatch:
snakemake --dryrun --summary --jobs 100 --use-conda -p \
--configfile config.yaml --cluster-config cluster.yaml \
--profile /path/to/conda/env --cluster "sbatch --parsable \
--qos=unlim --partition={cluster.queue} \
--job-name=username.{rule}.{wildcards} --mem={cluster.mem}gb \
--time={cluster.time} --ntasks={cluster.threads} \
--nodes={cluster.nodes}"
config.yaml
metaG_accession: PRJNA766694
metaG_ena_table: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/ENA_tables/PRJNA766694_metaG_wenv.txt
inputDIR: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input
outputDIR: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/output
scratch: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/scratch
adapters: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/adapters/illumina-adapters.fa
metaG_sample_list: /home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/SampleList_ForAssembly_metaG.txt
megahit_other: --continue --k-list 29,39,59,79,99,119
megahit_cpu: 80
megahit_min_contig: 1000
megahit_mem: 0.95
restart-times: 0
max-jobs-per-second: 1
max-status-checks-per-secon: 10
local-cores: 1
rerun-incomplete: true
keep-going: true
Snakefile
configfile: "config.yaml"
import io
import os
import pandas as pd
import numpy as np
import pathlib
from snakemake.exceptions import print_exception, WorkflowError
#----SET VARIABLES----#
METAG_ACCESSION = config["metaG_accession"]
METAG_SAMPLES = pd.read_table(config["metaG_ena_table"])
INPUTDIR = config["inputDIR"]
ADAPTERS = config["adapters"]
SCRATCHDIR = config["scratch"]
OUTPUTDIR = config["outputDIR"]
METAG_SAMPLELIST = pd.read_table(config["metaG_sample_list"], index_col="Assembly_group")
METAG_ASSEMBLYGROUP = list(METAG_SAMPLELIST.index)
ASSEMBLYGROUP = METAG_ASSEMBLYGROUP
#----COMPUTE VAR----#
MEGAHIT_CPU = config["megahit_cpu"]
MEGAHIT_MIN_CONTIG = config["megahit_min_contig"]
MEGAHIT_MEM = config["megahit_mem"]
MEGAHIT_OTHER = config["megahit_other"]
and slurm error output
snakemake: error: unrecognized arguments: --metaG_accession=PRJNA766694
--metaG_ena_table=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/ENA_tables/PRJNA766694_metaG_wenv.txt
--inputDIR=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input
--outputDIR=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/output
--scratch=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/scratch
--adapters=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/adapters/illumina-adapters.fa
--metaG_sample_list=/home/etucker5/miniconda3/envs/s-niv-MAGs/data/input/SampleList_ForAssembly_metaG.txt
--megahit_cpu=80 --megahit_min_contig=1000 --megahit_mem=0.95
On execution it fails to recognize arguments in my config.yaml file (for ex.):
snakemake: error: unrecognized arguments: --inputDIR=[path\to\dir]
In my understanding the Snakefile should be able to take any arguments stated in the config.yaml using:
INPUTDIR = config["inputDIR"]
when:
configfile: "config.yaml"
is input in my Snakefile.
Also, my config.yaml properly recognizes non-custom arguments such as:
max-jobs-per-second: 1
Is there some custom library setup that I need to initiate for this particular config.yaml? This is my first time using Snakemake and I am still learning how to properly work with config files.
Also, on swapping the paths directly into the Snakefile I was able to get the summary output for my dryrun without the unrecognized arguments error.
The issue was the way in which the workflow was executed using Slurm. I had been executing snakemake with sbatch as a bash script.
Instead, snakemake can be executed directly through the terminal or using bash. While I'm not exactly sure why, this caused my jobs to run on the cluster's local, which has tight memory limits, rather than on the hpc partitions that have appropriate capacity. Executing it this way also caused snakemake to not recognize the paths set in my config.yaml.
The lesson learned is that snakemake has a built in way of communicating with the slurm manager and it seems that executing snakemake through sbatch will cause conflicts.
("Conda Env") [user#log001 "Main Directory"]$ snakemake \
> --jobs 100 --use-conda -p -s Snakefile \
> --cluster-config cluster.yaml --cluster "sbatch \
> --parsable --qos=unlim --partition={cluster.queue} \
> --job-name=TARA.{rule}.{wildcards} --mem={cluster.mem}gb \
> --time={cluster.time} --ntasks={cluster.threads} --nodes={cluster.nodes}"
I am having an issue trying to run this code https://github.com/google/e3d_lstm.
I followed the instructions to cd into the directory then run
python -u run.py \
--is_training True \
--dataset_name mnist \
--train_data_paths ~/data/moving-mnist-example/moving-mnist-train.npz \
--valid_data_paths ~/data/moving-mnist-example/moving-mnist-valid.npz \
--pretrained_model pretrain_model/moving_mnist_e3d_lstm/model.ckpt-80000 \
--save_dir checkpoints/_mnist_e3d_lstm \
--gen_frm_dir results/_mnist_e3d_lstm \
--model_name e3d_lstm \
--allow_gpu_growth True \
--img_channel 1 \
--img_width 64 \
--input_length 10 \
--total_length 20 \
--filter_size 5 \
--num_hidden 64,64,64,64 \
--patch_size 4 \
--layer_norm True \
--sampling_stop_iter 50000 \
--sampling_start_value 1.0 \
--sampling_delta_per_iter 0.00002 \
--lr 0.001 \
--batch_size 4 \
--max_iterations 1 \
--display_interval 1 \
--test_interval 1 \
--snapshot_interval 10000
but immediately get this error
Traceback (most recent call last):
File "run.py", line 22, in <module>
from src.data_provider import datasets_factory
ImportError: No module named data_provider
I am on Linux using Python 2.7 and am in the correct directory so I don't understand why python cannot import the folder? I have also tried adding a __init__.py file inside of the src folder but I am still unable to import.
Here are my PYTHONPATHS
['', '/home/kong/anaconda3/envs/tf/lib/python27.zip',
'/home/kong/anaconda3/envs/tf/lib/python2.7',
'/home/kong/anaconda3/envs/tf/lib/python2.7/plat-linux2',
'/home/kong/anaconda3/envs/tf/lib/python2.7/lib-tk',
'/home/kong/anaconda3/envs/tf/lib/python2.7/lib-old',
'/home/kong/anaconda3/envs/tf/lib/python2.7/lib-dynload',
'/home/kong/anaconda3/envs/tf/lib/python2.7/site-packages',
'/home/kong/anaconda3/envs/tf/lib/python2.7/site-packages']
However, shouldn't the path where I executed the script be automatically added into the list of locations Python searches for modules?
i want to process data with flink's python api on windows . But when i use the command to submit a job to Local cluster, it throws NullPointerException。
bin/flink run -py D:\workspace\python-test\flink-test.py
flink-test.py:
from pyflink.dataset import ExecutionEnvironment
from pyflink.table import TableConfig, DataTypes, BatchTableEnvironment
from pyflink.table.descriptors import Schema, OldCsv, FileSystem
exec_env = ExecutionEnvironment.get_execution_environment()
exec_env.set_parallelism(1)
t_config = TableConfig()
t_env = BatchTableEnvironment.create(exec_env, t_config)
t_env.connect(FileSystem().path('D:\\workspace\\python-test\\data.txt')) \
.with_format(OldCsv()
.line_delimiter(' ')
.field('word', DataTypes.STRING())) \
.with_schema(Schema()
.field('word', DataTypes.STRING())) \
.register_table_source('mySource')
t_env.connect(FileSystem().path('D:\\workspace\\python-test\\result.txt')) \
.with_format(OldCsv()
.field_delimiter('\t')
.field('word', DataTypes.STRING())
.field('count', DataTypes.BIGINT())) \
.with_schema(Schema()
.field('word', DataTypes.STRING())
.field('count', DataTypes.BIGINT())) \
.register_table_sink('mySink')
t_env.scan('mySource') \
.group_by('word') \
.select('word, count(1)') \
.insert_into('mySink')
t_env.execute("tutorial_job")
Does anyone know why?
I have solved this problem. I read the source code by the error message.
The NullPointerException is caused by that flinkOptPath is empty!. I use the flink.bat to submit the job , and the flink.bat don't set the flinkOptPath. So I add some code in the flink.bat like this . The flink.bat is Incomplete for now. we should run flink on linux.
I am new to the programing and I am using already created scripts, I am trying to update my RRD database in python. I have manage to create below code which don’t come back to me with any errors but when I am trying to generate a graph it don’t contain any data.
#!/usr/bin/python
#modules
import sys
import os
import time
import rrdtool
import Adafruit_DHT as dht
#assign data
h,t = dht.read_retry(dht.DHT22, 22)
#display data
print 'Temp={0:0.1f}*C'.format(t, h)
print 'Humidity={1:0.1f}%'.format(t,h)
#update database
data = "N:h:t"
ret = rrdtool.update("%s/humidity.rrd" % (os.path.dirname(os.path.abspath(__file__))),data)
if ret:
print rrdtool.error()
time.sleep(300)
Below my data base specification:
#! /bin/bash
rrdtool create humidity.rrd \
--start "01/01/2015" \
--step 300 \
DS:th_dht22:GAUGE:1200:-40:100 \
DS:hm_dht22:GAUGE:1200:-40:100 \
RRA:AVERAGE:0.5:1:288 \
RRA:AVERAGE:0.5:6:336 \
RRA:AVERAGE:0.5:24:372 \
RRA:AVERAGE:0.5:144:732 \
RRA:MIN:0.5:1:288 \
RRA:MIN:0.5:6:336 \
RRA:MIN:0.5:24:372 \
RRA:MIN:0.5:144:732 \
RRA:MAX:0.5:1:288 \
RRA:MAX:0.5:6:336 \
RRA:MAX:0.5:24:372 \
RRA:MAX:0.5:144:732 \
rrdtool will silently ignore updates that are either too far apart or lie outside the predefined input range. I would add a logging feature to your code to see what you are trying to feed to rrdtool.
I am using the boto library to create a job flow in Amazons Elastic MapReduce Webservice (EMR). The following code should create a step:
step2 = JarStep(name='Find similiar items',
jar='s3n://recommendertest/mahout-core/mahout-core-0.5-SNAPSHOT.jar',
main_class='org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob',
step_args=['s3n://bucket/output/' + run_id + '/aggregate_watched/',
's3n://bucket/output/' + run_id + '/similiar_items/',
'SIMILARITY_PEARSON_CORRELATION'
])
When I run the job flow, it always fails throwing this error:
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/JobContext
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.JobContext
This is the line in the EMR logs invoking the java code:
2011-01-24T22:18:54.491Z INFO Executing /usr/lib/jvm/java-6-sun/bin/java \
-cp /home/hadoop/conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/home/hadoop:/home/hadoop \
/hadoop-0.18-core.jar:/home/hadoop/hadoop-0.18-tools.jar:/home/hadoop/lib/*:/home/hadoop/lib/jetty-ext/* \
-Xmx1000m \
-Dhadoop.log.dir=/mnt/var/log/hadoop/steps/3 \
-Dhadoop.log.file=syslog \
-Dhadoop.home.dir=/home/hadoop \
-Dhadoop.id.str=hadoop \
-Dhadoop.root.logger=INFO,DRFA \
-Djava.io.tmpdir=/mnt/var/lib/hadoop/steps/3/tmp \
-Djava.library.path=/home/hadoop/lib/native/Linux-i386-32 \
org.apache.hadoop.mapred.JobShell \
/mnt/var/lib/hadoop/steps/3/mahout-core-0.5-SNAPSHOT.jar \
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob \
s3n://..../output/job_2011-01-24_23:09:29/aggregate_watched/ \
s3n://..../output/job_2011-01-24_23:09:29/similiar_items/ \
SIMILARITY_PEARSON_CORRELATION
What is wrong with the parameters? The java class definition can be found here:
https://hudson.apache.org/hudson/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/hadoop/similarity/item/ItemSimilarityJob.html
I found the solution for the problem:
You need to specify hadoop version 0.20 in the jobflow parameters
You need to run the JAR step with mahout-core-0.5-SNAPSHOT-job.jar, not with the mahout-core-0.5-SNAPSHOT.jar
If you have an additional streaming step in your jobflow, you need to fix a bug in boto:
Open boto/emr/step.py
Change line 138 to "return '/home/hadoop/contrib/streaming/hadoop-streaming.jar'"
Save and reinstall boto
This is how the job_flow function should be invoked to run with mahout:
jobid = emr_conn.run_jobflow(name = name,
log_uri = 's3n://'+ main_bucket_name +'/emr-logging/',
enable_debugging=1,
hadoop_version='0.20',
steps=[step1,step2])
The fix to boto described in step #2 above (i.e. using the non-versioned hadoop-streamin.jar file) has been incorporated into the github master in this commit:
https://github.com/boto/boto/commit/a4e8e065473b5ff9af554ceb91391f286ac5cac7
For Some reference doing this from boto
import boto.emr.connection as botocon
import boto.emr.step as step
con = botocon.EmrConnection(aws_access_key_id='', aws_secret_access_key='')
step = step.JarStep(name='Find similar items', jar='s3://mahout-core-0.6-job.jar', main_class='org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob', action_on_failure='CANCEL_AND_WAIT', step_args=['--input', 's3://', '--output', 's3://', '--similarityClassname', 'SIMILARITY_PEARSON_CORRELATION'])
con.add_jobflow_steps('jflow', [step])
Obviously you need to upload the mahout-core-0.6-job.jar to an accessible s3 location. And the input and out put have to be accessible.