executing long running hive queries from remote machine

executing long running hive queries from remote machine - python

I've to execute long running (~10 hours) hive queries from my local server using a python script. my target hive server is in an aws cluster.
I've tried to execute it using
pyhs2, execute('<command>')
and
paramiko, exec_command('hive -e "<command>"')
in both cases my query will be running in hive server and will complete successfully. but issue is even after successfully completing the query my parent python script continue to wait for return value and will remain in Interruptible sleep (Sl) state for infinite time!
is there anyway I can make my script work fine using pyhs2 or paramiko? os is there any other better option available in python?

As i mentioned before that even I face a similar issue in my Performance based environment.
My use-case was i was using PYHS2 module to run queries using HIVE TEZ execution engine. TEZ generates lot of logs(basically in seconds scale). the logs gets captured in STDOUT variable and is provided to the output once the query successfully completes.
The way to overcome is to stream the output as an when it is generated as shown below:
for line in iter(lambda: stdout.readline(2048), ""):
print line
But for this you will have to use native connection to cluster using PARAMIKO or FABRIC and then issue hive command via CLI or beeline.

Related

Provide input to background running process using PID on server machine

I need to run a batch script on my windows server machine. I am running that using WMIC command from my local system like this wmic /node:ipaddress /user:\username /password:password process call create "E:\Hello\start.bat" this gives me process ID and status.
That batch file will run a service in the server machine, and I can access the service from my local system. so once the operations on my local system are done. I will have to press Q on that batch file to generate report. But I have to enter button Q on my server from my local Using PID.
Is there a way to do that. I will have to execute these two commands using python script to automate the process.
Thanks in advance
I tried providing input to WMIC process using subprocess.PIPE.stdin("Q"). but did not work.
The two commands are : 1) wmic /node:ipaddress /user:\username /password:password process call create "E:\Hello\start.bat", this command will give me process id and status. 2) a command to enter key "Q" on the same process ID, returned from first command. If there is another solution also to do this that is also fine for me, but the problem is that, the server is windows 2016 and communicating that server from python code is little difficult.

Python Script gets ORA-03150 but same procedure runs fine from sqldeveloper

My Python Script fails while running a stored procedure that accesses a view that uses a DB link to access a remote table.
That stored procedure works fine when run from Oracle SQL Developer but not when run from the python script.
When run from SQL Developer the stored procedure takes about a minute to run but fails in the python script after about 16 minutes.
It throws this error:
ORA-03150: end-of-file on communication channel for database link
Why would it fail from python and not sql developer? Why is it slower in python? It is logging in with the same userid in both.

How to run a single process multiple times in parallel using a UNIX shell script?

I am working on one of the MPP databases and would like to run a single SQL query using multiple sessions in python or UNIX shell script. Can somebody share your thoughts on spawning a SQL in python/UNIX utility. Any inputs would be appreciated. Thank you.
Code :-
for i in {1..$n}
do
( sh run_sql.sh test.sql touchstone_test & )
done

For python you could download the MySQLdb module. MySQLdb is an interface for connecting to a MySQL database server from Python. It implements the Python Database API v2.0 and is built on top of the MySQL C API. More info.

Depending on scale, you can choose either option.
If you have small tasks to accomplish, running a shell script is fine. Note that you can also pipe the query to the mysql CLI client, e.g.
mysql_cmd="mysql -h<host> -u<user> -p<pwd> <db>"
echo "SELECT name, id FROM myobjects WHERE ...." | $mysql_cmd
For a larger scale project, I would go with Python and the mysqldb interface that was mentioned already.

Insert into DB using remote server

I am using a search API to fetch some results. These results are inserted into the database. The only issue is it takes too much time. What I want to do is get my script running, even when I close my laptop it should be running and inserting records into the remote database. Does running the script on the server like this helps:
python myscript.py &

Execute MySQL script from Arduino Yun (without Python)

I want to retrieve data from MySQL database which is running directly in Arduino YUN itself.
I want to do it without Python, but directly using MySQL commands and Process. Is it possible?
Every post that I found in the internet is about to "how to retrieve data using python". But I don't want to use python, because connecting do database and etc. routine slows down my queries.
I'm retrieving data from database multiple times, and from different tables. So I moved logic of data retrieval to MySQL function, and now want to just call this function using Process class. The problem is that it works directly from mysql console, works using python, but doesn't work directly. When I say directly, I mean this:
Process process;
process.begin("python");
process.addParameter("/mnt/sda1/arduino/python/read.py");
process.addParameter(msg);
process.run();
This code works perfectly and uses python. Inside read.py file I have database call routine. So I want to do the same thing, but without python:
Process process;
process.begin("mysql");
process.addParameter("-u root -parduino --database arduino -e \"select process('my_text')\"");
process.run();
As you can understand, this code sample doesn't work. But same Mysql script works perfectly, and much faster than using python, if I run it through console.

Try Process.runShellCommand. It will accept your complete statement as an argument:
Process process;
process.runShellCommand("mysql -u root -parduino --database arduino -e \"select process('my_text')\"");

You can use subprocess module.
https://docs.python.org/2/library/subprocess.html
Using this you can send your mysql commands to the terminal as if sending directly through python script.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

executing long running hive queries from remote machine - python

Related

Provide input to background running process using PID on server machine

Python Script gets ORA-03150 but same procedure runs fine from sqldeveloper

How to run a single process multiple times in parallel using a UNIX shell script?

Insert into DB using remote server

Execute MySQL script from Arduino Yun (without Python)

Categories

Resources