Streaming objects from Python to PowerShell? - python

After some searching and checking previous answers like here Passing objects from python to powershell, apparently the best way to send objects from a Python script to PowerShell script or command is going to be as JSON.
However, with something like this (dir_json.py):
from json import dumps
from pathlib import Path
for fn in Path('.').glob('**/*'):
print(dumps({'name': str(fn)}))
You can do this:
python .\dir_json.py | ConvertFrom-JSON
And the result is OK, but the problem I'm hoping to solve is that ConvertFrom-JSON seems to wait until the script has completed before reading any of the JSON, even though the invidual JSON objects end on each line. This can easily be verified by adding a line like time.sleep(1) after the print.
Is there a better way to send objects from Python to PowerShell than using JSON objects? And is there a way to actually stream them as they are written, instead of passing the entire output of the Python script after the script completes?
I ran into jq, which was recommended by "people on the internet" as a solution to my type of problem, stating that ConvertFrom-JSON doesn't allow streaming, but jq does. However, this did nothing to improve my situation:
python .\dir_json_slow.py | jq -cn --stream 'fromstream(1|truncate_stream(inputs))' | ConvertFrom-JSON
To make jq play nice, I did change the script to write a list of objects instead of separate objects:
from sys import stdout
from time import sleep
from json import dumps
from pathlib import Path
first = True
stdout.write('[\n')
for fn in Path('.').glob('**/*'):
if first:
stdout.write(dumps({'name': str(fn)}))
first = False
else:
stdout.write(',\n'+dumps({'name': str(fn)}))
stdout.flush()
sleep(.1)
stdout.write('\n]')
(note that the problem isn't ConvertFrom-JSON holding things up at the end, jq itself only starts writing output once the Python script completes)

As long as each line[1] that your python script outputs is a complete JSON object by itself, you can use a ForEach-Object call to process each output line as it is being received by PowerShell and call ConvertFrom-Json for each:
python .\dir_json.py | ForEach-Object { ConvertFrom-JSON $_ }
A simplified example that demonstrates that streaming occurs, pausing between lines processed (waiting for a keypress):
# Prompts for a keystroke after each line emitted by the Python command.
python -c 'from json import dumps; print(dumps({''name'': ''foo''})); print(dumps({''name'': ''bar''}))' |
ForEach-Object { ConvertFrom-Json $_ | Out-Host; pause }
Note: The Out-Host call is only used to work around a display bug in PowerShell, still present as of PowerShell 7.2: Out-Host forces synchronous printing of the implicit table-formatting that is applied - see this answer.
ConvertFrom-Json - atypically for PowerShell cmdlets - collects all input up front before emitting the object(s) that the JSON input has been parsed into, which can be demonstrated as follows:
# Prompts for a keystroke first, and only after *both*
# strings have been emitted does ConvertFrom-Json produce output.
& { '{ "name": "foo" }'; pause; '{ "name": "bar" }' } |
ConvertFrom-Json | Out-Host
[1] PowerShell relays output from external programs such as Python invariably line by line. By contrast, a PowerShell-native command is free to emit any object to the pipeline, including multiline strings.

Related

debus-monitor output reading

I am trying to trigger the python script or shell script whenever a desktop notification has arrived using dbus-monitor
I am using the command in this way
dbus-monitor "interface='org.freedesktop.Notifications'" | grep --line-buffered "string" | xargs -I '{}' python3 ./test.py {}
after that, I am trying to send the desktop notification from another terminal using
-> notify-send "hello" "world"
the output for the above custom notification is
string "notify-send"
string ""
string "hello"
string "world "
string "urgency"
string "notify-send"
string ""
string "hello"
string "world "
string "urgency"
but if my output of this command is 10 lines, then the python script is getting called for every line.
but my expectation is to call the python script once for every notification and then get all the output in a single line as a param for the python script.
It is wise to take advantage of systemd integration with dbus.
Using systemd integration the programmer has better controls/sensors over the dbus integration. Also can take advantage on systemd loging/monitors mechanisms.
There is a good article here about systemd dbus with python..
Also there is very related answer to your question in this answer. as well.

Get the PID from a process at first

I have a question about how to get the PID from a process at first, not when the process has finished.
This is because I want to be able to kill the FFmpeg process (not the python script) if necessary, so it doesn't make sense to know its PID at the end.
FYI: This script is getting the PIDs from FFmpeg processes.
Below you will see how I coded this script, which is working fine, but I get the PID at the end, as I mentioned before.
Any idea about how to do it?
import json, base64, sys, subprocess
thisList = json.loads(base64.b64decode(sys.argv[1]))
logFileName = json.loads(base64.b64decode(sys.argv[2]))
p = subprocess.Popen(thisList, stderr = open(logFileName, 'w'))
print(p.pid)
As you can see, I am decoding an encoded base64 string (the FFmpeg command line) to protect it because is coming from an URL.
Also, I need to write the FFmpeg output to a file, so I am getting the stderr to write it externally. I encoded its path in base64 too.
Finally to mention that there will be many concurrent FFmpeg PIDs working at the same time, so something like searching the PID using the FFmpeg name is too generic.
It could be possible to launch a second script to search the entire command line of FFmpeg and getting its PID, which could work well in Unix (I already have a script to do that in PHP), but not in Windows. I would like to be compatible in both SO.
That PHP script working in Unix is ...
function getPidByCommand($myString)
{
$pid = 0;
$myString = str_replace('"', '', $myString);
exec("ps aux | grep \"${myString}\" | grep -v grep | awk '{ print $2 }' | head -1", $out);
if (isset($out[0]))
{
$pid = intval($out[0]);
}
return $pid;
}
Thank you very much in advance.
Mapg

Python not printing output

I am learning to use electron js with python and I am using python-shell so I have the following simple python script:
import sys, json
# simple JSON echo script
for line in sys.stdin:
print(json.dumps(json.loads(line)))
and in my main.js:
let {PythonShell} = require('python-shell')
let pyshell = new PythonShell('/home/bassel/electron_app/pyapp/name.py', {mode : 'json'});
pyshell.send({name:"mark"})
pyshell.on('message', function (message) {
// received a message sent from the Python script (a simple "print" statement)
console.log("hi");
});
but the hi is not getting printed, what is wrong?
This problem can also occur when trying to suppress the newline from the end of print output. See Why doesn't print output show up immediately in the terminal when there is no newline at the end?.
Output is often buffered in order to preserve system resources. This means that in this case, the system holds back the Python output until there's enough to release together.
To overcome this, you can explicitly "flush" the output:
import sys, json
# simple JSON echo script
for line in sys.stdin:
print(json.dumps(json.loads(line)))
sys.stdout.flush() # <--- added line to flush output
If you're using Python 3.3 or higher, you may alternatively use:
import sys, json
# simple JSON echo script
for line in sys.stdin:
print(json.dumps(json.loads(line)), flush=True) # <--- added keyword

Read a python variable in a shell script?

my python file has these 2 variables:
week_date = "01/03/16-01/09/16"
cust_id = "12345"
how can i read this into a shell script that takes in these 2 variables?
my current shell script requires manual editing of "dt" and "id". I want to read the python variables into the shell script so i can just edit my python parameter file and not so many files.
shell file:
#!/bin/sh
dt="01/03/16-01/09/16"
cust_id="12345"
In a new python file i could just import the parameter python file.
Consider something akin to the following:
#!/bin/bash
# ^^^^ NOT /bin/sh, which doesn't have process substitution available.
python_script='
import sys
d = {} # create a context for variables
exec(open(sys.argv[1], "r").read()) in d # execute the Python code in that context
for k in sys.argv[2:]:
print "%s\0" % str(d[k]).split("\0")[0] # ...and extract your strings NUL-delimited
'
read_python_vars() {
local python_file=$1; shift
local varname
for varname; do
IFS= read -r -d '' "${varname#*:}"
done < <(python -c "$python_script" "$python_file" "${#%%:*}")
}
You might then use this as:
read_python_vars config.py week_date:dt cust_id:id
echo "Customer id is $id; date range is $dt"
...or, if you didn't want to rename the variables as they were read, simply:
read_python_vars config.py week_date cust_id
echo "Customer id is $cust_id; date range is $week_date"
Advantages:
Unlike a naive regex-based solution (which would have trouble with some of the details of Python parsing -- try teaching sed to handle both raw and regular strings, and both single and triple quotes without making it into a hairball!) or a similar approach that used newline-delimited output from the Python subprocess, this will correctly handle any object for which str() gives a representation with no NUL characters that your shell script can use.
Running content through the Python interpreter also means you can determine values programmatically -- for instance, you could have some Python code that asks your version control system for the last-change-date of relevant content.
Think about scenarios such as this one:
start_date = '01/03/16'
end_date = '01/09/16'
week_date = '%s-%s' % (start_date, end_date)
...using a Python interpreter to parse Python means you aren't restricting how people can update/modify your Python config file in the future.
Now, let's talk caveats:
If your Python code has side effects, those side effects will obviously take effect (just as they would if you chose to import the file as a module in Python). Don't use this to extract configuration from a file whose contents you don't trust.
Python strings are Pascal-style: They can contain literal NULs. Strings in shell languages are C-style: They're terminated by the first NUL character. Thus, some variables can exist in Python than cannot be represented in shell without nonliteral escaping. To prevent an object whose str() representation contains NULs from spilling forward into other assignments, this code terminates strings at their first NUL.
Now, let's talk about implementation details.
${#%%:*} is an expansion of $# which trims all content after and including the first : in each argument, thus passing only the Python variable names to the interpreter. Similarly, ${varname#*:} is an expansion which trims everything up to and including the first : from the variable name passed to read. See the bash-hackers page on parameter expansion.
Using <(python ...) is process substitution syntax: The <(...) expression evaluates to a filename which, when read, will provide output of that command. Using < <(...) redirects output from that file, and thus that command (the first < is a redirection, whereas the second is part of the <( token that starts a process substitution). Using this form to get output into a while read loop avoids the bug mentioned in BashFAQ #24 ("I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?").
The IFS= read -r -d '' construct has a series of components, each of which makes the behavior of read more true to the original content:
Clearing IFS for the duration of the command prevents whitespace from being trimmed from the end of the variable's content.
Using -r prevents literal backslashes from being consumed by read itself rather than represented in the output.
Using -d '' sets the first character of the empty string '' to be the record delimiter. Since C strings are NUL-terminated and the shell uses C strings, that character is a NUL. This ensures that variables' content can contain any non-NUL value, including literal newlines.
See BashFAQ #001 ("How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?") for more on the process of reading record-oriented data from a string in bash.
Other answers give a way to do exactly what you ask for, but I think the idea is a bit crazy. There's a simpler way to satisfy both scripts - move those variables into a config file. You can even preserve the simple assignment format.
Create the config itself: (ini-style)
dt="01/03/16-01/09/16"
cust_id="12345"
In python:
config_vars = {}
with open('the/file/path', 'r') as f:
for line in f:
if '=' in line:
k,v = line.split('=', 1)
config_vars[k] = v
week_date = config_vars['dt']
cust_id = config_vars['cust_id']
In bash:
source "the/file/path"
And you don't need to do crazy source parsing anymore. Alternatively you can just use json for the config file and then use json module in python and jq in shell for parsing.
I would do something like this. You may want to modify it little bit for minor changes to include/exclude quotes as I didn't really tested it for your scenario:
#!/bin/sh
exec <$python_filename
while read line
do
match=`echo $line|grep "week_date ="`
if [ $? -eq 0 ]; then
dt=`echo $line|cut -d '"' -f 2`
fi
match=`echo $line|grep "cust_id ="`
if [ $? -eq 0 ]; then
cust_id=`echo $line|cut -d '"' -f 2`
fi
done

use two pipelines for python input file argument and stdin streaming

Is there a one-liner approach to running the following python script in linux bash, without saving any temporary file (except /dev/std* ) ?
my python script test.py takes in a filename as an argument, but also sys.stdin as a streaming input.
#test.py
#!/usr/bin/python
import sys
fn=sys.argv[1]
checkofflist=[]
with open(fn,'r') as f:
for line in f.readlines():
checkofflist.append(line)
for line in sys.stdin:
if line in checkofflist:
# do something to line
I would like to do something like
hadoop fs -cat inputfile.txt > /dev/stdout | cat streamingfile.txt | python test.py /dev/stdin
But of course this doesn't work since the middle cat corrupts the intended /dev/stdin content. Being able to do this is nice since then I don't need to save hdfs files locally every time I need to work with them.
I think what you're looking for is:
python test.py <( hadoop fs -cat inputfile.txt ) <streamingfile.txt
In bash, <( ... ) is Process Substitution. The command inside the parentheses is run with its output connected to a fifo or equivalent, and the name of the fifo (or /dev/fd/n if bash is able to use an unnamed pipe) is substituted as an argument. The tool sees a filename, which it can just open and use normally. (>(...) is also available, with input connected to a fifo, in case you want a named streaming output.)
Without relying on bash process substitution, you might also try
hadoop fs -cat inputfile.txt | python test.py streamingfile.txt
This provides streamingfile.txt as a command-line argument for test.py to use as a file name to open, as well as providing the contents of inputfile.txt on standard input.

Categories