CPP-like feature for Python text processing - python

I have a python script that emits text files which are inputs to a 3rd-party black box application, and it constructs the files according to input data and command line options. Depending on command line options a couple of these big text files differ only in one or two lines, I'll create an example below. Sadly, there is no good way to implement this on the black-box side of things, it only accepts straight directives, and if statements inside its inner loops make it very slow.
What I would like is to have a python module that contained the text with some cpp-like directives to choose the right lines in the text file it outputs. E.g., a file like blackboxText.py:
__txt="""
...
bunch of code
...
#if OPTION1
<command sequence one>
#else
<command sequence two>
#endif
...
bunch of code
...
"""
def get(options):
return(cpp(__txt,options))
What I do not want is to actually have to run cpp, I need to stick to python as the only executable in this stage.
I don't need the whole suite of 'cpp' commands like '#include', but it's not bad if it's complete. And it doesn't have to be cpp, it could be anything approximately similar in functionality.
Is there a Python module that can parse cpp-like directives like this in a text block?

Related

Refactor only selection black [duplicate]

We are not ready to automatically format the whole source code with black.
But from time to time I would like to execute black -S on a region via PyCharm.
There is a hint in the docs how to run black (or black -S (what I like)) on the whole file. But ...
How to run black only on a selected region?
Using Python Black on a code region in the PyCharm IDE can be done by implementing it as an external tool. Currently Black has two main options to choose the code to format
Run Black on the whole module specifying it on the CLI as the [SRC]...
Passing the code region as a string on the CLI using the -c, --code TEXT option.
The following implementation shows how to do this using the 2nd option. The reason is that applying Black to the whole module is likely to change the number of lines thus making the job of selecting the code region by choosing start and end line numbers more complicated.
Implementing the 1st option can be done, but it would require mapping the initial code region to the final code region after Black formats the entire module.
Lets take as example the following code that has a number of obvious PEP-8 violations (missing white-spaces and empty lines):
"""
long multi-line
comment
"""
def foo(token:int=None)->None:
a=token+1
class bar:
foo:int=None
def the_simple_test():
"""the_simple_test"""
pass
Step 1.
Using Black as an external tool in the IDE can be configured by going to File > Tools > External Tools and clicking the Add or Edit icons.
What is of interesst is passing the right Macros - (see point 3 "Parameter with macros") from the PyCharm IDE to the custom script that calls Black and does the necessary processing. Namely you'll need the Macros
FilePath - File Path
SelectionStartLine - Selected text start line number
SelectionEndLine - Select text end line number
PyInterpreterDirectory - The directory containing the Python interpreter selected for the project
But from time to time I would like to execute black -S on a region via PyCharm.
Any additional Black CLI options you want to pass as arguments are best placed at the end of the parameter list.
Since you may have Black installed on a specific venv, the example also uses the PyInterpreterDirectory macro.
The screenshot illustrates the above:
Step 2.
You'll need to implement a script to call Black and interface with the IDE. The following is a working example. It should be noted:
Four lines are OS/shell specific as commented (it should be trivial to adapt them to your environment).
Some details could be further tweaked, for purpose of example the implementation makes simplistic choices.
import os
import pathlib
import tempfile
import subprocess
import sys
def region_to_str(file_path: pathlib.Path, start_line: int, end_line: int) -> str:
file = open(file_path)
str_build = list()
for line_number, line in enumerate(file, start=1):
if line_number > end_line:
break
elif line_number < start_line:
continue
else:
str_build.append(line)
return "".join(str_build)
def black_to_clipboard(py_interpeter, black_cli_options, code_region_str):
py_interpreter_path = pathlib.Path(py_interpeter) / "python.exe" # OS specific, .exe for Windows.
proc = subprocess.Popen([py_interpreter_path, "-m", "black", *black_cli_options,
"-c", code_region_str], stdout=subprocess.PIPE)
try:
outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
proc.kill()
outs, errs = proc.communicate()
# By default Black outputs binary, decodes to default Python module utf-8 encoding.
result = outs.decode('utf-8').replace('\r','') # OS specific, remove \r from \n\r Windows new-line.
tmp_dir_name = tempfile.gettempdir()
tmp_file = tempfile.gettempdir() + "\\__run_black_tmp.txt" # OS specific, escaped path separator.
with open(tmp_file, mode='w+', encoding='utf-8', errors='strict') as out_file:
out_file.write(result + '\n')
command = 'clip < ' + str(tmp_file) # OS specific, send result to clipboard for copy-paste.
os.system(command)
def main(argv: list[str] = sys.argv[1:]) -> int:
"""External tool script to run black on a code region.
Args:
argv[0] (str): Path to module containing code region.
argv[1] (str): Code region start line.
argv[2] (str): Code region end line.
argv[3] (str): Path to venv /Scripts directory.
argv[4:] (str): Black CLI options.
"""
# print(argv)
lines_as_str = region_to_str(argv[0], int(argv[1]), int(argv[2]))
black_to_clipboard(argv[3], argv[4:], lines_as_str)
if __name__ == "__main__":
main(sys.argv[1:])
Step 3.
The hard part is done. Lets use the new functionality.
Normally select the lines you want as your code region in the editor. This has to be emphasized because the previous SelectionStartLine and SelectionEndLine macros need the selection to work. (See the next screenshot).
Step 4.
Run the external tool previously implemented. This can be done by right clicking in the editor and choosing External Tools > the_name_of_your_external_tool.
Step 5.
Simply paste (the screenshot shows the result after running the external tool and pressing Ctrl + v). The implementation in Step 2 copies Black's output to your OS's clipboard, this seemed like the preferable solution since this way you change the file inside the editor thus Undo Ctrl + z will also work. Changing the file by overwrite it programmatically outside the editor would be less seamless and might require refreshing it inside the editor.
Step 6.
You can record a macro of the previous steps and associate it with a keyboard shortcut to have the above functionality in one keystroke (similar to copy-paste Ctrl + c + Ctrl + v).
End Notes.
If you need to debug the functionality in Step 2 a Run Configuration can also be configured using the same macros the external tool configuration did.
It's important to notice when using the clipboard that character encodings can change across the layers. I decided to use clip and read into it directly from a temporary file, this was to avoid passing the code string to Black on the command line because the CMD Windows encoding is not UTF-8 by default. (For Linux users this should be simpler but can depend on your system settings.)
One important note is that you can choose a code region without the broader context of its indentation level. Meaning, for example, if you only choose 2 methods inside a class they will be passed to Black and formatted with the indentation level of module level functions. This shouldn't be a problem if you are careful to select code regions with their proper scope. This could also easily be solved by passing the additional macro SelectionStartColumn - Select text start column number from Step 1 and prepending that number of whitespaces to each line in the Step 2 script. (Ideally such functionality would be implemented by Black as a CLI option.) In any case, if needed, using Tab to put the region in its proper indentation level is easy enough.
The main topic of the question is how to integrating Black with the PyCharm IDE for a code region, so demonstrating the 2nd option should be enough to address the problem because the 1st option would, for the most part, only add implementation specific complexity. (The answer is long enough as it is. The specifics of implementing the 1st option would make a good Feature/Pull Request for the Black project.)
I have researched about this because it actually looks interesting, and I've came to the conclusion that you can maybe use:
black -S and_your_file_path
or:
black -c and_a_string
to format the code passed in as a string.
I will also follow this thread because it looks interesting.
And I'm also going to do more research on this and if I find something I will let you know.

Getting all pods for a container, storing them in text files and then using those files as args in single command

The picture above shows the list of all kubernetes pods I need to save to a text file (or multiple text files).
I need a command which:
stores multiple pod logs into text files (or on single text file) - so far I have this command which stores one pod into one text file but this is not enough since I will have to spell out each pod name individually for every pod:
$ kubectl logs ipt-prodcat-db-kp-kkng2 -n ho-it-sst4-i-ie-enf > latest.txt
I then need the command to send these files into a python script where it will check for various strings - so far this works but if this could be included with the above command then that would be extremely useful:
python CheckLogs.py latest.txt latest2.txt
Is it possible to do either (1) or both (1) and (2) in a single command?
The simplest solution is to create a shell script that does exactly what you are looking for:
#!/bin/sh
FILE="text1.txt"
for p in $(kubectl get pods -o jsonpath="{.items[*].metadata.name}"); do
kubectl logs $p >> $FILE
done
With this script you will get the logs of all the pods in your namespace in a FILE.
You can even add python CheckLogs.py latest.txt
There are various tools that could help here. Some of these are commonly available, and some of these are shortcuts that I create my own scripts for.
xargs: This is used to run multiple command lines in various combinations, based on the input. For instance, if you piped text output containing three lines, you could potentially execute three commands using the content of those three lines. There are many possible variations
arg1: This is a shortcut that I wrote that simply takes stdin and produces the first argument. The simplest form of this would just be "awk '{print $1}'", but I designed mine to take optional parameters, for instance, to override the argument number, separator, and to take a filename instead. I often use "-i{}" to specify a substitution marker for the value.
skipfirstline: Another shortcut I wrote, that simply takes some multiline text input and omits the first line. It is just "sed -n '1!p'".
head/tail: These print some of the first or last lines of stdin. Interesting forms of this take negative numbers. Read the man page and experiment.
sed: Often a part of my pipelines, for making inline replacements of text.

Securely sending information between two python scripts

Brief summary:
I have two files: foo1.pyw and foo2.py
I need to send large amounts of sensitive information to foo2.py from foo1.pyw, and then back again.
Currently, I am doing this by writing to a .txt file, and then opening it with foo2.py using: os.system('foo2.py [text file here] [other arguments passing information]') The problem here is that the .txt file then leaves a trace when it is removed. I need to send information to foo2.py and back without having to write to a temp file.
The information will be formatted text, containing only ASCII characters, including letters, digits, symbols, returns, tabs, and spaces.
I can give more detail if needed.
You could use encryption like AES with python:http://eli.thegreenplace.net/2010/06/25/aes-encryption-of-files-in-python-with-pycrypto or use a transport layer: https://docs.python.org/2/library/ssl.html.
If what you're worrying about is the traces left on the HD, and real time interception is not the issue, why not just shred the temp file afterwards?
Alternatively, for a lot more work, you can setup a ramdisk and hold the file in memory.
The right way to do this is probably with a sub-process and pipe, accessible via subprocess.Popen You can then directly pipe information between the scripts.
I think the simplest solution would be to just call the function within foo2.py from foo1.py:
# foo1.py
import foo2
result = foo2.do_something_with_secret("hi")
# foo2.py
def do_something_with_secret(s):
print(s)
return 'yeah'
Obviously, this wouldn't work if you wanted to replace foo2.py with an arbitrary executable.
This may be a little tricky if they two are in different directories, run under different versions of Python, etc.

passing variables to bowtie from python

I want to pass an input fasta file stored in a variable say inp_a from python to bowtie and write the output into another out_a. I want to use
os.system ('bowtie [options] inp_a out_a')
Can you help me out
Your question asks for two things, as far as I can tell: writing data to disk, and calling an external program from within Python. Without more detailed requirements, here's what I would write:
import subprocess
data_for_bowtie = "some genome data, lol"
with open("input.fasta", "wb") as input_file:
input_file.write(data_for_bowtie)
subprocess.call(["bowtie", "input.fasta", "output.something"])
There are some fine details here which I have assumed. I'm assuming that you mean bowtie, the read aligner. I'm assuming that your file is a binary, non-human-readable one (which is why there's that b in the second argument to open) and I'm making baseless assumptions about how to call bowtie on the command line because I'm not motivated enough to spend the time learning it.
Hopefully, that provides a starting point. Good luck!

Multiple lines user input in command-line Python application

Is there any easy way to handle multiple lines user input in command-line Python application?
I was looking for an answer without any result, because I don't want to:
read data from a file (I know, it's the easiest way);
create any GUI (let's stay with just a command line, OK?);
load text line by line (it should pasted at once, not typed and not pasted line by line);
work with each of lines separately (I'd like to have whole text as a string).
What I would like to achieve is to allow user pasting whole text (containing multiple lines) and capture the input as one string in entirely command-line tool. Is it possible in Python?
It would be great, if the solution worked both in Linux and Windows environments (I've heard that e.g. some solutions may cause problems due to the way cmd.exe works).
import sys
text = sys.stdin.read()
After pasting, you have to tell python that there is no more input by sending an end-of-file control character (ctrl+D in Linux, ctrl+Z followed by enter in Windows).
This method also works with pipes. If the above script is called paste.py, you can do
$ echo "hello" | python paste.py
and text will be equal to "hello\n". It's the same in windows:
C:\Python27>dir | python paste.py
The above command will save the output of dir to the text variable. There is no need to manually type an end-of-file character when the input is provided using pipes -- python will be notified automatically when the program creating the input has completed.
You could get the text from clipboard without any additional actions which raw_input() requires from a user to paste the multiline text:
import Tkinter
root = Tkinter.Tk()
root.withdraw()
text = root.clipboard_get()
root.destroy()
See also How do I copy a string to the clipboard on Windows using Python?
Use :
input = raw_input("Enter text")
These gets in input as a string all the input. So if you paste a whole text, all of it will be in the input variable.
EDIT: Apparently, this works only with Python Shell on Windows.

Categories