Capture output from pexpect - python

I am having trouble with pexpect. I'm trying to grab output from tralics which reads in latex equations and emits the MathML representation, like this:
1 ~/ % tralics --interactivemath
This is tralics 2.14.5, a LaTeX to XML translator, running on tlocal
Copyright INRIA/MIAOU/APICS/MARELLE 2002-2012, Jos\'e Grimm
Licensed under the CeCILL Free Software Licensing Agreement
Starting translation of file texput.tex.
No configuration file.
> $x+y=z$
<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'><mrow><mi>x</mi> <mo>+</mo><mi>y</mi><mo>=</mo><mi>z</mi></mrow></math></formula>
>
So I try to get the formula using pexpect:
import pexpect
c = pexpect.spawn('tralics --interactivemath')
c.expect('>')
c.sendline('$x+y=z$')
s = c.read_nonblocking(size=2000)
print s
The output has the formula, but with the original input at the beginning and some control chars at the end:
"x+y=z$\r\n<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'><mrow><mi>x</mi><mo>+</mo><mi>y</mi><mo>=</mo><mi>z</mi></mrow></math></formula>\r\n\r> \x1b[K"
I can clean the output string, but I must be missing something basic. Is there a cleaner way to get the MathML?

From what I understand you are trying to get this from pexpect:
<formula type='inline'><math xmlns='http://www.w3.org/1998/Math/MathML'><mrow><mi>x</mi> <mo>+</mo><mi>y</mi><mo>=</mo><mi>z</mi></mrow></math></formula>
You can use a regexp instead of ">" for the matching in order to get the expected result. This is the easiest example:
c.expect("<formula.*formula>");
After that, you can access the matched string by calling the match attribute of pexpect:
print c.match
You might also try different regexps, due to the fact that the one I posted is a greedy one and it might hinder your execution time if the formulas are big.

Related

Line terminator adds dot at the end of a line in npm test of Python code

I wanted to learn command line programming using Python.
I saw a to-do challenge on the internet and started to work on it by learning from the web. The challenge is to create a command line interface of a to-do app.
The challenge is titled CoronaSafe Engineering Fellowship Test Problem. Here is the challenge material on Google Drive: https://drive.google.com/drive/folders/1SyLcxnEBNRecIyFAuL5kZqSg8Dw4xnTG?usp=sharing
and there is a GitHub project at https://github.com/nseadlc-2020/package-todo-cli-task/
In the README.md I was instructed to create symbolic link for the batch file todo.bat with the name todo. Now, my first condition is that, when the symbolic link is called from the command prompt without any arguments, it must print some usage tips for the program. Finally, I have to use the npm test command to test the execution.
At the very beginning I got this trouble, whenever I use a print statement, I see a dot • at the end of every string which ends with a new line. For instance,
import sys
import random
args = sys.argv[1:]
if len(args) == 0:
print('Usage :-', end='\n')
print('$ ./todo help # Show usage', end='')
The above statements when executed without arguments gives the output,
Usage :-.
$ ./todo help # Show usage
Here, I noticed that for the first print statement ends with a newline, the string ends with what looks like a middle dot (•). Whereas, for the second print statement since I override the end parameter with an empty string, no newline character was output, and so the dot is not printed. See the screen shot:
What's wrong, and how can I pass the test? My program does not print a middle dot at all.
The problem seems to be squarely inside the todo.test.js file.
In brief, Windows and Unix-like platforms have different line ending conventions (printing a line in Windows adds two control characters at the end, whilst on Unix-like systems only one is printed) and it looks like the test suite is only prepared to cope with results from Unix-like systems.
Try forcing your Python to only print Unix line feeds, or switch to a free Unix-like system for running the tests.
Alternatively, rename todo.test.js and replace it with a copy with DOS line feeds. In many Windows text editors, you should be able to simply open the file as a Unix text file, then "Save As..." and select Windows text file (maybe select "ANSI" if it offers that, though the term is horribly wrong and they should know better); see e.g. Windows command to convert Unix line endings? for many alternative solutions (many of which vividly illustrate some of the other issues with Windows; proceed with caution).
This seems to be a known issue, as noted in the README.md you shared: https://github.com/nseadlc-2020/package-todo-cli-task/issues/12 (though it imprecisely labels this as "newline UTF encoding issues"; the problem has nothing to do with UTF-8 or UTF-16).
See also the proposed duplicate Line endings (also known as Newlines) in JS strings
I had exactly the same problem.
I replaced:
print(variable_name) # Or print("Your text here")
With:
sys.stdout.buffer.write(variable_name.encode('utf-8')) # To sys.stdout.buffer.write("Your text here".encode('utf-8'))
Now it worked fine in windows.
First write your help string like this
help_string='Usage :-\n$ ./task add 2 hello world # Add a new item with priority 2 and text "hello world" to the list\n$ ./task ls # Show incomplete priority list items sorted by priority in ascending order\n$ ./task del INDEX # Delete the incomplete item with the given index\n$ ./task done INDEX # Mark the incomplete item with the given index as complete\n$ ./task help # Show usage\n$ ./task report # Statistics'
Then print it on the console using
sys.stdout.buffer.write(help_string.encode('utf8'))
This problem occurs due to differences in encoding type of windows and npm tests. Also make sure to avoid any spaces after or before "\n".
Why have multiple prints,when python prints can incorporate new line without having to declare separately, follow example below:
print("Usage :- \n$ ./todo help #Show usage")
Output:
Usage :-
$ ./todo help #Show usage

How do I parse the output of getmac command?

After doing some research I found out that the best way to get the Ethernet MAC address under windows is the "getmac" command. (getmac module of python does not produce the same!!). Now I want to use this command from within a python code to get the MAC address. I figured out that my code should start something like this:
import os
if sys.platform == 'win32':
os.system("getmac")
do something here to get the first mac address that appears in the results
here is an example output
Physical Address Transport Name
=================== ==========================================================
1C-69-7A-3A-E3-40 Media disconnected
54-8D-5A-CE-21-1A \Device\Tcpip_{82B01094-C274-418F-AB0A-BC4F3660D6B4}
I finally want to get 1C-69-7A-3A-E3-40 preferably without the dashes.
Thanks in advance.
Two things. First of all, I recommend you find ways of getting the mac address more elegantly. This question's answer seems to use the uuid module, which is perhaps a good cross-platform solution.
Having said that, if you want to proceed with parsing the output of a system call, I recommend the use of Python's subprocess module. For example:
import subprocess
output_of_command = subprocess.check_output("getmac")
This will run getmac and the output of that command will go into a variable. From there, you can parse the string.
Here's how you might extract the mac address from that string:
# I'm setting this directly to provide a clear example of the parsing, separate
# from the first part of this answer.
my_string = """Physical Address Transport Name
=================== ==========================================================
1C-69-7A-3A-E3-40 Media disconnected
54-8D-5A-CE-21-1A \Device\Tcpip_{82B01094-C274-418F-AB0A-BC4F3660D6B4}"""
my_mac_address = my_string.rsplit('=', 1)[-1].split(None, 1)[0]
The first split is a right split. It's breaking up the string by the '=' character, once, starting from the end of the string. Then, I'm splitting the output of that by whitespace, limiting to one split, and taking the first string value.
Again, however, I would discourage this approach to getting a mac address. Parsing the human-readable output of command line scripts is seldom advisable because the output can unexpectedly be different than what your script is expecting. You can assuredly get the mac address in a more robust way.

Save sentence as server filename

I'm saving the recording of a set of sentences to a corresponding set of audio files.
Sentences include:
Ich weiß es nicht!
¡No lo sé!
Ég veit ekki!
How would you recommend I convert the sentence to a human readable filename which will later be served on an online server. I'm not sure right now as to what languages I might be dealing with in the future.
UPDATE:
Please note that two sentences can't clash with each other. For example:
É bär icke dej.
E bår icke dej.
can't resolve to the same filename as these will overwrite each other. This is the problem with the slugify function mentioned here: Turn a string into a valid filename?
The best I have come up with is to use urllib.parse.quote. However I think the resulting output is harder to read than I would have hoped. Any suggestions?:
Ich%20wei%C3%9F%20es%20nicht%21
%C2%A1No%20lo%20s%C3%A9%21
%C3%89g%20veit%20ekki%21
What about unidecode?
import unidecode
a = [u'Ich weiß es nicht!', u'¡No lo sé!', u'Ég veit ekki!']
for s in a:
print(unidecode.unidecode(s).replace(' ', '_'))
This gives pure ASCII strings that can readily be processed if they still contain unwanted characters. Keeping spaces distinct in the form of underscores helps with readability.
Ich_weiss_es_nicht!
!No_lo_se!
Eg_veit_ekki!
If uniqueness is a problem, a hash or something like that might be added to the strings.
Edit:
Some clarification seems to be required with respect to the hashing. Many hash functions are explicitely designed for giving very different outputs for close inputs. For example, the built-in hash function of python gives:
In [1]: hash('¡No lo sé!')
Out[1]: 6428242682022633791
In [2]: hash('¡No lo se!')
Out[2]: 4215591310983444451
With that you can do something like
unidecode.unidecode(s).replace(' ', '_') + '_' + str(hash(s))[:10]
in order to get not too long strings. Even with such shortened hashes, clashes are pretty unlikely.
you should probably try to convert spaces into another symbol making your string look like É-bär-icke-dej.
if your using python I would do it like this.
Replace spaces with another symbol like (-) or (/)
mystring.replace(' ','-')
Detect your character encoding using chardet a python package that detects encoding.
Decode your string using pythons
mystring.decode(*the detected encoding*)
Check if file name is in your directory already using python's OS package. something like
files = os.listdir(*path to directory*)
//get how many times the file name has been repeated
redundance = 0
for name in files:
if mystring in name:
redundance+=1
append redundance to your string
if redundance !=0:
mystring = mystring+redundance
Use ur string as a file name!
Hope this helps!
The only disallowed characters in traditional Unix / Linux file names are slash (/ U+002F) and the null character (U+0000). There is no need to convert your example human-readable strings to anything else.
If you need to make the files available to systems which do not use the same file name encoding, such as for downloading over FTP or from a web server, perhaps you want to expose them as explicitly UTF-8. On most modern U*xes, this should be the default out of the box anyway. This would correspond to the results you get from urllib quoting, where the percent-encoding is a safe and reasonably standard way of producing a machine readable and unambigious representation of the encoding. If you embed these in a snippet of HTML or something, you can keep the display text human-readable, and just keep the link machine-readable.
Ég veit ekki!

Is it possible to access the source code of a python script passed to python on standard in?

This is a bit of a random question that is more out of curiosity than any specific need.
Is it possible to write some python code that will print some stuff out, including the source code itself, without having the python code stored in a file? For example, doing something like this at the Bash prompt:
$ echo '
> print "The Code:"
> PrintScript() # What would this function look like?
> for i in range(5):
> print i,
> print "!"
> ' | python
and get an output like this:
The Code:
print "The Code:"
PrintScript() # What would this function look like?
for i in range(5):
print i,
print "!"
0 1 2 3 4 5 !
I suspect that this probably can't be done, but given python's introspection capabilities, I was curious to know whether it extended to this level.
That's the closest I'm getting:
echo 'import __main__,inspect;print inspect.getsource(__main__)' | python
which fails... In any case, the original code is eaten up (read from stdin) by the interpreter at startup. At most you may be able to get to the compiled code, again through the __main__ module.
Update:
The dis module is supposed to give you a disassembly of all functions in a module, but even that one isn't seeing any code:
$ echo -e 'import __main__,dis;print dis.dis(__main__)' | python
None
And even when I throw in a function:
$ echo -e "import __main__,dis;print dis.dis(__main__)\ndef x():\n pass" | python
None
Yes, it is indeed possible to write a program which outputs it's own source. You don't need even introspection for this tasks, you just need to be able to print computed strings (works with every language).
The technique is called Quine and here is a rather short example in Python:
quine = 'quine = %r\r\nprint quine %% quine'
print quine % quine
But quines aren't limited to such simple programs. They can do much more, for example printing their own source backwards and so on... :)
print open(__file__).read(),
This will work on UNIX systems I think, but I'm not sure about Windows. The trailing comma makes sure that the source code is printed exactly, without an extra trailing newline.
Just realized (based on the comments below) that this does not work if your source code comes from sys.stdin, which is exactly what you were asking for. In that case, you might take advantage of some of the ideas outlined on this page about quines (programs printing their own source codes) in Python, but none of the solutions would be a single function that just works. A language-independent discussion is here.
So, in short, no, I don't think this is possible with a single function if your source code comes from the standard input. There might be a possibility to access the interpreted form of your program as a Python code object and translate that back into source form, but the translated form will almost surely not match the original file contents exactly. (For instance, the comments and the shebang line would definitely be stripped away).
closest you can get is using readline to interrogate the command history if available from what i can see e.g. but i suspect this may not contain stuff piped into the session and would only work for interactive sessions anyway

Reverse a word in Vim

How can I reverse a word in Vim? Preferably with a regex or normal-mode commands, but other methods are welcome too:
word => drow
Thanks for your help!
PS: I'm in windows XP
Python is built in supported in my vim, but not Perl.
Here is another (pythonic) solution based on how this works:
:echo join(reverse(split('hello', '.\zs')), '')
olleh
If you want to replace all words in the buffer,
:%s/\(\<.\{-}\>\)/\=join(reverse(split(submatch(1), '.\zs')), '')/g
This works by first creating a list of characters in the word, which is reversed and joined back to form the word. The substitute command finds each word and then passes the word to the expressions and uses the result as replacement.
This Tip might help: http://vim.wikia.com/wiki/Reverse_letters
It says:
Simply enable visual mode (v), highlight the characters you want inverted, and hit \is. For a single word you can use vw (or viw): viw\is
vnoremap <silent> <Leader>is :<C-U>let old_reg_a=#a<CR>
\:let old_reg=#"<CR>
\gv"ay
\:let #a=substitute(#a, '.\(.*\)\#=',
\ '\=#a[strlen(submatch(1))]', 'g')<CR>
\gvc<C-R>a<Esc>
\:let #a=old_reg_a<CR>
\:let #"=old_reg<CR>
There are more solutions in the comments.
Assuming you've got perl support built in to vim, you can do this:
command! ReverseWord call ReverseWord()
function! ReverseWord()
perl << EOF
$curword = VIM::Eval('expand("<cword>")');
$reversed = reverse($curword);
VIM::Msg("$curword => $reversed");
VIM::DoCommand("norm lbcw$reversed");
EOF
endfun
And potentially bind that to a keystroke like so:
nmap ,r :ReverseWord<CR>
I don't have Python supported on my VIM, but it looks like it would be pretty simple to do it with Python. This article seems like a good explanation of how to use Python in VIM and I'm guessing you'd do something like this:
:python 'word'[::-1]
The article indicates that the result will appear in the status bar, which would be non-optimal if you were trying to replace the string in a document, but if you just want to check that your girlfriend is properly reversing strings in her head, this should be fine.
If you have rev installed (e.g. via MSys or Cygwin) then it's really not this difficult.
Select what you want to reverse and filter (%! <cmd>) it:
:%! rev
This pipes your selection to your shell while passing it a command.
if your version of VIM supports it you can do vw\is or viw\is (put your cursor at the first letter of the word before typing the command)... but I have had a lot of compatibility issues with that. Not sure what has to be compiled in or turned on but this only works sometimes.
EDIT:
\is is:
:<C-U>let old_reg_a=#a<CR>
\ :let old_reg=#"<CR>
\ gv"ay :let #a=substitute(#a, '.\(.*\)\#=', '\=#a[strlen(submatch(1))]', 'g')<CR>
\ gvc<C-R>a<Esc> :let #a=old_reg_a<CR>
\ :let #"=old_reg<CR>
Didn't remember where it came from but a google search come this article on vim.wikia.com. Which shows the same thing so I guess that's it.
Well you could use python itself to reverse the line through the filter command. Say the text you had written was:
Python
You could reverse it by issuing.
:1 ! python -c "print raw_input()[::-1]"
And your text will be replaced to become:
nohtyP
The "1" in the command tells vi to send line 1 to the python statement which we are executing: "print raw_input()[::-1]". So if you wanted some other line reversed, you would send that line number as argument. The python statement then reverses the line of input.
There is a tricky way to do this if you have Vim compiled with +rightleft. You set 'allowrevins' which let you hit Ctrl+_ in insert mode to start Reverse Insert mode. It was originally made for inserting bidirectional scripts.
Type your desired word in Insert mode, or move your cursor to the end of an already typed word. Hit Ctrl+_ and then pick a completion (i_Ctrl-x) method which is the most likely not to return any results for your word. Ysing Ctrl+e to cancel in-place completion does not seem to work in this case.
I.e. for an unsyntactic text file you can hit in insert mode Ctrl+x Ctrl+d which is guaranteed to fail to find any macro/function names in the current file (See :h i_CTRL-X_CTRL-D and:h complete for more information).
And voila! Completion lookup in reverse mode makes the looked up word reverse. Notice that the cursor will move to the beginning of that word (it's reversed direction of writing, remember?)
You should then hit Ctrl+_ again to get back to regular insert mode and keyboard layout and go on with editing.
Tip: You can set 'complete' exclusively (for the buffer, at least) to a completion option that is guaranteed to return no result. Just go over the options in :h 'complete'. This will make the easy i_Ctrl-N / i_Ctrl-P bindings available for a handy word reversal session. You can ofcourse further automate this with a macro, a function or a binding
Note: Setting/resetting 'paste' and 'compatible' can set/reset 'allowrevins'. See :h allowrevins.
If you have some time on your hands, you can bubble your way there by iteratively transposing characters (xp)...
I realize I'm a little late to the game, but I thought I'd just add what I think is the simplest method.
It's two things:
Vim's expression register
pyeval (py3eval on recent vim releases) function
So to reverse a word you would do the following:
"ayiw yank word into register a
<C-r>=py3eval('"".join(reversed(str(' . #a ')))') use vim's = (expression) register to call the py3eval function which evaluates python code (duh) and returns the result, which is then fed via the expression register into our document.
For more info on the expression register see https://www.brianstorti.com/vim-registers/#the-expression-and-the-search-registers
you can use revins mode in order to do it:
at the beginning type :set revins. from now on every letter you type will be inserted in a reverse order, until you type :set norevins to turn off. i.e, while revins is set, typing word will output drow.
in order to change an existing word after revins mode is set, and the cursor on beginning of the word, type:
dwi<C-r>"<ESC>
explanation:
dw deleted a word.
i to enter insert mode
<C-r>" to paste the last deleted or yaked text in insert mode, <ESC> to exit insert mode.
remember to :set norevins at the end!

Categories