Vim obtain string between visual selection range with Python - python

Here is some text
here is line two of text
I visually select from is to is in Vim: (brackets represent the visual selection [ ])
Here [is some text
here is] line two of text
Using Python, I can obtain the range tuples of the selection:
function! GetRange()
python << EOF
import vim
buf = vim.current.buffer # the buffer
start = buf.mark('<') # start selection tuple: (1,5)
end = buf.mark('>') # end selection tuple: (2,7)
EOF
endfunction
I source this file: :so %, select the text visually, run :<,'>call GetRange() and
now that I have (1,5) and (2,7). In Python, how can I compile the string that is the following:
is some text\nhere is
Would be nice to:
Obtain this string for future manipulation
then replace this selected range with the updated/manipulated string

Try this:
fun! GetRange()
python << EOF
import vim
buf = vim.current.buffer
(lnum1, col1) = buf.mark('<')
(lnum2, col2) = buf.mark('>')
lines = vim.eval('getline({}, {})'.format(lnum1, lnum2))
lines[0] = lines[0][col1:]
lines[-1] = lines[-1][:col2]
print "\n".join(lines)
EOF
endfun
You can use vim.eval to get python values of vim functions and variables.

This would probably work if you used pure vimscript
function! GetRange()
let #" = substitute(#", '\n', '\\n', 'g')
endfunction
vnoremap ,r y:call GetRange()<CR>gvp
This will convert all newlines into \n in the visual selection and replace the selection with that string.
This mapping yanks the selection into the " register. Calls the function (isn't really necessary since its only one command). Then uses gv to reselect the visual selection and then pastes the quote register back onto the selected region.
Note: in vimscript all user defined functions must start with an Uppercase letter.

Here's another version based on Conner's answer. I took qed's suggestion and also added a fix for when the selection is entirely within one line.
import vim
def GetRange():
buf = vim.current.buffer
(lnum1, col1) = buf.mark('<')
(lnum2, col2) = buf.mark('>')
lines = vim.eval('getline({}, {})'.format(lnum1, lnum2))
if len(lines) == 1:
lines[0] = lines[0][col1:col2 + 1]
else:
lines[0] = lines[0][col1:]
lines[-1] = lines[-1][:col2 + 1]
return "\n".join(lines)

Related

What causes this return() to create a SyntaxError?

I need this program to create a sheet as a list of strings of ' ' chars and distribute text strings (from a list) into it. I have already coded return statements in python 3 but this one keeps giving
return(riplns)
^
SyntaxError: invalid syntax
It's the return(riplns) on line 39. I want the function to create a number of random numbers (randint) inside a range built around another randint, coming from the function ripimg() that calls this one.
I see clearly where the program declares the list I want this return() to give me. I know its type. I see where I feed variables (of the int type) to it, through .append(). I know from internet research that SyntaxErrors on python's return() functions usually come from mistype but it doesn't seem the case.
#loads the asciified image ("/home/userX/Documents/Programmazione/Python projects/imgascii/myascify/ascimg4")
#creates a sheet "foglio1", same number of lines as the asciified image, and distributes text on it on a randomised line
#create the sheet foglio1
def create():
ref = open("/home/userX/Documents/Programmazione/Python projects/imgascii/myascify/ascimg4")
charcount = ""
field = []
for line in ref:
for c in line:
if c != '\n':
charcount += ' '
if c == '\n':
charcount += '*' #<--- YOU GONNA NEED TO MAKE THIS A SPACE IN A FOLLOWING FUNCTION IN THE WRITER.PY PROGRAM
for i in range(50):#<------- VALUE ADJUSTMENT FROM WRITER.PY GOES HERE(default : 50):
charcount += ' '
charcount += '\n'
break
for line in ref:
field.append(charcount)
return(field)
#turn text in a list of lines and trasforms the lines in a list of strings
def poemln():
txt = open("/home/gcg/Documents/Programmazione/Python projects/imgascii/writer/poem")
arrays = []
for line in txt:
arrays.append(line)
txt.close()
return(arrays)
#rander is to be called in ripimg()
def rander(rando, fldepth):
riplns = []
for i in range(fldepth):
riplns.append(randint((rando)-1,(rando)+1)
return(riplns) #<---- THIS RETURN GIVES SyntaxError upon execution
#opens a rip on the side of the image.
def ripimg():
upmost = randint(160, 168)
positions = []
fldepth = 52 #<-----value is manually input as in DISTRIB function.
positions = rander(upmost,fldepth)
return(positions)
I omitted the rest of the program, I believe these functions are enough to get the idea, please tell me if I need to add more.
You have incomplete set of previous line's parenthesis .
In this line:-
riplns.append(randint((rando)-1,(rando)+1)
You have to add one more brace at the end. This was causing error because python was reading things continuously and thought return statement to be a part of previous uncompleted line.

Ultisnips python interpolation snippet, extracting number from filename

I am trying to make a snippet that will help me choose the right revision
number for migration, by reading all migration files from
application/migrations.
What I managed to do myself is that my filenames are being filtered while I am
typing, and when only one match left insert its revision number at the cursor
position (which are first 14 chars of filename always).
The problem is that when I hit TAB to select, I am also left with what I have
typed so far to search for that revision number, meaning something like this
remo20160812110447.
Question is, how to get rid of that remo in this case!?
NOTE: Example uses hardcoded values, for easier testing, those will be later
replaced by # lst = os.listdir('application/migrations') line.
Also an added bonus effect would be to present those 20160710171947 values as
human readable date format while choosing, but after hitting TAB insert their
original source version.
global !p
import datetime
def complete(t, opts):
if t:
opts = [ m for m in opts if t in m ]
if len(opts) == 1:
return opts[0][:14]
return "(" + '|'.join(opts) + ')'
endglobal
snippet cimigration "Inserts desired migration number, obtained via filenames"
$1`!p import os
# lst = os.listdir('application/migrations')
lst = [
'20160710171947_create.php',
'20160810112347_delete.php',
'20160812110447_remove.php'
]
snip.rv = complete(t[1], lst)`
endsnippet
This can definitely be performed in pure vimscript.
Here is a working prototype. It does work but has some issues with portability: global variables, reliance on iskeyword and uses two keybindings instead of one. But it was put together in an hour or so:
set iskeyword=#,48-57,_,-,.,192-255
let g:wordidx = 0
let g:word = ''
let g:match = 0
function! Suggest()
let l:glob = globpath('application/migrations', '*.php')
let l:files = map(split(l:glob), 'fnamemodify(v:val, ":t")')
let l:char = getline('.')[col('.')-1]
let l:word = ''
let l:suggestions = []
if l:char =~# '[a-zA-Z0-9_]'
if g:word ==# ''
let g:word = expand('<cword>')
let g:match = matchadd('ErrorMsg', g:word)
endif
let l:word = g:word
"let l:reg = '^' . l:word
let l:suggestions = filter(l:files, 'v:val =~ l:word')
if !empty(l:suggestions)
call add(l:suggestions, l:word)
"echo l:suggestions
let l:change = l:suggestions[g:wordidx]
let g:wordidx = (g:wordidx + 1) % len(l:suggestions)
"echo g:wordidx + 10
execute "normal! mqviwc" . l:change . "\<esc>`q"
endif
endif
"echo [l:word, l:suggestions]
endfunction
function! SuggestClear()
call matchdelete(g:match)
let g:wordidx = 0
let g:word = ''
let g:match = 0
endfunction
nnoremap <leader><tab> :call Suggest()<cr>
nnoremap <leader><cr> :call SuggestClear()<cr>
Adding this to your ~/.vimrc will allow you to steps through search matches with <leader><tab>. It will highlight the part that is being matched, to drop the highlight you need to type <leader><cr>.
You should always drop the highlight after use because the original search word is kept internally until you destroy it. Using <leader><tab> before clearing the match will substitute for the suggestions from the previous match.
Screencast (my leader is -):
If you have more vim questions join the vi.SE subsection of the website. You can probably get better answers there.
This can be achieved using post-expand-actions: https://github.com/SirVer/ultisnips/blob/master/doc/UltiSnips.txt#L1602

Multi-line Matching in Python

I've read all of the articles I could find, even understood a few of them but as a Python newb I'm still a little lost and hoping for help :)
I'm working on a script to parse items of interest out of an application specific log file, each line begins with a time stamp which I can match and I can define two things to identify what I want to capture, some partial content and a string that will be the termination of what I want to extract.
My issue is multi-line, in most cases every log line is terminated with a newline but some entries contain SQL that may have new lines within it and therefore creates new lines in the log.
So, in a simple case I may have this:
[8/21/13 11:30:33:557 PDT] 00000488 SystemOut O 21 Aug 2013 11:30:33:557 [WARN] [MXServerUI01] [CID-UIASYNC-17464] BMXAA6720W - USER = (ABCDEF) SPID = (2526) app (ITEM) object (ITEM) : select * from item where ((status != 'OBSOLETE' and itemsetid = 'ITEMSET1') and (exists (select 1 from maximo.invvendor where (exists (select 1 from maximo.companies where (( contains(name,' $AAAA ') > 0 )) and (company=invvendor.manufacturer and orgid=invvendor.orgid))) and (itemnum = item.itemnum and itemsetid = item.itemsetid)))) and (itemtype in (select value from synonymdomain where domainid='ITEMTYPE' and maxvalue = 'ITEM')) order by itemnum asc (execution took 2083 milliseconds)
This all appears as one line which I can match with this:
re.compile('\[(0?[1-9]|[12][0-9]|3[01])(\/)(0?[1-9]|[12][0-9]|3[01])(\/)([0-9]{2}).*(milliseconds)')
However in some cases there may be line breaks in the SQL, as such I want to still capture it (and potentially replace the line breaks with spaces). I am currently reading the file a line at a time which obviously isn't going to work so...
Do I need to process the whole file in one go? They are typically 20mb in size. How do I read the entire file and iterate through it looking for single or multi-line blocks?
How would I write a multi-line RegEx that would match either the whole thing on one line or of it is spread across multiple lines?
My overall goal is to parameterize this so I can use it for extracting log entries that match different patterns of the starting string (always the start of a line), the ending string (where I want to capture to) and a value that is between them as an identifier.
Thanks in advance for any help!
Chris.
import sys, getopt, os, re
sourceFolder = 'C:/MaxLogs'
logFileName = sourceFolder + "/Test.log"
lines = []
print "--- START ----"
lineStartsWith = re.compile('\[(0?[1-9]|[12][0-9]|3[01])(\/)(0?[1-9]|[12][0-9]|3[01])(\/)([0-9]{2})(\ )')
lineContains = re.compile('.*BMXAA6720W.*')
lineEndsWith = re.compile('(?:.*milliseconds.*)')
lines = []
with open(logFileName, 'r') as f:
for line in f:
if lineStartsWith.match(line) and lineContains.match(line):
if lineEndsWith.match(line) :
print 'Full Line Found'
print line
print "- Record Separator -"
else:
print 'Partial Line Found'
print line
print "- Record Separator -"
print "--- DONE ----"
Next step, for my partial line I'll continue reading until I find lineEndsWith and assemble the lines in to one block.
I'm no expert so suggestions are always welcome!
UPDATE - So I have it working, thanks to all the responses that helped direct things, I realize it isn't pretty and I need to clean up my if / elif mess and make it more efficient but IT's WORKING! Thanks for all the help.
import sys, getopt, os, re
sourceFolder = 'C:/MaxLogs'
logFileName = sourceFolder + "/Test.log"
print "--- START ----"
lineStartsWith = re.compile('\[(0?[1-9]|[12][0-9]|3[01])(\/)(0?[1-9]|[12][0-9]|3[01])(\/)([0-9]{2})(\ )')
lineContains = re.compile('.*BMXAA6720W.*')
lineEndsWith = re.compile('(?:.*milliseconds.*)')
lines = []
multiLine = False
with open(logFileName, 'r') as f:
for line in f:
if lineStartsWith.match(line) and lineContains.match(line) and lineEndsWith.match(line):
lines.append(line.replace("\n", " "))
elif lineStartsWith.match(line) and lineContains.match(line) and not multiLine:
#Found the start of a multi-line entry
multiLineString = line
multiLine = True
elif multiLine and not lineEndsWith.match(line):
multiLineString = multiLineString + line
elif multiLine and lineEndsWith.match(line):
multiLineString = multiLineString + line
multiLineString = multiLineString.replace("\n", " ")
lines.append(multiLineString)
multiLine = False
for line in lines:
print line
Do I need to process the whole file in one go? They are typically 20mb in size. How do I read the entire file and iterate through it looking for single or multi-line blocks?
There are two options here.
You could read the file block by block, making sure to attach any "leftover" bit at the end of each block to the start of the next one, and search each block. Of course you will have to figure out what counts as "leftover" by looking at what your data format is and what your regex can match, and in theory it's possible for multiple blocks to all count as leftover…
Or you could just mmap the file. An mmap acts like a bytes (or like a str in Python 2.x), and leaves it up to the OS to handle paging blocks in and out as necessary. Unless you're trying to deal with absolutely huge files (gigabytes in 32-bit, even more in 64-bit), this is trivial and efficient:
with open('bigfile', 'rb') as f:
with mmap.mmap(f.fileno(), length=0, access=mmap.ACCESS_READ) as m:
for match in compiled_re.finditer(m):
do_stuff(match)
In older versions of Python, mmap isn't a context manager, so you'll need to wrap contextlib.closing around it (or just use an explicit close if you prefer).
How would I write a multi-line RegEx that would match either the whole thing on one line or of it is spread across multiple lines?
You could use the DOTALL flag, which makes the . match newlines. You could instead use the MULTILINE flag and put appropriate $ and/or ^ characters in, but that makes simple cases a lot harder, and it's rarely necessary. Here's an example with DOTALL (using a simpler regexp to make it more obvious):
>>> s1 = """[8/21/13 11:30:33:557 PDT] 00000488 SystemOut O 21 Aug 2013 11:30:33:557 [WARN] [MXServerUI01] [CID-UIASYNC-17464] BMXAA6720W - USER = (ABCDEF) SPID = (2526) app (ITEM) object (ITEM) : select * from item where ((status != 'OBSOLETE' and itemsetid = 'ITEMSET1') and (exists (select 1 from maximo.invvendor where (exists (select 1 from maximo.companies where (( contains(name,' $AAAA ') > 0 )) and (company=invvendor.manufacturer and orgid=invvendor.orgid))) and (itemnum = item.itemnum and itemsetid = item.itemsetid)))) and (itemtype in (select value from synonymdomain where domainid='ITEMTYPE' and maxvalue = 'ITEM')) order by itemnum asc (execution took 2083 milliseconds)"""
>>> s2 = """[8/21/13 11:30:33:557 PDT] 00000488 SystemOut O 21 Aug 2013 11:30:33:557 [WARN] [MXServerUI01] [CID-UIASYNC-17464] BMXAA6720W - USER = (ABCDEF) SPID = (2526) app (ITEM) object (ITEM) : select * from item where ((status != 'OBSOLETE' and itemsetid = 'ITEMSET1') and
(exists (select 1 from maximo.invvendor where (exists (select 1 from maximo.companies where (( contains(name,' $AAAA ') > 0 )) and (company=invvendor.manufacturer and orgid=invvendor.orgid))) and (itemnum = item.itemnum and itemsetid = item.itemsetid)))) and (itemtype in (select value from synonymdomain where domainid='ITEMTYPE' and maxvalue = 'ITEM')) order by itemnum asc (execution took 2083 milliseconds)"""
>>> r = re.compile(r'\[(.*?)\].*?milliseconds\)', re.DOTALL)
>>> r.findall(s1)
['8/21/13 11:30:33:557 PDF']
>>> r.findall(s2)
['8/21/13 11:30:33:557 PDF']
As you can see the second .*? matched the newline just as easily as a space.
If you're just trying to treat a newline as whitespace, you don't need either; '\s' already catches newlines.
For example:
>>> s1 = 'abc def\nghi\n'
>>> s2 = 'abc\ndef\nghi\n'
>>> r = re.compile(r'abc\s+def')
>>> r.findall(s1)
['abc def']
>>> r.findall(s2)
['abc\ndef']
You can read an entire file into a string and then you can use re.split to make a list of all the entries separated by times. Here's an example:
f = open(...)
allLines = ''.join(f.readlines())
entries = re.split(regex, allLines)

How to re-indent Python code after changing indent width in Emacs?

I changed python-indent from 3 to 4. I then mark-whole-buffer and indent-for-tab-command. It gave me garbage.
There is the indent-region function. So I'd try mark the whole buffer, then M-x and type indent-region. It's usually bound to C-M-\, as far as I know.
Edit
Re-indentation does not work for a tab-width change. As I wrote in the comments changing spaces to tabs and then altering the tab-width is a solution:
"Guessing you are indenting with space and not tabs, you'd first do tabify on the buffer content with your tab-width set to 3. Then change tab-width to 4 and run untabify."
This is kind of a hack, but it won't give you the garbage that indent-region is giving you
1) Make sure tabs as spaces are set to 4 spaces. In a scratch buffer type:
(setq tab-width 4)
And then evaluate it by marking it and using M-x eval-region
2) Globally replace all sets of three spaces with a tab character
M-x replace-regexp [SPC][SPC][SPC][RET] C-q[TAB][RET]
3) Highlight the whole buffer and untabify
M-x mark-whole-buffer M-x untabify
This will convert all tabs into four spaces.
Try indent-region instead on the buffer. Initially bounded to C-M-\
This is a bit of a hack, but it worked for me as a quick work-around: to a "M-X replace-string", " " -> " ". Then you have to close and re-open if your emacs does an automatic idnent-detection on the file. Then you have to go through and fix mult-line code (with tab), and strings that have lots of spaces.
This might also help:
http://www.emacswiki.org/emacs/IndentingPython
In particular, PythonTidy is very effective for restructuring messy code, with minor hiccups (unfortunately the tool is not easy to configure):
http://www.emacswiki.org/emacs/PythonProgrammingInEmacs#toc17
May be it will not be useful or not by theme, but I use such script.
Run it from command line. (python reindent.py some.py)
Change string_equal, and replace_to.
import sys
file_name = sys.argv[1]
string_equal = " "
replace_to = " "
with open(file_name) as f:
data = f.readlines()
f.close()
def create_new_line(i):
new_line = ""
flag = True
cur_s = ""
for k in i:
if flag and k == " ":
cur_s += k
if cur_s == string_equal:
new_line += replace_to
cur_s = ""
else:
flag = False
new_line += k
return new_line
with open(file_name, "w") as f:
for i in data:
l = create_new_line(i)
f.write(l)
f.close()

difflib python formatting

I am using this code to find difference between two csv list and hove some formatting questions. This is probably an easy fix, but I am new and trying to learn and having alot of problems.
import difflib
diff=difflib.ndiff(open('test1.csv',"rb").readlines(), open('test2.csv',"rb").readlines())
try:
while 1:
print diff.next(),
except:
pass
the code works fine and I get the output I am looking for as:
Group,Symbol,Total
- Adam,apple,3850
? ^
+ Adam,apple,2850
? ^
bob,orange,-45
bob,lemon,66
bob,appl,-56
bob,,88
My question is how do I clean the formatting up, can I make the Group,Symbol,Total into sperate columns, and the line up the text below?
Also can i change the ? to represent a text I determine? such as test 1 and test 2 representing which sheet it comes from?
thanks for any help
Using difflib.unified_diff gives much cleaner output, see below.
Also, both difflib.ndiff and difflib.unified_diff return a Differ object that is a generator object, which you can directly use in a for loop, and that knows when to quit, so you don't have to handle exceptions yourself. N.B; The comma after line is to prevent print from adding another newline.
import difflib
s1 = ['Adam,apple,3850\n', 'bob,orange,-45\n', 'bob,lemon,66\n',
'bob,appl,-56\n', 'bob,,88\n']
s2 = ['Adam,apple,2850\n', 'bob,orange,-45\n', 'bob,lemon,66\n',
'bob,appl,-56\n', 'bob,,88\n']
for line in difflib.unified_diff(s1, s2, fromfile='test1.csv',
tofile='test2.csv'):
print line,
This gives:
--- test1.csv
+++ test2.csv
## -1,4 +1,4 ##
-Adam,apple,3850
+Adam,apple,2850
bob,orange,-45
bob,lemon,66
bob,appl,-56
So you can clearly see which lines were changed between test1.csv and test1.csv.
To line up the columns, you must use string formatting.
E.g. print "%-20s %-20s %-20s" % (row[0],row[1],row[2]).
To change the ? into any text test you like, you'd use s.replace('any text i like').
Your problem has more to do with the CSV format, since difflib has no idea it's looking at columnar fields. What you need is to figure out into which field the guide is pointing, so that you can adjust it when printing the columns.
If your CSV files are simple, i.e. they don't contain any quoted fields with embedded commas or (shudder) newlines, you can just use split(',') to separate them into fields, and figure out where the guide points as follows:
def align(line, guideline):
"""
Figure out which field the guide (^) points to, and the offset within it.
E.g., if the guide points 3 chars into field 2, return (2, 3)
"""
fields = line.split(',')
guide = guideline.index('^')
f = p = 0
while p + len(fields[f]) < guide:
p += len(fields[f]) + 1 # +1 for the comma
f += 1
offset = guide - p
return f, offset
Now it's easy to show the guide properly. Let's say you want to align your columns by printing everything 12 spaces wide:
diff=difflib.ndiff(...)
for line in diff:
code = line[0] # The diff prefix
print code,
if code == '?':
fld, offset = align(lastline, line[2:])
for f in range(fld):
print "%-12s" % '',
print ' '*offset + '^'
else:
fields = line[2:].rstrip('\r\n').split(',')
for f in fields:
print "%-12s" % f,
print
lastline = line[2:]
Be warned that the only reliable way to parse CSV files is to use the csv module (or a robust alternative); but getting it to play well with the diff format (in full generality) would be a bit of a headache. If you're mainly interested in readability and your CSV isn't too gnarly, you can probably live with an occasional mix-up.

Categories