Incorrect syntax near GO in SQL - python

I am concatenating many sql statements and am running into the following error.
"Incorrect syntax near GO" and "Incorrect syntax near "-
It seems that when i delete the trailing space and the go and the space after the go, and then CTRL+Z to put back the GO this makes the error go away? its pretty weird
why??
How could I code it in Python, thanks
')
END TRY
BEGIN CATCH
print ERROR_MESSAGE()
END CATCH
GO

As already mentioned in comments, GO is not part of the SQL syntax, rather a batch delimiter in Management Studio.
You can go around it in two ways, use Subprocess to call SqlCmd, or cut the scripts within Python. The Subprocess + SqlCmd will only really work for you if you don't care about query results as you would need to parse console output to get those.
I needed to build a database from SSMS generated scripts in past and created the below function as a result (updating, as I now have a better version that leaves comments in):
def partition_script(sql_script: str) -> list:
""" Function will take the string provided as parameter and cut it on every line that contains only a "GO" string.
Contents of the script are also checked for commented GO's, these are removed from the comment if found.
If a GO was left in a multi-line comment,
the cutting step would generate invalid code missing a multi-line comment marker in each part.
:param sql_script: str
:return: list
"""
# Regex for finding GO's that are the only entry in a line
find_go = re.compile(r'^\s*GO\s*$', re.IGNORECASE | re.MULTILINE)
# Regex to find multi-line comments
find_comments = re.compile(r'/\*.*?\*/', flags=re.DOTALL)
# Get a list of multi-line comments that also contain lines with only GO
go_check = [comment for comment in find_comments.findall(sql_script) if find_go.search(comment)]
for comment in go_check:
# Change the 'GO' entry to '-- GO', making it invisible for the cutting step
sql_script = sql_script.replace(comment, re.sub(find_go, '-- GO', comment))
# Removing single line comments, uncomment if needed
# file_content = re.sub(r'--.*$', '', file_content, flags=re.MULTILINE)
# Returning everything besides empty strings
return [part for part in find_go.split(sql_script) if part != '']
Using this function, you can run scripts containing GO like this:
import pymssql
conn = pymssql.connect(server, user, password, "tempdb")
cursor = conn.cursor()
for part in partition_script(your_script):
cursor.execute(part)
conn.close()
I hope this helps.

Related

How do I remove unwanted additional empty lines after adding page break?

I am trying to reformat this .docx document using the python docx module. Each question ends with the specific expression "-- ans end --". I want to insert a page break after the expression with the following code:
import docx, re
from pathlib import Path
from docx.enum.text import WD_BREAK
filename = Path("DOCUMENT_NAME")
doc = docx.Document(filename)
for para in doc.paragraphs:
match = re.search(r"-- ans end --", para.text)
if match:
run = para.add_run()
run.add_break(WD_BREAK.PAGE)
After each page break there seems to be 2
which I tried to remove with:
para.text = para.text.strip("\n")
Striping the empty lines before adding the page break does nothing, while striping the empty lines after adding the page break removes the page break.
Please tell me how to eliminate or avoiding adding the 2 empty lines. Thanks.
Update:
The page break should be added to the start of the next paragraph/section instead of after -- ans end -- (the end of this section) as the page break creates a new line when it is added to the end of a paragraph (try it on Word). Therefore I used this:
run = para.runs[0]
run._element.addprevious(new_run_element)
new_run = Run(new_run_element, run._parent)
new_run.text = ""
new_run.add_break(WD_BREAK.PAGE)
to add a page break to the start of next paragraph instead, which does not create a new line.
Have you looked at the contents of your doc before and after altering it? eg.
for para in doc.paragraphs:
print(repr(para.text)) # the call to repr() makes your `\n`s show up
this is helpful for figuring out what is going on.
Prior to altering your doc, there are no \ns with the --- ans end --s, so it makes sense that stripping the empty lines before adding your page break doesn't do anything. Also, prior to stripping your doc, there is an empty string in a paragraph right after -- ans end --:
'-- ans --'
'-- ans end --'
''
is what stuff looks like before you edit the doc. (Except there is one case where -- ans end -- is followed by two ''s, which is annoyingly different from all the others.)
After editing the doc, those sections look like this.
'-- ans end --\n'
''
When I run this code, as I mentioned in my comment above, the page break actually shows up in the wrong spot - right after --ans end -- instead of right before. I think that can be worked around in a fairly straightforward way, I'll leave it to you if you're also having that issue.
If you remove those '' paragraphs I think that solves your problem. It is annoying to remove a paragraph from a document, but see this GitHub answer for an incantation which does it.

Python : correct use of set_completion_display_matches_hook

I'm trying to write a function to display a custom view when users press the tab button. Apparently "set_completion_display_matches_hook" function is what I need, I can display a custom view, but the problem is that I have to press Enter to get a prompt again.
The solution in Python2 seems to be that (solution here):
def match_display_hook(self, substitution, matches, longest_match_length):
print ''
for match in matches:
print match
print self.prompt.rstrip(),
print readline.get_line_buffer(),
readline.redisplay()
But it doesn't work with Python3. I made these syntax changes :
def match_display_hook(self, substitution, matches, longest_match_length):
print('\n----------------------------------------------\n')
for match in matches:
print(match)
print(self.prompt.rstrip() + readline.get_line_buffer())
readline.redisplay()
Any ideas please ?
First, the Python 2 code uses commas to leave the line unfinished. In Python 3, it's done using end keyword:
print(self.prompt.rstrip(), readline.get_line_buffer(), sep='', end='')
Then, a flush is required to actually display the unfinished line (due to line buffering):
sys.stdout.flush()
The redisplay() call does not seem to be needed.
The final code:
def match_display_hook(self, substitution, matches, longest_match_length):
print()
for match in matches:
print(match)
print(self.prompt.rstrip(), readline.get_line_buffer(), sep='', end='')
sys.stdout.flush()
The redisplay() function
voidrl_redisplay (void)
Change what's displayed on the screen to reflect the current contents of rl_line_buffer.
In your example you have written to stdout, but not changed that buffer.
Print and flush as described by in other answer should work.
One issue you will have, however, is cursor position. Say you have this scenario:
$ cmd some_file
^
+---- User has back-tracked here and want to insert an option.
<TAB> completion with print and flush will put cursor
at end of `some_file' and the line will get an extra 15
spaces after that ...
To remedy this one way is to first get cursor position, then use ANSI sequences to re-position the cursor.
buf = readline.get_line_buffer()
x = readline.get_endidx()
print(self.prompt + buf, end = '')
if x < len(buf):
""" Set cursor at old column position """
print("\r\033[%dC" % (x + len(self.prompt)), end = '')
sys.stdout.flush()
Now, of course, you get another issue if prompt has ANSI sequences in-iteself. Typically color or the like. Then you can not use len(prompt) but have to find printed / visible length.
One has to use open and close bytes elsewhere, typically \0x01 and \0x02 respectively.
So one typically get:
prompt = '\001\033[31;1m\002VISIBLE_TEXT\001\033[0m\002 '
instead of:
prompt = '\033[31;1mVISIBLE_TEXT\033[0m '
With those guards it should be easy enough to strip out the visible text.
Typically something like:
clean_prompt = re.sub(r'\001[^\002]*\002', '', prompt))
Cache the length of that and use when printing the readline manually. Note that you also have to remove the guards when using it manually - as in the hook function. (But it is needed in input(prompt)
this one worked for me for redisplaying substitution and the end of matches display for python3:
def match_display_hook(self, substitution, matches, longest_match_length):
print("")
for match in matches:
print(match)
print("")
sys.stdout.write(substitution)
sys.stdout.flush()
return None
while previous ones using print prompt didn't. (didn't get to the bottom of the problem)

Python search and replace in a file

I was trying to make a script to allow me to automate clean ups in the linux kernel a little bit. The first thing on my agenda was to remove braces({}) on if statements(c-styled) that wasnt necessary for single statement blocks. Now the code I tried with my little knowledge of regex in python I got to a working state, such as:
if (!buf || !buf_len) {
TRACE_RET(chip, STATUS_FAIL);
}
and the script turn it into:
if (!buf || !buf_len)
TRACE_RET(chip, STATUS_FAIL);
Thats what I want but when I try it on real source files it seems like it randomly selects a if statement and take its deleted it beginning brace and it has multiple statement blocks and it remove the ending brace far down the program usually on a else satement or a long if statement.
So can someone please help me with make the script only touch an if statement if it has a single block statement and correctly delete it corresponding beginning and ending brace.
The correct script looks like:
from sys import argv
import os
import sys
import re
get_filename = argv[1]
target = open(get_filename)
rename = get_filename + '.tmp'
temp = open(rename, 'w')
def if_statement():
look=target.read()
pattern=r'''if (\([^.)]*\)) (\{)(\n)([^>]+)(\})'''
replacement=r'''if \1 \3\4'''
pattern_obj = re.compile(pattern, re.MULTILINE)
outtext = re.sub(pattern_obj, replacement, look)
temp.write(outtext)
temp.close()
target.close()
if_statement()
Thanks in advance
In theory, this would mostly work:
re.sub(r'(if\s*\([^{]+\)\s*){([^;]*;)\s*}', r'\1\2', yourstring)
Note that this will fail on nested single-statement blocks and on semicolons inside string or character literals.
In general, trying to parse C code with regex is a bad idea, and you really shouldn't get rid of those braces anyway. It's good practice to have them and they're not hurting anything.

A Python script I've written for correcting table names of the SQL dumps from Windows. Any comments?

as a newbie in Python I've thought about writing a quick and dirty script for correcting the table anme caps of a MySQL dump file (by phpMyAdmin).
The idea is since the correct capitalization of the table names are in the comments, I'm going to use it.
e.g.:
-- --------------------------------------------------------
--
-- Table structure for table `Address`
--
The reason I'm asking here is that I don't have a mentor on Python programming and I was hoping you guys could steer me to the right direction. It feels like there's a lot of stuff I'm doing wrong (maybe it's not pythonic) I'd really appreciate your help, thanks in advance!
Here's what I've written (and it works):
#!/usr/bin/env python
import re
filename = 'dump.sql'
def get_text_blocks(filename):
text_blocks = []
text_block = ''
separator = '-- -+'
for line in open(filename, 'r'):
text_block += line
if re.match(separator, line):
if text_block:
text_blocks.append(text_block)
text_block = ''
return text_blocks
def fix_text_blocks(text_blocks):
f = open(filename + '-fixed', 'w')
for block in text_blocks:
table_pattern = re.compile(r'Table structure for table `(.+)`')
correct_table_name = table_pattern.search(block)
if correct_table_name:
replacement = 'CREATE TABLE IF NOT EXISTS `' + correct_table_name.groups(0)[0] + '`'
block = re.sub(r'CREATE TABLE IF NOT EXISTS `(.+)`', replacement, block)
f.write(block)
if __name__ == '__main__':
fix_text_blocks(get_text_blocks(filename))
Looks fairly good, so the following are relatively minor:
get_text_blocks basically splits the entire text by the separator, correct? If so, I think this can be done with a single regex with a re.MULTILINE flag. Something like r'(.*?)\n-- -+' (warning: untested).
If you don't want to use a single regex but prefer to parse the file in a loop, you can ditch the regex for str.straswith. You should also not concatenate strings the way you do with text_block, since every concatenation creates a new string. You can use either the StringIO class, or have a list of lines, and then join them with '\n'.join.
The nested 'if' can be dropped: use the 'and' operator instead.
In any case, working with files (and other objects which have a 'finally' logic) is now done with the 'with [object] as [name]:' clause. Look it up, it's nifty.
If you don't do that - always close your files when you finish working with them, preferably in a 'finally' clause.
I prefer opening files with the 'b' flag as well. Prevents '\r\n' magic in Windows.
In fix_text_blocks, the pattern should be compiled outside the for loop.

regex for parsing SQL statements

I've got an IronPython script that executes a bunch of SQL statements against a SQL Server database. the statements are large strings that actually contain multiple statements, separated by the "GO" keyword. That works when they're run from sql management studio and some other tools, but not in ADO. So I split up the strings using the 2.5 "re" module like so:
splitter = re.compile(r'\bGO\b', re.IGNORECASE)
for script in splitter.split(scriptBlob):
if(script):
[... execute the query ...]
This breaks in the rare case that there's the word "go" in a comment or a string. How in the heck would I work around that? i.e. correctly parse this string into two scripts:
-- this is a great database script! go team go!
INSERT INTO myTable(stringColumn) VALUES ('go away!')
/*
here are some comments that go with this script.
*/
GO
INSERT INTO myTable(stringColumn) VALUES ('this is the next script')
EDIT:
I searched more and found this SQL documentation:
http://msdn.microsoft.com/en-us/library/ms188037(SQL.90).aspx
As it turns out, GO must be on its own line as some answers suggested. However it can be followed by a "count" integer which will actually execute the statement batch that many times (has anybody actually used that before??) and it can be followed by a single-line comments on the same line (but not a multi-line, I tested this.) So the magic regex would look something like:
"(?m)^\s*GO\s*\d*\s*$"
Except this doesn't account for:
a possible single-line comment ("--" followed by any character except a line break) at the end.
the whole line being inside a larger multi-line comment.
I'm not concerned about capturing the "count" argument and using it. Now that I have some technical documentation i'm tantalizingly close to writing this "to spec" and never having to worry about it again.
Is "GO" always on a line by itself? You could just split on "^GO$".
since you can have comments inside comments, nested comments, comments inside queries, etc, there is no sane way to do it with regexes.
Just immagine the following script:
INSERT INTO table (name) VALUES (
-- GO NOW GO
'GO to GO /* GO */ GO' +
/* some comment 'go go go'
-- */ 'GO GO' /*
GO */
)
That without mentioning:
INSERT INTO table (go) values ('xxx') GO
The only way would be to build a stateful parser instead. One that reads a char at a time, and has a flag that will be set when it is inside a comment/quote-delimited string/etc and reset when it ends, so the code can ignore "GO" instances when inside those.
If GO is always on a line by itself you can use split like this:
#!/usr/bin/python
import re
sql = """-- this is a great database script! go team go!
INSERT INTO myTable(stringColumn) VALUES ('go away!')
/*
here are some comments that go with this script.
*/
GO 5 --this is a test
INSERT INTO myTable(stringColumn) VALUES ('this is the next script')"""
statements = re.split("(?m)^\s*GO\s*(?:[0-9]+)?\s*(?:--.*)?$", sql)
for statement in statements:
print "the statement is\n%s\n" % (statement)
(?m) turns on multiline matchings, that is ^ and $ will match start and end of line (instead of start and end of string).
^ matches at the start of a line
\s* matches zero or more whitespaces (space, tab, etc.)
GO matches a literal GO
\s* matches as before
(?:[0-9]+)? matches an optional integer number (with possible leading zeros)
\s* matches as before
(?:--.*)? matches an optional end-of-line comment
$ matches at the end of a line
The split will consume the GO line, so you won't have to worry about it. This will leave you with a list of statements.
This modified split has a problem: it will not give you back the number after the GO, if that is important I would say it is time to move to a parser of some form.
This won't detect if GO ever is used as a variable name inside some statement, but should take care of those inside comments or strings.
EDIT: This now works if GO is part of the statement, as long as it is not in it's own line.
import re
line_comment = r'(?:--|#).*$'
block_comment = r'/\*[\S\s]*?\*/'
singe_quote_string = r"'(?:\\.|[^'\\])*'"
double_quote_string = r'"(?:\\.|[^"\\])*"'
go_word = r'^[^\S\n]*(?P<GO>GO)[^\S\n]*\d*[^\S\n]*(?:(?:--|#).*)?$'
full_pattern = re.compile(r'|'.join((
line_comment,
block_comment,
singe_quote_string,
double_quote_string,
go_word,
)), re.IGNORECASE | re.MULTILINE)
def split_sql_statements(statement_string):
last_end = 0
for match in full_pattern.finditer(statement_string):
if match.group('GO'):
yield statement_string[last_end:match.start()]
last_end = match.end()
yield statement_string[last_end:]
Example usage:
statement_string = r"""
-- this is a great database script! go team go!
INSERT INTO go(go) VALUES ('go away!')
go 7 -- foo
INSERT INTO go(go) VALUES (
'I have to GO " with a /* comment to GO inside a /* GO string /*'
)
/*
here are some comments that go with this script.
*/
GO
INSERT INTO go(go) VALUES ('this is the next script')
"""
for statement in split_sql_statements(statement_string):
print '======='
print statement
Output:
=======
-- this is a great database script! go team go!
INSERT INTO go(go) VALUES ('go away!')
=======
INSERT INTO go(go) VALUES (
'I have to GO " with a /* comment to GO inside a /* GO string /*'
)
/*
here are some comments that go with this script.
*/
=======
INSERT INTO go(go) VALUES ('this is the next script')

Categories