I want to see how the page is interacted with during my tests, e.g. what elements currently have the focus and where the interaction happens (similar to what the Cypress UI does).
How can I most conveniently achieve this in Selenium for Python?
Heavily inspired by the helper function from https://developer.mozilla.org/en-US/docs/Web/API/CSSStyleSheet/insertRule#Function_to_add_a_stylesheet_rule to add global stylesheet rules into a page I created add_css.js (in the subfolder helper_js):
/* global arguments */
(function (rules) {
var styleEl = document.createElement("style");
styleEl.classList.add("PART-OF-SELENIUM-TESTING");
// Append <style> element to <head>
document.head.appendChild(styleEl);
// Grab style element's sheet
var styleSheet = styleEl.sheet;
for (var i = 0; i < rules.length; i++) {
var j = 1,
rule = rules[i],
selector = rule[0],
propStr = "";
// If the second argument of a rule is an array of arrays, correct our variables.
if (Array.isArray(rule[1][0])) {
rule = rule[1];
j = 0;
}
for (var pl = rule.length; j < pl; j++) {
var prop = rule[j];
propStr += prop[0] + ": " + prop[1] + (prop[2] ? " !important" : "") + ";\n";
}
// Insert CSS Rule
styleSheet.insertRule(selector + "{" + propStr + "}", styleSheet.cssRules.length);
}
}).apply(null, arguments);
which I then loaded and injected into the page using:
import pkgutil
add_css = pkgutil.get_data("helper_js", "add_css.js").decode("utf8")
# ...
driver.execute_script(
add_css,
[
[":active", ["outline", "3px dashed red", True]],
[":focus", ["outline", "3px dashed yellow", True]],
[":active:focus", ["outline", "3px dashed orange", True]],
]
)
in order to add global styles for elements which are active or have focus.
Related
Edited overview and Scope
This problem boils down to the following problem; given a source file, automatically place open and closing braces for optional control blocks in C/C++. These blocks are if, else, do, while, and for afaik.
Overview
I am attempting to trace and analyze various loops, statements, and the like in a massive code repository that I have not written myself. My end goal is to perform timing statistics on all loops (will be expanded to other things in the future, but out of scope for this problem) in a given source of code. These trace functions do various things, but they all follow a similar issue; being placed before and after a block of interest is executed.
In essence, I want to transform the code:
for (i = 0; i < some_condition; i++) {
some_code = goes(here);
}
for (i = 0; i < some_condition; i++)
{
some_code = goes(here);
}
for (i = 0; i < some_condition; i++) { some_code = goes(here); }
for (i = 0; i < some_condition; i++)
some_code = goes(here);
for (i = 0; i < some_condition; i++)
for (i = 0; i < some_condition; i++)
some_code = goes(here);
to the following:
S_TRACE(); for (i = 0; i < some_condition; i++) {
some_code = goes(here);
} E_TRACE();
S_TRACE(); for (i = 0; i < some_condition; i++)
{
some_code = goes(here);
} E_TRACE();
S_TRACE(); for (i = 0; i < some_condition; i++) { some_code = goes(here); } E_TRACE();
S_TRACE(); for (i = 0; i < some_condition; i++) {
some_code = goes(here); } E_TRACE();
S_TRACE(); for (i = 0; i < some_condition; i++) {
S_TRACE(); for (i = 0; i < some_condition; i++) {
some_code = goes(here); } E_TRACE(); } E_TRACE();
Basically, without new lines of code added, I want to insert a function before the statement begins (easy) and after the statement (which can be hard). For example, the following code is actually in the repository of code:
for( int i = 0; names[i]; i++ )
if( !STRCMP( arg, names[i] ) )
{
*dst = names[i];
return 0;
}
return -1;
Terrible readability aside, I'd like to place braces on this type of loop, and insert my tracing functions. Arguments to the function (to account for nesting) I have omitted.
Current Implementation
My current implementation uses regex in Python, as I'm fairly comfortable and quick in this language. Relevant segments of implementation are as follows:
import re
source = []
loops = [r"^\s*(for\s*\(.*\))\s*($|{\s*$|\s*)", r"^\s*(while\s*\(.*\))\s*($|{\s*$|\s*)", r"^\s*(do)\s*({?)$"]
def analyize_line(out_file):
lnum, lstr = source.pop(0)
for index, loop_type in enumerate(loops):
match = re.findall(loop_type, lstr)
if match:
print(lnum + 1, ":", match[0][0])
if '{' in match[0][1]:
out_file.write(lstr.replace(match[0][0], "S_TRACE(); {}".format(match[0][0])))
look_ahead_place()
return
else:
last_chance = lstr + source[0][1]
last_match = re.findall(loop_type, last_chance)
if last_match and '{' in last_match[0][1]:
# same as above
out_file.write(lstr.replace(match[0][0], "S_TRACE(); {}".format(match[0][0])))
lcnum, lcstr = source.pop(0)
out_file.write(lcstr)
look_ahead_place()
else:
# No matching bracket, make one
out_file.write(lstr.replace(match[0][0], "S_TRACE(); {} {{".format(match[0][0])))
look_ahead_and_place_bracket()
return
# if we did not match, just a normal line
out_file.write(lstr)
def look_ahead_place():
depth = 1
for idx, nl in enumerate(source):
substr = ""
for c in nl[1]:
substr += c
if depth > 0:
if c == '{':
depth += 1
elif c == '}':
depth -= 1
if depth == 0:
substr += " E_TRACE(); "
if depth == 0:
source[idx][1] = substr
return
print("Error finding closing bracket here!")
exit()
def look_ahead_and_place_bracket():
for idx, nl in enumerate(source):
# Is the next line a scopable? how to handle multiline? ???
# TODO
return
def trace_loops():
global source
src_filename = "./example.c"
src_file = open(src_filename)
out_file = open(src_filename + ".tr", 'w')
source = [[number, line] for number, line in enumerate(src_file.readlines())]
while len(source) > 0:
analyize_line(out_file)
trace_loops()
The example.c is the example provided above for demonstration purposes. I am struggling to come up with an algorithm that will handle both inline loops, loops with no matching braces, and loops that contain no braces but have multiline inners.
Any help in the development of my algorithm would be much appreciated. Let me know in the comments if there is something that needs to be addressed more.
EDIT :: Further Examples and Expected Results
Characters that are added are surrounded by < and > tokens for visibility.
Nested Brace-less:
for( int i = 0; i < h->fdec->i_plane; i++ )
for( int y = 0; y < h->param.i_height >> !!i; y++ )
fwrite( &h->fdec->plane[i][y*h->fdec->i_stride[i]], 1, h->param.i_width >> !!i, f );
<S_TRACE(); >for( int i = 0; i < h->fdec->i_plane; i++ )< {>
<S_TRACE(); >for( int y = 0; y < h->param.i_height >> !!i; y++ )< {>
fwrite( &h->fdec->plane[i][y*h->fdec->i_stride[i]], 1, h->param.i_width >> !!i, f );< } E_TRACE();>< } E_TRACE();>
Nested Mixed:
for( int i = 0; i < h->fdec->i_plane; i++ ) {
for( int y = 0; y < h->param.i_height >> !!i; y++ )
fwrite( &h->fdec->plane[i][y*h->fdec->i_stride[i]], 1, h->param.i_width >> !!i, ff );
}
<S_TRACE(); >for( int i = 0; i < h->fdec->i_plane; i++ ) {
<S_TRACE(); >for( int y = 0; y < h->param.i_height >> !!i; y++ )< {>
fwrite( &h->fdec->plane[i][y*h->fdec->i_stride[i]], 1, h->param.i_width >> !!i, ff );< } E_TRACE();>
}< E_TRACE();>
Large Multiline Nested Brace-less:
for( int i = 0; i < h->sh.i_mmco_command_count; i++ )
for( int j = 0; h->frames.reference[j]; j++ )
if( h->frames.reference[j]->i_poc == h->sh.mmco[i].i_poc )
x264_frame_push_unused(
h,
x264_frame_shift( &h->frames.reference[j] )
);
<S_TRACE(); >for( int i = 0; i < h->sh.i_mmco_command_count; i++ )< {>
<S_TRACE(); >for( int j = 0; h->frames.reference[j]; j++ )< {>
if( h->frames.reference[j]->i_poc == h->sh.mmco[i].i_poc )
x264_frame_push_unused(
h,
x264_frame_shift( &h->frames.reference[j] )
);< } E_TRACE();>< } E_TRACE();>
This Gross Multiliner:
for( int j = 0;
j < ((int) offsetof(x264_t,stat.frame.i_ssd) - (int) offsetof(x264_t,stat.frame.i_mv_bits)) / (int) sizeof(int);
j++ )
((int*)&h->stat.frame)[j] += ((int*)&t->stat.frame)[j];
for( int j = 0; j < 3; j++ )
h->stat.frame.i_ssd[j] += t->stat.frame.i_ssd[j];
h->stat.frame.f_ssim += t->stat.frame.f_ssim;
<S_TRACE(); >for( int j = 0;
j < ((int) offsetof(x264_t,stat.frame.i_ssd) - (int) offsetof(x264_t,stat.frame.i_mv_bits)) / (int) sizeof(int);
j++ )< {>
((int*)&h->stat.frame)[j] += ((int*)&t->stat.frame)[j];< } E_TRACE();>
<S_TRACE(); >for( int j = 0; j < 3; j++ )< {>
h->stat.frame.i_ssd[j] += t->stat.frame.i_ssd[j];< } E_TRACE();>
h->stat.frame.f_ssim += t->stat.frame.f_ssim;
If Statement Edgecase:
Perhaps my implementation requires an inclusion of if statements to account for this?
if( h->sh.i_type != SLICE_TYPE_I )
for( int i_list = 0; i_list < 2; i_list++ )
for( int i = 0; i < 32; i++ )
h->stat.i_mb_count_ref[h->sh.i_type][i_list][i] += h->stat.frame.i_mb_count_ref[i_list][i];
You are going down a rabbit hole. The more cases you run into the more cases you will run into until you have to write an actual parser for C++, which will require learning a whole technology toolchain.
Instead I would strongly recommend that you simplify your life by using a formatting tool like clang-format that already knows how to parse C++ to first rewrite with consistent formatting (so braces are now always there), and then you just need to worry about balanced braces.
(If this is part of a build process, you can copy code, reformat it, then analyze reformatted code.)
Note, if the code makes interesting use of templates, this might not be enough. But it will hopefully get you most of the way there.
After extensive research, numerous applications, and many implementations, I've gotten just what I needed.
There is an existing solution called Uncrustify. The documentation is a bit lacking, but with some probing today the following config will do as I requested above.
$ cat .uncrustify
# Uncrustify-0.70.1
nl_if_brace = remove
nl_brace_else = force
nl_elseif_brace = remove
nl_else_brace = remove
nl_else_if = remove
nl_before_if_closing_paren = remove
nl_for_brace = remove
nl_while_brace = remove
nl_do_brace = remove
nl_brace_while = remove
nl_multi_line_sparen_open = remove
nl_multi_line_sparen_close = remove
nl_after_vbrace_close = true
mod_full_brace_do = force
mod_full_brace_for = force
mod_full_brace_function = force
mod_full_brace_if = force
mod_full_brace_while = force
You can run this using the command:
$ uncrustify -c /path/to/.uncrustify --no-backup example.c
For the future dwellers out there looking at similar issues:
clang-format is essentially a white-space only formatter.
clang-tidy can do, to a lesser extent, of what uncrustify can do; however requires direct integration with your compiler database or a full list of compiler options, which can be combersome.
indent is similar to clang-format
C++ Resharper does not support bracket formatting as of 2019.3, though planned for 2020.1.
VS Code does not support auto/forced bracket insertion
All these claims are made as of today and hopefully will be out of date soon so there are a plethera of tools for us to use and abuse :P
I have a problem with updating the position of vertical lines simultaneously on plots using Chart.js. What I want to do is to draw vertical line in a specific x postion when mouse pointer is on another graph. The problem is that with the current code, after moving mouse pointer over one plot in the second I have plotted line but the plot doesn't refresh, thus after moving again pointer there are a bunch of other lines.
I was trying including update() option before drawing vertical lines which actually solves the problem but the whole chart is refreshed and it's very slow.
Thx for the help!
Chart.defaults.LineWithLine = Chart.defaults.scatter
Chart.controllers.LineWithLine = Chart.controllers.scatter.extend({
draw: function(ease) {
Chart.controllers.scatter.prototype.draw.call(this, ease);
if (this.chart.tooltip._active && this.chart.tooltip._active.length) {
var activePoint = this.chart.tooltip._active[0],
ctx = this.chart.ctx,
x = activePoint.tooltipPosition().x,
topY = this.chart.scales['y-axis-1'].top,
bottomY = this.chart.scales['y-axis-1'].bottom;
// draw line
ctx.save();
ctx.beginPath();
ctx.moveTo(x, topY);
ctx.lineTo(x, bottomY);
ctx.lineWidth = 1.5;
ctx.strokeStyle = 'black';
ctx.stroke();
ctx.restore();
// get x value
var xValue = map(x, this.chart.chartArea.left, this.chart.chartArea.right, chainage_min, chainage_max);
if (this.chart == graph2) {
try {
// graph1.update() // drastically slows down
} finally {
//
}
var activePoint = graph2.tooltip._active[0],
ctx2 = graph1.ctx,
x = graph1.scales['x-axis-1'].getPixelForValue(xValue)
topY = graph1.scales['y-axis-1'].top,
bottomY = graph1.scales['y-axis-1'].bottom;
// draw line
ctx2.save();
ctx2.beginPath();
ctx2.moveTo(x, topY);
ctx2.lineTo(x, bottomY);
ctx2.lineWidth = 2.0;
ctx2.strokeStyle = 'black';
ctx2.stroke();
ctx2.restore();
} else if (this.chart == graph1) {
try {
//graph2.update() // drastically slows down
} finally {
//
}
var activePoint = graph1.tooltip._active[0],
ctx2 = graph2.ctx,
x = graph2.scales['x-axis-1'].getPixelForValue(xValue)
topY = graph2.scales['y-axis-1'].top,
bottomY = graph2.scales['y-axis-1'].bottom;
// draw line
ctx2.save();
ctx2.beginPath();
ctx2.moveTo(x, topY);
ctx2.lineTo(x, bottomY);
ctx2.lineWidth = 2.0;
ctx2.strokeStyle = 'black';
ctx2.stroke();
ctx2.restore();
}
}
}
})
function map(value, start1, stop1, start2, stop2) {
return start2 + (stop2 - start2) * ((value - start1) / (stop1 - start1))
}
I am deploying some JavaScript on a page with selenium's driver.execute_script function.
I prepare my JavaScript but if I drop the code into another line like so:
script = 'line one code' +
'line two code'
driver.execute_script(script)
It gives me an error.
I've also tried doing:
script = [
'line one code',
'line two code'
]
script = ';'.join(script)
But that gave me same error.
To construct a multi-line script, you can take help of the triple quotes i.e. """ ... """.
Here is a example of multi-line script which is invoked through execute_script() using Selenium:
def wheel_element(element, deltaY = 120, offsetX = 0, offsetY = 0):
error = element._parent.execute_script("""
var element = arguments[0];
var deltaY = arguments[1];
var box = element.getBoundingClientRect();
var clientX = box.left + (arguments[2] || box.width / 2);
var clientY = box.top + (arguments[3] || box.height / 2);
var target = element.ownerDocument.elementFromPoint(clientX, clientY);
for (var e = target; e; e = e.parentElement) {
if (e === element) {
target.dispatchEvent(new MouseEvent('mouseover', {view: window, bubbles: true, cancelable: true, clientX: clientX, clientY: clientY}));
target.dispatchEvent(new MouseEvent('mousemove', {view: window, bubbles: true, cancelable: true, clientX: clientX, clientY: clientY}));
target.dispatchEvent(new WheelEvent('wheel', {view: window, bubbles: true, cancelable: true, clientX: clientX, clientY: clientY, deltaY: deltaY}));
return;
}
}
return "Element is not interactable";
""", element, deltaY, offsetX, offsetY)
You can call the method as:
wheel_element(elm, -120)
Add \ after the + sign
script = 'line one code ' + \
'line two code'
Or use round brackets
script = ('line one code '
'line two code')
I need to scrape emails from the website.
It's visible in a browser but when I try to scrape it with requests\BeautifulSoup I get this: "[email protected]"
I can do this with Selenium but it will take more time and I would like to know is it possible to scrape these emails with requests\BeautifulSoup? Maybe it's needed to use some libraries for working with js.
The email tag:
<span id="signature_email"><a class="__cf_email__" href="/cdn-cgi/l/email-protection" data-cfemail="30425f5e70584346515c5c531e535f5d">[email protected]</a><script data-cfhash='f9e31' type="text/javascript">/* <![CDATA[ */!function(t,e,r,n,c,a,p){try{t=document.currentScript||function(){for(t=document.getElementsByTagName('script'),e=t.length;e--;)if(t[e].getAttribute('data-cfhash'))return t[e]}();if(t&&(c=t.previousSibling)){p=t.parentNode;if(a=c.getAttribute('data-cfemail')){for(e='',r='0x'+a.substr(0,2)|0,n=2;a.length-n;n+=2)e+='%'+('0'+('0x'+a.substr(n,2)^r).toString(16)).slice(-2);p.replaceChild(document.createTextNode(decodeURIComponent(e)),c)}p.removeChild(t)}}catch(u){}}()/* ]]> */</script></span></span> <span class="separator">|</span>
From the CF tag, in your supplied html, I assume you are scraping a cloudflare site. They offer a feature to obfuscate emails listed (see here) which encrypts the addresses in the HTML and using JavaScript decrypts them. Hence, using selenium you'll see email-addresses but using requests you won't.
Since the decryption method can be easily taken from the JavaScript, you can write your own decryption method in Python.
In JavaScript,
(function () {
try {
var s, a, i, j, r, c, l = document.getElementById("__cf_email__");
a = l.className;
if (a) {
s = '';
r = parseInt(a.substr(0, 2), 16);
for (j = 2; a.length - j; j += 2) {
c = parseInt(a.substr(j, 2), 16) ^ r;
s += String.fromCharCode(c);
}
s = document.createTextNode(s);
l.parentNode.replaceChild(s, l);
}
} catch (e) {}
})();
In Python,
def decodeEmail(e):
de = ""
k = int(e[:2], 16)
for i in range(2, len(e)-1, 2):
de += chr(int(e[i:i+2], 16)^k)
return de
Code In all Languages is here:
Javascript
function cfDecodeEmail(encodedString) {
var email = "", r = parseInt(encodedString.substr(0, 2), 16), n, i;
for (n = 2; encodedString.length - n; n += 2){
i = parseInt(encodedString.substr(n, 2), 16) ^ r;
email += String.fromCharCode(i);
}
return email;
}
console.log(cfDecodeEmail("543931142127353935313e352e7a373b39")); // usage
Python
def cfDecodeEmail(encodedString):
r = int(encodedString[:2],16)
email = ''.join([chr(int(encodedString[i:i+2], 16) ^ r) for i in range(2, len(encodedString), 2)])
return email
print cfDecodeEmail('543931142127353935313e352e7a373b39') # usage
PHP
function cfDecodeEmail($encodedString){
$k = hexdec(substr($encodedString,0,2));
for($i=2,$email='';$i<strlen($encodedString)-1;$i+=2){
$email.=chr(hexdec(substr($encodedString,$i,2))^$k);
}
return $email;
}
echo cfDecodeEmail('543931142127353935313e352e7a373b39'); // usage
GO
package main
import (
"bytes"
"strconv"
)
func cf(a string) (s string) {
var e bytes.Buffer
r, _ := strconv.ParseInt(a[0:2], 16, 0)
for n := 4; n < len(a)+2; n += 2 {
i, _ := strconv.ParseInt(a[n-2:n], 16, 0)
e.WriteString(string(i ^ r))
}
return e.String()
}
func main() {
email := cf("543931142127353935313e352e7a373b39") // usage
print(email)
print("\n")
}
C++
#include <iostream>
#include <string>
using namespace std;
string cfDecodeEmail(string encodedString);
int main()
{
cout << cfDecodeEmail("543931142127353935313e352e7a373b39") << endl;
}
string cfDecodeEmail(string encodedString)
{
string email;
char xorKey = stoi( encodedString.substr(0, 2), nullptr, 16);
for( unsigned i = 2; i < encodedString.length(); i += 2)
email += stoi( encodedString.substr(i, 2), nullptr, 16) ^ xorKey;
return email;
}
C#
using System;
public class Program
{
public static string cfDecodeEmail(string encodedString)
{
string email = "";
int r = Convert.ToInt32(encodedString.Substring(0, 2), 16), n, i;
for (n = 2; encodedString.Length - n > 0; n += 2)
{
i = Convert.ToInt32(encodedString.Substring(n, 2), 16) ^ r;
char character = (char)i;
email += Convert.ToString(character);
}
return email;
}
public static void Main(string[] args)
{
Console.WriteLine(cfDecodeEmail("543931142127353935313e352e7a373b39")); // usage
}
}
According to above algorithm, I wrote code in Ruby to parse [protected email] with nokogiri
def decode_email(e)
r = Integer(e[0,2], 16)
(2..e.length - 2).step(2).map do |j|
c = Integer(e[j,2], 16) ^ r
c.chr
end.join('')
end
I try to generate output by reading either sql script or shell script in unix box and output file is generated with statement functionality (Create,drop,update,delete,merge,insert) followed by tablename. I try to accomplish this output in a generic way to read any code and generate the output. Can this be achieved using awk programming.
OUTPUT
MERGE|temp_st_rx_wk_str_ip_rpt
SELECT|rx_ov_ord_excep_str_sku
SELECT|ndc
SELECT|fiscal_week
SELECT|store
SELECT|dss_saf_user01.rx_ov_ord_exclu_str
SELECT|rx_osv_invoice_str_ndc
DROP|temp_extract
CREATE|temp_build_extract
SELECT|temp_st_rx_wk_str_ip_rpt
CODE
merge into temp_st_rx_wk_str_ip_rpt s
USING (SELECT b.week_nbr,
b.store_nbr,
SUM (NVL (a.orig_on_ord_qty, 0)) AS mnd_ov_ord_orig_qty,
SUM (NVL (b.inv_qty, 0)) AS mnd_ov_inv_qty
FROM (SELECT /*+ PARALLEL (s,8) */ w.week_nbr, s.store_nbr, s.ndc_nbr,
SUM (s.orig_on_ord_qty) AS orig_on_ord_qty
FROM rx_ov_ord_excep_str_sku s,
ndc n,
fiscal_week w,
store st
WHERE s.ndc_nbr = n.ndc_nbr
AND s.store_nbr = st.store_nbr
AND s.ord_dt BETWEEN w.start_dt AND w.end_dt
AND n.schd_drug_cd NOT IN (''02'', ''07'')
AND n.gen_brand_ind <> ''Y''
AND s.orig_on_ord_qty < 1000 -- Arbitrary value used to exclude bad data
AND w.week_nbr = &P_WEEK_NBR
AND st.area_nbr NOT IN (0, 10, 11)
AND st.pharm_ind = ''Y''
AND s.store_nbr NOT IN
(SELECT store_nbr
FROM dss_saf_user01.rx_ov_ord_exclu_str
WHERE rx_ov_ord_exclu_cd = ''CP'')
GROUP BY w.week_nbr, s.store_nbr, s.ndc_nbr) a,
(SELECT /*+ INDEX (s,RX_OSV_INVOICE_STR_NDC_PK) */
w.week_nbr, s.store_nbr, s.ndc_nbr,
SUM (s.inv_qty) AS inv_qty
FROM rx_osv_invoice_str_ndc s,
ndc n,
store st,
fiscal_week w
WHERE s.ndc_nbr = n.ndc_nbr
AND s.store_nbr = st.store_nbr
AND s.ord_dt BETWEEN w.start_dt AND w.end_dt
AND s.ord_type_cd <> ''F''
AND n.schd_drug_cd NOT IN (''02'', ''07'')
AND n.gen_brand_ind <> ''Y''
AND s.inv_qty > 0
AND w.week_nbr = &P_WEEK_NBR
AND st.area_nbr NOT IN (0, 10, 11)
AND st.pharm_ind = ''Y''
AND s.store_nbr NOT IN
(SELECT store_nbr
FROM dss_saf_user01.rx_ov_ord_exclu_str
WHERE rx_ov_ord_exclu_cd = ''CP'')
GROUP BY w.week_nbr, s.store_nbr, s.ndc_nbr) b
WHERE a.week_nbr (+) = b.week_nbr
AND a.store_nbr (+) = b.store_nbr
AND a.ndc_nbr (+) = b.ndc_nbr
GROUP BY b.week_nbr, b.store_nbr) t
ON (t.week_nbr = s.week_nbr
AND t.store_nbr = s.store_nbr)
WHEN NOT MATCHED
THEN
INSERT (week_nbr, store_nbr, mnd_ov_ord_orig_qty, mnd_ov_inv_qty)
VALUES (t.week_nbr, t.store_nbr, t.mnd_ov_ord_orig_qty, t.mnd_ov_inv_qty)
WHEN MATCHED
THEN
UPDATE SET
s.mnd_ov_ord_orig_qty = t.mnd_ov_ord_orig_qty,
s.mnd_ov_inv_qty = t.mnd_ov_inv_qty';
commit;
drop table temp_extract;
create table temp_build_extract as select * from temp_st_rx_wk_Str_ip_rpt;
You can try:
awk -f e.awk input.txt
where input.txt is your input file (CODE), and e.awk is:
/^merge / {
if (match($0,/merge into ([^[:blank:]]+)/,a)) {
print "MERGE|"a[1]
next
}
}
/FROM [^(]/ {
getFromTabs()
if (match(from,/FROM ([^[:blank:]]+)/,a)) {
printKey(a[1])
do {
ind=index(from,",")
if (ind) {
from=substr(from,ind+1)
match(from,/[[:space:]]*([[:alnum:]]+)/,a)
printKey(a[1])
}
}
while (ind)
}
}
/^drop/ {
if (match($0,/drop table ([^[:blank:]]+)/,a)) {
print "DROP|"a[1]
next
}
}
/^create/ {
if (match($0,/create table ([^[:blank:]]+)/,a)) {
print "CREATE|"a[1]
}
if (match($0,/select.*[[:blank:]]([^[:blank:]]+);/,a)) {
print "SELECT|"a[1]
}
}
function printKey(key) {
if (!(key in T)) {
print "SELECT|"key
T[key]++
}
}
function getFromTabs(p) {
p=0
from=""
do {
from=(p++==0)?$0:(from ORS $0)
getline
}
while (!/WHERE/)
}
For your sample code above this produces output:
MERGE|temp_st_rx_wk_str_ip_rpt
SELECT|rx_ov_ord_excep_str_sku
SELECT|ndc
SELECT|fiscal
SELECT|store
SELECT|dss_saf_user01.rx_ov_ord_exclu_str
SELECT|rx_osv_invoice_str_ndc
DROP|temp_extract;
CREATE|temp_build_extract
SELECT|temp_st_rx_wk_Str_ip_rpt
(Note that I know nothing about SQL, so you must check if this looks ok to you.)