Python parsing update statements using regex

Python parsing update statements using regex - python

I'm trying to find a regex expression in python that will be able to handle most of the UPDATE queries that I throw at if from my DB. I can't use sqlparse or any other libraries that may be useful with for this, I can only use python's built-in modules or cx_Oracle, in case it has a method I'm not aware of that could do something like this.
Most update queries look like this:
UPDATE TABLE_NAME SET COLUMN_NAME=2, OTHER_COLUMN=to_date('31-DEC-202023:59:59','DD-MON-YYYYHH24:MI:SS'), COLUMN_STRING='Hello, thanks for your help', UPDATED_BY=-100 WHERE CODE=9999;
Most update queries I use have a version of these types of updates. The output has to be a list including each separate SQL keyword (UPDATE, SET, WHERE), each separate update statement(i.e COLUMN_NAME=2) and the final identifier (CODE=9999).
Ideally, the result would look something like this:
list = ['UPDATE', 'TABLE_NAME', 'SET', 'COLUMN_NAME=2', 'OTHER_COLUMN=("31-DEC-2020 23:59:59","DD-MON-YYYY HH24:MI:SS")', COLUMN_STRING='Hello, thanks for your help', 'UPDATED_BY=-100', 'WHERE', 'CODE=9999']
Initially I tried doing this using a string.split() splitting on the spaces, but when dealing with one of my slightly more complex queries like the one above, the split method doesn't deal well with string updates such as the one I'm trying to make in COLUMN_STRING or those in OTHER_COLUMN due to the blank spaces in those updates.

Let's use the shlex module :
import shlex
test="UPDATE TABLE_NAME SET COLUMN_NAME=2, OTHER_COLUMN=to_date('31-DEC-202023:59:59','DD-MON-YYYYHH24:MI:SS'), COLUMN_STRING='Hello, thanks for your help', UPDATED_BY=-100 WHERE CODE=9999;"
t=shlex.split(test)
Up to here, we won't get rid of comma delimiters and the last semi one, so maybe we can do this :
for i in t:
if i[-1] in [',',';']:
i=i[:-1]
If we print every element of that list we'll get :
UPDATE
TABLE_NAME
SET
COLUMN_NAME=2
OTHER_COLUMN=to_date(31-DEC-202023:59:59,DD-MON-YYYYHH24:MI:SS)
COLUMN_STRING=Hello, thanks for your help
UPDATED_BY=-100
WHERE
CODE=9999
Not a proper generic answer, but serves the purpose i hope.

Related

Issue with strings and dictionaries in python

I'm using the click package to get input for one or more variables which get loaded in as a combined dictionary. Each entry is then joined and the combined string is added to the end of a base URL and sent through the requests package to receive some xml data.
Earlier I had an issue with one of the variables that let you search through a range, such as
[value1, value2]
Python added double quotes around it so the search function didn't operate correctly, so I used
.replace('"', '')
on the joined string before combined with the base url and that seemed to fix that problem. The issue now is that individual input that contains more than one word now doesn't produce the same output as the actual search engine online. I have to use quotes when I input the information to keep it as a single argument, but then the quotes get removed by the function above and I believe that is what is causing the issue.
I think if I have a way to access individual entries of this dictionary and remove the double quotes from only certain entries then that should get the job done. But if I am overlooking something please let me know.
Help is appreciated.
Code added below:
import click
import requests
#click.command()
#click.option(--variable1)
#click.option(--variable2)
query_list=[variable1, variable2]
query=''.join(query_list)
base_url = "abc.com...."
response=requests.get(base_url,query)

Django model filter targeting JSONField where the keys contain hyphen / dash

I am trying to apply a model filter on a JSONField BUT the keys in the JSON are UUIDs.
So when is do something like...
MyModel.objects.filter(data__8d8dd642-32cb-48fa-8d71-a7d6668053a7=‘bob’)
... I get a compile error. The hyphens in the UUID are the issue.
Any clues if there is an escape char or another behaviour to use? My database is PostgreSQL.
Update 1 - now with added JSON
{
‘8d8dd642-32cb-48fa-8d71-a7d6668053a7’: ’8d8dd642-32cb-48fa-8d71-a7d6668053a7’,
‘9a2678c4-7a49-4851-ab5d-6e7fd6d33d72’: ‘John Smith’,
‘9933ae39-1a27-4477-a9f4-3d1839f93fb4’: ‘Employee’
}

I was having this same issue where I couldn't use __contains, and found that you can use **kwargs unpacking to get it to work, which allows you to pass the filter as a string (this is also useful if you were to need a dynamic filter):
kwargs = {
'data__8d8dd642-32cb-48fa-8d71-a7d6668053a7': 'bob'
}
MyModel.objects.filter(**kwargs)

That's looks difficult and I'm not 100% convinced what I have will work.
You can try using the JsonField contains lookup. The lookup is explained more under the docs for HStoreField since it's shared functionality.
This would look like this:
MyModel.objects.filter(data__contains={'8d8dd642-32cb-48fa-8d71-a7d6668053a7': 'bob'})
I think this will allow you to circumvent the fact that the lookups need to be valid Python variable names.

If you also want to search wild card (using the example from above)
This will basically search LIKE %bob%
kwargs = {
'data__8d8dd642-32cb-48fa-8d71-a7d6668053a7__icontains': 'bob'
}
MyModel.objects.filter(**kwargs)

mysql 'LIKE' in Django python

I am little new to Django,
My Question is How do i do %LIKE% of MYSQL in Django Filter
Want something like this
myModel.objects.filter(myField__**like**="xyz")
as we can do
myModel.objects.filter(myField__startswith="xyz")
for strings that starts with 'xyz' but i want to match anywhere in the myField content.
What i know
it can be done by REGEX and .extra() but i want something very straight forward.
Thanks in advance.

You can do it like this:
myModel.objects.filter(myField__contains = "xyz")
Note: __contains is case sensitive. You can use __icontains if you don't care about the case of the text.

Use the contains operator my_model.objects.filter(my_field__contains='xyz') and icontains if you want case insensitivity

Passing regex to coloumn attribute in happybase scan

I am trying to pass in a list of regexes to the columns attribute in my happybase scan calls. This is because, my coloumn names are made by dynamically appending ids which i dont have acces to at scan time.
Is this possible?

HappyBase author here.
According to the Thrift API you can pass regular expressions in the columns argument for the ScannerOpen() API family (see http://svn.apache.org/viewvc/hbase/trunk/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup#l717). However, the Thrift API used by HappyBase is ScannerOpenWithScan(), which uses the TScan struct (see http://svn.apache.org/viewvc/hbase/trunk/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup#l141), which does not contain any remark about regular expressions. Actually I don't know (without testing) whether this works.
A more flexible and powerful way is to specify a filter string using the filter argument to happybase.Table.scan(). See http://hbase.apache.org/book/thrift.html for the filter string syntax. In your case, something like "ColumnPrefixFilter('theprefix')" should do the trick. See http://happybase.readthedocs.org/en/latest/api.html#happybase.Table.scan for the HappyBase API.

I am not familiar with HBase's syntax. Here is the happybase-python code I used, and it works for me. Thanks to Wouter Bolsterlee!! Not like the 'columns' statement, you don't have to put 'columnFamily' in 'ColumnPrefixFilter'.
import happybase
pool = happybase.ConnectionPool(size=3, host='172.xx.xx.xx')
with pool.connection() as conn1:
hbaseTable = conn1.table('HBase_table_name_here')
for rowKey, rowData in hbaseTable.scan(row_prefix= 'year-2015-', filter="ColumnPrefixFilter('month-06')", limit = 6):
print rowData

Django-Haystack with Solr contains search

I am using haystack within a project using solr as the backend. I want to be able to perform a contains search, similar to the Django .filter(something__contains="...")
The __startswith option does not suit our needs as it, as the name suggests, looks for words that start with the string.
I tried to use something like *keyword* but Solr does not allow the * to be used as the first character
Thanks.

To get "contains" functionallity you can use:
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="100" side="back"/>
<filter class="solr.LowerCaseFilterFactory" />
as index analyzer.
This will create ngrams for every whitespace separated word in your field. For example:
"Index this!" => x, ex, dex, ndex, index, !, s!, is!, his!, this!
As you see this will expand your index greatly but if you now enter a query like:
"nde*"
it will match "ndex" giving you a hit.
Use this approach carefully to make sure that your index doesn't get too large. If you increase minGramSize, or decrease maxGramSize it will not expand the index as mutch but reduce the "contains" functionallity. For instance setting minGramSize="3" will require that you have at least 3 characters in your contains query.

You can achieve the same behavior without having to touch the solr schema. In your index, make your text field an EdgeNgramField instead of a CharField. Under the hood this will generate a similar schema to what lindstromhenrik suggested.

I am using an expression like:
.filter(something__startswith='...')
.filter_or(name=''+s'...')
as is seems solr does not like expression like '...*', but combined with or will do

None of the answers here do a real substring search *keyword*.
They don't find the keyword that is part of a bigger string, (not a prefix or suffix).
Using EdgeNGramFilterFactory or the EdgeNgramField in the indexes can only do a "startswith" or a "endswith" type of filtering.
The solution is to use a NgramField like this:
class MyIndex(indexes.SearchIndex, indexes.Indexable):
...
field_to_index= indexes.NgramField(model_attr='field_name')
...
This is very elegant, because you don't need to manually add anything to the schema.xml

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python parsing update statements using regex - python

Related

Issue with strings and dictionaries in python

Django model filter targeting JSONField where the keys contain hyphen / dash

mysql 'LIKE' in Django python

Passing regex to coloumn attribute in happybase scan

Django-Haystack with Solr contains search

Categories

Resources