Say I have a SQL 2008 Database, which contains the inventory data for my business. I need to trigger a python script, once an item qty is changed. This can result under several conditions, it could be a sale order, or simply a qty change. The python script will transform the data, and upload it to google sheets.
I need this to trigger in realtime, like when the specified columns change or records are created, I need to fire off the script.
Its preferred the solution runs on the DB server itself, without having to pay for other integration tools such as Zapier. (Besides Zapier wont help here)
Constraints:
I cannot move the database to the cloud (Business Restriction)
Upgrading the database to a new version is not possible either (budget)
Changing the Database to open source is not possible either (other application dependencies)
Its a real pickle, but I'm trying to find a solution for a real time trigger.
Failing that I could almost implement a periodic scanning method, but this will create new problems.
Havent tried anything yet, because I have no idea what to try here.
Some google searches, but was not able to find a solution.
Source: https://www.sqlshack.com/use-xp-cmdshell-extended-procedure/
The answer to this is to configure the required trigger, using xp_cmdshell. You can then use xp_cmdshell to run a .bat or .py file.
Be aware of the permissions xp_cmdshell is running as, most likely this will be the SQL user that runs the file, you will have to ensure this user has the right privileges at the OS level to execute files, and write to any directories that need to be written to.
If anyone is in a similar situation, they could look into upgrading to SQL Express (which is free), not sure if this would break the application, but there is a good chance it will not(SQL Server 2008R2 Express upgrade to SQL Server 2012 Express).
It goes without saying this is certainly not best practice, and if you can at all avoid it, it would be best to run a scheduled task instead.
Related
I had a question regarding the python IBM_DB package (but I think it could be applied to any of the packages that employ the connection/cursor logic i.e. pyodbc).
When the cursor.execute() method is called, it executes an sql query on the database. However, to access this data, you would need to use the fetchall()/other fetch methods. I want to time the hit on the database.
Does the query completely finish running at the execute level, and it is in memory just for python to fetch? Or does the fetch method continue calling the database? I have scoured the documentation and am unable to find anything definitive on this subject.
Most or all of the Db2 open source drivers are based on the Call Level Interface (CLI). The CLI functions and details are part of the overall Db2 documentation. The Fetch() from a ResultSet retrieves one more row.
AFAIK the result set can be cached or go back to the engine. It makes sense to bring in few (dozen) rows, but not for some million rows.
You would need insights and understanding of how drivers and database query processing work in order to measure something useful and interpret it correctly.
BTW: There is some form of CLI tracing available.
So I have a Google sheet that maintains a lot of data. I also have a MySQL DB with a huge junk of data. There is a vital piece of information in the Sheet that is also present in the DB. Both needs to be in sync. The information always enters the Sheet first. I had a python script with mysql queries to update my database separately.
Now the work flow has changed. Data will enter the sheet and whenever that happens the database has to updated automatically.
After some research, I found that using the onEdit function of Google AppScript (I learned from here.), I could pickup when the file has changed.
The Next step is to fetch the data from relevant cell, which I can do using this.
Now I need to connect to the DB and send some queries. This is where I am stuck.
Approach 1:
Have a python web-app running live. Send the data via UrlFetchApp.This I yet have to try.
Approach 2:
Connect to mySQL remotely through appscript. But I am not sure this is possible after 2-3 hours of reading the docs.
So this is my scenario. Any viable solution you can think of or a better approach?
Connect directly to mySQL. You likely missed reading this part https://developers.google.com/apps-script/guides/jdbc
Using JDBC within Apps Script will work if you have the time to build this yourself.
If you don't want to roll your own solution, check out SeekWell. It allows you to connect to databases and write SQL queries directly in Sheets. You can create a run a “Run Sheet” that will run multiple queries at once and schedule those queries to be run without you even opening the Sheet.
Disclaimer: I made this.
I am working on a system where a bunch of modules connect to a MS SqlServer DB to read/write data. Each of these modules are written in different languages (C#, Java, C++) as each language serves the purpose of the module best.
My question however is about the DB connectivity. As of now, all these modules use the language-specific Sql Connectivity API to connect to the DB. Is this a good way of doing it ?
Or alternatively, is it better to have a Python (or some other scripting lang) script take over the responsibility of connecting to the DB? The modules would then send in input parameters and the name of a stored procedure to the Python Script and the script would run it on the database and send the output back to the respective module.
Are there any advantages of the second method over the first ?
Thanks for helping out!
If we assume that each language you use will have an optimized set of classes to interact with databases, then there shouldn't be a real need to pass all database calls through a centralized module.
Using a "middle-ware" for database manipulation does offer a very significant advantage. You can control, monitor and manipulate your database calls from a central and single location. So, for example, if one day you wake up and decide that you want to log certain elements of the database calls, you'll need to apply the logical/code change only in a single piece of code (the middle-ware). You can also implement different caching techniques using middle-ware, so if the different systems share certain pieces of data, you'd be able to keep that data in the middle-ware and serve it as needed to the different modules.
The above is a very advanced edge-case and it's not commonly used in small applications, so please evaluate the need for the above in your specific application and decide if that's the best approach.
Doing things the way you do them now is fine (if we follow the above assumption) :)
Does anyone have or know of a good template / plan for doing automated server upgrades? In this case I am upgrading a python/django server, but am going to have to apply this update to many machines, and want to be sure that the operation is fully testable and recoverable should anything go wrong.
Am picturing something along the lines of:
remotely fetch new code
verify code download (e.g. hash of files)
take down server, display "you are upgrading dialog"
backup database(s)
backup code directory
apply new code updates
verify code update (e.g. hash of files)
apply database update (if necessary)
run tests
if success
startup server
verify server update
else
restore old database
restore old code
report error
startup server
verify server restore
I'm sure that this isn't exhaustive, and there are many other error conditions to consider, but am wondering if something like this already exists as a formalized process/best practices checklist to follow? Ideally this whole thing should of course be done by a single script call.
Once you have a plan (and yours looks pretty good), the Fabric site should be your next stop.
I think you're pretty much covering everything. Identify what's important to you and you're business practices: that's what counts.
I am working on a program to automate parsing data from XML files and storing it into several databases. (Specifically the USGS realtime water quality service, if anyone's interested, at http://waterservices.usgs.gov/rest/WaterML-Interim-REST-Service.html) It's written in Python 2.5.1 using LXML and PYODBC. The databases are in Microsoft Access 2000.
The connection function is as follows:
def get_AccessConnection(db):
connString = 'DRIVER={Microsoft Access Driver (*.mdb)};DBQ=' + db
cnxn = pyodbc.connect(connString, autocommit=False)
cursor = cnxn.cursor()
return cnxn, cursor
where db is the filepath to the database.
The program:
a) opens the connection to the database
b) parses 2 to 8 XML files for that database and builds the values from them into a series of records to insert into the database (using a nested dictionary structure, not a user-defined type)
c) loops through the series of records, cursor.execute()-ing an SQL query for each one
d) commits and closes the database connection
If the cursor.execute() call throws an error, it writes the traceback and the query to the log file and moves on.
When my coworker runs it on his machine, for one particular database, specific records will simply not be there, with no errors recorded. When I run the exact same code on the exact same copy of the database over the exact same network path from my machine, all the data that should be there is there.
My coworker and I are both on Windows XP computers with Microsoft Access 2000 and the same versions of Python, lxml, and pyodbc installed. I have no idea how to check whether we have the same version of the Microsoft ODBC drivers. I haven't been able to find any difference between the records that are there and the records that aren't. I'm in the process of testing whether the same problem happens with the other databases, and whether it happens on a third coworker's computer as well.
What I'd really like to know is ANYTHING anyone can think of that would cause this, because it doesn't make sense to me. To summarize: Python code executing SQL queries will silently fail half of them on one computer and work perfectly on another.
Edit:
No more problem. I just had my coworker run it again, and the database was updated completely with no missing records. Still no idea why it failed in the first place, nor whether or not it will happen again, but "problem solved."
I have no idea how to check whether
we have the same version of the
Microsoft ODBC drivers.
I think you're looking for Control Panel | Administrative Tools | Data Sources (ODBC). Click the "Drivers" tab.
I think either Access 2000 or Office 2000 shipped with a desktop edition of SQL Server called "MSDE". Might be worth installing that for testing. (Or production, for that matter.)