Importing python script to zabbix? - python

So ive made a easy python script to monitor DB sizes in my postgres but now id like to form graphs about the results to be monitored. However i cannot find a single way to get this script into the WEB GUI to be used in zabbix/grafana. We use zabbix and grafana on top of that simply because grafana looks way better.
# pip install psycopg2-binary
import psycopg2
connection = psycopg2.connect(user = "postgres", password = "password", host = "server", port =
"5432", database = "postgres")
cursor = connection.cursor()
cursor.execute("SELECT datname FROM pg_database WHERE datistemplate = false")
records = cursor.fetchall()
for record in records:
cursor.execute("SELECT pg_size_pretty( pg_database_size('{}'))".format(record[0]))
row = cursor.fetchone()
print("DB:{} Size:{}".format(record[0], row[0]))
Ive been googleing around the entire morning but i cant find any information about this. Ive found that it should be added to /zabbix/externalscripts folder but i did so and now i have no clue how to access it and add to a graph.

You are dealing with multiple databases and for each of them you want the size: you need to implement a Low Level Discovery for your target host.
The discovery rule should produce a json like:
{
"data": [
{
"{#DBNAME}": "Database 1",
"{#SOMEOTHERPROPERTY}": "XXX"
},
{
"{#DBNAME}": "Database 2",
"{#SOMEOTHERPROPERTY}": "YYY"
}
] }
Then you have to create an item prototype which uses {#DBNAME} as a reference, to query the db size.
You can create both the LLD and the item prototype using the ODBC support.
For instance, your OBDC discovery should be:
Key = db.odbc.discovery[get_databases,{HOST.NAME}]
Params = SELECT datname FROM pg_database WHERE datistemplate = false
And your item prototype:
Key = db.odbc.select[Used size on {#datname},{HOST.NAME}]
Params = SELECT pg_size_pretty( pg_database_size('{#datname}'))
After this setup, you will have an item for each database (and new databases will be discovered dynamically): you can plot them with latest data, with Grafana or by defining Graph Prototypes.

Related

Google Cloud SQL - Can I directly access database with SQLAlchemy - not locally

I'm trying to directly access Google Cloud SQL and create there table. I want to use as little services as possible (keep it simple), therefore I really don't want to use Cloud SDK whatever.
I want to use something similar, that I saw here. I tried to replicate it, but I ended up with error.
AttributeError: module 'socket' has no attribute 'AF_UNIX'
For all this I'm using Python with sqlalchemy & pymysql
I really don't know how to debug it since I'm using it first few hours, but I think that problem could be with URL or environmental variables (app.yamp file, which I created).
I think that I already have installed all dependencies which I need
db_user = os.environ.get("db_user")
db_pass = os.environ.get("db_pass")
db_name = os.environ.get("db_name ")
cloud_sql_connection_name = os.environ.get("cloud_sql_connection_name ")
db = sqlalchemy.create_engine(
# Equivalent URL:
# mysql+pymysql://<db_user>:<db_pass>#/<db_name>?unix_socket=/cloudsql/<cloud_sql_instance_name>
sqlalchemy.engine.url.URL(
drivername='mysql+pymysql',
username=db_user,
password=db_pass,
database=db_name,
query={
'unix_socket': '/cloudsql/{}'.format(cloud_sql_connection_name)
}
),
pool_size=5,
max_overflow=2,
pool_timeout=30,
pool_recycle=1800,
)
with db.connect() as conn:
conn.execute(
"CREATE TABLE IF NOT EXISTS votes "
"( vote_id SERIAL NOT NULL, time_cast timestamp NOT NULL, "
"candidate CHAR(6) NOT NULL, PRIMARY KEY (vote_id) );"
)
I do not use db_user etc. as real values. These are just examples.
It should pass successfully and create an table in Google SQL
Can I directly access database with SQLAlchemy - not locally
You are specifying a unix socket /cloudsql/{}. This requires that you set up the Cloud SQL Proxy on your local machine.
To access Cloud SQL directly, you will need to specify the Public IP address for Cloud SQL. In your call to the function sqlalchemy.engine.url.URL, specify the host and port parameters and remove the query parameter.

How to query a (Postgres) RDS DB through an AWS Jupyter Notebook?

I'm trying to query an RDS (Postgres) database through Python, more specifically a Jupyter Notebook. Overall, what I've been trying for now is:
import boto3
client = boto3.client('rds-data')
response = client.execute_sql(
awsSecretStoreArn='string',
database='string',
dbClusterOrInstanceArn='string',
schema='string',
sqlStatements='string'
)
The error I've been receiving is:
BadRequestException: An error occurred (BadRequestException) when calling the ExecuteSql operation: ERROR: invalid cluster id: arn:aws:rds:us-east-1:839600708595:db:zprime
In the end, it was much simpler than I thought, nothing fancy or specific. It was basically a solution I had used before when accessing one of my local DBs. Simply import a specific library for your database type (Postgres, MySQL, etc) and then connect to it in order to execute queries through python.
I don't know if it will be the best solution since making queries through python will probably be much slower than doing them directly, but it's what works for now.
import psycopg2
conn = psycopg2.connect(database = 'database_name',
user = 'user',
password = 'password',
host = 'host',
port = 'port')
cur = conn.cursor()
cur.execute('''
SELECT *
FROM table;
''')
cur.fetchall()

RuntimeError: OperationalError: (2003, Can't connect to MySQL server on 'IPaddress of the instance'

I'm trying to run a Python(version 2.7.1') script where I am using pymysql package to create a table into a database from a CSV file.
It runs correctly in my local system, however, the problem appears when running the same script as a part of a pipeline in Google Cloud Dataflow.
My Python function is the following one:
class charge_to_db(beam.DoFn):
def process(self, element):
import pymysql
with open(element, 'r') as f:
data = f.read().decode("UTF-8")
datalist = []
for line in data.split('\n'):
datalist.append(line.split(','))
db = pymysql.connect(host='IPaddress', user='root', password='mypassword', database='stack_model')
cursor = db.cursor()
cursor.execute("DROP TABLE IF EXISTS stack_convergence")
# create column names from the first line in fList
up = "upper_bnd"
primal = "primal"
d = "dualit"
gap = "gap_rel"
teta = "teta"
alpha = "alpha"
imba = "imba_avg"
price = "price_avg"
# create STUDENT table // place a comma after each new column except the last
queryCreateConvergenceTable = """CREATE TABLE stack_convergence(
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null,
{} float not null )""".format(up, primal, d, gap, teta, alpha, imba, price)
cursor.execute(queryCreateConvergenceTable)
When running this function in the cloud I'm obtaining the following error:
RuntimeError: OperationalError: (2003, 'Can\'t connect to MySQL server on \'35.195.1.40\' (110 "Connection timed out")')
I don't know why this error is occurring because it runs correctly in local system, so from the local system I have access to my cloud SQL instance, but not from the dataflow in the cloud.
Why is this error occurring?
On Dataflow you cannot whitelist an IP to enable Dataflow to access a SQL instance. If you would be using Java, the easiest way would be to use JdbcIO / JDBC socket factory.
But since you're using Python, then mimicking the implementation of JdbcIO.read() using Python-specific database connectivity facilities would help. There's this related question with a workaround after changing some Cloud SQL settings and adding related python codes.
If this seems complex, alternatively you can export data from Cloud SQL to Cloud Storage and then load from Cloud Storage.

connecting to and using the same database in MongoDB python

I tried to connect to MongoDb server remotely using python pymongo, but when I tried to display documents from collection I got error message as
"pymongo.errors.OperationFailure: not authorized on pt to execute command { find: "devices", filter: {} }" .
Also when I tried get single record from mongo, it will not display the record details instead it will display as
"pymongo.cursor.Cursor object at 0x000001E883A14F98".
Mongo Server Details: Host: Someth-pt-ved-01
user: uname
pwd: mypass
authenticationDatabase: pt
collection: devices
My python code for connection is:
from pymongo import MongoClient
uri = "mongodb://uname:mypass#Someth-pt-ved-01:27017"
client = MongoClient(uri)
db = client.pt
collection = db.devices
#to get single record details
cursor = collection.find({'ID': 1490660})
print(cursor)
#to get all documents from collection-devices
for document in cursor:
print(document)
Note; I am working on Windows 10.

Create a table from query results in Google BigQuery

We're using Google BigQuery via the Python API. How would I create a table (new one or overwrite old one) from query results? I reviewed the query documentation, but I didn't find it useful.
We want to simulate:
"SELEC ... INTO ..." from ANSI SQL.
You can do this by specifying a destination table in the query. You would need to use the Jobs.insert API rather than the Jobs.query call, and you should specify writeDisposition=WRITE_APPEND and fill out the destination table.
Here is what the configuration would look like, if you were using the raw API. If you're using Python, the Python client should give accessors to these same fields:
"configuration": {
"query": {
"query": "select count(*) from foo.bar",
"destinationTable": {
"projectId": "my_project",
"datasetId": "my_dataset",
"tableId": "my_table"
},
"createDisposition": "CREATE_IF_NEEDED",
"writeDisposition": "WRITE_APPEND",
}
}
The accepted answer is correct, but it does not provide Python code to perform the task. Here is an example, refactored out of a small custom client class I just wrote. It does not handle exceptions, and the hard-coded query should be customised to do something more interesting than just SELECT * ...
import time
from google.cloud import bigquery
from google.cloud.bigquery.table import Table
from google.cloud.bigquery.dataset import Dataset
class Client(object):
def __init__(self, origin_project, origin_dataset, origin_table,
destination_dataset, destination_table):
"""
A Client that performs a hardcoded SELECT and INSERTS the results in a
user-specified location.
All init args are strings. Note that the destination project is the
default project from your Google Cloud configuration.
"""
self.project = origin_project
self.dataset = origin_dataset
self.table = origin_table
self.dest_dataset = destination_dataset
self.dest_table_name = destination_table
self.client = bigquery.Client()
def run(self):
query = ("SELECT * FROM `{project}.{dataset}.{table}`;".format(
project=self.project, dataset=self.dataset, table=self.table))
job_config = bigquery.QueryJobConfig()
# Set configuration.query.destinationTable
destination_dataset = self.client.dataset(self.dest_dataset)
destination_table = destination_dataset.table(self.dest_table_name)
job_config.destination = destination_table
# Set configuration.query.createDisposition
job_config.create_disposition = 'CREATE_IF_NEEDED'
# Set configuration.query.writeDisposition
job_config.write_disposition = 'WRITE_APPEND'
# Start the query
job = self.client.query(query, job_config=job_config)
# Wait for the query to finish
job.result()
Create a table from query results in Google BigQuery. Assuming you are using Jupyter Notebook with Python 3 going to explain the following steps:
How to create a new dataset on BQ (to save the results)
How to run a query and save the results in a new dataset in table format on BQ
Create a new DataSet on BQ: my_dataset
bigquery_client = bigquery.Client() #Create a BigQuery service object
dataset_id = 'my_dataset'
dataset_ref = bigquery_client.dataset(dataset_id) # Create a DatasetReference using a chosen dataset ID.
dataset = bigquery.Dataset(dataset_ref) # Construct a full Dataset object to send to the API.
dataset.location = 'US' # Specify the geographic location where the new dataset will reside. Remember this should be same location as that of source data set from where we are getting data to run a query
# Send the dataset to the API for creation. Raises google.api_core.exceptions.AlreadyExists if the Dataset already exists within the project.
dataset = bigquery_client.create_dataset(dataset) # API request
print('Dataset {} created.'.format(dataset.dataset_id))
Run a query on BQ using Python:
There are 2 types here:
Allowing Large Results
Query without mentioning large result etc.
I am taking the Public dataset here: bigquery-public-data:hacker_news & Table id: comments to run a query.
Allowing Large Results
DestinationTableName='table_id1' #Enter new table name you want to give
!bq query --allow_large_results --destination_table=project_id:my_dataset.$DestinationTableName 'SELECT * FROM [bigquery-public-data:hacker_news.comments]'
This query will allow large query results if required.
Without mentioning --allow_large_results:
DestinationTableName='table_id2' #Enter new table name you want to give
!bq query destination_table=project_id:my_dataset.$DestinationTableName 'SELECT * FROM [bigquery-public-data:hacker_news.comments] LIMIT 100'
This will work for the query where the result is not going to cross the limit mentioned in Google BQ documentation.
Output:
A new dataset on BQ with the name my_dataset
Results of the queries saved as tables in my_dataset
Note:
These queries are Commands which you can run on the terminal(without ! in the beginning). But as we are using Python to run these commands/queries we are using !. This will enable us to use/run commands in the Python program as well.
Also please upvote the answer :). Thank You.

Categories