How to expose data to zabbix

How to expose data to zabbix - python

Here is my goal: I would like to be able to report various metrics to zabbix so that we can display the graphs on a web page.
These metrics include:
latency per soap service submission
various query results from one or more databases.
What things do I need to write and/or expose? Or is the zabbix server going to go and get it from an exposed service somewhere?
I've been advised that a script that returns a single value will work, but I'm wondering if that's the right way.

I can offer 2 suggestions to get the metrics into Zabbix:
Use the zabbix_sender binary to feed the data from your script directly to the Zabbix server. This allows your script to call on it's own interval and set all the parameters needed. You really only need to know the location to the zabbix_sender binary. Inside the Zabbix server interface, you would create items with the type of Zabbix trapper. This is the item type which receives values send from the zabbix_sender. You make up the key name and it has to match.
The second way you could do this is to specify a key name and script/binary inside the zabbix_agentd.conf file. Every time the Zabbix server requests this item the script would be called and the data from the script recorded. This allows you to set the intervals in the Zabbix item configuration rather than forcing you to run your script on its own intervals. However, you would need to add this extra bit of information to your zabbix_agentd.conf file for every host.
There may be other ways to do this directly from Python (zabbix_sender bindings for Python maybe?). But these are the 2 ways I have used before which work well. This isn't really Python specific. But you should be able to use zabbix_sender in your Python scripting. Hope this information helps!
Update: I also remembered that Zabbix was working on/has a API (JSON/RPC style). But the documentation site is down at the moment and I am not sure if the API is for submitting item data or not. Here is the Wiki on the API: http://www.zabbix.com/wiki/doc/api
And a project for Python API: https://github.com/gescheit/scripts/tree/master/zabbix/
There seems to be little documentation on the API as it is new as of Zabbix version 1.8

Actually there is a python binding for zabbix_sender. http://pypi.python.org/pypi/zbxsend

Related

Best approach to an API update project

I'm working in a personal project to segment some data from the Sendinblue Api (CRM Service). Basically what I try to achieve is generate a new score attribute to each user base on his emailing behavior. For that proposed, the process I've plan is as follows:
Get data from the API
Store in database
Analysis and segment the data with Python
Create and update score attribute in Sendin every 24 hours
The Api has a Rate limiting 400 request per minute, we are talking about 100k registers right now which means I have to spend like 3 hours to get all the initial data (currently I'm using concurrent futures to multiprocessing). After that I'll plan to store and update only the registers who present changes. I'm wondering if this is the best way to do it and which combinations of tools is better for this job.
Right now I have all my script in Jupyter notebooks and I recently finished my first Django project, so I don't know if I need a django app for this one or just simple connect the notebook to a database (PostgreSQL?), and if this last one is possible which library I have to learn to run my script every 24 hours. (i'm a beginner). Thanks!

I don't think you need Django except you want a web to view your data. Even so you can write any web application to view your statistic data with any framework/language. So I think the approach is simpler:
Create your python project, entry point main function will execute logic to fetch data from API. Once it's done, you can start logic to analyze and statistic then save result in database.
If you can query to view your final result by SQL, you don't need to build web application. Otherwise you might want to build a small web application to pull data from database to view statistic in charts or export in any prefer format.
Setup a linux cron job to execute python code at #1 and let it run every 24 at paticular time you want. Link: https://phoenixnap.com/kb/set-up-cron-job-linux

How to get filtered VM list on vcenter with python

I'm trying to get a list of registered virtual machines - by their name - in my vcenter. The problem that I have many vms (~5K), and I am doing it a lot of times (O(1000)/hour).
The SDKs I'm using causing it a lot of traffic (1-2MB/request):
pysphere: which ask for all vms, and filters on client side.
pyVmomi, which need to use recursion to list all vms (I saw SI.content.searchIndex.FindByDnsName on reboot_vm.py, but my machines' DNS configuration is not true)
Looking into SOAP documentation didn't help (got into RetrievePropertiesEx.objectSet but it doesn't looks to filter anything), and the new REST (v6.5) didn't help too (since I need to get its "datastore path", and all I can get is the name)

Have you tried a using a property collector, as in pyVmomi.vmodl.query.PropertyCollector.FilterSpec?
The pyVmomi Community Samples contain examples of using this API, such as
https://github.com/vmware/pyvmomi-community-samples/blob/master/samples/filter_vms.py.

vAPI is based on REST, even more, it does network requests every time, so it will be slow. Another way is to use VCDB, where are stored all VMs (there us such one, but I can not help you here more), see https://pubs.vmware.com/vsphere-4-esx-vcenter/index.jsp?topic=/com.vmware.vsphere.installclassic.doc_41/install/prep_db/c_preparing_the_vmware_infrastructure_databases.html or https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1025914

POST method for webhooks dropbox

I simply want to receive notifications from dropbox that a change has been made. I am currently following this tutorial:
https://www.dropbox.com/developers/reference/webhooks#tutorial
The GET method is done, verification is good.
However, when trying to mimic their implementation of POST, I am struggling because of a few things:
I have no idea what redis_url means in the def_process function of the tutorial.
I can't actually verify if anything is really being sent from dropbox.
Also any advice on how I can debug? I can't print anything from my program since it has to be ran on a site rather than an IDE.

Redis is a key-value store; it's just a way to cache your data throughout your application.
For example, access token that is received after oauth callback is stored:
redis_client.hset('tokens', uid, access_token)
only to be used later in process_user:
token = redis_client.hget('tokens', uid)
(code from https://github.com/dropbox/mdwebhook/blob/master/app.py as suggested by their documentation: https://www.dropbox.com/developers/reference/webhooks#webhooks)
The same goes for per-user delta cursors that are also stored.
However there are plenty of resources how to install Redis, for example:
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-redis
In this case your redis_url would be something like:
"redis://localhost:6379/"
There are also hosted solutions, e.g. http://redistogo.com/
Possible workaround would be to use database for such purpose.
As for debugging, you could use logging facility for Python, it's thread safe and capable of writing output to file stream, it should provide you with plenty information if properly used.
More info here:
https://docs.python.org/2/howto/logging.html

How to interface with opentsdb from Python

I'd like to interface with an opentsdb data store with Python. I only see a java client library for it. How would I go about this?

Unless you want a standalone client (in which case the Twisted Python OpenTSDB Client looks great), the easiest way is to run tcollector, and simply drop your Python script under /usr/local/tcollector/collector/0 – your script is expected to never return and print one data point per line in that format: metric timestamp value tag1=value1 tag2=value2 ....
tcollector takes care of connecting to OpenTSDB, pushing your data points out, etc. So you can focus on collecting the data you want to collect and write your data collection script in Python or any other scripting language you may like.

You can use Python request module and OpenTSDB HTTP API, too.

Give that library a try.
Twisted Python OpenTSDB Client
http://code.google.com/p/totsdb/source/browse/tostdb.py

Google AppEngine and Threaded Workers

I am currently trying to develop something using Google AppEngine, I am using Python as my runtime and require some advise on setting up the following.
I am running a webserver that provides JSON data to clients, The data comes from an external service in which I have to pull the data from.
What I need to be able to do is run a background system that will check the memcache to see if there are any required ID's, if there is an ID I need to fetch some data for that ID from the external source and place the data in the memecache.
If there are multiple id's, > 30 I need to be able to pull all 30 request as quickly and efficiently as possible.
I am new to Python Development and AppEngine so any advise you guys could give would be great.
Thanks.

You can use "backends" or "task queues" to run processes in the background. Tasks have a 10-minute run time limit, and backends have no run time limit. There's also a cronjob mechanism which can trigger requests at regular intervals.
You can fetch the data from external servers with the "URLFetch" service.

Note that using memcache as the communication mechanism between front-end and back-end is unreliable -- the contents of memcache may be partially or fully erased at any time (and it does happen from time to time).
Also note that you can't query memcache of you don't know the exact keys ahead of time. It's probably better to use the task queue to queue up requests instead of using memcache, or using the datastore as a storage mechanism.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.