How to disable logging from the Marqo python client - python

The newest release of Marqo seems to have added additional logging to the Marqo python client. How do I turn this off?
>>> mq.index("new").add_documents([{"field1": "hello"}])
2023-01-27 14:27:33,178 logger:'marqo' INFO add_documents pre-processing: took 0.000s for 1 docs, for an average of 0.000s per doc.
2023-01-27 14:27:34,561 logger:'marqo' INFO add_documents roundtrip: took 1.383s to send 1 docs to Marqo (roundtrip, unbatched), for an average of 1.383s per doc.
2023-01-27 14:27:34,561 logger:'marqo' INFO add_documents Marqo index: took 1.087s for Marqo to process & index 1 docs (server unbatched), for an average of 1.087s per doc.
2023-01-27 14:27:34,561 logger:'marqo' INFO add_documents completed. total time taken: 1.385s.

Use this to set the log level to warning. It should make the log messages disappear.
marqo.set_log_level('WARN')

Related

Confluent-Kafka Python - Describe consumer groups (to get the lag of each consumer group)

I want to get the details of the consumer group using confluent-kafka. The cli equivalent of that is
`
./kafka-consumer-groups.sh --bootstrap-server XXXXXXXXX:9092 --describe --group my-group
My end goal is to get the value of lag from the output. Is there any method in confluent-kafka python API to get these details. There is a method in the java API but I couldn't find it in python API.
I tried using the describe_configs method in the adminClient API but it ended up throwing kafkaException with following details
This most likely occurs because of a request being malformed by the client library or the message was sent to an incompatible broker. See the broker logs for more details.
For now I have come up with the following solution. It's a work around to get the combined lag of a consumer group
def get_lag(topic,numPartitions):
diff = list()
for i in range(numPartitions):
topic_partition = TopicPartition(topic, partition=i)
low, high = consumer.get_watermark_offsets(topic_partition)
currentList = consumer.committed([topic_partition])
current = currentList[0].offset
diff.append(high-current)
return sum(diff) # Combined Lag

Specifying types of text from stdout in subprocess.Popen in Python

First of all sorry if the title is not as descriptive as possible, it's hard to explain.
I have made a subprocess call to a batch and redirect all the content to python console with print(), the batch executed prints information messages, and I need to print a custom message, only if the batch message is a determined one.
But for some way, python detects that is not the same string.
I don't know if its an encoding issue (I encode the readline in iso-8859-1 to avoid the encoding errors of utf-8).
Here's my code.
from tkinter import *
from tkinter import ttk
from subprocess import Popen, PIPE
import os
gui = Tk()
text = Text(gui)
lb = ttk.Label(text="My bat")
cmd = 'C:\\Users\\User\\Desktop\\BAT1.bat'
def runbat():
proc = Popen(cmd,shell=False,stdout=PIPE)
while True:
line = proc.stdout.readline().decode('iso-8859-1').rstrip('\n')
if line != '':
myline = 'INFO Starting: Initialize communication activity (InitializeCommunication).'
if line == myline:
print("That's the line!")
print(line)
else:
break
bt = ttk.Button(text="Run bat", command=runbat).grid(row=4, column=5)
text.grid(row=6, column= 5)
lb.grid(row=3, column=5)
prog.grid(row=7,column=5)
mainloop()
And this is the output:
C:\Python34\python.exe "C:/Users/User/Documents/Desarrollo/Boos Production Manager/preferences/multioneproof.py"
C:\Users\User\Documents\Desarrollo\Boos Production Manager\preferences>cd /D C:
C:\Users\User\Documents\Desarrollo\Boos Production Manager\preferences>cd C:\Program Files\Philips MultiOne Workflow
C:\Program Files\Philips MultiOne Workflow>MultiOneWorkflow.exe /f "C:/Users/User/Desktop/A.xml" /w "Z:\Spain Factory\multione configuration\verify.txt" /p S /lu true /v info /c Halt
WARN Parameter IncludeUniqueIdOfDeviceInLabelData is provided without the GenerateAndExportLabelData parameter.
INFO Philips MultiOne Workflow version 3.11.91.28
INFO OS: Microsoft Windows 10 Home. Computer name: PC-DAVID. Application path: C:\Program Files\Philips MultiOne Workflow\MultiOneWorkflow.exe. Running as administrator: no. Format: Espa¤ol (Espa¤a) [es-ES], date format: dd/MM/yyyy, right to left: no, decimal separator: [,].
INFO Key: N/A. Profile: Debug. TwelveNc: N/A.
INFO Privileges: 0-10V / 1-10V: All. 0-10V / 1-10V (LED Driver): All. ActiLume: All. ActiLume wired: All. ActiLume wireless: All. Active Cooling: All. Adjustable Light Output: All. Adjustable Light Output Minimum: All. Adjustable Output Current: All. Adjustable Output Current Multi-Channel: All. Adjustable Startup Time: All. AmpDim: All. Coded Light: All. Coded Light Pwm: All. Coded Light Randomize: All. Coded Mains Scene Settings: All. Coded Mains Standalone Receiver: All. ComBox: All. Constant Light Output: All. Constant Light Output LITE: All. Constant Light Output Multi-Channel: All. Correlated Color Temperature Dual Channel: All. Correlated color temperature: All. Corridor Mode: All. DALI 102 variables: All. DALI Power Supply: All. DC Emergency: All. Dali 202 variables: All. Daylight override / Daylight switching: All. Device Info: All. Diagnostics: All. Diagnostics Emergency: All. Diagnostics Motor Controller: All. Dimming Interface: All. Driver Addressing: All. Driver Temperature Limit: All. Dwell Time: All. Dynadimmer: All. Emergency: All. End Of Life indication: All. Energy Meter: All. FCC Test Mode Settings: All. Factory link: All. Field Task Tuning: All. Field Task Tuning/Occupancy Sensing/Daylight Harvesting: All. Lamp Burn-in: All. Lamp selection: All. Late Stage Configuration: All. Light Source Age: All. LineSwitch: All. Load Fault Indicator Thresholds: All. Logical Signal Input: All. Lumen Level: All. LumiStep: All. Luminaire (Fixture) Information: All. Luminaire Production Test: All. Min dim level: All. Module Temperature Protection: All. Motor Control: All. NTC on LedSet: All. OEM Write Protection: All. Occupancy / Daylight: All. Occupancy sharing / Group light behavior: All. PowerBox: All. Push Button Unit LCU2070: All. Push Button Unit LCU2071: All. Quick Lamp Start: All. Relay Switched Output: All. SR Power Supply: All. Set Lamp uptime: All. Step Dimming: All. Touch and Dim: All.
INFO On warnings: halt
INFO Using Write&Verify.
INFO Multiple device configuring: Disabled
INFO Commission all: Disabled
INFO Check device model: Enabled
INFO DALI factory new: Disabled
INFO Starting: Prepare system activity (PrepareSystem).
INFO Success: Prepare system activity (PrepareSystem).
INFO Starting: Select feature file activity (OpenFile).
INFO Opening features file
INFO Provided file: c:/users/user/desktop/a.xml
INFO Success: Select feature file activity (OpenFile).
INFO Starting: Initialize communication activity (InitializeCommunication).
INFO Success: Initialize communication activity (InitializeCommunication).
INFO Starting: Identify device activity (IdentifyDevice).
INFO Devices identified: 0
ERROR No connected devices were found
ERROR Failure: Identify device activity (IdentifyDevice).
INFO Starting: Stop activity (Stop).
INFO Success: Stop activity (Stop).
INFO End
C:\Program Files\Philips MultiOne Workflow>echo 500 1>"Z:\Spain Factory\multione configuration\log.txt"
Process finished with exit code 0
So I think it must print my custom sentence when it comes to the correct line, but for some reason it doesn't do it.
if line != '' :
myline = 'INFO Starting: Initialize communication activity (InitializeCommunication).'
if str(myline) in str(line) :
print("That's the line!")
print(line)
else :
break
In this your 'line' variable value have extra trailing spaces which is not in 'myline' variable that's why your equal to if condition fails, if you use in condition it will solve this issue.

I need to scrape logs from cloud watch logs and load it to s3 and from s3 to data warehouse

I have several lambda functions. I need to scrape my logs generated from all of my lambda functions and load to our internal data warehouse. I thought of these solutions.
Have a lambda function subscribed to my lambda function's cloudwatch log groups and polish and log messages and push it to s3.
Pros: Works and simple to implement.
Cons: There is no way for me to
"replay". Say My exporter failed for some reason. I wouldn't be able
to replay this action.
Have a lambda function that runs every 10 min or so and creates export task and scrapes logs from cloudwatch and loads them to s3.
import boto3
client = boto3.client('logs')
response = client.create_export_task(
taskName='export_task',
logGroupName='/aws/lambda/<lambda_function_1>',
fromTime=from_time,
to=to_time,
destination='<application_logs>',
destinationPrefix='<lambda_function_1>'
)
response = client.create_export_task(
taskName='export_task',
logGroupName='/aws/lambda/<lambda_function_2>',
fromTime=from_time,
to=to_time,
destination='<application_logs>',
destinationPrefix='<lambda_function_2>'
)
Second create_export_task fails here
An error occurred (LimitExceededException) when calling the
CreateExportTask operation: Resource limit exceeded."
I cant create multiple export task. Is there a way to address this?
From AWS docs: One active (running or pending) export task at a time, per account. This limit cannot be changed.
U can use the below function to check if the status has been changed to 'COMPLETED'
response = client.create_export_task(
taskName='export_cw_to_s3',
logGroupName='/ecs/',
logStreamNamePrefix=org_id,
fromTime=int((yesterday-unix_start).total_seconds() * 1000),
to=int((today-unix_start).total_seconds() * 1000),
destination='test-bucket',
destinationPrefix=f'random-string/{today.year}/{today.month}/{today.day}/{org_id}')
taskId = (response['taskId'])
status = 'RUNNING'
while status in ['RUNNING','PENDING']:
response_desc = client.describe_export_tasks(
taskId=taskId
)
status = response_desc['exportTasks'][0]['status']['code']
Came across the same error message and the reason is you can only have one running/pending export task per account at a given time hence this task is failing. From AWS docs: One active (running or pending) export task at a time, per account. This limit cannot be changed.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/cloudwatch_limits_cwl.html
Sometimes one createExport task stays in pending state for long preventing other lambda functions with the same task to run. You could see this task and cancel it allowing the other functions to run.

Checking STOP time of EC2 instance with boto3

Python 2.7
Boto3
I'm trying to get a timestamp of when the instance was stopped OR the time the last state transition took place OR a duration of how long the instance has been in the current state.
My goal is to test if an instance has been stopped for x hours.
For example,
instance = ec2.Instance('myinstanceID')
if int(instance.state['Code']) == 80:
stop_time = instance.state_change_time() #Dummy method.
Or something similar to that.
I see that boto3 has a launch_time method. And lots of ways to analyze state changes using state_transition_reason and state_reason but I'm not seeing anything regarding the state transition timestamp.
I've got to be missing something.
Here is the Boto3 docs for Instance "state" methods...
state
(dict) --
The current state of the instance.
Code (integer) --
The low byte represents the state. The high byte is an opaque internal value and should be ignored.
0 : pending
16 : running
32 : shutting-down
48 : terminated
64 : stopping
80 : stopped
Name (string) --
The current state of the instance.
state_reason
(dict) --
The reason for the most recent state transition.
Code (string) --
The reason code for the state change.
Message (string) --
The message for the state change.
Server.SpotInstanceTermination : A Spot instance was terminated due to an increase in the market price.
Server.InternalError : An internal error occurred during instance launch, resulting in termination.
Server.InsufficientInstanceCapacity : There was insufficient instance capacity to satisfy the launch request.
Client.InternalError : A client error caused the instance to terminate on launch.
Client.InstanceInitiatedShutdown : The instance was shut down using the shutdown -h command from the instance.
Client.UserInitiatedShutdown : The instance was shut down using the Amazon EC2 API.
Client.VolumeLimitExceeded : The limit on the number of EBS volumes or total storage was exceeded. Decrease usage or request an increase in your limits.
Client.InvalidSnapshot.NotFound : The specified snapshot was not found.
state_transition_reason
(string) --
The reason for the most recent state transition. This might be an empty string.
The EC2 instance has an attribute StateTransitionReason which also has the time the transition happened. Use Boto3 to get the time the instance was stopped.
print status['StateTransitionReason']
...
User initiated (2016-06-23 23:39:15 GMT)
The code below prints stopped time and current time. Use Python to parse the time and find the difference. Not very difficult if you know Python.
import boto3
import re
client = boto3.client('ec2')
rsp = client.describe_instances(InstanceIds=['i-03ad1f27'])
if rsp:
status = rsp['Reservations'][0]['Instances'][0]
if status['State']['Name'] == 'stopped':
stopped_reason = status['StateTransitionReason']
current_time = rsp['ResponseMetadata']['HTTPHeaders']['date']
stopped_time = re.findall('.*\((.*)\)', stopped_reason)[0]
print 'Stopped time:', stopped_time
print 'Current time:', current_time
Output
Stopped time: 2016-06-23 23:39:15 GMT
Current time: Tue, 20 Dec 2016 20:33:22 GMT
You might consider using AWS Config to view the configuration history of the instances.
AWS Config is a fully managed service that provides you with an AWS resource inventory, configuration history, and configuration change notifications to enable security and governance
The get-resource-config-history command can return information about an instance, so it probably has Stop & Start times. It will take a bit of parsing to extract the details.

GAE: Task on backend instance killed without warning

TL;DR:
How can I work around this bug in Appengine: sometimes is_shutting_down returns False, and in a second or two, the instance is shut down?
Details
I have a backend instance on a Google Appengine application (Python). The backend instance is used to generate reports, which sometimes takes minutes or even hours to finish.
To deal with unexpected shutdowns, I am watching for runtime.is_shutting_down() and store the report's intermediate state into DB when is_shutting_down returns True.
Here's the portion of code where I check it:
from google.appengine.api import runtime
#...
def my_report_function():
#...
# Check if we should interrupt and reschedule to avoid timeout error.
duration_sec = time.time() - start
too_long = MAX_SEC < duration_sec
is_shutting_down = runtime.is_shutting_down()
log.debug('Does this report iteration need to wrap it up soon? '
'Too long? %s (%s sec). Shutting down? %s'
% (too_long, duration_sec, is_shutting_down))
if too_long or is_shutting_down:
# save the state of report, reschedule next iteration, and return
Sometimes it works, but sometimes I see the following in the Appengine log:
D 2013-06-20 18:41:56.893 Does this report iteration need to wrap it up soon? Too long? False (348.865950108 sec). Shutting down? False
E 2013-06-20 18:42:00.248 Process terminated because the backend took too long to shutdown.
Clearly, the 30-second timeout has not passed between the time when I checked the value returned by runtime.is_shutting_down(), and when Appengine killed the backend.
Does anybody know why this is happening, and whether there is a workaround for this?
Thank you in advance!
There is demo code from Google IO here http://backends-io.appspot.com/
The included counter_v3_with_write_behind.py demonstrates a pattern:
On '/_ah/start' set a shutdown hook via
runtime.set_shutdown_hook(something_to_save_progress_and_requeue_task)
It looks like your code is 'are you shutting down right now, if not, go do something that may take a while'. This pattern should listen for 'shut down ASAP or you lose everything'.

Categories