I'm looking at this in the redis stream documentation, which says:
It is time to try reading something using the consumer group:
> XREADGROUP GROUP mygroup Alice COUNT 1 STREAMS mystream >
1) 1) "mystream"
2) 1) 1) 1526569495631-0
2) 1) "message"
2) "apple"
XREADGROUP replies are just like XREAD replies. Note however the GROUP
provided above, it states that I want to
read from the stream using the consumer group mygroup and I'm the
consumer Alice. Every time a consumer performs an operation with a
consumer group, it must specify its name uniquely identifying this
consumer inside the group.
There is another very important detail in the command line above,
after the mandatory STREAMS option the ID requested for the key
mystream is the special ID >. This special ID is only valid in the
context of consumer groups, and it means: messages never delivered to
other consumers so far.
I am trying to specify the ">" parameter in redis-py.
When I look at the documentation here, I don't see any parameter in streams that seems to let me do this. Specifically, I'm trying:
>>> r.xreadgroup(mygroupname,myconsumer,{mystream : ">"},1)
[] # oh no, empty. WHY?!
#
# even though
>>> r.xread({mystream: '1561950326849-0'}, count=1)
[[b'stuff-returned-successfully.]]
What am I missing? Why can't I specify a ">" to indicate unseen messages?
You had a mistaken assumption in this question that you had /unseen/ messages. That command should work, but will not if you have already seen all the messages once.
Try
# make sure you have not seen anything in your stream by resetting last seen to 0
>>> r.xgroup_setid(mystream,mygroupname,0) # RESET ALL
Now
r.xreadgroup(mygroupname,myconsumer,{mystream : ">"},1)
works fine.
Related
When I create the consumer
consumer = pulsar.Client(
PULSAR_URL,
authentication=AuthenticationOauth2(params)
).subscribe(
topic=PULSAR_TOPIC,
subscription_name=PULSAR_SUBSCRIPTION_NAME
)
I cannot read all messages from the beginning, or all non read messages, I only can read messages created after the consumer is created.
The questions is about how can I set the consumer in order to read all non read messages previously.
Thanks
You can specify the initial_position in the subscribe method to set the initial position of a consumer when subscribing to the topic. It could be either: InitialPosition.Earliest or InitialPosition.Latest. Default: Latest
So in your case, if you wanted to start at the oldest available message then you would want something like:
consumer = pulsar.Client(
PULSAR_URL,
authentication=AuthenticationOauth2(params)
).subscribe(
topic=PULSAR_TOPIC,
subscription_name=PULSAR_SUBSCRIPTION_NAME,
initial_position=InitialPosition.Earliest
)
Hope this helps!
I am trying to create a List with regex in a Python shell.
In list I have 3 value topic-1, topic-2, topic-3. I am creating one consumer object which is having topic and topic should contains 3 values [topic1, topic2, topic3]. so Whenever I want to point any one from topic-1, topic-2, topic-3. so it should take a message from right topic.
I am following below code but it's giving one issue:
import pulsar
import re
client = pulsar.Client('pulsar://localhost:6650')
topic = 'my-topic'
topic = ['topic-1', 'topic-2', 'topic-3']
topic = re.compile('topic-.*')
print(topic)
# <_sre.SRE_Pattern object at 0x7f13314e7210>
consumer = client.subscribe(topic, "my-subscription")
2019-04-26 07:05:02.956 INFO ConnectionPool:72 | Created connection for
pulsar://localhost:6650
2019-04-26 07:05:02.957 INFO ClientConnection:300 | [127.0.0.1:55874 ->
127.0.0.1:6650] Connected to broker
Here I am able to create consumer object, But it's not creating with right value like [topic-1, topic-2, topic-3]
because in next step I am not able to receive the message.
What would be the syntax issue?
I can't find anything overtly wrong in your syntax. Are you sure you have those topics in your namespace? Try using pulsar's command line tools, e.g.:
pulsar-admin tenants list
pulsar-admin namespaces list <>
pulsar-admin topics list tenant/cluster/namespace
See here for more options: https://pulsar.apache.org/docs/latest/reference/CliTools/
Recently I wrote code that should forward every message from a certain user to all groups that I joined but it doesn't.
Here my code:
for message in client.iter_messages('aliakhtari78'):
try:
dialogs = client.get_dialogs()
for dialog in dialogs:
id_chat = dialog.message.to_id.channel_id
entity = client.get_entity(id_chat)
client.forward_messages(
entity, # to which entity you are forwarding the messages
message.id, # the IDs of the messages (or message) to forward
'somebody' # who sent the messages?
)
except:
pass
in this code first I take every message which send to me by 'aliakhtari78' and then get entity of the groups that I joined to and in the end it should forward the message to all groups but it doesn't, I check my code and replace the entity with a user entity and it worked and I know the problem is because of entity, but I cant find out what is my problem.
In addition, I'm sorry for writing mistakes in my question.
In order to send messages to any entities in Telegram, you need two pieces of information:
the constant unique ID of the entity (It's an integer. It's NOT username string)
the access_hash which is different for each user for each entity
You can only pass #username to client.get_entity, and Telethon automatically resolves the #username to an entity with id and access_hash. That's why it works when you change your code like that. However, in your code, you have passed channel_id (which is the constant unique ID of the entity) to client.get_entity, not username.
Note that client.get_dialogs returns entities along with dialogs. You have just ignored the entities! This is how you can get an array of all entities:
dialogs, entities = client.get_dialogs()
Then simply pass the corresponding entity from the entities array to client.forward_messages.
I'm subscribing to Kafka using a pattern with a wildcard, as shown below. The wildcard represents a dynamic customer id.
consumer.subscribe(pattern='customer.*.validations')
This works well, because I can pluck the customer Id from the topic string. But now I need to expand on the functionality to listen to a similar topic for a slightly different purpose. Let's call it customer.*.additional-validations. The code needs to live in the same project because so much functionality is shared, but I need to be able to take a different path based on the type of queue.
In the Kafka documentation I can see that it is possible to subscribe to an array of topics. However these are hard-coded strings. Not patterns that allow for flexibility.
>>> # Deserialize msgpack-encoded values
>>> consumer = KafkaConsumer(value_deserializer=msgpack.loads)
>>> consumer.subscribe(['msgpackfoo'])
>>> for msg in consumer:
... assert isinstance(msg.value, dict)
So I'm wondering if it is possible to somehow do a combination of the two? Kind of like this (non-working):
consumer.subscribe(pattern=['customer.*.validations', 'customer.*.additional-validations'])
In the KafkaConsumer code, it supports list of topics, or a pattern,
https://github.com/dpkp/kafka-python/blob/68c8fa4ad01f8fef38708f257cb1c261cfac01ab/kafka/consumer/group.py#L717
def subscribe(self, topics=(), pattern=None, listener=None):
"""Subscribe to a list of topics, or a topic regex pattern
Partitions will be dynamically assigned via a group coordinator.
Topic subscriptions are not incremental: this list will replace the
current assignment (if there is one).
So you can create a regex, with OR condition using |, that should work as subscribe to multiple dynamic topics regex, as it internally uses re module for matching.
(customer.*.validations)|(customer.*.additional-validations)
In the Confluent Kafka library, the subscribe doesn't have a pattern keyword but instead will process regex patterns that start with ^.
def subscribe(self, topics, on_assign=None, *args, **kwargs):
"""
Set subscription to a supplied list of topics
This replaces a previous subscription.
Regexp pattern subscriptions are supported by prefixing the topic string with ``"^"``, e.g.::
consumer.subscribe(["^my_topic.*", "^another[0-9]-?[a-z]+$", "not_a_regex"])
"""
I've configured postfix on the email server with .forward file which saves a copy of email and invokes a python script. These emails are stored in Maildir format.
I want to use this Python script to send a reply to the sender acknowledging that the email has been received. I was wondering if there is any way I can open/access that e-mail, get the header info and sender address and send email back.
I looked at several examples of Maildir functions of Python, but they mostly add/delete e-mails. How can I open the latest e-mail received in Maildir/new and get the required information?
The program I have so far:
md = mailbox.Maildir('/home/abcd/Maildir')
message = md.iterkeys().next()
#print message
#for msg in md:
# subject = msg.get('Subject',"")
# print subject
print message
sender = message.get('From',"")
print sender
When I execute this, I do get the sender name. But It is rather the oldest email arrived in Maildir/new folder not the latest one.
Also, if I use get_date function, what if two (or more) e-mails arrive on the same day?
The MaildirMessage's method .get_date() gets you the timestamp of the
message file on disc. Depending on your filesystem, this may have anywhere between two second and nanosecond accuracy. The changes of two messages giving the same value with .get_date() are vastly smaller than when this actually returned a date only.
However if the message files were touched for some reason the return from .get_date() would not be relevant at all. Dovecot e.g. explicitly states that a files mtime should not be changed.
There are several dates associated with a MaildirMessage:
The arrival time timestamp, as encoded in the name of message (the part before the first dot, these are "whole" seconds). If the part
between the first and second dot has a segment of the form Mn than n is the microsecond arrival time, and be used to improve the resolution of the timestamp.
The timestamp of the file on disc
The 'Date:' header field as set by the sending program (or added by some
MTA)
The dates added by intermediate MTA in the 'Received:' header field
The last of these might not be available e.g. if you and the sender are on the same mail server. The third can be easily faked/incorrect (ever got spam in your inbox dated many years ago?). And the second is incorrect if the file ever got touched.
That leaves selecting on the first option:
d = {}
for name in md.keys():
d.setdefault(int(name.split('.', 1)[0]), []).append(name)
result = sorted(d.items())[-1][1]
assert len(result) == 1 # might fail
msg = md.get_message(result[0])
If you are lucky result is a list with a single item. But this value has only second resolution, so you might have multiple emails and then you have to decide on how to decide which message to select based on one of the other values (e.g. by sorting using the files timestamp .get_date()) or just select the first, randomly select one. (If you have the log file, you can search for the result messages' keys in there to determine which one arrived latest).
If you wouldn't convert to int, and have old emails (i.e. before 2001-09-09 03:46:40) a string comparison would probably not give you the message with the latest arrival time.
Some hints for this:
You can open a Maildir with the mailbox.Maildir class (see the Documentation for mailbox)
You can iterate over all the mails in a Maildir via the method itervalues
Now you get all the mails in the Maildir. One of them is the most recent one.
The mails are objects of the class MaildirMessage, which is a subclass of Message. For these classes, also a documentation exists (on the same page as mailbox, currently)
With the method "get_date" on those objects, you can find out, which one is the most recent one. You still have to select it yourself.
So much as beginners help: A little bit you should also do by yourself.
You should make yourself familiar with the Python documentation - I agree, that it is not easy to find the right packages and how to use them, but you can try them directly in the Python shell.
Ok, here another code snippet:
newest = None
for message in md.itervalues():
if newest == None or message.get_date() > newest.get_date():
newest = message
# now newest should contain the newest message
Did not see your last question: get_date does not only contain the date, but also the time, because it gives the number of seconds since (normally) 1970.