Looping over Protocol Buffers attributes in Python - python

I would like help with recursively looping over all attributes/sub objects contained in a protocol buffers message, assuming that we do not know the names of them, or how many there are.
As an example, take the following .proto file from the tutorial on the google website:
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
}
repeated PhoneNumber phone = 4;
}
and to use it...:
person = tutorial.Person()
person.id = 1234
person.name = "John Doe"
person.email = "jdoe#example.com"
phone = person.phone.add()
phone.number = "555-4321"
phone.type = tutorial.Person.HOME
Given Person, How do I then access both the name of the attribute and its value for each element: person.id, person.name, person.email, person.phone.number, person.phone.type?
I have tried the following, however it doesn't seem to recurs into person.phone.number or person.phone.type.
object_of_interest = Person
while( hasattr(object_of_interest, "_fields") ):
for obj in object_of_interest._fields:
# Do_something_with_object(obj) # eg print obj.name
object_of_interest = obj
I have tried using obj.DESCRIPTOR.fields_by_name.keys to access the sub elements, but these are the string representations of the sub objects, not the objects themselves.
obj.name gives me the attribute of the name, but im not sure how to actually get the value of that attribute, eg obj.name may give me 'name', but how do i get 'john doe' out of it?

I'm not super familiar with protobufs, so there may well be an easier way or api for this kind of thing. However, below shows an example of how you could iterate/introspect and objects fields and print them out. Hopefully enough to get you going in the right direction at least...
import addressbook_pb2 as addressbook
person = addressbook.Person(id=1234, name="John Doe", email="foo#example.com")
person.phone.add(number="1234567890")
def dump_object(obj):
for descriptor in obj.DESCRIPTOR.fields:
value = getattr(obj, descriptor.name)
if descriptor.type == descriptor.TYPE_MESSAGE:
if descriptor.label == descriptor.LABEL_REPEATED:
map(dump_object, value)
else:
dump_object(value)
elif descriptor.type == descriptor.TYPE_ENUM:
enum_name = descriptor.enum_type.values[value].name
print "%s: %s" % (descriptor.full_name, enum_name)
else:
print "%s: %s" % (descriptor.full_name, value)
dump_object(person)
which outputs
tutorial.Person.name: John Doe
tutorial.Person.id: 1234
tutorial.Person.email: foo#example.com
tutorial.Person.PhoneNumber.number: 1234567890
tutorial.Person.PhoneNumber.type: HOME

Related

How to set a protobuf Timestamp field in python?

I am exploring the use of protocol buffers and would like to use the new Timestamp data type which is in protobuf3. Here is my .proto file:
syntax = "proto3";
package shoppingbasket;
import "google/protobuf/timestamp.proto";
message TransactionItem {
optional string product = 1;
optional int32 quantity = 2;
optional double price = 3;
optional double discount = 4;
}
message Basket {
optional string basket = 1;
optional google.protobuf.Timestamp tstamp = 2;
optional string customer = 3;
optional string store = 4;
optional string channel = 5;
repeated TransactionItem transactionItems = 6;
}
message Baskets {
repeated Basket baskets = 1;
}
After generating python classes from this .proto file I'm attempting to create some objects using the generated classes. Here's the code:
import shoppingbasket_pb2
from google.protobuf.timestamp_pb2 import Timestamp
baskets = shoppingbasket_pb2.Baskets()
basket1 = baskets.baskets.add()
basket1.basket = "001"
basket1.tstamp = Timestamp().GetCurrentTime()
which fails with error:
AttributeError: Assignment not allowed to composite field "tstamp" in protocol message object.
Can anyone explain to me why this isn't working as I am nonplussed.
See Timestamp.
I think you want:
basket1.tstamp.GetCurrentTime()
You could also parse it:
Timestamp().FromJsonString("2022-03-26T22:23:34Z")
I found this highly confusing, as it differs from how other values was assigned in my demo, so I'd like to add this method using .FromDatetime():
.proto:
message User {
int64 id = 1;
string first_name = 2;
...
string phone_number = 7;
google.protobuf.Timestamp created_on = 8; # <-- NB
}
The response, UserMsgs.User(), is here a class generated from the above protofile, and is aware of what type each field has.
def GetUser(self, request, context):
response = UserMsgs.User()
if request.id is not None and request.id > 0:
usr = User.get_by_id(request.id)
response.id = usr.id
response.first_name = usr.first_name
...
response.phone_number = str(usr.phone_number)
response.created_on.FromDatetime(usr.created_on) # <-- NB
return response
So instead of assigning response.created_on with = as the others, we can use the built in function .FromDatetime as mentioned here.
NB: Notice the lowercase t in Datetime
usr.created_on in my example is a python datetime, and is assigned to a google.protobuf.Timestamp field.

Python container troubles

Basically what I am trying to do is generate a json list of SSH keys (public and private) on a server using Python. I am using nested dictionaries and while it does work to an extent, the issue lies with it displaying every other user's keys; I need it to list only the keys that belong to the user for each user.
Below is my code:
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f) # gets the creation time of file (f)
username_list = f.split('/') # splits on the / character
user = username_list[2] # assigns the 2nd field frome the above spilt to the user variable
key_length_cmd = check_output(['ssh-keygen','-l','-f', f]) # Run the ssh-keygen command on the file (f)
attr_dict = {}
attr_dict['Date Created'] = str(datetime.datetime.fromtimestamp(c_time)) # converts file create time to string
attr_dict['Key_Length]'] = key_length_cmd[0:5] # assigns the first 5 characters of the key_length_cmd variable
ssh_user_key_dict[f] = attr_dict
user_dict['SSH_Keys'] = ssh_user_key_dict
main_dict[user] = user_dict
A list containing the absolute path of the keys (/home/user/.ssh/id_rsa for example) is passed to the function. Below is an example of what I receive:
{
"user1": {
"SSH_Keys": {
"/home/user1/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:20.995862",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:21.457867",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:21.423867",
"Key_Length]": "2048 "
},
"/home/user1/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:20.956862",
"Key_Length]": "2048 "
}
}
},
As can be seen, user2's key files are included in user1's output. I may be going about this completely wrong, so any pointers are welcomed.
Thanks for the replies, I read up on nested dictionaries and found that the best answer on this post, helped me solve the issue: What is the best way to implement nested dictionaries?
Instead of all the dictionaries, I simplfied the code and just have one dictionary now. This is the working code:
class Vividict(dict):
def __missing__(self, key): # Sets and return a new instance
value = self[key] = type(self)() # retain local pointer to value
return value # faster to return than dict lookup
main_dict = Vividict()
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f)
username_list = f.split('/')
user = username_list[2]
key_bit_cmd = check_output(['ssh-keygen','-l','-f', f])
date_created = str(datetime.datetime.fromtimestamp(c_time))
key_type = key_bit_cmd[-5:-2]
key_bits = key_bit_cmd[0:5]
main_dict[user]['SSH Keys'][f]['Date Created'] = date_created
main_dict[user]['SSH Keys'][f]['Key Type'] = key_type
main_dict[user]['SSH Keys'][f]['Bits'] = key_bits

How to get the type of a variable defined in a protobuf message?

I'm trying to do some 'translation' from protobuf files to Objective-C classes using Python. For example, given the protobuf message:
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
}
I want to translate it into an objc class:
#interface Person : NSObject
#property (nonatomic, copy) NSString *name;
#property (nonatomic, assign) int ID;
#property (nonatomic, copy) NSString *email;
#end
The key point is to acquire every property's name and type. For example, 'optional string email' in the protobuf message, its name is 'email', type is 'string', so it should be NSString *email in objective-c. I followed the official tutorial, wrote an addressbook.proto just the same as the one in the tutorial and compiled it. Then I wrote my python code:
import addressbook_pb2 as addressbook
p = addressbook.Person()
all_fields = p.DESCRIPTOR.fields_by_name
# print "all fields: %s" %all_fields
field_keys = all_fields.keys()
# print "all keys: %s" %field_keys
for key in field_keys:
one_field = all_fields[key]
print one_field.label
This just gave me:
1
2
3
2
So I guess label is not what I need, while field_keys is just the list of names that I expect. I tried some other words, and did some search on the web, but didn't find the right answer.
If there's no way to acquire the type, I have another thought, which is to read and analyze every line of the protobuf source file in a pure 'Pythonic' way, but I really don't want to do this if its not necessary.
Can anybody help me?
The FieldDescriptor class has a message_type member which, if a composite field, is a descriptor of the message type contained in this field. Otherwise, this is None.
Combine this with iterating through a dictionary of DESCRIPTORS means you can get the name and type of composite and non-composite (raw) fields.
import addressbook_pb2 as addressbook
DESCRIPTORS = addressbook.Person.DESCRIPTOR.fields_by_name
for (field_name, field_descriptor) in DESCRIPTORS.items():
if field_descriptor.message_type:
# Composite field
print(field_name, field_descriptor.message_type.name)
else:
# Raw type
print(field_name, field_descriptor.type)
# TYPE_DOUBLE
# TYPE_FLOAT
# TYPE_INT64
# TYPE_UINT64
# TYPE_INT32
# TYPE_FIXED64
# TYPE_FIXED32
# TYPE_BOOL
# TYPE_STRING
# TYPE_GROUP
# TYPE_MESSAGE
# TYPE_BYTES
# TYPE_UINT32
# TYPE_ENUM
# TYPE_SFIXED32
# TYPE_SFIXED64
# TYPE_SINT32
# TYPE_SINT64
# MAX_TYPE
The raw types are class attributes; https://github.com/protocolbuffers/protobuf/blob/master/python/google/protobuf/descriptor.py
Thanks to Marc's answer, I figured out some solution. This is just a thought, but it's a huge step for me.
Python code:
import addressbook_pb2 as addressbook
typeDict = {"1":"CGFloat", "2":"CGFloat", "3":"NSInteger", "4":"NSUinteger", "5":"NSInteger", "8":"BOOL", "9":"NSString", "13":"NSUinteger", "17":"NSInteger", "18":"NSInteger"}
attrDict = {"CGFloat":"assign", "NSInteger":"assign", "NSUinteger":"assign", "BOOL":"assign", "NSString":"copy"}
p = addressbook.Person()
all_fields = p.DESCRIPTOR.fields_by_name
field_keys = all_fields.keys()
for key in field_keys:
one_field = all_fields[key]
typeNumStr = str(one_field.type)
className = typeDict.get(typeNumStr, "NSObject")
attrStr = attrDict.get(className, "retain")
propertyStr = "#property (nonatomic, %s) %s *%s" %(attrStr, className, key)
print propertyStr
For the addressbook example, it prints:
#property (nonatomic, copy) NSString *email
#property (nonatomic, copy) NSString *name
#property (nonatomic, retain) NSObject *phone
#property (nonatomic, assign) NSInteger *id
Not the final solution, but it means a lot. Thank you, Marc!

How to remove elements from JSON using Python

I receive the following data from an API request, I would like to be able to search the data and select the ID where Name = "Steve" (for instance 3 in the example below)
The data returned from the API always has a different number of elements in, and the location of 'Steve' can be in a different part of the returned string. The ID will also change.
(Getdata){
header =
(APIResponseHeader){
sessionToken = "xxxx"
}
Items[] =
(Summary){
Id = 1
Name = "John"
TypeId = 1
},
(Summary){
Id = 2
Name = "Jack"
TypeId = 1
},
(Summary){
Id = 3
Name = "Steve"
TypeId = 1
},
}
I think the format of the data is JSON(?) and I'm not sure how to convert, and then search it, if this is at all possible..?

Are there any tools can build object from text directly, like google protocol buffer?

At most log process systems, log file is tab separated text files, the schema of the file is provided separately.
for example.
12 tom tom#baidu.com
3 jim jim#baidu.com
the schema is
id : uint64
name : string
email : string
In order to find record like this person.name == 'tom' , The code is
for each_line in sys.stdin:
fields = each_line.strip().split('\t')
if feilds[1] == 'tom': # magic number
print each_line
There are a lot of magic numbers 1 2 3.
Are there some tools like google protocol buffer(It's for binary), So we can build the object from text directly?
Message Person {
uint64 id = 1;
string name = 2;
string email = 3;
}
so we than build person like this: person = lib.BuildFromText(line)
for each_line in sys.stdin:
person = lib.BuildFromText(each_line) # no magic number
if person.name == 'tom':
print each_line
import csv
Person = {
'id': int,
'name': str,
'email': str
}
persons = []
for row in csv.reader(open('CSV_FILE_NAME', 'r'), delimiter='\t'):
persons.append({item[0]: item[1](row[index]) for index, item in enumerate(Person.items())})
How does lib.BuildFromText() function suppose to know how to name fields? They are just values in the line you pass to it, right? Here is how to do it in Python:
import sys
from collections import namedtuple
Person = namedtuple('Person', 'id, name, email')
for each_line in sys.stdin:
person = Person._make(each_line.strip().split('\t'))
if person.name == 'tom':
print each_line

Categories