Python - Searching file lines for multiple patterns efficiently

Python - Searching file lines for multiple patterns efficiently - python

I am working on parsing a number of large files and want to be sure I'm doing it as efficiently as possible. One of the lines I am parsing looks like this (a Windows Security Event Log 4624):
Security/Microsoft-Windows-Security-Auditing ID [4624] :EventData/Data -> SubjectUserSid = S-1-0-0 SubjectUserName = - SubjectDomainName = - SubjectLogonId = 0x0000000000000000 TargetUserSid = S-1-1-11-1111111111-1111111111-1111111111-1111 TargetUserName = johndoe TargetDomainName = TestDomain TargetLogonId = 0x0000000001111111 LogonType = 3 LogonProcessName = NtLmSsp AuthenticationPackageName = NTLM WorkstationName = TestWorkstation LogonGuid = {00000000-0000-0000-0000-000000000000} TransmittedServices = - LmPackageName = NTLM V2 KeyLength = 128 ProcessId = 0x0000000000000000 ProcessName = - IpAddress = 1.1.1.1 IpPort = 11111
What I want to know is, what is the most efficient way to pull multiple fields from that line? I could partition the line repeatedly until I reach each field I am interested in, but I feel like repeatedly looping over the line is a waste of time/resources.
Is there an intelligent way to look at the line only once but pull out, for example, the following fields:
LogonType = 3
TargetUserName = johndoe
TargetUserSid = S-1-1-11-1111111111-1111111111-1111111111-1111
As an example, what I could do is repeat the following process:
part = line.partition('TargetUserName = ')[2]
username = part.partition(' ')[0]
to get each field that I want (the above example getting me just the username), but again that feels inefficient to me.
Is there a better way to handle it?

Each field name is a group of uppercase and lowercase characters. They are separated from their values by =. Each value is a group of non-whitespace characters. You can use re.findall with matching groups to locate all "letters = nonwhitespace" instances. This will give you a list of tuples, which you can either save or iterate through and pass to a format string:
>>> s = '''Security/Microsoft-Windows-Security-Auditing ID [4624] :EventData/Data -> SubjectUserSid = S-1-0-0 SubjectUserName = - SubjectDomainName = - SubjectLogonId = 0x0000000000000000 TargetUserSid = S-1-1-11-1111111111-1111111111-1111111111-1111 TargetUserName = johndoe TargetDomainName = TestDomain TargetLogonId = 0x0000000001111111 LogonType = 3 LogonProcessName = NtLmSsp AuthenticationPackageName = NTLM WorkstationName = TestWorkstation LogonGuid = {00000000-0000-0000-0000-000000000000} TransmittedServices = - LmPackageName = NTLM V2 KeyLength = 128 ProcessId = 0x0000000000000000 ProcessName = - IpAddress = 1.1.1.1 IpPort = 11111 '''
>>> import re
>>> for item in re.findall(r'([A-Za-z]+) = (\S+)', s):
... print('{} = {}'.format(*item))
...
SubjectUserSid = S-1-0-0
SubjectUserName = -
SubjectDomainName = -
SubjectLogonId = 0x0000000000000000
TargetUserSid = S-1-1-11-1111111111-1111111111-1111111111-1111
TargetUserName = johndoe
TargetDomainName = TestDomain
TargetLogonId = 0x0000000001111111
LogonType = 3
LogonProcessName = NtLmSsp
AuthenticationPackageName = NTLM
WorkstationName = TestWorkstation
LogonGuid = {00000000-0000-0000-0000-000000000000}
TransmittedServices = -
LmPackageName = NTLM
KeyLength = 128
ProcessId = 0x0000000000000000
ProcessName = -
IpAddress = 1.1.1.1
IpPort = 11111
You can also turn it into a dictionary for easy access:
>>> d = dict(re.findall(r'([A-Za-z]+) = (\S+)', s))
>>> d['LogonType']
'3'

st = 'Security/Microsoft-Windows-Security-Auditing ID [4624] :EventData/Data -> SubjectUserSid = S-1-0-0 SubjectUserName = - SubjectDomainName = - SubjectLogonId = 0x0000000000000000 TargetUserSid = S-1-1-11-1111111111-1111111111-1111111111-1111 TargetUserName = johndoe TargetDomainName = TestDomain TargetLogonId = 0x0000000001111111 LogonType = 3 LogonProcessName = NtLmSsp AuthenticationPackageName = NTLM WorkstationName = TestWorkstation LogonGuid = {00000000-0000-0000-0000-000000000000} TransmittedServices = - LmPackageName = NTLM V2 KeyLength = 128 ProcessId = 0x0000000000000000 ProcessName = - IpAddress = 1.1.1.1 IpPort = 11111';
using re module and re.findall you can I think get want you want
import re
li = re.findall(r'LogonType\s*=\s*\d+|TargetUserName\s*=\s*\w+|TargetUserSid\s*=\s*\w-.*?\s',st,re.MULTILINE| re.DOTALL)
>>>li
['TargetUserSid = S-1-1-11-1111111111-1111111111-1111111111-1111 ', 'TargetUserName = johndoe', 'LogonType = 3']

Related

Parsing logs to json Python

Folks,
I am trying to parse log file into json format.
I have a lot of logs, there is one of them
How can I parse this?
03:02:03.113 [info] ext_ref = BANK24AOS_cl_reqmarketcreditorderstate_6M8I1NT8JKYD_1591844522410384_4SGA08M8KIXQ reqid = 1253166 type = INREQ channel = BANK24AOS sid = msid_1591844511335516_KRRNBSLH2FS duration = 703.991 req_uri = marketcredit/order/state login = 77012221122 req_type = cl_req req_headers = {"accept-encoding":"gzip","connection":"close","host":"test-mobileapp-api.bank.kz","user-agent":"okhttp/4.4.1","x-forwarded-for":"212.154.169.134","x-real-ip":"212.154.169.134"} req_body = {"$sid":"msid_1591844511335516_KRRNBSLH2FS","$sid":"msid_1591844511335516_KRRNBSLH2FS","app":"bank","app_version":"2.3.2","channel":"aos","colvir_token":"GExPR0lOX1BBU1NXT1JEX0NMRUFSVEVYVFNzrzh4Thk1+MjDKWl/dDu1fQPsJ6gGLSanBp41yLRv","colvir_commercial_id":"-1","colvir_id":"000120.335980","openway_commercial_id":"6247520","openway_id":"6196360","$lang":"ru","ekb_id":"923243","inn":"990830221722","login":"77012221122","bank24_id":"262"} resp_body = {"task_id":"","status":"success","data":{"state":"init","applications":[{"status":"init","id":"123db561-34a3-4a8d-9fa7-03ed6377b44f","name":"Sulpak","amount":101000,"items":[{"name":"Switch CISCO x24","price":100000,"count":1,"amount":100000}]}],"segment":{"range":{"min":6,"max":36,"step":1},"payment_day":{"max":28,"min":1}}}}
Into this type of json, or any other format (but I guess json is best one)
{
"time":"03:02:03.113",
"class_req":"info",
"ext_ref":"BANK24AOS_cl_reqmarketcreditorderstate_6M8I1NT8JKYD_1591844522410384_4SGA08M8KIXQ",
"reqid":"1253166",
"type":"INREQ",
"channel":"BANK24AOS",
"sid":"msid_1591844511335516_KRRNBSLH2FS",
"duration":"703.991",
"req_uri":"marketcredit/order/state",
"login":"77012221122",
"req_type":"cl_req",
"req_headers":{
"accept-encoding":"gzip",
"connection":"close",
"host":"test-mobileapp-api.bank.kz",
"user-agent":"okhttp/4.4.1",
"x-forwarded-for":"212.154.169.134",
"x-real-ip":"212.154.169.134"
},
"req_body":{
"$sid":"msid_1591844511335516_KRRNBSLH2FS",
"$sid":"msid_1591844511335516_KRRNBSLH2FS",
"app":"bank",
"app_version":"2.3.2",
"channel":"aos",
"colvir_token":"GExPR0lOX1BBU1NXT1JEX0NMRUFSVEVYVFNzrzh4Thk1+MjDKWl/dDu1fQPsJ6gGLSanBp41yLRv",
"colvir_commercial_id":"-1",
"colvir_id":"000120.335980",
"openway_commercial_id":"6247520",
"openway_id":"6196360",
"$lang":"ru",
"ekb_id":"923243",
"inn":"990830221722",
"login":"77012221122",
"bank24_id":"262"
},
"resp_body":{
"task_id":"",
"status":"success",
"data":{
"state":"init",
"applications":[
{
"status":"init",
"id":"123db561-34a3-4a8d-9fa7-03ed6377b44f",
"name":"Sulpak",
"amount":101000,
"items":[
{
"name":"Switch CISCO x24",
"price":100000,
"count":1,
"amount":100000
}
]
}
],
"segment":{
"range":{
"min":6,
"max":36,
"step":1
},
"payment_day":{
"max":28,
"min":1
}
}
}
}
}
I am trying to split first whole text, but there I met another problem is to match keys to values depending on '=' sign. Also there might be some keys with empty values. For ex.:
type = INREQ channel = sid = duration = 1.333 (to get to know that there is an empty value, you need to pay attention on number of spaces. Usually there is 1 space between prev.value and next key). So this example should look like this:
{
"type":"INREQ",
"channel":"",
"sid":"",
"duration":"1.333"
}
Thanks ahead!

Here, one thing pass for duplicate key about "$sid":"msid_1591844511335516_KRRNBSLH2FS"
import re
text = """03:02:03.113 [info] ext_ref = reqid = 1253166 type = INREQ channel = BANK24AOS sid = msid_1591844511335516_KRRNBSLH2FS duration = 703.991 req_uri = marketcredit/order/state login = 77012221122 req_type = cl_req req_headers = {"accept-encoding":"gzip","connection":"close","host":"test-mobileapp-api.bank.kz","user-agent":"okhttp/4.4.1","x-forwarded-for":"212.154.169.134","x-real-ip":"212.154.169.134"} req_body = {"$sid":"msid_1591844511335516_KRRNBSLH2FS","$sid":"msid_1591844511335516_KRRNBSLH2FS","app":"bank","app_version":"2.3.2","channel":"aos","colvir_token":"GExPR0lOX1BBU1NXT1JEX0NMRUFSVEVYVFNzrzh4Thk1+MjDKWl/dDu1fQPsJ6gGLSanBp41yLRv","colvir_commercial_id":"-1","colvir_id":"000120.335980","openway_commercial_id":"6247520","openway_id":"6196360","$lang":"ru","ekb_id":"923243","inn":"990830221722","login":"77012221122","bank24_id":"262"} resp_body = {"task_id":"","status":"success","data":{"state":"init","applications":[{"status":"init","id":"123db561-34a3-4a8d-9fa7-03ed6377b44f","name":"Sulpak","amount":101000,"items":[{"name":"Switch CISCO x24","price":100000,"count":1,"amount":100000}]}],"segment":{"range":{"min":6,"max":36,"step":1},"payment_day":{"max":28,"min":1}}}}"""
index1 = text.index('[')
index2 = text.index(']')
new_text = 'time = '+ text[:index1-1] + ' class_req = ' + text[index1+1:index2] + text[index2+2:]
lst = re.findall(r'\S+? = |\S+? = \{.*?\} |\S+? = \{.*?\}$|\S+? = \S+? ', new_text)
res = {}
for item in lst:
key, equal, value = item.partition('=')
key, value = key.strip(), value.strip()
if value.startswith('{'):
try:
value = json.loads(value)
except:
print(value)
res[key] = value

you can try regulation in python.
here is what i write, it works for your problem.
for convenience i deleted string before "ext_ref...",you can directly truncate the raw string.
import re
import json
string = 'ext_ref = BANK24AOS_cl_reqmarketcreditorderstate_6M8I1NT8JKYD_1591844522410384_4SGA08M8KIXQ reqid = 1253166 type = INREQ channel = BANK24AOS sid = msid_1591844511335516_KRRNBSLH2FS duration = 703.991 req_uri = marketcredit/order/state login = 77012221122 req_type = cl_req req_headers = {"accept-encoding":"gzip","connection":"close","host":"test-mobileapp-api.bank.kz","user-agent":"okhttp/4.4.1","x-forwarded-for":"212.154.169.134","x-real-ip":"212.154.169.134"} req_body = {"$sid":"msid_1591844511335516_KRRNBSLH2FS","$sid":"msid_1591844511335516_KRRNBSLH2FS","app":"bank","app_version":"2.3.2","channel":"aos","colvir_token":"GExPR0lOX1BBU1NXT1JEX0NMRUFSVEVYVFNzrzh4Thk1+MjDKWl/dDu1fQPsJ6gGLSanBp41yLRv","colvir_commercial_id":"-1","colvir_id":"000120.335980","openway_commercial_id":"6247520","openway_id":"6196360","$lang":"ru","ekb_id":"923243","inn":"990830221722","login":"77012221122","bank24_id":"262"} resp_body = {"task_id":"","status":"success","data":{"state":"init","applications":[{"status":"init","id":"123db561-34a3-4a8d-9fa7-03ed6377b44f","name":"Sulpak","amount":101000,"items":[{"name":"Switch CISCO x24","price":100000,"count":1,"amount":100000}]}],"segment":{"range":{"min":6,"max":36,"step":1},"payment_day":{"max":28,"min":1}}}}'
position = re.search("req_headers",string) # position of req_headers
resp_body_pos = re.search("resp_body",string)
resp_body = string[resp_body_pos.span()[0]:]
res1 = {}
res1.setdefault(resp_body.split("=")[0],resp_body.split("=")[1])
print(res1)
before = string[:position.span()[0]]
after = string[position.span()[0]:resp_body_pos.span()[0]] # handle req_body seperately
res2 = re.findall("(\S+) = (\S+)",before)
print(res2)
res3 = re.findall("(\S+) = ({.*?})",after)
print(res3)
#res1 type: dict{'resp_body':'...'} content in resp_body
#res2 type: list[(),()..] content before req_head
#res3 type: list[(),()..] the rest content
and now you can do what you want to do with the data(.e.g. transform it into json respectively)
Hope this is helpful

Proper way to format date for Fedex API XML

I have a Django application where I am trying to make a call to Fedex's API to send out a shipping label for people wanting to send in a product for cash. When I try to make the call though it says there is a data validation issue with the Expiration field in the XML I am filling out. I swear this has worked in the past with me formatting the date as "YYYY-MM-DD", but now it is not. I read that with Fedex, you need to format the date as ISO, but that is also not passing the data validation. I am using a python package created to help with tapping Fedex's API.
Django view function for sending API Call
def Fedex(request, quote):
label_link = ''
expiration_date = datetime.datetime.now() + datetime.timedelta(days=10)
# formatted_date = "%s-%s-%s" % (expiration_date.year, expiration_date.month, expiration_date.day)
formatted_date = expiration_date.replace(microsecond=0).isoformat()
if quote.device_type != 'laptop':
box_length = 9
box_width = 12
box_height = 3
else:
box_length = 12
box_width = 14
box_height = 3
logging.basicConfig(stream=sys.stdout, level=logging.INFO)
## Page 411 of FedEx Dev Guide - 20.14 Email Labels
CONFIG_OBJ = FedexConfig(key=settings.FEDEX_KEY, password=settings.FEDEX_PASSWORD, account_number=settings.FEDEX_ACCOUNT,
meter_number=settings.FEDEX_METER, use_test_server=settings.USE_FEDEX_TEST)
fxreq = FedexCreatePendingShipRequestEmail(CONFIG_OBJ, customer_transaction_id='xxxxxx id:01')
fxreq.RequestedShipment.ServiceType = 'FEDEX_GROUND'
fxreq.RequestedShipment.PackagingType = 'YOUR_PACKAGING'
fxreq.RequestedShipment.DropoffType = 'REGULAR_PICKUP'
fxreq.RequestedShipment.ShipTimestamp = datetime.datetime.now()
# Special fields for the email label
fxreq.RequestedShipment.SpecialServicesRequested.SpecialServiceTypes = ('RETURN_SHIPMENT', 'PENDING_SHIPMENT')
fxreq.RequestedShipment.SpecialServicesRequested.PendingShipmentDetail.Type = 'EMAIL'
fxreq.RequestedShipment.SpecialServicesRequested.PendingShipmentDetail.ExpirationDate = formatted_date
email_address = fxreq.create_wsdl_object_of_type('EMailRecipient')
email_address.EmailAddress = quote.email
email_address.Role = 'SHIPMENT_COMPLETOR'
# RETURN SHIPMENT DETAIL
fxreq.RequestedShipment.SpecialServicesRequested.ReturnShipmentDetail.ReturnType = ('PENDING')
fxreq.RequestedShipment.SpecialServicesRequested.ReturnShipmentDetail.ReturnEMailDetail = fxreq.create_wsdl_object_of_type(
'ReturnEMailDetail')
fxreq.RequestedShipment.SpecialServicesRequested.ReturnShipmentDetail.ReturnEMailDetail.MerchantPhoneNumber = 'x-xxx-xxx-xxxx'
fxreq.RequestedShipment.SpecialServicesRequested.PendingShipmentDetail.EmailLabelDetail.Recipients = [email_address]
fxreq.RequestedShipment.SpecialServicesRequested.PendingShipmentDetail.EmailLabelDetail.Message = "Xxxxxx Xxxxxx"
fxreq.RequestedShipment.LabelSpecification = {'LabelFormatType': 'COMMON2D', 'ImageType': 'PDF'}
fxreq.RequestedShipment.Shipper.Contact.PersonName = quote.first_name + ' ' + quote.last_name
fxreq.RequestedShipment.Shipper.Contact.CompanyName = ""
fxreq.RequestedShipment.Shipper.Contact.PhoneNumber = quote.phone
fxreq.RequestedShipment.Shipper.Address.StreetLines.append(quote.address)
fxreq.RequestedShipment.Shipper.Address.City = quote.city
fxreq.RequestedShipment.Shipper.Address.StateOrProvinceCode = quote.state
fxreq.RequestedShipment.Shipper.Address.PostalCode = quote.zip
fxreq.RequestedShipment.Shipper.Address.CountryCode = settings.FEDEX_COUNTRY_CODE
fxreq.RequestedShipment.Recipient.Contact.PhoneNumber = settings.FEDEX_PHONE_NUMBER
fxreq.RequestedShipment.Recipient.Address.StreetLines = settings.FEDEX_STREET_LINES
fxreq.RequestedShipment.Recipient.Address.City = settings.FEDEX_CITY
fxreq.RequestedShipment.Recipient.Address.StateOrProvinceCode = settings.FEDEX_STATE_OR_PROVINCE_CODE
fxreq.RequestedShipment.Recipient.Address.PostalCode = settings.FEDEX_POSTAL_CODE
fxreq.RequestedShipment.Recipient.Address.CountryCode = settings.FEDEX_COUNTRY_CODE
fxreq.RequestedShipment.Recipient.AccountNumber = settings.FEDEX_ACCOUNT
fxreq.RequestedShipment.Recipient.Contact.PersonName = ''
fxreq.RequestedShipment.Recipient.Contact.CompanyName = 'Xxxxxx Xxxxxx'
fxreq.RequestedShipment.Recipient.Contact.EMailAddress = 'xxxxxx#xxxxxxxxx'
# Details of Person Who is Paying for the Shipping
fxreq.RequestedShipment.ShippingChargesPayment.PaymentType = 'SENDER'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.AccountNumber = settings.FEDEX_ACCOUNT
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Contact.PersonName = 'Xxxxx Xxxxx'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Contact.CompanyName = 'Xxxxx Xxxxxx'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Contact.PhoneNumber = 'x-xxx-xxx-xxxx'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Contact.EMailAddress = 'xxxxxxx#xxxxxxxxx'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Address.StreetLines = 'Xxxxx N. xXxxxxx'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Address.City = 'Xxxxxxx'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Address.StateOrProvinceCode = 'XX'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Address.PostalCode = 'xxxxx'
fxreq.RequestedShipment.ShippingChargesPayment.Payor.ResponsibleParty.Address.CountryCode = 'US'
# Package Info
package1 = fxreq.create_wsdl_object_of_type('RequestedPackageLineItem')
package1.SequenceNumber = '1'
package1.Weight.Value = 1
package1.Weight.Units = "LB"
package1.Dimensions.Length = box_length
package1.Dimensions.Width = box_width
package1.Dimensions.Height = box_height
package1.Dimensions.Units = "IN"
package1.ItemDescription = 'Phone'
fxreq.RequestedShipment.RequestedPackageLineItems.append(package1)
fxreq.RequestedShipment.PackageCount = '1'
try:
fxreq.send_request()
label_link = str(fxreq.response.CompletedShipmentDetail.AccessDetail.AccessorDetails[0].EmailLabelUrl)
except Exception as exc:
print('Fedex Error')
print('===========')
print(exc)
print('==========')
return label_link
Error Log
Error:cvc-datatype-valid.1.2.1: \\'2017-11-3\\' is not a valid value for \\'date\\'.\\ncvc-type.3.1.3: The value \\'2017-11-3\\' of element \\'ns0:ExpirationDate\\' is not valid."\\n }\\n }' (Error code: -1)

How to send long message using smpp.twisted python library

How to send long message using smpp.twisted python library.
I know how to send short message before 160 bytes.
For example:
ESME_NUM = '9090'
phone = '123456'
short_message = 'There is not short message.'*15
submit_pdu = SubmitSM(
source_addr=ESME_NUM,
destination_addr=phone,
short_message=short_message,
source_addr_ton=SOURCE_ADDR_TON,
dest_addr_ton=DEST_ADDR_TON,
dest_addr_npi=DEST_ADDR_NPI,
esm_class=EsmClass(EsmClassMode.DEFAULT, EsmClassType.DEFAULT),
protocol_id=0,
registered_delivery=RegisteredDelivery(
RegisteredDeliveryReceipt.SMSC_DELIVERY_RECEIPT_REQUESTED),
replace_if_present_flag=ReplaceIfPresentFlag.DO_NOT_REPLACE,
data_coding=DataCoding(DataCodingScheme.DEFAULT, DataCodingDefault.UCS2),
)
submitSMDeferred = smpp.sendDataRequest(submit_pdu)
If short_message > 160 b I cannot send message.
I found solution with using message_payload, but sms split on parts.
length = len(short_message)
splitat = 160
parts = length/splitat +1
submit_pdu = None
submitSMDeferred = defer.Deferred()
if length > splitat:
for k in range(parts):
msgpart = short_message[k*splitat:k*splitat+splitat]
self.logger.info('%s - %s - %s' % (msgpart, parts, k))
submit_pdu = SubmitSM(
source_addr=self.ESME_NUM,
destination_addr=source_addr,
source_addr_ton=self.SOURCE_ADDR_TON,
dest_addr_ton=self.DEST_ADDR_TON,
dest_addr_npi=self.DEST_ADDR_NPI,
esm_class=EsmClass(EsmClassMode.DEFAULT, EsmClassType.DEFAULT),
protocol_id=0,
registered_delivery=RegisteredDelivery(
RegisteredDeliveryReceipt.SMSC_DELIVERY_RECEIPT_REQUESTED),
replace_if_present_flag=ReplaceIfPresentFlag.DO_NOT_REPLACE,
data_coding=DataCoding(DataCodingScheme.DEFAULT, DataCodingDefault.UCS2),
sar_msg_ref_num = 1,
sar_total_segments = parts,
sar_segment_seqnum = k+1,
message_payload=msgpart
)
submitSMDeferred = smpp.sendDataRequest(submit_pdu)

I'm sorry. You just can use message_payload instead short_message.
Example:
ESME_NUM = '9090'
phone = '123456'
short_message = 'There is not short message.'*15
submit_pdu = SubmitSM(
source_addr=ESME_NUM,
destination_addr=phone,
message_payload=short_message,
source_addr_ton=SOURCE_ADDR_TON,
dest_addr_ton=DEST_ADDR_TON,
dest_addr_npi=DEST_ADDR_NPI,
esm_class=EsmClass(EsmClassMode.DEFAULT, EsmClassType.DEFAULT),
protocol_id=0,
registered_delivery=RegisteredDelivery(
RegisteredDeliveryReceipt.SMSC_DELIVERY_RECEIPT_REQUESTED),
replace_if_present_flag=ReplaceIfPresentFlag.DO_NOT_REPLACE,
data_coding=DataCoding(DataCodingScheme.DEFAULT, DataCodingDefault.UCS2),
)
smpp.sendDataRequest(submit_pdu).addBoth(message_sent)
Stay it here.

convert contents of metadata file into variables list

Hi I m wanting to convert the contents of a file (in this case a Landsat 7 metadata file) into a series of variables defined by the contents of the file using Python 2.7. The file contents looks like this:
GROUP = L1_METADATA_FILE
GROUP = METADATA_FILE_INFO
ORIGIN = "Image courtesy of the U.S. Geological Survey"
REQUEST_ID = "0101305309253_00043"
LANDSAT_SCENE_ID = "LE71460402010069SGS00"
FILE_DATE = 2013-06-02T11:19:59Z
STATION_ID = "SGS"
PROCESSING_SOFTWARE_VERSION = "LPGS_12.2.1"
DATA_CATEGORY = "NOMINAL"
END_GROUP = METADATA_FILE_INFO
GROUP = PRODUCT_METADATA
DATA_TYPE = "L1T"
ELEVATION_SOURCE = "GLS2000"
OUTPUT_FORMAT = "GEOTIFF"
EPHEMERIS_TYPE = "DEFINITIVE"
SPACECRAFT_ID = "LANDSAT_7"
SENSOR_ID = "ETM"
SENSOR_MODE = "BUMPER"
WRS_PATH = 146
WRS_ROW = 040
DATE_ACQUIRED = 2010-03-10
GROUP = IMAGE_ATTRIBUTES
CLOUD_COVER = 0.00
IMAGE_QUALITY = 9
SUN_AZIMUTH = 137.38394502
SUN_ELEVATION = 48.01114126
GROUND_CONTROL_POINTS_MODEL = 55
GEOMETRIC_RMSE_MODEL = 3.790
GEOMETRIC_RMSE_MODEL_Y = 2.776
GEOMETRIC_RMSE_MODEL_X = 2.580
END_GROUP = IMAGE_ATTRIBUTES
Example of interested variable items:
GROUP = MIN_MAX_RADIANCE
RADIANCE_MAXIMUM_BAND_1 = 293.700
RADIANCE_MINIMUM_BAND_1 = -6.200
RADIANCE_MAXIMUM_BAND_2 = 300.900
RADIANCE_MINIMUM_BAND_2 = -6.400
RADIANCE_MAXIMUM_BAND_3 = 234.400
RADIANCE_MINIMUM_BAND_3 = -5.000
RADIANCE_MAXIMUM_BAND_4 = 241.100
RADIANCE_MINIMUM_BAND_4 = -5.100
RADIANCE_MAXIMUM_BAND_5 = 47.570
RADIANCE_MINIMUM_BAND_5 = -1.000
RADIANCE_MAXIMUM_BAND_6_VCID_1 = 17.040
RADIANCE_MINIMUM_BAND_6_VCID_1 = 0.000
RADIANCE_MAXIMUM_BAND_6_VCID_2 = 12.650
RADIANCE_MINIMUM_BAND_6_VCID_2 = 3.200
RADIANCE_MAXIMUM_BAND_7 = 16.540
RADIANCE_MINIMUM_BAND_7 = -0.350
RADIANCE_MAXIMUM_BAND_8 = 243.100
RADIANCE_MINIMUM_BAND_8 = -4.700
END_GROUP = MIN_MAX_RADIANCE
I am open to other ideas as I don't need all entries as variables, just a selection. And I see some headers are listed more than once. i.e. GROUP is used multiple times. I need to be able to select certain variables (integer values) and use in formulas in other areas of code. ANY help would be appreciated (novice python coder).

I'm not sure exactly what you are looking for, but maybe something like this:
s = '''GROUP = L1_METADATA_FILE
GROUP = METADATA_FILE_INFO
ORIGIN = "Image courtesy of the U.S. Geological Survey"
REQUEST_ID = "0101305309253_00043"
LANDSAT_SCENE_ID = "LE71460402010069SGS00"
FILE_DATE = 2013-06-02T11:19:59Z
STATION_ID = "SGS"
PROCESSING_SOFTWARE_VERSION = "LPGS_12.2.1"
DATA_CATEGORY = "NOMINAL"
END_GROUP = METADATA_FILE_INFO
GROUP = PRODUCT_METADATA
DATA_TYPE = "L1T"
ELEVATION_SOURCE = "GLS2000"
OUTPUT_FORMAT = "GEOTIFF"
EPHEMERIS_TYPE = "DEFINITIVE"
SPACECRAFT_ID = "LANDSAT_7"
SENSOR_ID = "ETM"
SENSOR_MODE = "BUMPER"
WRS_PATH = 146
WRS_ROW = 040
DATE_ACQUIRED = 2010-03-10'''
output = {} #Dict
for line in s.split("\n"): #Iterates through every line in the string
l = line.split("=") #Seperate by "=" and put into a list
output[l[0].strip()] = l[1].strip() #First word is key, second word is value
print output #Output is a dictonary containing all key-value pairs in your metadata seperated by "="
print output["SENSOR_ID"] #Outputs "ETM"
==============
Edited:
f = open('metadata.txt', 'r') #open file for reading
def build_data(f): #build dictionary
output = {} #Dict
for line in f.readlines(): #Iterates through every line in the string
if "=" in line: #make sure line has data as wanted
l = line.split("=") #Seperate by "=" and put into a list
output[l[0].strip()] = l[1].strip() #First word is key, second word is value
return output #Returns a dictionary with the key, value pairs.
data = build_data(f)
print data["IMAGE_QUALITY"] #prints 9

An strange python "if" syntax error

I get this error: Invaild syntax in my "if" statement and rly can't figur why, can anyone of you guys help me? I'm using python 3.2
here is the part of my code whit the error my code:
L = list()
LT = list()
tn = 0
players = 0
newplayer = 0
newplayerip = ""
gt = "start"
demsg = "start"
time = 1
status = 0
day = 1
conclient = 1
print("DONE! The UDP Server is now started and Waiting for client's on port 5000")
while 1:
try:
data, address = server_socket.recvfrom(1024)
if not data: break
################### reciving data! ###################
UPData = pickle.loads(data)
status = UPData[0][[0][0]
if status > 998: ##### it is here the error are given####
try:
e = len(L)
ori11 = UPData[0][1][0]
ori12 = UPData[0][1][1]
ori13 = UPData[0][1][2]
ori14 = UPData[0][1][3]
ori21 = UPData[0][1][4]
ori22 = UPData[0][1][5]
ori23 = UPData[0][1][6]
ori24 = UPData[0][1][7]
ori31 = UPData[0][2][0]
ori32 = UPData[0][2][1]
ori33 = UPData[0][2][2]
ori34 = UPData[0][2][3]
ori41 = UPData[0][2][4]
ori42 = UPData[0][2][5]
ori43 = UPData[0][2][6]
ori44 = UPData[0][2][7]
ori51 = UPData[0][3][0]
ori52 = UPData[0][3][1]
ori53 = UPData[0][3][2]
ori54 = UPData[0][3][3]
ori61 = UPData[0][3][4]
ori62 = UPData[0][3][5]
ori63 = UPData[0][3][6]
ori64 = UPData[0][3][7]
ori71 = UPData[0][4][0]
ori72 = UPData[0][4][1]
ori73 = UPData[0][4][2]
ori74 = UPData[0][4][3]
ori81 = UPData[0][4][4]
ori82 = UPData[0][4][5]
ori83 = UPData[0][4][6]
ori84 = UPData[0][4][7]
ori91 = UPData[0][5][0]
ori92 = UPData[0][5][1]
ori93 = UPData[0][5][2]
ori94 = UPData[0][5][3]
ori101 = UPData[0][5][4]
ori102 = UPData[0][5][5]
ori103 = UPData[0][5][6]
ori104 = UPData[0][5][7]
npcp11 = UPData[0][6][0]
npcp12 = UPData[0][6][1]
npcp13 = UPData[0][6][2]
npcp21 = UPData[0][6][3]
npcp22 = UPData[0][6][4]
npcp23 = UPData[0][6][5]
npcp31 = UPData[0][6][6]
npcp32 = UPData[0][6][7]
npcp33 = UPData[0][7][0]
npcp41 = UPData[0][7][1]
npcp42 = UPData[0][7][2]
npcp43 = UPData[0][7][3]
npcp51 = UPData[0][7][4]
npcp52 = UPData[0][7][5]
npcp53 = UPData[0][7][6]
npcp61 = UPData[0][7][7]
npcp62 = UPData[0][8][0]
npcp63 = UPData[0][8][1]
npcp71 = UPData[0][8][2]
npcp72 = UPData[0][8][3]
npcp73 = UPData[0][8][4]
npcp81 = UPData[0][8][5]
npcp82 = UPData[0][8][6]
npcp83 = UPData[0][8][7]
npcp91 = UPData[1][0][0]
npcp92 = UPData[1][0][1]
npcp93 = UPData[1][0][2]
npcp101 = UPData[1][0][3]
npcp102 = UPData[1][0][4]
npcp103 = UPData[1][0][5]
d0 = (status, )
d1 = (ori11,ori12,ori13,ori14,ori21,ori22,ori23,ori24)
d2 = (ori31,ori32,ori33,ori34,ori41,ori42,ori43,ori44)
d3 = (ori51,ori52,ori53,ori54,ori61,ori62,ori63,ori64)
d4 = (ori71,ori72,ori73,ori74,ori81,ori82,ori83,ori84)
d5 = (ori91,ori92,ori93,ori94,ori101,ori102,ori103,ori104)
d6 = (npcp11,npcp21,npcp31,npcp21,npcp22,npcp23,npcp31,npcp32)
d7 = (npcp33,npcp41,npcp42,npcp43,npcp51,npcp52,npcp53,npcp61)
d8 = (npcp62,npcp63,npcp71,npcp72,npcp72,npcp81,npcp82,npcp83)
d9 = (npcp91,npcp92,npcp93,npcp101,npcp102,npcp103)
pack1 = (d0,d1,d2,d3,d4,d5,d6,d7,d8)
pack2 = (d9, )
dat = pickle.dumps((pack1,pack2))
while tn < e:
server_socket.sendto(dat, (L[tn],3560))
tn = tn + 1
except:
pass
print("could not send data to some one or could not run the server at all")
else:
the part where the console tells me my error is is here:
if status > 998:

The problem is here:
status = UPData[0][[0][0]
The second opened bracket [ is not closed. The Python compiler keeps looking for the closing bracket, finds if on the next line and gets confused because if is not supposed to be inside brackets.
You may want to remove this bracket, or close it, according to your specific needs (the structure of UPData)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - Searching file lines for multiple patterns efficiently - python

Related

Parsing logs to json Python

Proper way to format date for Fedex API XML

How to send long message using smpp.twisted python library

convert contents of metadata file into variables list

An strange python "if" syntax error

Categories

Resources