How to parse text output separated with space

How to parse text output separated with space - python

First lines are field names. Others are values but if no corresponding data, values are filled with spaces.
In particular, bindings has no values in SHORTNAMES and APIGROUP.
pods has no value in APIGROUP
$ kubectl api-resources
NAME SHORTNAMES APIGROUP NAMESPACED KIND
bindings true Binding
pods po true Pod
deployments deploy apps true Deployment
Finally, I would like to treat output data as python dict, which key is field name.
First of all, it seems to replace spaced no value with the dummy value by regex.
NAME SHORTNAMES APIGROUP NAMESPACED KIND
bindings no-value no-value true Binding
Is it possibe?

Here is a solution with regex.
import re
data = """NAME SHORTNAMES APIGROUP NAMESPACED KIND
bindings true Binding
pods po true Pod
deployments deploy apps true Deployment"""
regex = re.compile(
"(?P<name>\S+)\s+"
"(?P<shortname>\S+)\s+"
"(?P<group>\S+)\s+"
"(?P<namespace>\S+)\s+"
"(?P<kind>\S+)"
)
header = data.splitlines()[0]
for match in regex.finditer(header):
name_index = match.start('name')
shortname_index = match.start('shortname')
group_index = match.start('group')
namespace_index = match.start('namespace')
kind_index = match.start('kind')
def get_text(line, index):
result = ''
for char in line[index:]:
if char == ' ':
break
result += char
if result:
return result
else:
return "no-value"
resources = []
for line in data.splitlines()[1:]:
resources.append({
"name" : get_text(line, name_index),
"shortname": get_text(line, shortname_index),
"group": get_text(line, group_index),
"namespace": get_text(line, namespace_index),
"kind": get_text(line, kind_index)
})
print(resources)
And the output is(formatted):
[
{
'name': 'bindings',
'shortname': 'no-value',
'group': 'no-value',
'namespace': 'true',
'kind': 'Binding'
},
{
'name': 'pods',
'shortname': 'po',
'group': 'no-value',
'namespace': 'true',
'kind': 'Pod'
},
{
'name': 'deployments',
'shortname': 'deploy',
'group': 'apps',
'namespace': 'true',
'kind': 'Deployment'
}
]

Related

How to build a nested dictionary of varying depth using for loop?

Given a Pandas table of thousands of rows, where the left most spaces of a row determine if it's a sub structure of the above row or not.
Parameter | Value
'country' 'Germany'
' city' 'Berlin'
' area' 'A1'
' city' 'Munchen'
' comment' 'a comment'
'country' 'France'
' city' 'Paris'
' comment' 'a comment'
'state' 'California'
' comment' '123'
Where I have information about if a parameter is a list or not.
{
'country': list,
'city': list
'state': list
}
I would want to create the following nested structure
{
"country": [
{
"Germany": {
"city": [
{
"Berlin": {
"area": "A1"
}
},
{
"Munchen": {
"comment": "a comment"
}
}
]
}
},
{
"France": {
"city": [
{
"Paris": {
"comment": "a comment"
}
}
]
}
}
],
"state": [
{
"California": {
"comment": 123
}
}
]
}
Since the knowledge about what level the sub structure depends on only the row before, I thought that a for loop would be good. But I am clearly missing something fundamental about creating nested dictionaries using for loops. It could be a recursive solution as well, but I am unsure if it would be easier here.
This is my current attempt which is obviously a mess.
import pandas as pd
params = ['country',' city',' area',' city',' comment','country',' city',' comment','state',' comment']
vals = ['Germany','Berlin','A1','Munich','acomment','France','Paris','acomment','California','123']
conf = {'country':'list','city':'list'}
df = pd.DataFrame()
df['param'] = params
df['vals']= vals
output_dict = dict()
level_path = dict()
for param,vals in df.values:
d = output_dict
hiearchy_level = sum( 1 for _ in itertools.takewhile(str.isspace,param)) ## Count number of left most spaces
param = param.lstrip()
if hiearchy_level > 0:
base_path = level_path[str(hiearchy_level-1)]
else:
base_path = []
path = base_path + [param]
for p in path:
if p in conf: ## It should be a list
d.setdefault(p,[{}])
d = d[p][-1] ## How to understand if I should push a new list element or enter an existing one?
else:
d.setdefault(p,{})
d = d[p]
d[param] = vals
level_path[str(hiearchy_level)] = path
and the output being
{'country': [{'country': 'France',
'city': [{'city': 'Paris',
'area': {'area': 'A1'},
'comment': {'comment': 'a comment'}}]}],
'state': {'state': 'California', 'comment': {'comment': '123'}}}
I don't understand how I should be able to step in and out of the list elements in the for loop, knowing if I should push a new dictionary or enter an already existing one.
Any input on what I am missing would be appreciated.

Search for dictionary value and output "address" in python

I have a list of dictionaries, with a list value (of dictionaries), etc etc.
filesystem = [
{
"type":"dir",
"name":"examples",
"ext":"",
"contents":[
{
"type":"file",
"name":"text_document",
"ext":"txt",
"contents":"This is a text document.\nIt has 2 lines!"
}
]
},
{
"type":"file",
"name":"helloworld",
"ext":"py",
"contents":"print(\"Hello world\")"
}
]
I need a way to search for a dictionary. For example, I want to get the examples folder. I want to write a path: /examples and search for the directory the path directs to. This needs to work with nested directories as well.
I have tried to match to a dictionary using wildcards:
target = {
"type":"dir",
"name":currentSearchDir,
"ext":"",
"contents":*
}
if currentSearch == target:
print("found")
but, of course, it doesn't work.
Thanks.

Here is a recursive search:
filesystem = [
{
"type":"dir",
"name":"examples",
"ext":"",
"contents":[
{
"type":"file",
"name":"text_document",
"ext":"txt",
"contents":"This is a text document.\nIt has 2 lines!"
}
]
},
{
"type":"file",
"name":"helloworld",
"ext":"py",
"contents":"print(\"Hello world\")"
}
]
def search(data, name):
for entry in data:
if entry['name'] == name:
return entry
if isinstance( entry['contents'], list ):
sub = search( entry['contents'], name )
if sub:
return sub
return None
print( search( filesystem, "examples" ) )
print( search( filesystem, "text_document" ) )
Output:
{'type': 'dir', 'name': 'examples', 'ext': '', 'contents': [{'type': 'file', 'name': 'text_document', 'ext': 'txt', 'contents': 'This is a text document.\nIt has 2 lines!'}]}
{'type': 'file', 'name': 'text_document', 'ext': 'txt', 'contents': 'This is a text document.\nIt has 2 lines!'}

Google DLP: "ValueError: Protocol message Value has no "stringValue" field."

I have a method where I build a table for multiple items for Google's DLP inspect API which can take either a ContentItem, or a table of values
Here is how the request is constructed:
def redact_text(text_list):
dlp = google.cloud.dlp.DlpServiceClient()
project = 'my-project'
parent = dlp.project_path(project)
items = build_item_table(text_list)
info_types = [{'name': 'EMAIL_ADDRESS'}, {'name': 'PHONE_NUMBER'}]
inspect_config = {
'min_likelihood': "LIKELIHOOD_UNSPECIFIED",
'include_quote': True,
'info_types': info_types
}
response = dlp.inspect_content(parent, inspect_config, items)
return response
def build_item_table(text_list):
rows = []
for item in text_list:
row = {"values": [{"stringValue": item}]}
rows.append(row)
table = {"table": {"headers": [{"name": "something"}], "rows": rows}}
return table
When I run this I get back the error ValueError: Protocol message Value has no "stringValue" field. Even though the this example and the docs say otherwise.
Is there something off in how I build the request?
Edit: Here's the output from build_item_table
{
'table':
{
'headers':
[
{'name': 'value'}
],
'rows':
[
{
'values':
[
{
'stringValue': 'My name is Jenny and my number is (555) 867-5309, you can also email me at anemail#gmail.com, another email you can reach me at is email#email.com. '
}
]
},
{
'values':
[
{
'stringValue': 'Jimbob Doe (555) 111-1233, that one place down the road some_email#yahoo.com'
}
]
}
]
}
}

Try string_value .... python uses the field names, not the type name.

how to print domain name for route53 DNS

I am new to python and learning, I am writing a code to print the domain name and type and value of a route53 hosted zone.
When it loops through CNAME, I get the value of CNAME and not its domain name.
def list(zoneid, region, profile):
rrs = []
aws_session = boto3.session.Session(region=region, profile=profile)
route53 = aws_session.client('route53')
paginator = route53.get_paginator('list_resource_record_sets')
page = paginator.paginate(
HostedZoneId=zoneid,
PaginationConfig={
'MaxItems': 500,
'PageSize': 500
}
)
for i in page:
for record in i['ResourceRecordSets']:
if record['Type'] == 'CNAME':
a.extend(x['Value'] for x in record['ResourceRecords'])
elif record['Type'] == 'A':
a.append(record['Name'])
return a
print record[Name] - gives the domain name. But how can I include it in "a.extend(x['Value'] for x in record['ResourceRecords'])" this line

There is no Value key available for any of the resource record as can be seen from the sample response on the docs:
{
'ResourceRecordSets': [
{
'Name': 'string',
'Type': 'SOA'|'A'|'TXT'|'NS'|'CNAME'|'MX'|'NAPTR'|'PTR'|'SRV'|'SPF'|'AAAA'|'CAA',
'SetIdentifier': 'string',
'Weight': 123,
'Region': 'us-east-1'|'us-east-2'|'us-west-1'|'us-west-2'|'ca-central-1'|'eu-west-1'|'eu-west-2'|'eu-west-3'|'eu-central-1'|'ap-southeast-1'|'ap-southeast-2'|'ap-northeast-1'|'ap-northeast-2'|'sa-east-1'|'cn-north-1'|'cn-northwest-1'|'ap-south-1',
'GeoLocation': {
'ContinentCode': 'string',
'CountryCode': 'string',
'SubdivisionCode': 'string'
},
'Failover': 'PRIMARY'|'SECONDARY',
'MultiValueAnswer': True|False,
'TTL': 123,
'ResourceRecords': [
{
'Value': 'string'
},
],
'AliasTarget': {
'HostedZoneId': 'string',
'DNSName': 'string',
'EvaluateTargetHealth': True|False
},
'HealthCheckId': 'string',
'TrafficPolicyInstanceId': 'string'
},
],
'IsTruncated': True|False,
'MaxItems': 'string',
'NextToken': 'string'
}
I think you just want to refer to the Name key:
Name (string) --
The name of the domain you want to perform the action on.
Enter a fully qualified domain name, for example, www.example.com.
You can optionally include a trailing dot. If you omit the trailing
dot, Amazon Route 53 still assumes that the domain name that you
specify is fully qualified. This means that Amazon Route 53 treats
www.example.com (without a trailing dot) and www.example.com.
(with a trailing dot) as identical.

Combination of two fields to be unique in Python Eve

In Python Eve framework, is it possible to have a condition which checks combination of two fields to be unique?
For example the below definition restricts only firstname and lastname to be unique for items in the resource.
people = {
# 'title' tag used in item links.
'item_title': 'person',
'schema': {
'firstname': {
'type': 'string',
'required': True,
'unique': True
},
'lastname': {
'type': 'string',
'required': True,
'unique': True
}
}
Instead, is there a way to restrict firstname and lastname combination to be unique?
Or is there a way to implement a CustomValidator for this?

You can probably achieve what you want by overloading the _validate_unique and implementing custom logic there, taking advantage of self.document in order to retrieve the other field value.
However, since _validate_unique is called for every unique field, you would end up performing your custom validation twice, once for firstname and then for lastname. Not really desirable. Of course the wasy way out is setting up fullname field, but I guess that's not an option in your case.
Have you considered going for a slighty different design? Something like:
{'name': {'first': 'John', 'last': 'Doe'}}
Then all you need is make sure that name is required and unique:
{
'name': {
'type':'dict',
'required': True,
'unique': True,
'schema': {
'first': {'type': 'string'},
'last': {'type': 'string'}
}
}
}

Inspired by Nicola and _validate_unique.
from eve.io.mongo import Validator
from eve.utils import config
from flask import current_app as app
class ExtendedValidator(Validator):
def _validate_unique_combination(self, unique_combination, field, value):
""" {'type': 'list'} """
self._is_combination_unique(unique_combination, field, value, {})
def _is_combination_unique(self, unique_combination, field, value, query):
""" Test if the value combination is unique.
"""
if unique_combination:
query = {k: self.document[k] for k in unique_combination}
query[field] = value
resource_config = config.DOMAIN[self.resource]
# exclude soft deleted documents if applicable
if resource_config['soft_delete']:
query[config.DELETED] = {'$ne': True}
if self.document_id:
id_field = resource_config['id_field']
query[id_field] = {'$ne': self.document_id}
datasource, _, _, _ = app.data.datasource(self.resource)
if app.data.driver.db[datasource].find_one(query):
key_names = ', '.join([k for k in query])
self._error(field, "value combination of '%s' is not unique" % key_names)

The way I solved this issue is by creating a dynamic field using a combination of functions and lambdas to create a hash that will use
which ever fields you provide
def unique_record(fields):
def is_lambda(field):
# Test if a variable is a lambda
return callable(field) and field.__name__ == "<lambda>"
def default_setter(doc):
# Generate the composite list
r = [
str(field(doc)
# Check is lambda
if is_lambda(field)
# jmespath is not required, but it enables using nested doc values
else jmespath.search(field, doc))
for field in fields
]
# Generate MD5 has from composite string (Keep it clean)
return hashlib.md5(''.join(r).encode()).hexdigest()
return {
'type': 'string',
'unique': True,
'default_setter': default_setter
}
Practical Implementation
My use case was to create a collection that limits the amount of key value pairs a user can create within the collection
domain = {
'schema': {
'key': {
'type': 'string',
'minlength': 1,
'maxlength': 25,
'required': True,
},
'value': {
'type': 'string',
'minlength': 1,
'required': True
},
'hash': unique_record([
'key',
lambda doc: request.USER['_id']
]),
'user': {
'type': 'objectid',
'default_setter': lambda doc: request.USER['_id'] # User tenant ID
}
}
}
}
The function will receive a list of either string or lambda function for dynamic value setting at request time, in my case the user's "_id"
The function supports the use of JSON query with the JMESPATH package, this isn't mandatory, but leave the door open for nested doc flexibility in other usecases
NOTE: This will only work with values that are set by the USER at request time or injected into the request body using the pre_GET trigger pattern, like the USER object I inject in the pre_GET trigger which represents the USER currently making the request

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to parse text output separated with space - python

Related

How to build a nested dictionary of varying depth using for loop?

Search for dictionary value and output "address" in python

Google DLP: "ValueError: Protocol message Value has no "stringValue" field."

how to print domain name for route53 DNS

Combination of two fields to be unique in Python Eve

Categories

Resources