Python: Extract Key, Value Pairs From A Dictionary - Key Contains Specific Text

Python: Extract Key, Value Pairs From A Dictionary - Key Contains Specific Text - python

I currently have a dictionary that looks like this:
{OctetString('Ethernet8/6'): Integer(1),
OctetString('Ethernet8/7'): Integer(2),
OctetString('Ethernet8/8'): Integer(2),
OctetString('Ethernet8/9'): Integer(1),
OctetString('Vlan1'): Integer(2),
OctetString('Vlan10'): Integer(1),
OctetString('Vlan15'): Integer(1),
OctetString('loopback0'): Integer(1),
OctetString('mgmt0'): Integer(1),
OctetString('port-channel1'): Integer(1),
OctetString('port-channel10'): Integer(1),
OctetString('port-channel101'): Integer(1),
OctetString('port-channel102'): Integer(1)}
I want my dictionary to look like this:
{OctetString('Ethernet8/6'): Integer(1),
OctetString('Ethernet8/7'): Integer(2),
OctetString('Ethernet8/8'): Integer(2),
OctetString('Ethernet8/9'): Integer(1)}
I am not sure what is the best way to find these key, value pairs. I really want anything that matches '\Ethernet(\d*)/(\d*)'. However I am not sure the best way to go about this. My main goal is to match all the Ethernet Values and then count them. For example: After I have the dict matching all of Ethernetx/x I want to count the amount of 1's and 2's.
Also, why do I get only Ethernet8/6 when I iterate the dictionary and print, but when I pprint the dictionary I end up with OctetString('Ethernet8/6')?
for k in snmp_comb: print k
Ethernet2/18
Ethernet2/31
Ethernet2/30
Ethernet2/32
Ethernet8/46

This should do it:
new_dict = dict()
for key, value in orig_dict.items():
if 'Ethernet' in str(key):
new_dict[key] = value
When you use print, python calls the __str__ method on the OctetString object, which returns Ethernet8/6. However, I think pprint defaults to printing the object type.
EDIT:
Stefan Pochmann has rightly pointed out below that if 'Ethernet' in will match any string which contains the word Ethernet. The OP did mention using regex in his post to match Ethernet(\d*)/(\d*), so this answer may not be suitable to anyone else looking to solve a similar problem.

(I'll use the same 'Ethernet' in str(key) test as the accepted answer.)
If you want to keep the original dict and have the filtered version as a separate dictionary, I'd use a comprehension:
newdict = {key: value
for key, value in mydict.items()
if 'Ethernet' in str(key)}
If you don't want to keep the original dict, you can also just remove the entries you don't want:
for key in list(mydict):
if 'Ethernet' in str(key):
del mydict[key]
The reason you get "OctetString('...')" is the same as this one:
>>> 'foo'
'foo'
>>> pprint.pprint('foo')
'foo'
>>> print('foo')
foo
The first two tests show you a representation you can use in source code, that's why there are quotes. It's what the repr function gets you. The third test prints the value for normal pleasure, so doesn't add quotes. The "OctetString('...')" is simply such a representation as well, and you can copy&paste it into source code and get actual OctetString objects again, rather than Python string objects. I guess pprint is mostly intended for developing, where it's more useful to get the full repr version.

Related

update string from a dictionary with the values from matching keys

def endcode(msg,secret_d):
for ch in msg:
for key,value in secret_d:
if ch == key:
msg[ch] = value
return msg
encode('CAN YOU READ THIS',{'A':'4','E':'3','T':'7','I':'1','S':'5'})
This is my code. What I am trying to do here is for every characters in a string msg, the function should search in the dictionary and replace it with the mapping string if the character ch is a key in the dictionary secret_d.
If ch is not a key in secret_d than keep it unchanged.
For the example, the final result is should be 'C4N YOU R34D 7H15'

Your function name is endcode but you are calling encode.
But more important, I'll give you a hint to what's going on. This isn't going to totally work, but it's going to get you back on track.
def endcode(msg,secret_d):
newstr=""
for ch in msg:
for key,value in secret_d.iteritems():
if ch == key:
newstr=newstr+value
print(msg)
endcode('CAN YOU READ THIS',{'A':'4','E':'3','T':'7','I':'1','S':'5'})
But if you want a complete answer, here is mine.

A few issues:
As rb612 pointed out, there's a typo in your function definition ("endcode")
you are doing nothing with the return value of your function after calling it
msg[ch] is trying to assign items in a string, but that's not possible, strings are immutable. You'll have to build a new string. You cannot "update" it.
in order to iterate over (key, value) pairs of a dictionary d, you must iterate over d.items(). Iterating over d will iterate over the keys only.
That being said, here's my suggestion how to write this:
>>> def encode(msg, replacers):
... return ''.join([replacers.get(c, c) for c in msg])
...
>>> result = encode('CAN YOU READ THIS',{'A':'4','E':'3','T':'7','I':'1','S':'5'})
>>> result
'C4N YOU R34D 7H15'
Things to note:
dict.get can be called with a fallback value as the second argument. I'm telling it to just return the current character if it cannot be found within the dictionary.
I'm using a list comprehension as the argument for str.join instead of a generator expression for performance reasons, here's an excellent explanation.

Error with Python dictionary: str object has no attribute append

I am writing code in python.
My input line is "all/DT remaining/VBG all/NNS of/IN "
I want to create a dictionary with one key and multiple values
For example - all:[DT,NNS]
groupPairsByKey={}
Code:
for line in fileIn:
lineLength=len(line)
words=line[0:lineLength-1].split(' ')
for word in words:
wordPair=word.split('/')
if wordPair[0] in groupPairsByKey:
groupPairsByKey[wordPair[0]].append(wordPair[1])
<getting error here>
else:
groupPairsByKey[wordPair[0]] = [wordPair[1]]

Your problem is that groupPairsByKey[wordPair[0]] is not a list, but a string!
Before appending value to groupPairsByKey['all'], you need to make the value a list.
Your solution is already correct, it works perfectly in my case. Try to make sure that groupPairsByKey is a completely empty dictionary.
By the way, this is what i tried:
>>> words = "all/DT remaining/VBG all/NNS of/IN".split
>>> for word in words:
wordPair = word.split('/')
if wordPair[0] in groupPairsByKey:
groupPairsByKey[wordPair[0]].append(wordPair[1])
else:
groupPairsByKey[wordPair[0]] = [wordPair[1]]
>>> groupPairsByKey
{'of': ['IN'], 'remaining': ['VBG'], 'all': ['DT', 'NNS']}
>>>
Also, if your code is formatted like the one you posted here, you'll get an indentationError.
Hope this helps!

Although it looks to me like you should be getting an IndentationError, if you are getting the message
str object has no attribute append
then it means
groupPairsByKey[wordPair[0]]
is a str, and strs do not have an append method.
The code you posted does not show how
groupPairsByKey[wordPair[0]]
could have a str value. Perhaps put
if wordPair[0] in groupPairsByKey:
if isinstance(groupPairsByKey[wordPair[0]], basestring):
print('{}: {}'.format(*wordPair))
raise Hell
into your code to help track down the culprit.
You could also simplify your code by using a collections.defaultdict:
import collections
groupPairsByKey = collections.defaultdict(list)
for line in fileIn:
lineLength=len(line)
words=line[0:lineLength-1].split(' ')
for word in words:
wordPair=word.split('/')
groupPairsByKey[wordPair[0]].append(wordPair[1])
When you access a defaultdict with a missing key, the factory function -- in this case list -- is called and the returned value is used as the associated value in the defaultdict. Thus, a new key-value pair is automatically inserted into the defaultdict whenever it encounters a missing key. Since the default value is always a list, you won't run into the error
str object has no attribute append anymore -- unless you have
code which reassigns an old key-value pair to have a new value which is a str.

You can do:
my_dict["all"] = my_string.split('/')
in Python,

How list every item of dir(object)?

In Python, to find all attributes, there is:
dir(object)
object.__dict__.keys()
But what i want is to list what is in the second branch, not only the first branch, it's kind of a recursive operation?
How to do that?
it's like
dir(dir(x) for x in dir(math))
tried this and still get the same result duplicated:
>>> for i in dir(math):
... for j in i:
... print dir(j)
and all results are the methods of str
Update: it seems that the dir() commande returns a list of str, here is a simple hack; I tried to exclude the reserved names to see if i go further, but the result was only str
[i for i in dir(math) if i[0]!="_"]
[type(i) for i in dir(math) if i[0]!="_"]
Thank you again :)

object.__dict__.keys() # Just keys
object.__dict__.values() # Just values
object.__dict__.items() # Key-value pairs
Edit wait! I think I misunderstood. You want to list an object's properties, and those properties' properties and so on and so forth? Try something like this:
def discover(object):
for key in dir(object):
value = getattr(object, key)
print key, value
discover(value)
It's pretty crude, but that's the recursion I think you're looking for. Note that you will have to stop it manually at some point. There's no turtles at the bottom, it goes on and on.

Parse string with three-level delimitation into dictionary

I've found how to split a delimited string into key:value pairs in a dictionary elsewhere, but I have an incoming string that also includes two parameters that amount to dictionaries themselves: parameters with one or three key:value pairs inside:
clientid=b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0&keyid=987654321&userdata=ip:192.168.10.10,deviceid:1234,optdata:75BCD15&md=AMT-Cam:avatar&playbackmode=st&ver=6&sessionid=&mk=PC&junketid=1342177342&version=6.7.8.9012
Obviously these are dummy parameters to obfuscate proprietary code, here. I'd like to dump all this into a dictionary with the userdata and md keys' values being dictionaries themselves:
requestdict {'clientid' : 'b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0', 'keyid' : '987654321', 'userdata' : {'ip' : '192.168.10.10', 'deviceid' : '1234', 'optdata' : '75BCD15'}, 'md' : {'Cam' : 'avatar'}, 'playbackmode' : 'st', 'ver' : '6', 'sessionid' : '', 'mk' : 'PC', 'junketid' : '1342177342', 'version' : '6.7.8.9012'}
Can I take the slick two-level delimitation parsing command that I've found:
requestDict = dict(line.split('=') for line in clientRequest.split('&'))
and add a third level to it to handle & preserve the 2nd-level dictionaries? What would the syntax be? If not, I suppose I'll have to split by & and then check & handle splits that contain : but even then I can't figure out the syntax. Can someone help? Thanks!

I basically took Kyle's answer and made it more future-friendly:
def dictelem(input):
parts = input.split('&')
listing = [part.split('=') for part in parts]
result = {}
for entry in listing:
head, tail = entry[0], ''.join(entry[1:])
if ':' in tail:
entries = tail.split(',')
result.update({ head : dict(e.split(':') for e in entries) })
else:
result.update({head: tail})
return result

Here's a two-liner that does what I think you want:
dictelem = lambda x: x if ':' not in x[1] else [x[0],dict(y.split(':') for y in x[1].split(','))]
a = dict(dictelem(x.split('=')) for x in input.split('&'))

Can I take the slick two-level delimitation parsing command that I've found:
requestDict = dict(line.split('=') for line in clientRequest.split('&'))
and add a third level to it to handle & preserve the 2nd-level dictionaries?
Of course you can, but (a) you probably don't want to, because nested comprehensions beyond two levels tend to get unreadable, and (b) this super-simple syntax won't work for cases like yours, where only some of the data can be turned into a dict.
For example, what should happen with 'PC'? Do you want to make that into {'PC': None}? Or maybe the set {'PC'}? Or the list ['PC']? Or just leave it alone? You have to decide, and write the logic for that, and trying to write it as an expression will make your decision very hard to read.
So, let's put that logic in a separate function:
def parseCommasAndColons(s):
bits = [bit.split(':') for bit in s.split(',')]
try:
return dict(bits)
except ValueError:
return bits
This will return a dict like {'ip': '192.168.10.10', 'deviceid': '1234', 'optdata': '75BCD15'} or {'AMT-Cam': 'avatar'} for cases where each comma-separated component has a colon inside it, but a list like ['1342177342'] for cases where any of them don't.
Even this may be a little too clever; I might make the "is this in dictionary format" check more explicit instead of just trying to convert the list of lists and see what happens.
Either way, how would you put that back into your original comprehension?
Well, you want to call it on the value in the line.split('='). So let's add a function for that:
def parseCommasAndColonsForValue(keyvalue):
if len(keyvalue) == 2:
return keyvalue[0], parseCommasAndColons(keyvalue[1])
else:
return keyvalue
requestDict = dict(parseCommasAndColonsForValue(line.split('='))
for line in clientRequest.split('&'))
One last thing: Unless you need to run on older versions of Python, you shouldn't often be calling dict on a generator expression. If it can be rewritten as a dictionary comprehension, it will almost certainly be clearer that way, and if it can't be rewritten as a dictionary comprehension, it probably shouldn't be a 1-liner expression in the first place.
Of course breaking expressions up into separate expressions, turning some of them into statements or even functions, and naming them does make your code longer—but that doesn't necessarily mean worse. About half of the Zen of Python (import this) is devoted to explaining why. Or one quote from Guido: "Python is a bad language for code golf, on purpose."
If you really want to know what it would look like, let's break it into two steps:
>>> {k: [bit2.split(':') for bit2 in v.split(',')] for k, v in (bit.split('=') for bit in s.split('&'))}
{'clientid': [['b59694bf-c7c1-4a3a-8cd5-6dad69f4abb0']],
'junketid': [['1342177342']],
'keyid': [['987654321']],
'md': [['AMT-Cam', 'avatar']],
'mk': [['PC']],
'playbackmode': [['st']],
'sessionid': [['']],
'userdata': [['ip', '192.168.10.10'],
['deviceid', '1234'],
['optdata', '75BCD15']],
'ver': [['6']],
'version': [['6.7.8.9012']]}
That illustrates why you can't just add a dict call for the inner level—because most of those things aren't actually dictionaries, because they had no colons. If you changed that, then it would just be this:
{k: dict(bit2.split(':') for bit2 in v.split(',')) for k, v in (bit.split('=') for bit in s.split('&'))}
I don't think that's very readable, and I doubt most Python programmers would. Reading it 6 months from now and trying to figure out what I meant would take a lot more effort than writing it did.
And trying to debug it will not be fun. What happens if you run that on your input, with missing colons? ValueError: dictionary update sequence element #0 has length 1; 2 is required. Which sequence? No idea. You have to break it down step by step to see what doesn't work. That's no fun.
So, hopefully that illustrates why you don't want to do this.

How to retrieve from python dict where key is only partially known?

I have a dict that has string-type keys whose exact values I can't know (because they're generated dynamically elsewhere). However, I know that that the key I want contains a particular substring, and that a single key with this substring is definitely in the dict.
What's the best, or "most pythonic" way to retrieve the value for this key?
I thought of two strategies, but both irk me:
for k,v in some_dict.items():
if 'substring' in k:
value = v
break
-- OR --
value = [v for (k,v) in some_dict.items() if 'substring' in k][0]
The first method is bulky and somewhat ugly, while the second is cleaner, but the extra step of indexing into the list comprehension (the [0]) irks me. Is there a better way to express the second version, or a more concise way to write the first?

There is an option to write the second version with the performance attributes of the first one.
Use a generator expression instead of list comprehension:
value = next(v for (k,v) in some_dict.iteritems() if 'substring' in k)
The expression inside the parenthesis will return an iterator which you will then ask to provide the next, i.e. first element. No further elements are processed.

How about this:
value = (v for (k,v) in some_dict.iteritems() if 'substring' in k).next()
It will stop immediately when it finds the first match.
But it still has O(n) complexity, where n is the number of key-value pairs. You need something like a suffix list or a suffix tree to speed up searching.

If there are many keys but the string is easy to reconstruct from the substring, then it can be faster reconstructing it. e.g. often you know the start of the key but not the datestamp that has been appended on. (so you may only have to try 365 dates rather than iterate through millions of keys for example).
It's unlikely to be the case but I thought I would suggest it anyway.
e.g.
>>> names={'bob_k':32,'james_r':443,'sarah_p':12}
>>> firstname='james' #you know the substring james because you have a list of firstnames
>>> for c in "abcdefghijklmnopqrstuvwxyz":
... name="%s_%s"%(firstname,c)
... if name in names:
... print name
...
james_r

class MyDict(dict):
def __init__(self, *kwargs):
dict.__init__(self, *kwargs)
def __getitem__(self,x):
return next(v for (k,v) in self.iteritems() if x in k)
# Defining several dicos ----------------------------------------------------
some_dict = {'abc4589':4578,'abc7812':798,'kjuy45763':1002}
another_dict = {'boumboum14':'WSZE x478',
'tagada4783':'ocean11',
'maracuna102455':None}
still_another = {12:'jfg',45:'klsjgf'}
# Selecting the dicos whose __getitem__ method will be changed -------------
name,obj = None,None
selected_dicos = [ (name,obj) for (name,obj) in globals().iteritems()
if type(obj)==dict
and all(type(x)==str for x in obj.iterkeys())]
print 'names of selected_dicos ==',[ name for (name,obj) in selected_dicos]
# Transforming the selected dicos in instances of class MyDict -----------
for k,v in selected_dicos:
globals()[k] = MyDict(v)
# Exemple of getting a value ---------------------------------------------
print "some_dict['7812'] ==",some_dict['7812']
result
names of selected_dicos == ['another_dict', 'some_dict']
some_dict['7812'] == 798

I prefer the first version, although I'd use some_dict.iteritems() (if you're on Python 2) because then you don't have to build an entire list of all the items beforehand. Instead you iterate through the dict and break as soon as you're done.
On Python 3, some_dict.items(2) already results in a dictionary view, so that's already a suitable iterator.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Extract Key, Value Pairs From A Dictionary - Key Contains Specific Text - python

Related

update string from a dictionary with the values from matching keys

Error with Python dictionary: str object has no attribute append

How list every item of dir(object)?

Parse string with three-level delimitation into dictionary

How to retrieve from python dict where key is only partially known?

Categories

Resources