a=0.77 ,b=0.2 ,c=0.20, d=0.79 ,z=(c+d), e=(z*a) ,output=(z+e)
I have a text file like above. I need a parser logic that will throw an equation like
output=(0.20+0.79)+((0.20+0.79)*a) what are some efficient ways to do it? Are there any libraries? Thank you!
Primitive method is to work with strings and use replace()
First use split(',') to convert string to list
['a=0.77 ', 'b=0.2 ', 'c=0.20', ' d=0.79 ', 'z=(c+d)', ' e=(z*a) ', 'output=(z+e)']
Next use .strip() to remove spaces from ends and begins.
Next use .split('=') on every element to create nested lists.
[['a', '0.77'], ['b', '0.2'], ['c', '0.20'], ['d', '0.79'], ['z', '(c+d)'], ['e', '(z*a)'], ['output', '(z+e)']]
Next use dict() to create dictionary.
{'a': '0.77',
'b': '0.2',
'c': '0.20',
'd': '0.79',
'e': '(z*a)',
'output': '(z+e)',
'z': '(c+d)'}
And now you can get first 'a' : '0.77 to run .replace('a', '0.77)` on other items in dictionary. And repeate it with other values from dictionary.
So finally you could get dictionary
{'a': '0.77',
'b': '0.2',
'c': '0.20',
'd': '0.79',
'e': '((0.20+0.79)*0.77)',
'output': '((0.20+0.79)+((0.20+0.79)*0.77))',
'z': '(0.20+0.79)'}
and output has string ((0.20+0.79)+((0.20+0.79)*0.77))
import sympy
import pprint
text = 'a=0.77 ,b=0.2 ,c=0.20, d=0.79 ,z=(c+d), e=(z*a) ,output=(z+e)'
parts = text.split(',') # create list
#print(parts)
parts = [item.strip() for item in parts] # remove spaces
#print(parts)
parts = [item.split('=') for item in parts] # create tuples
#print(parts)
parts = dict(parts) # create dict
#print(parts)
pprint.pprint(parts)
for key1, val1 in parts.items():
for key2, val2 in parts.items():
parts[key2] = parts[key2].replace(key1, val1)
pprint.pprint(parts)
print('output:', parts['output'])
Related
So in the following program I have comes up with something that isn't working the way I want, and I need help figuring this out.
The first input takes a string
"Joe,123-5432 Linda,983-4123 Frank,867-5309"
It first replaces commas with white space, and converts it into list.
['Joe', '123-5432', 'Linda', '983-4123', 'Frank', '867-5309']
However, I want to convert this list into a dictionary using the first entry as the key, and the second entry as the value. So it would look like this:
{'Joe':'123-5432', 'Linda':'983-4123', 'Frank':'867-5309'}
This is where I find my problem (within the function). When I call it into the function it broke it up by individual characters, rather than seeing the .splits as a whole string, which looks like this...
{'J': 'o', 'e': ' ', '1': '2', '3': ' ', '5': '3', ' ': '9', 'i': 'n', 'd': 'a', '8': '6', '-': '4', 'F': 'r', 'a': 'n', 'k': ' ', '7': '-', '0': '9'}
Which ya know is funny, but not my target here.
Later in the program, when Ecall gets an input, it cross references the list and pulls the phone number from the dictionary. Can you help me build a better comprehension for Pdict in the function that does this and not whatever I did?
def Convert(FormatInput):
Pdict = {FormatInput[i]: FormatInput[i + 1] for i in range(0, len(FormatInput), 2)}
return Pdict
user_input = input()
FormatInput=user_input.replace(",", " ")
Pdict=Convert(FormatInput)
Ecall = (input())
print(Pdict.get(Ecall, 'N/A'))
Use two different split operations instead of doing the replace to try to do it in a single split (which just makes things more difficult because now you've lost the information of which separator was which).
First split the original string (on whitespace) to produce a list of comma-separated entries:
>>> user_input = "Joe,123-5432 Linda,983-4123 Frank,867-5309"
>>> user_input.split()
['Joe,123-5432', 'Linda,983-4123', 'Frank,867-5309']
and then split on the commas within each entry so you have a sequence of pairs that you can pass to dict(). You can do the whole thing in one easy line:
>>> dict(entry.split(",") for entry in user_input.split())
{'Joe': '123-5432', 'Linda': '983-4123', 'Frank': '867-5309'}
Given a two file path
Z:\home\user\dfolder\NO,AG,GK.jpg
Z:\home\user\dfolder\NI,DG,BJ (1).jpg
The objective is to split each string and store into a dict
Currently, I first split the path using os.path.split to get list of s
s=['NO,AG,GK.jpg','NI,DG,BJ (1).jpg']
and iteratively split the string as below
all_dic=[]
for ds in s:
k=ds.split(",")
kk=k[-1].split('.jpg')[0].split("(")[0] if bool(re.search('\(\d+\)', ds)) else k[-1].split('.jpg')[0]
nval={"f":k[0],"s":k[1],"t":kk}
all_dic.append(nval)
But, I am curious for a regex approach, or any 1 liner .
One liner parsing using regex + inline list parsing:
import re
s = ['NO,AG,GK.jpg', 'NI,DG,BJ (1).jpg']
keys = ['f', 's', 't']
all_dic = [{keys[k]: x for k, x in enumerate(
re.sub("(\s\(\d+\))?(\.jpg)?", "", item).split(','))} for item in s]
print(all_dic)
->
[{'f': 'NO', 's': 'AG', 't': 'GK'}, {'f': 'NI', 's': 'DG', 't': 'BJ'}]
Well, I think this is the easiest way to get the same output without using the split() function.
The regular expression takes only the letters and puts them in a list, so we don't even have to split the string or remove the (1) from it.
import re
s=['NO,AG,GK.jpg','NI,DG,BJ (1).jpg']
all_dic = []
for ds in s:
regex = '[a-zA-Z]+'
k = re.findall(regex,ds) # We extract all the matches (as a list)
nval={'f':k[0],'s':k[1],'t':k[2]} # We create the dictionary
all_dic.append(nval) # We append the dictionary to the list
print(all_dic)
# Output: [{'f': 'NO', 's': 'AG', 't': 'GK'}, {'f': 'NI', 's': 'DG', 't': 'BJ'}]
Also, you have the file extension in k[3], just in case you need it.
I wrote something like this to convert comma separated list to a dict.
def list_to_dict( rlist ) :
rdict = {}
i = len (rlist)
while i:
i = i - 1
try :
rdict[rlist[i].split(":")[0].strip()] = rlist[i].split(":")[1].strip()
except :
print rlist[i] + ' Not a key value pair'
continue
return rdict
Isn't there a way to
for i, row = enumerate rlist
rdict = tuple ( row )
or something?
You can do:
>>> li=['a:1', 'b:2', 'c:3']
>>> dict(e.split(':') for e in li)
{'a': '1', 'c': '3', 'b': '2'}
If the list of strings require stripping, you can do:
>>> li=["a:1\n", "b:2\n", "c:3\n"]
>>> dict(t.split(":") for t in map(str.strip, li))
{'a': '1', 'b': '2', 'c': '3'}
Or, also:
>>> dict(t.split(":") for t in (s.strip() for s in li))
{'a': '1', 'b': '2', 'c': '3'}
If I understand your requirements correctly, then you can use the following one-liner.
def list_to_dict(rlist):
return dict(map(lambda s : s.split(':'), rlist))
Example:
>>> list_to_dict(['alpha:1', 'beta:2', 'gamma:3'])
{'alpha': '1', 'beta': '2', 'gamma': '3'}
You might want to strip() the keys and values after splitting in order to trim white-space.
return dict(map(lambda s : map(str.strip, s.split(':')), rlist))
You mention both colons and commas so perhaps you have a string with key/values pairs separated by commas, and with the key and value in turn separated by colons, so:
def list_to_dict(rlist):
return {k.strip():v.strip() for k,v in (pair.split(':') for pair in rlist.split(','))}
>>> list_to_dict('a:1,b:10,c:20')
{'a': '1', 'c': '20', 'b': '10'}
>>> list_to_dict('a:1, b:10, c:20')
{'a': '1', 'c': '20', 'b': '10'}
>>> list_to_dict('a : 1 , b: 10, c:20')
{'a': '1', 'c': '20', 'b': '10'}
This uses a dictionary comprehension iterating over a generator expression to create a dictionary containing the key/value pairs extracted from the string. strip() is called on the keys and values so that whitespace will be handled.
for a list of dictionaries
sample_dict = [
{'a': 'woot', 'b': 'nope', 'c': 'duh', 'd': 'rough', 'e': '1'},
{'a': 'coot', 'b': 'nope', 'c': 'ruh', 'd': 'rough', 'e': '2'},
{'a': 'doot', 'b': 'nope', 'c': 'suh', 'd': 'rough', 'e': '3'},
{'a': 'soot', 'b': 'nope', 'c': 'fuh', 'd': 'rough', 'e': '4'},
{'a': 'toot', 'b': 'nope', 'c': 'cuh', 'd': 'rough', 'e': '1'}
]
How do I make a separate dictionary that contains all the key,value pair that match to a certain key. With list comprehension I created a list of all the key,value pairs like this:
container = [[key,val] for s in sample_dict for key,val in s.iteritems() if key == 'a']
Now the container gave me
[['a', 'woot'], ['a', 'coot'], ['a', 'doot'], ['a', 'soot'], ['a', 'toot']]
Which is all fine... but if I want to do the same with dictionaries, I get only a singe key,value pair. Why does this happen ?
container = {key : val for s in sample_dict for key,val in s.iteritems() if key == 'a'}
The container gives only a single element
{'a': 'toot'}
I want the something like
{'a': ['woot','coot','doot','soot','toot']}
How do I do this with minimal change to the code above ?
You are generating multiple key-value pairs with the same key, and a dictionary will only ever store unique keys.
If you wanted just one key, you'd use a dictionary with a list comprehension:
container = {'a': [s['a'] for s in sample_dict if 'a' in s]}
Note that there is no need to iterate over the nested dictionaries in sample_dict if all you wanted was a specific key; in the above I simply test if the key exists ('a' in s) and extract the value for that key with s['a']. This is much faster than looping over all the keys.
Another option:
filter = lambda arr, x: { x: [ e.get(x) for e in arr] }
So, from here, you can construct the dict based on the original array and the key
filter(sample_dict, 'a')
# {'a': ['woot', 'coot', 'doot', 'soot', 'toot']}
I am newbie to python,i am facing below issue please help me:
I read line by line from one file, each line having field name and its value,
now i have to find out field name and filevalue in the line.example of line is:
line=" A= 4 | B='567' |c=4|D='aaa' "
Since some field values are itself a string so I am unable to create regex to retrieve field name and filed value.
Please let me know regex for above example.
the output should be
A=4
B='567'
c=4
D='aaa'
The simplest solution I can think of is converting each line into a dictionary. I assume that you don't have any quote marks or | marks in your strings (see my comments on the question).
result={} # Initialize a dictionary
for line in open('input.txt'): # Read file line by line in a memory-efficient way
# Split line to pairs using '|', split each pair using '='
pairs = [pair.split('=') for pair in line.split('|')]
for pair in pairs:
key, value = pair[0].strip(), pair[1].strip()
try: # Try an int conversion
value=int(value)
except: # If fails, strip quotes
value=value.strip("'").strip('"')
result[key]=value # Add current item to the results dictionary
which, for the following input:
A= 4 | B='567' |c=4|D='aaa'
E= 4 | F='567' |G=4|D='aaa'
Would give:
{'A': 4, 'c': 4, 'B': '567', 'E': 4, 'D': 'aaa', 'G': 4, 'F': '567'}
Notes:
If you consider '567' to be a number, you can strip the " and ' before trying to convert it to integer.
If you need to take floats into account, you can try value=float(value). Remeber to do it after the int convertion attempt, because every int is also a float.
try this one:
import re
line = " A= 4 | B='567' |c=4|D='aaa' "
re.search( '(?P<field1>.*)=(?P<value1>.*)\|(?P<field2>.*)=(?P<value2>.*)\|(?P<field3>.*)=(?P<value3>.*)\|(?P<field4>.*)=(?P<value4>.*)', line ).groups()
output:
(' A', ' 4 ', ' B', "'567' ", 'c', '4', 'D', "'aaa' ")
you can also try using \S* instead of .* if your fields and values do not contain whitespaces. this will eliminate the whitespaces from output:
re.search( '(?P<field1>\S*)\s*=\s*(?P<value1>\S*)\s*\|\s*(?P<field2>\S*)\s*=\s*(?P<value2>\S*)\s*\|\s*(?P<field3>\S*)\s*=\s*(?P<value3>\S*)\s*\|\s*(?P<field4>\S*)\s*=\s*(?P<value4>\S*)', line ).groupdict()
output:
{'field1': 'A',
'field2': 'B',
'field3': 'c',
'field4': 'D',
'value1': '4',
'value2': "'567'",
'value3': '4',
'value4': "'aaa'"
}
this will create related groups:
[ re.search( '\s*([^=]+?)\s*=\s*(\S+)', group ).groups( ) for group in re.findall( '([^=|]*\s*=\s*[^|]*)', line ) ]
output:
[('A', '4'), ('B', "'567'"), ('c', '4'), ('D', "'aaa'")]
does it help?
Assuming you don't have nasty things like nested quotes or unmatched quotes you can do it all with split and strip:
>>> line = " A= 4 | B='567' |c=4|D='aaa' "
>>> values = dict((x.strip(" '"), y.strip(" '")) for x,y in (entry.split('=') for entry in line.split('|')))
>>> values
{'A': '4', 'c': '4', 'B': '567', 'D': 'aaa'}