python list parsing example

python list parsing example - python

I'd like to know how to parse (or split) and element of a list?
I have a list of lists (of string) such as:
resultList = [['TWP-883 PASS'], ['TWP-1080 PASS'], ['TWP-1081 PASS']]
where:
resultList[0] = ['TWP-883 PASS']
resultList[1] = ['TWP-1080 PASS']
essentially, I need a variable for the two entries in each element of the list. For example:
issueId = 'TWP-883'
status = 'PASS'
What would allow for iterating through this list and parsing such as above?

Well that's as simple as:
# You can also assign as you iterate as suggested in the comments.
for issue, status in resultList:
print issue, status
This outputs
TWP-883 PASS
TWP-1080 PASS
TWP-1081 PASS
TWP-1082 PASS
TWP-884 FAIL
TWP-885 PASS
Here's another example:
>>> x = [1, 2] # or (1, 2), or '12' works with collections
>>> y, z = x
>>> y
1
>>> z
2
>>>
Incidentally, in Python 3.x, you can also do:
In [1]: x = [1, 2, 3, 4]
In [2]: y, z, *rest = x
In [3]: y
Out[3]: 1
In [4]: z
Out[4]: 2
In [5]: rest
Out[5]: [3, 4]

You just need a simple for loop that exploits Python's tuple unpacking machinery.
for issueId, status in resultList:
# Do stuff with issueId and status

Note: I changed this answer to reflect an edit of the question. Specifically, I added a split() to separate the strings in the nested lists into two strings (issueId and status).
I would use list and dictionary comprehensions to turn your list of lists into a list of dictionaries with the keys issueId and status:
resultList = [['TWP-883 PASS'], ['TWP-1080 PASS'], ['TWP-1081 PASS']]
result_dicts = [{("issueId","status")[x[0]]:x[1] for x in enumerate(lst[0].split())} for lst in resultList]
Lookups can now be done in this way:
>>> result_dicts[0]["status"]
'PASS'
>>> result_dicts[0]["issueId"]
'TWP-883'
>>> result_dicts[1]
{'status': 'PASS', 'issueId': 'TWP-1080'}
>>>
To declare variables for each value in each dictionary in the list and print them, use the code below:
for entry in result_dicts:
issueId = entry["issueId"]
status = entry["status"]
print("The status of {0: <10} is {1}".format(issueId, status))
Output:
The status of TWP-883 is PASS
The status of TWP-1080 is PASS
The status of TWP-1081 is PASS

If you wish to do more transformations later, use a genexp,
g = (func(issueid, status) for issueid, status in resultList) # func returns non-None objects
If you just want to consume the iterable,
for issueid, status in resultList:
# print(issueid, status)
# whatever

You can get the list of the strings by
issueId = [y for x in resultList for (i,y) in enumerate(x.split()) if(i%2==0)]
status = [y for x in resultList for (i,y) in enumerate(x.split()) if(i%2==1)]
to go through every issueID and corrosponding status you can use
for id,st in zip(issueId,status):
print(id, " : ", st)

Related

Add to set elements of a splitted string

I'm receiving values as strings, separated by comma.
Example:
alpha, gane, delta
delta, opsirom, nado
I want to obtain a list/set of uniques values, sorted. I'm trying to use a set for uniquenes:
app = set()
for r in result:
app = app | set(r.split[","])
but I get the following error:
TypeError: 'builtin_function_or_method' object is not subscriptable

I would use a mix between split and replace if I'm understand your input correctly and set for uniqueness as you stated:
value_1 = "alpha, gane, delta, alpha"
aux_1 = value_1.replace(" ","").split(",")
a = list(set(aux_1))
print(a)
#Another list formatted as string arrives:
value_2 = "alpha, beta, omega, beta"
aux_2 = value_2.replace(" ","").split(",")
#Option 1:
a += list(set(aux_2))
a = list(set(a))
print(a)
#Option 2:
for i in aux_2:
if i in a:
pass
else:
a.append(i)
print(a)
Output for both cases:
['delta', 'gane', 'omega', 'beta', 'alpha']
After you receive another string you can add the values to the full list, in this case a and use set() again to eliminate further duplicates. Or check for each individual value if the the value in the string is in the full list and append it if it's not, or skip if it already exists in the full list.

as well as you can use below code,
splited_inputs = inputs.split(',')
unique_values = list(dict.fromkeys(splited_inputs))

Try this:
s = "alpha, gane, delta, delta, opsirom, nado"
unique_values = list(set(s.rsplit(', ')))
print(unique_values)
outputs:
['opsirom', 'delta', 'alpha', 'gane', 'nado']

You are not too far off. The immediate problem was the use of [] instead of () for the split function call.
In [151]: alist = """alpha, gane, delta
...: delta, opsirom, nado""".splitlines()
In [152]: alist
Out[152]: ['alpha, gane, delta', 'delta, opsirom, nado']
In [153]: aset = set()
In [154]: for astr in alist:
...: aset |= set(astr.split(', '))
...:
In [155]: aset
Out[155]: {'alpha', 'delta', 'gane', 'nado', 'opsirom'}
The use | to join sets is fine; I used the \= version. The split delimiter needed to be tweaked to avoid having both 'delta' and ' delta' in result. Otherwise you might need to apply strip to each string. #Victor got this part right.

List of dicts: Getting list of matching dictionary based on id

I'm trying to get the matching IDs and store the data into one list. I have a list of dictionaries:
list = [
{'id':'123','name':'Jason','location': 'McHale'},
{'id':'432','name':'Tom','location': 'Sydney'},
{'id':'123','name':'Jason','location':'Tompson Hall'}
]
Expected output would be something like
# {'id':'123','name':'Jason','location': ['McHale', 'Tompson Hall']},
# {'id':'432','name':'Tom','location': 'Sydney'},
How can I get matching data based on dict ID value? I've tried:
for item in mylist:
list2 = []
row = any(list['id'] == list.id for id in list)
list2.append(row)
This doesn't work (it throws: TypeError: tuple indices must be integers or slices, not str). How can I get all items with the same ID and store into one dict?

First, you're iterating through the list of dictionaries in your for loop, but never referencing the dictionaries, which you're storing in item. I think when you wrote list[id] you mean item[id].
Second, any() returns a boolean (true or false), which isn't what you want. Instead, maybe try row = [dic for dic in list if dic['id'] == item['id']]
Third, if you define list2 within your for loop, it will go away every iteration. Move list2 = [] before the for loop.
That should give you a good start. Remember that row is just a list of all dictionaries that have the same id.

I would use kdopen's approach along with a merging method after converting the dictionary entries I expect to become lists into lists. Of course if you want to avoid redundancy then make them sets.
mylist = [
{'id':'123','name':['Jason'],'location': ['McHale']},
{'id':'432','name':['Tom'],'location': ['Sydney']},
{'id':'123','name':['Jason'],'location':['Tompson Hall']}
]
def merge(mylist,ID):
matches = [d for d in mylist if d['id']== ID]
shell = {'id':ID,'name':[],'location':[]}
for m in matches:
shell['name']+=m['name']
shell['location']+=m['location']
mylist.remove(m)
mylist.append(shell)
return mylist
updated_list = merge(mylist,'123')

Given this input
mylist = [
{'id':'123','name':'Jason','location': 'McHale'},
{'id':'432','name':'Tom','location': 'Sydney'},
{'id':'123','name':'Jason','location':'Tompson Hall'}
]
You can just extract it with a comprehension
matched = [d for d in mylist if d['id'] == '123']
Then you want to merge the locations. Assuming matched is not empty
final = matched[0]
final['location'] = [d['location'] for d in matched]
Here it is in the interpreter
In [1]: mylist = [
...: {'id':'123','name':'Jason','location': 'McHale'},
...: {'id':'432','name':'Tom','location': 'Sydney'},
...: {'id':'123','name':'Jason','location':'Tompson Hall'}
...: ]
In [2]: matched = [d for d in mylist if d['id'] == '123']
In [3]: final=matched[0]
In [4]: final['location'] = [d['location'] for d in matched]
In [5]: final
Out[5]: {'id': '123', 'location': ['McHale', 'Tompson Hall'], 'name': 'Jason'}
Obviously, you'd want to replace '123' with a variable holding the desired id value.
Wrapping it all up in a function:
def merge_all(df):
ids = {d['id'] for d in df}
result = []
for id in ids:
matches = [d for d in df if d['id'] == id]
combined = matches[0]
combined['location'] = [d['location'] for d in matches]
result.append(combined)
return result
Also, please don't use list as a variable name. It shadows the builtin list class.

how to modify list of actual variables in python

I want to do the same thing to a list of objects. I can call a function on all 3 of them like:
x = double(x)
y = double(y)
z = double(z)
but even by pre-modern standards this seems hideous. I do
In [4]: z = 0
In [5]: x = 0
In [6]: y = 0
In [7]: items = [x, y, z]
In [8]: for item in items:
...: item = 5
...:
In [9]: print(x)
0
and no dice. How do you operate on a list of variables, I'm reading getattr and it doesn't seem to work for me
I want to iterate over x, y, and z in this case, and set them all equal to 5 in two lines of code- the for item in items, and then modify each item
For now, I get odd behavior, like all items are directly equal to each other:
In [11]: for item in items:
print item is x
....:
True
True
True
In [12]: for item in items:
print item is y
....:
True
True
True
I don't actually want a list, I want them to stay in memory with the same name so I can immediately do:
return {'income': income, 'ex_date': exdate, ...}
I didn't give enough info, but here is the goal, all 4 will have to be handled separately:
total_income = self.get_total_income(portfolio_id)
short_term = get_short_term(distros, 'ST')
long_term = get_lt(distros, 'LT')
total_distributions = self.get_totals(portfolio_id)
items = [total_income, short_term, long_term, total_distributions]
for item in items:
do the same thing
return {'total_income': total_income, 'short_term': short_term, 'long_term': long_term, 'total_distributions': total_distributions}

The problem with your loop is that all you do is assign 5 to the name item, but never do anything to the original list.
To modify a list (or, more precise, get a new list of modified values) you can either use a list comprehension or map a function to all elements of a list.
>>> items = [0, 0, 0]
>>> [float(x) for x in items]
[0.0, 0.0, 0.0]
>>> map(float, items)
[0.0, 0.0, 0.0]
Another example:
>>> items = [1,2,3]
>>> [x*10 for x in items]
[10, 20, 30]
>>> map(str, items)
['1', '2', '3']
edit in response to your comment on the question:
ya, I used a few yesterday. I don't actually want a list though, I want them to stay in memory with the same name
In that case, use a dictionary as your data structure instead of n lone variables. Example:
>>> my_values = {'x': 1, 'y': 2, 'z':3}
>>> my_values['y']
2
You can modify all your values by a given rule with a dictionary comprehension.
>>> my_values = {key:value*2 for key, value in my_values.items()}
>>> my_values
{'y': 4, 'x': 2, 'z': 6}

Here you even don't need to use list comprehensions. In your particular case this is enough:
x = y = z = 5

If you want your variables to stay in the memory with same names but with different values you can do:
[x, y, z] = [5 for _ in items]
or
[x, y, z] = [double(i) for i in items]

What about that?
x = 1
y = 2
z = 3
l = [x, y, z]
l = [i**2 for i in l]
x, y, z = l
print(x, y, z)
1 4 9

How do you operate on a list of variables
Generally speaking you don't. If you have values you want to operate on as a list then you store them in a list not in individual variables. Or you can store them as items in a dictionary, or attributes of an object, just don't put them in separate variables.
For the particular case you asked about you could do this:
x, y, z = map(double, [x, y, z])
or for your other example:
total_income = self.get_total_income(var)
short_term = get_short_term(var)
long_term = get_lt(var)
total_distributions = self.get_totals(var)
items = [total_income, short_term, long_term, total_distributions]
result = {
'total_income': total_income,
'short_term': short_term,
'long_term': long_term,
'total_distributions': total_distributions
}
# N.B. Using list(...) to avoid modifying the dict while iterating.
for key, value in list(result.items()):
result[key] = do_something(value)
return result

Double loop for whois.py

I'm using a dictionary with the id as the key and the names as the values. What I'm trying to do is get the names in the values that have the same name in them and put them in a list. Like for example with the name tim:
{'id 1': ['timmeh', 'user543', 'tim'], 'id 2': ['tim', 'timmeh', '!anon0543']}
whois_list = ['timmeh', 'user543', 'tim', '!anon0543']
The bot would append the names that are not in list yet. This is the code to execute this example:
def who(name):
whois_list = []
if not any(l for l in whois.whoisDB.values() if name.lower() in l):
return 'No alias found for <b>%s</b>." % name.title()
else:
for l in whois.whoisDB.values():
if name.lower() in l:
for names in l:
if names not in whois_list
whois_list.append(names)
return "Possible alias found for <b>%s</b>: %s" % (name.title(), whois_list)
The issue is: I do not want to have a double loop in this code, but I'm not really sure how to do it, if it's possible.

A logically equivalent, but shorter and more efficient solution is to use sets instead of lists.
Your innermost for loop simply extends whois_list with every non-duplicate name in l. If you originally define whois_list = set([]) then you can replace the three lines of the inner for loop with:
whois_list = whois_list.union(l)
For example,
>>> a = set([1,2,3])
>>> a = a.union([3,4,5])
>>> a
set([1, 2, 3, 4, 5])
You'll notice a prints out slightly differently, indicating that it is a set instead of a list. If this is a problem, you could convert it right before your return statement as in
>>> a = list(a)
>>> a
[1, 2, 3, 4, 5]

using FOR statement on 2 elements at once python

I have the following list of variables and a mastervariable
a = (1,5,7)
b = (1,3,5)
c = (2,2,2)
d = (5,2,8)
e = (5,5,8)
mastervariable = (3,2,5)
I'm trying to check if 2 elements in each variable exist in the master variable, such that the above would show B (3,5) and D (5,2) as being elements with at least 2 elements matching in the mastervariable. Also note that using sets would result in C showing up as matchign but I don't want to count C cause only 'one' of the elements in C are in mastervariable (i.e. 2 only shows up once in mastervariable not twice)
I currently have the very inefficient:
if current_variable[0]==mastervariable[0]:
if current_variable[1] = mastervariable[1]:
True
elif current_variable[2] = mastervariable[1]:
True
#### I don't use OR here because I need to know which variables match.
elif current_variable[1] == mastervariable[0]: ##<-- I'm now checking 2nd element
etc. etc.
I then continue to iterate like the above by checking each one at a time which is extremely inefficient. I did the above because using a FOR statement resulted in me checking the first element twice which was incorrect:
For i in a:
for j in a:
### this checked if 1 was in the master variable and not 1,5 or 1,7
Is there a way to use 2 FOR statement that allows me to check 2 elements in a list at once while skipping any element that has been used already? Alternatively, can you suggest an efficient way to do what I'm trying?
Edit: Mastervariable can have duplicates in it.

For the case where matching elements can be duplicated so that set breaks, use Counter as a multiset - the duplicates between a and master are found by:
count_a = Counter(a)
count_master = Counter(master)
count_both = count_a + count_master
dups = Counter({e : min((count_a[e], count_master[e])) for e in count_a if count_both[e] > count_a[e]})
The logic is reasonably intuitive: if there's more of an item in the combined count of a and master, then it is duplicated, and the multiplicity is however many of that item are in whichever of a and master has less of them.
It gives a Counter of all the duplicates, where the count is their multiplicity. If you want it back as a tuple, you can do tuple(dups.elements()):
>>> a
(2, 2, 2)
>>> master
(1, 2, 2)
>>> dups = Counter({e : min((count_a[e], count_master[e])) for e in count_a if count_both[e] > count_a[e]})
>>> tuple(dups.elements())
(2, 2)

Seems like a good job for sets. Edit: sets aren't suitable since mastervariable can contain duplicates. Here is a version using Counters.
>>> a = (1,5,7)
>>>
>>> b = (1,3,5)
>>>
>>> c = (2,2,2)
>>>
>>> d = (5,2,8)
>>>
>>> e = (5,5,8)
>>> D=dict(a=a, b=b, c=c, d=d, e=e)
>>>
>>> from collections import Counter
>>> mastervariable = (5,5,3)
>>> mvc = Counter(mastervariable)
>>> for k,v in D.items():
... vc = Counter(v)
... if sum(min(count, vc[item]) for item, count in mvc.items())==2:
... print k
...
b
e

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python list parsing example - python

You just need a simple for loop that exploits Python's tuple unpacking machinery. for issueId, status in resultList: # Do stuff with issueId and status

If you wish to do more transformations later, use a genexp, g = (func(issueid, status) for issueid, status in resultList) # func returns non-None objects If you just want to consume the iterable, for issueid, status in resultList: # print(issueid, status) # whatever

Related

Add to set elements of a splitted string

List of dicts: Getting list of matching dictionary based on id

how to modify list of actual variables in python

Double loop for whois.py

using FOR statement on 2 elements at once python

Categories

Resources