Unable to match tuples items to list items - python

I have a list (tags) of integers. I want to map the list items to the value items of a dictionary (classes) and get the corresponding dictionary keys as output.
I am using:
h = classes.items()
for x in tags:
for e in h:
# print x, e, # uncomment this line to make a diagnosis
if x == e[1]:
print e[0]
else:
print "No Match"
Classes is the dictionary.
Tags is the list with items that I want to map with the classes. When I run this code, I am getting 2616 time No Match at the output.
2616 = 8 (no. of tuples)*327 (no. of items of tags list)

If I understand what you are trying to do maybe this will help
>>> tags
['0', '2', '1', '3', '4', '7', '2', '0', '1', '6', '3', '2', '8', '4', '1', '2', '0', '7', '5', '4', '1']
>>> classes
{'Tesla': 7, 'Nissan': 0, 'Honda': 5, 'Toyota': 6, 'Ford': 1, 'Mazda': 4, 'Ferrari': 2, 'Suzuki': 3}
tags is a list of strings, not integers - so let's convert it to a list of ints.
>>> tags = map(int, tags)
classes is a dictionary mapping car makes to ints, but we want to use the value as the lookup. We can invert the dictionary (swap keys and values)
>>> classes_inverse = {v: k for k, v in classes.items()}
Now this is what tags and classes_inverse look like
>>> tags
[0, 2, 1, 3, 4, 7, 2, 0, 1, 6, 3, 2, 8, 4, 1, 2, 0, 7, 5, 4, 1]
>>> classes_inverse
{0: 'Nissan', 1: 'Ford', 2: 'Ferrari', 3: 'Suzuki', 4: 'Mazda', 5: 'Honda', 6: 'Toyota', 7: 'Tesla'}
Now we can collect the values of the inverse dictionary for each item in the list.
>>> [classes_inverse.get(t, "No Match") for t in tags]
['Nissan', 'Ferrari', 'Ford', 'Suzuki', 'Mazda', 'Tesla', 'Ferrari', 'Nissan', 'Ford', 'Toyota', 'Suzuki', 'Ferrari', 'No Match', 'Mazda', 'Ford', 'Ferrari', 'Nissan', 'Tesla', 'Honda', 'Mazda', 'Ford']

For each tag, you iterate through all the keys, and print whether it was a match or not, when you except to have at most one hit. For example, if you have 10 items, for each tag, you'll print 1 hit and 9 misses.
Since you want to store this data, the easiest way is to invert the dictionary map, i.e. make key -> value to value -> key. However, this assumes that all values are unique, which your example implies so.
def map_tags(tags, classes):
tag_map = {value: key for key, value in classes.items()}
return [tag_map.get(t, 'No match') for t in tags]
However, be careful. In your classes examples the values are integers, while the tags are strings. You want the two to match when making a map out of them. If the tags are intended to be strings, then change
tag_map.get(t, 'No match')
to
tag_map.get(int(t), 'No match')

Related

how can I compare two lists in python and if I have matches ~> I want matches and next value from another list

a = [’bww’, ’1’, ’23’, ’honda’, ’2’, ’55’, ’ford’, ’11’, ’88’, ’tesla’, ’15’, ’1’, ’kia’, ’2’, ’3’]
b = [’ford’, ’honda’]
should return all matches and next value from list a
Result -> [’ford’, ’11’, ’honda’, ’2’]
or even better [’ford 11’, ’honda 2’]
I am new with python and asking help
Here is a neat one-liner to solve what you are looking for. It uses a list comprehension, which iterates over 2 items (bi-gram) of the list at once and then combines the matching items with their next item using .join()
[' '.join([i,j]) for i,j in zip(a,a[1:]) if i in b] #<------
['honda 2', 'ford 11']
EXPLANATION:
You can use zip(a, a[1:]) to iterate over 2 items in the list at once (bi-gram), as a rolling window of size 2. This works as follows.
Next you can compare the first item i[k] in each tuple (i[k],i[k+1]) with elements from list b, using if i in b
If it matches, you can then keep that tuple, and use ' '.join([i,j]) to join them into 1 string as you expect.
Rather than changing the data to suit the code (which some responders seem to think is appropriate) try this:
GROUP = 3
a = ['bmw', '1', '23', 'honda', '2', '55', 'ford', '11', '88', 'tesla', '15', '1', 'kia', '2', '3']
b = ['ford', 'honda']
c = [f'{a[i]} {a[i+1]}' for i in range(0, len(a)-1, GROUP) if a[i] in b]
print(c)
Output:
['honda 2', 'ford 11']
Note:
The assumption here is that input data are presented in groups of three but only the first two values in each triplet are needed.
If the assumption about grouping is wrong then:
c = [f'{a[i]} {a[i+1]}' for i in range(len(a)-1) if a[i] in b]
...which will be less efficient
Assuming all are in string type also assuming after every name in the list a there will be a number next to him.
Code:-
a = ['bww', '1', 'honda', '2', 'ford', '11', 'tesla', '15', 'nissan', '2']
b = ['ford', 'honda']
res=[]
for check in b:
for index in range(len(a)-1):
if check==a[index]:
res.append(check+" "+a[index+1])
print(res)
Output:-
['ford 11', 'honda 2']
List comprehension
Code:-
a = ['bww', '1', 'honda', '2', 'ford', '11', 'tesla', '15', 'nissan', '2']
b = ['ford', 'honda']
res=[check+" "+a[index+1] for check in b for index in range(len(a)-1) if check==a[index]]
print(res) #Same output
I hope ths will help you
a = ['bww', 1, 'honda', 2, 'ford', 11, 'tesla', 15, 'nissan', 2]
b = ['ford', 'honda']
ls=[]
for item in b:
if a.__contains__(item):
ls.append((item+" "+str(a[a.index(item)+1])))
print(ls)

How to store dictionary entries in a loop?

I've been trying to get a dictionary with tuple of strings a key of an integer out of a CSV file but am having trouble.
This is the code I have tried:
fullcsv = [['Brand', 'Swap', 'Candy1', 'Candy2', 'Capacity'],
['Willywonker', 'Yes', 'bubblegum', 'mints', '7'],
['Mars-CO', 'Yes', 'chocolate', 'bubblegum', '1'],
['Nestle', 'Yes', 'bears', 'bubblegum', '2'],
['Uncle Jims', 'Yes', 'chocolate', 'bears', '5']]
def findE(fullcsv):
i = 0
a = {}
while i < len(fullcsv)-1:
i = i + 1
a[i] = ({(fullcsv[i][2],fullcsv[i][3]): int(fullcsv[i][4])})
return a
This is the output for this chunk of code:
{1: {('bubblegum', 'mints'): 7},
2: {('chocolate', 'bubblegum'): 1},
3: {('bears', 'bubblegum'): 2},
4: {('chocolate', 'bears'): 5}}
But the output I'm looking for is more like this:
{('bubblegum', 'mints'): 7,
('chocolate', 'bubblegum'): 1,
('bears', 'bubblegum'): 2,
('chocolate', 'bears'): 5}
so that the tuples aren't numbered and also aren't in their own {}, but just in parentheses ().
Here is a slightly different way if you want.
def findE(fullcsv):
new_dict = {}
for entry in fullcsv[1:]:
new_dict[(entry[2],entry[3])] = entry[-1]
return new_dict
within the function you need to set the key value pair of the dictionary like so
a[(fullcsv[i][2],fullcsv[i][3])] = int(fullcsv[i][4])
so that the full function is
def findE(fullcsv):
i = 0
a ={}
while i < len(fullcsv)-1:
i = i + 1
a[(fullcsv[i][2],fullcsv[i][3])] = int(fullcsv[i][4])
return a
the general syntax is
dictionary[new_key] = new_value

How to match specific list sequence against dictreader row?

I have the following lists:
main_list:
[4, 1, 5]
iterated lists/two rows from dict:
['John', '1', '4', '3']
['Mary', '4', '1', '5']
the iterated list is from the below, dictionary being csv.DictReader(x):
for row in dictionary:
print(list(row.values()))
I want the below to work, where if my main_list matches a sequence from the dictionary list, it will spit out the first column, in which the header is 'name':
if main_list in list(row.values()):
print(row['name'])
For the example above, as Mary's items match 4, 1, 5, the final returned value should be Mary.
I'm new to Python, and I would appreciate any advice on how to work this out.
You can use extended tuple unpacking to split a row into its name and the rest.
name,*therest = `['Mary', '4', '1', '5']
Then make the comparison
test = [4, 1, 5]
if therest == [str(thing) for thing in test]:
print(name)

Converting colon separated list into a dict?

I wrote something like this to convert comma separated list to a dict.
def list_to_dict( rlist ) :
rdict = {}
i = len (rlist)
while i:
i = i - 1
try :
rdict[rlist[i].split(":")[0].strip()] = rlist[i].split(":")[1].strip()
except :
print rlist[i] + ' Not a key value pair'
continue
return rdict
Isn't there a way to
for i, row = enumerate rlist
rdict = tuple ( row )
or something?
You can do:
>>> li=['a:1', 'b:2', 'c:3']
>>> dict(e.split(':') for e in li)
{'a': '1', 'c': '3', 'b': '2'}
If the list of strings require stripping, you can do:
>>> li=["a:1\n", "b:2\n", "c:3\n"]
>>> dict(t.split(":") for t in map(str.strip, li))
{'a': '1', 'b': '2', 'c': '3'}
Or, also:
>>> dict(t.split(":") for t in (s.strip() for s in li))
{'a': '1', 'b': '2', 'c': '3'}
If I understand your requirements correctly, then you can use the following one-liner.
def list_to_dict(rlist):
return dict(map(lambda s : s.split(':'), rlist))
Example:
>>> list_to_dict(['alpha:1', 'beta:2', 'gamma:3'])
{'alpha': '1', 'beta': '2', 'gamma': '3'}
You might want to strip() the keys and values after splitting in order to trim white-space.
return dict(map(lambda s : map(str.strip, s.split(':')), rlist))
You mention both colons and commas so perhaps you have a string with key/values pairs separated by commas, and with the key and value in turn separated by colons, so:
def list_to_dict(rlist):
return {k.strip():v.strip() for k,v in (pair.split(':') for pair in rlist.split(','))}
>>> list_to_dict('a:1,b:10,c:20')
{'a': '1', 'c': '20', 'b': '10'}
>>> list_to_dict('a:1, b:10, c:20')
{'a': '1', 'c': '20', 'b': '10'}
>>> list_to_dict('a : 1 , b: 10, c:20')
{'a': '1', 'c': '20', 'b': '10'}
This uses a dictionary comprehension iterating over a generator expression to create a dictionary containing the key/value pairs extracted from the string. strip() is called on the keys and values so that whitespace will be handled.

Flatten Entity-Attribute-Value (EAV) Schema in Python

I've got a csv file in something of an entity-attribute-value format (i.e., my event_id is non-unique and repeats k times for the k associated attributes):
event_id, attribute_id, value
1, 1, a
1, 2, b
1, 3, c
2, 1, a
2, 2, b
2, 3, c
2, 4, d
Are there any handy tricks to transform a variable number of attributes (i.e., rows) into columns? The key here is that the output ought to be an m x n table of structured data, where m = max(k); filling in missing attributes with NULL would be optimal:
event_id, 1, 2, 3, 4
1, a, b, c, null
2, a, b, c, d
My plan was to (1) convert the csv to a JSON object that looks like this:
data = [{'value': 'a', 'id': '1', 'event_id': '1', 'attribute_id': '1'},
{'value': 'b', 'id': '2', 'event_id': '1', 'attribute_id': '2'},
{'value': 'a', 'id': '3', 'event_id': '2', 'attribute_id': '1'},
{'value': 'b', 'id': '4', 'event_id': '2', 'attribute_id': '2'},
{'value': 'c', 'id': '5', 'event_id': '2', 'attribute_id': '3'},
{'value': 'd', 'id': '6', 'event_id': '2', 'attribute_id': '4'}]
(2) extract unique event ids:
events = set()
for item in data:
events.add(item['event_id'])
(3) create a list of lists, where each inner list is a list the of attributes for the corresponding parent event.
attributes = [[k['value'] for k in j] for i, j in groupby(data, key=lambda x: x['event_id'])]
(4) create a dictionary that brings events and attributes together:
event_dict = dict(zip(events, attributes))
which looks like this:
{'1': ['a', 'b'], '2': ['a', 'b', 'c', 'd']}
I'm not sure how to get all inner lists to be the same length with NULL values populated where necessary. It seems like something that needs to be done in step (3). Also, creating n lists full of m NULL values had crossed my mind, then iterate through each list and populate the value using attribute_id as the list location; but that seems janky.
Your basic idea seems right, though I would implement it as follows:
import itertools
import csv
events = {} # we're going to keep track of the events we read in
with open('path/to/input') as infile:
for event, _att, val in csv.reader(infile):
if event not in events:
events[event] = []
events[int(event)].append(val) # track all the values for this event
maxAtts = max(len(v) for _k,v in events.items()) # the maximum number of attributes for any event
with open('path/to/output', 'w') as outfile):
writer = csv.writer(outfile)
writer.writerow(["event_id"] + list(range(1, maxAtts+1))) # write out the header row
for k in sorted(events): # let's look at the events in sorted order
writer.writerow([k] + events[k] + ['null']*(maxAtts-len(events[k]))) # write out the event id, all the values for that event, and pad with "null" for any attributes without values

Categories