I am working on implementing natural join in python. The first two lines show the tables attributes and the next two lines each tables' tuples or rows.
Expected Output:
[['A', 1, 'A', 'a', 'A'],
['A', 1, 'A', 'a', 'Y'],
['A', 1, 'Y', 'a', 'A'],
['A', 1, 'Y', 'a', 'Y'],
['S', 2, 'B', 'b', 'S']]
And what I got:
[['A', 1, 'A', 'a', 'A', 'Y'],
['A', 1, 'A', 'a', 'A', 'Y']]
I have looked through the code and everything seems to be right, I would appreciate any help.
t1atts = ('A', 'B', 'C', 'D')
t2atts = ('B', 'D', 'E')
t1tuples = [['A', 1, 'A', 'a'],
['B', 2, 'Y', 'a'],
['Y', 4, 'B', 'b'],
['A', 1, 'Y', 'a'],
['S', 2, 'B', 'b']]
t2tuples = [[1, 'a', 'A'],
[3, 'a', 'B'],
[1, 'a', 'Y'],
[2, 'b', 'S'],
[3, 'b', 'E']]
def findindices(t1atts, t2atts):
t1index=[]
t2index=[]
for index, att in enumerate(t1atts):
for index2, att2 in enumerate(t2atts):
if att == att2:
t1index.append(index)
t2index.append(index2)
return t1index, t2index
def main():
tpl=0; tpl2=0; i=0; j=0; count=0; result=[]
t1index, t2index = findindices(t1atts, t2atts)
for tpl in t1tuples:
while tpl2 in range(len(t2tuples)):
i=0; j=0
while (i in range(len(t1index))) and (j in range(len(t2index))):
if tpl[t1index[i]] != t2tuples[tpl2][t2index[j]]:
i=len(t1index)
j=len(t1index)
else:
count+=1
i+=1
j+=1
if count == len(t1index):
extravals = [val for index, val in enumerate(t2tuples[tpl2]) if index not in t2index]
temp = tpl
tpl += extravals
result.append(tpl)
tpl = temp
count=0
tpl2+=1
print result
Here's what I came up with. I'd do some more refactoring, etc before calling it done
import pprint
t1atts = ('A', 'B', 'C', 'D')
t2atts = ('B', 'D', 'E')
t1tuples = [
['A', 1, 'A', 'a'],
['B', 2, 'Y', 'a'],
['Y', 4, 'B', 'b'],
['A', 1, 'Y', 'a'],
['S', 2, 'B', 'b']]
t2tuples = [
[1, 'a', 'A'],
[3, 'a', 'B'],
[1, 'a', 'Y'],
[2, 'b', 'S'],
[3, 'b', 'E']]
t1columns = set(t1atts)
t2columns = set(t2atts)
t1map = {k: i for i, k in enumerate(t1atts)}
t2map = {k: i for i, k in enumerate(t2atts)}
join_on = t1columns & t2columns
diff = t2columns - join_on
def match(row1, row2):
return all(row1[t1map[rn]] == row2[t2map[rn]] for rn in join_on)
results = []
for t1row in t1tuples:
for t2row in t2tuples:
if match(t1row, t2row):
row = t1row[:]
for rn in diff:
row.append(t2row[t2map[rn]])
results.append(row)
pprint.pprint(results)
And I get the expected results:
[['A', 1, 'A', 'a', 'A'],
['A', 1, 'A', 'a', 'Y'],
['A', 1, 'Y', 'a', 'A'],
['A', 1, 'Y', 'a', 'Y'],
['S', 2, 'B', 'b', 'S']]
Ok, here is the solution please verify and let me know if it works for you:
I change little bit of naming to understood myself:
#!/usr/bin/python
table1 = ('A', 'B', 'C', 'D')
table2 = ('B', 'D', 'E')
row1 = [['A', 1, 'A', 'a'],
['B', 2, 'Y', 'a'],
['Y', 4, 'B', 'b'],
['A', 1, 'Y', 'a'],
['S', 2, 'B', 'b']]
row2 = [[1, 'a', 'A'],
[3, 'a', 'B'],
[1, 'a', 'Y'],
[2, 'b', 'S'],
[3, 'b', 'E']]
def findindices(table1, table2):
inter = set(table1).intersection(set(table2))
tup_index1 = [table1.index(x) for x in inter]
tup_index2 = [table2.index(x) for x in inter]]
return tup_index1, tup_index2
def main():
final_lol = list()
tup_index1, tup_index2 = findindices(table1, table2)
merge_tup = zip(tup_index1, tup_index2)
for tup1 in row1:
for tup2 in row2:
for m in merge_tup:
if tup1[m[0]] != tup2[m[1]]:
break
else:
ls = []
ls.extend(tup1)
ls.append(tup2[-1])
final_lol.append(ls)
return final_lol
if __name__ == '__main__':
import pprint
pprint.pprint(main())
Output:
[['A', 1, 'A', 'a', 'A'],
['A', 1, 'A', 'a', 'Y'],
['A', 1, 'Y', 'a', 'A'],
['A', 1, 'Y', 'a', 'Y'],
['S', 2, 'B', 'b', 'S']]
Related
I created 2 lists in python `
ls = []
a = ['a','b','c','d','e','f']
i = 0
while i < 5:
x = a[-1]
a.pop(-1)
a.insert(0, x)
ls.insert(0, a)
i += 1
print(ls)
What I want to do is to add something from the list filled with letters into an empty list and making the result look like this
ls = [
['a','b','c','d','e','f'],
['f','a','b','c','d','e'],
['e','f','a','b','c','d'],
['d','e','f','a','b','c'],
['c','d','e','f','a','b'],
['b','c','d','e','f','a']
]
I would like to know where I made a mistake in python and the solution.
The list is a mutable object in python, so when you insert the list a in the ls, you are just adding a reference to the list a, instead of adding the whole value.
A workaround would be to insert a copy of a in the ls. One way to create a new copy of the list is using the list() on the list or you can use copy function from copy module. So doing ls.insert(0, a.copy()) would give the same result as below -
ls = []
a = ['a','b','c','d','e','f']
i = 0
while i < 5:
x = a[-1]
a.pop(-1)
a.insert(0, x)
ls.insert(0, list(a)) # updated this
i += 1
print(ls)
Output:
[['b', 'c', 'd', 'e', 'f', 'a'], ['c', 'd', 'e', 'f', 'a', 'b'], ['d', 'e', 'f', 'a', 'b', 'c'], ['e', 'f', 'a', 'b', 'c', 'd'], ['f', 'a', 'b', 'c', 'd', 'e']]
Another easy way to get your expected output would be to -
ls = []
a = ['a','b','c','d','e','f']
for i in range(6):
ls.append(a.copy())
a = [a[-1]] + a[:-1]
print(ls)
Output :
[['a', 'b', 'c', 'd', 'e', 'f'], ['f', 'a', 'b', 'c', 'd', 'e'], ['e', 'f', 'a', 'b', 'c', 'd'], ['d', 'e', 'f', 'a', 'b', 'c'], ['c', 'd', 'e', 'f', 'a', 'b'], ['b', 'c', 'd', 'e', 'f', 'a']]
For example, I have a list:
[['aabbbb'], ['bbbbab'], ['babbab'], ['baaaaa'], ['bbbaaa'], ['bbbbaa']]
how do I split it so that I get [['a', 'a', 'b', 'b', 'b', 'b'],... etc? It would be very useful thanks!
You can use list comprehension.
mylist = [['aabbbb'], ['bbbbab'], ['babbab'], ['baaaaa'], ['bbbaaa'], ['bbbbaa']]
new_list = [list(item[0]) for item in mylist]
This will return,
[['a', 'a', 'b', 'b', 'b', 'b'], ['b', 'b', 'b', 'b', 'a', 'b'], ['b', 'a', 'b', 'b', 'a', 'b'], ['b', 'a', 'a', 'a', 'a', 'a'], ['b', 'b', 'b', 'a', 'a', 'a'], ['b', 'b', 'b', 'b', 'a', 'a']]
Try the code below
oldList = [['aabbbb'], ['bbbbab'], ['babbab'], ['baaaaa'], ['bbbaaa'], ['bbbbaa']]
newList = []
for i in oldList:
newList.append(list(i[0]))
this is an algorithm that is meant to write down some of the permutations of the list P and it does it well, but...
def p():
global P
P = ['a', 'b', 'c', 'd']
perm(4)
per = []
def perm(k):
global P
if k==1:
print(P)
per.append(P)
else:
for i in range(k):
P[i], P[k-1] = P[k-1], P[i]
perm(k-1)
P[i], P[k-1] = P[k-1], P[i]
when i want it to add the permutations to a global list (necessary for the rest of the program) there is a problem. It still prints all the permutations
['b', 'c', 'd', 'a']
['b', 'c', 'd', 'a']
['d', 'b', 'c', 'a']
['d', 'b', 'c', 'a']
['b', 'd', 'c', 'a']
['b', 'd', 'c', 'a']
['a', 'c', 'b', 'd']
['a', 'c', 'b', 'd']
['b', 'a', 'c', 'd']
['b', 'a', 'c', 'd']
['a', 'b', 'c', 'd']
['a', 'b', 'c', 'd']
['b', 'd', 'a', 'c']
['b', 'd', 'a', 'c']
['a', 'b', 'd', 'c']
['a', 'b', 'd', 'c']
['b', 'a', 'd', 'c']
['b', 'a', 'd', 'c']
['a', 'd', 'b', 'c']
['a', 'd', 'b', 'c']
['b', 'a', 'd', 'c']
['b', 'a', 'd', 'c']
['a', 'b', 'd', 'c']
['a', 'b', 'd', 'c']
but when i check the list it's filled with the default set
[['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b',
'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'],
['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b',
'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'],
['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b',
'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'],
['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c'], ['a', 'b', 'd', 'c']]
could you please help me at least with what's the actual issue?
Try using built-in itertools link .
import itertools
P = [letter for letter in "abcd"]
def perm(permutate_this):
return list(itertools.permutations(permutate_this))
print(perm(P))
Use
per.append(P[:])
to copy the list. You are appending a reference to a list and thats always the same data. Your per contains the same reference over and over.
Comment-remark of cdarke:
Slicing a list is calles a shallow copy - if you have lists that contain other refs (f.e. inner lists) it will only copy the ref and you have the same problem for the inner lists - you would have to resort to copy.deepcopy in that case.
Example:
innerlist = [1,2,3]
l2 = [innerlist, 5, 6]
l3 = l2[:]
print(l2) # orig
print(l3) # the shallow copy
l3[2] = "changed" # l2[2] is unchanged
print(l2)
print(l3)
innerlist[2] = 999 # both (l2 and l3) will reflect this change in the innerlist
print(l2)
print(l3)
Output:
[[1, 2, 3], 5, 6] # l2
[[1, 2, 3], 5, 6] # l3
[[1, 2, 3], 5, 6] # l2 unchanged by l3[2]='changed'
[[1, 2, 3], 5, 'changed'] # l3 changed by -"-
[[1, 2, 999], 5, 6] # l2 and l3 affected by change in innerlist
[[1, 2, 999], 5, 'changed']
It may be more convenient for you to use built-in permutations:
from itertools import permutations
arr = [1, 2, 3]
list(permutations(arr))
> [(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)]
I have two lists, ['A', 'B', 'C', 'D'] and [1, 2, 3, 4]. Both lists will always have the same number of items. I need to multiply each string by its number, so the final product I am looking for is:
['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
Nested list comprehension works too:
>>> l1 = ['A', 'B', 'C', 'D']
>>> l2 = [1, 2, 3, 4]
>>> [c for c, i in zip(l1, l2) for _ in range(i)]
['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
In above zip returns (char, count) tuples:
>>> t = list(zip(l1, l2))
>>> t
[('A', 1), ('B', 2), ('C', 3), ('D', 4)]
Then for every tuple the second for loop is executed count times to add the character to the result:
>>> [char for char, count in t for _ in range(count)]
['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
I would use itertools.repeat for a nice, efficient implementation:
>>> letters = ['A', 'B', 'C', 'D']
>>> numbers = [1, 2, 3, 4]
>>> import itertools
>>> result = []
>>> for letter, number in zip(letters, numbers):
... result.extend(itertools.repeat(letter, number))
...
>>> result
['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
>>>
I also think it is quite readable.
The code is pretty straight forward, see inline comments
l1 = ['A', 'B', 'C', 'D']
l2 = [1, 2, 3, 4]
res = []
for i, x in enumerate(l1): # by enumerating you get both the item and its index
res += x * l2[i] # add the next item to the result list
print res
OUTPUT
['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
You use zip() to do it like this way:
a = ['A', 'B', 'C', 'D']
b = [1, 2, 3, 4]
final = []
for k,v in zip(a,b):
final += [k for _ in range(v)]
print(final)
Output:
>>> ['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
Or you can do it, too, using zip() and list comprehension:
a = ['A', 'B', 'C', 'D']
b = [1, 2, 3, 4]
final = [k for k,v in zip(a,b) for _ in range(v)]
print(final)
Output:
>>> ['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
You can use NumPy and then convert the NumPy array to a list:
letters = ['A', 'B', 'C', 'D']
times = [1, 2, 3, 4]
np.repeat(letters, times).tolist()
#output
['A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D', 'D']
I have many troubles to merge two simple nested lists while keeping the overall structure. For example:
# From this source:
x = [['a','b','c'],['d','e','f'],['g','h','i']]
y = [['A','B','C'],['D','E','F'],['G','H','I']]
# To this result:
z = [[['a','A'],['b','B'],['c','C']],
[['d','D'],['e','E'],['f','F']],
[['g','G'],['h','H'],['i','I']]]
Any idea?
Here, some (embarrassing) code I tried:
X = []
Y = []
for i in iter(x[:]):
X.append('=')
for v in iter(i[:]):
X.append(v)
print X; print
for i in iter(y[:]):
Y.append('=')
for v in iter(i[:]):
Y.append(v)
print Y; print
for i in zip(X, Y):
print i
print([list(map(list,zip(*t))) for t in zip(x,y)])
[[['a', 'A'], ['b', 'B'], ['c', 'C']],
[['d', 'D'], ['e', 'E'], ['f', 'F']],
[['g', 'G'], ['h', 'H'], ['i', 'I']]]
The steps:
In [20]: zip(x,y) # zip the lists together
Out[20]:
[(['a', 'b', 'c'], ['A', 'B', 'C']),
(['d', 'e', 'f'], ['D', 'E', 'F']),
(['g', 'h', 'i'], ['G', 'H', 'I'])]
In [21]: t = (['a', 'b', 'c'], ['A', 'B', 'C']) # first "t"
In [22]: zip(*t) # transpose, row to columns, columns to rows
Out[22]: [('a', 'A'), ('b', 'B'), ('c', 'C')]
In [23]: list(map(list,zip(*t))) # convert inner tuples to lists
Out[23]: [['a', 'A'], ['b', 'B'], ['c', 'C']]