Related
I'm a beginner in python , and I am doing the binary search task, I'm trying a different code than what it is common. My issue is that I can't return a new binary list. In the first time the function is working as it suppose to be, but for the second time the function isn't return the new list.
My Code:
import random, math
user_choice=random.randint(0,100)
print (user_choice)
max=100
number_elements=50
number_list=random.sample(range(max), number_elements)
sort_list=sorted(number_list)
print (sort_list)
count=0
limit=int(math.sqrt(number_elements))
# divide by 2 the length of the number_list
def divide_list(sort_list):
global count,number_elements
number_elements=int(number_elements//2)
count += 1
half=len(sort_list)//2
if user_choice <= sort_list[number_elements] :
sort_list=sort_list[:half]
print(sort_list)
else :
sort_list=sort_list[half:]
print(sort_list)
return sort_list
while len(sort_list)==0 or count <=limit:
max /= 2
divide_list(sort_list)
Output :
99
[5, 6, 8, 9, 14, 15, 17, 18, 19, 22, 23, 24, 26, 27, 28, 34, 35, 36, 38, 39, 40, 41, 44, 46, 47, 48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99],
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
[48, 50, 51, 53, 54, 55, 56, 57, 58, 61, 63, 64, 65, 67, 68, 69, 75, 76, 80, 81, 86, 90, 96, 97, 99]
Thanks for your help.
Your function is returning the list but you are not storing/updating it.
Solution:
Change the line divide_list(sort_list) to sort_list = divide_list(sort_list).
Explanation:
When you are updating the variable sort_list inside the function before printing (the line --> sort_list=sort_list[half:]), it is updating the variable sort_list associated to that function call and not the global variable sort_list. So by storing the return value in the global variable (as given in my solution), your list gets updated and passes the updated list next time the function is called.
prints elements from a list with 10 digits per line
from this
n = [85, 13, 99, 34, 71, 15, 82, 24, 64, 61,
67, 99, 50, 68, 25, 37, 32, 27, 14, 91,
79, 15, 47, 48, 74, 88, 64, 53, 77, 50,
24, 91, 87, 55, 60, 75, 91, 22, 47, 63,
81, 88, 26, 48, 69, 59, 84, 77, 28, 36,
59, 74, 89, 73, 91, 64, 55, 88, 90, 48,
73, 97, 98, 40, 93, 50, 78, 60, 44, 77,
82, 51, 53, 65, 98, 59, 94, 91, 52, 44,
65, 85, 72, 92, 49, 67, 58, 48, 62, 54,
89, 67, 58, 48, 85, 45, 77, 76, 81, 77]
to this (without parenthesis)
value n :
85, 13, 99, 34, 71, 15, 82, 24, 64, 61,
67, 99, 50, 68, 25, 37, 32, 27, 14, 91,
79, 15, 47, 48, 74, 88, 64, 53, 77, 50,
24, 91, 87, 55, 60, 75, 91, 22, 47, 63,
81, 88, 26, 48, 69, 59, 84, 77, 28, 36,
59, 74, 89, 73, 91, 64, 55, 88, 90, 48,
73, 97, 98, 40, 93, 50, 78, 60, 44, 77,
82, 51, 53, 65, 98, 59, 94, 91, 52, 44,
65, 85, 72, 92, 49, 67, 58, 48, 62, 54,
89, 67, 58, 48, 85, 45, 77, 76, 81, 77
You can try this:
n = [85, 13, 99, 34, 71, 15, 82, 24, 64, 61,
67, 99, 50, 68, 25, 37, 32, 27, 14, 91,
79, 15, 47, 48, 74, 88, 64, 53, 77, 50,
24, 91, 87, 55, 60, 75, 91, 22, 47, 63,
81, 88, 26, 48, 69, 59, 84, 77, 28, 36,
59, 74, 89, 73, 91, 64, 55, 88, 90, 48,
73, 97, 98, 40, 93, 50, 78, 60, 44, 77,
82, 51, 53, 65, 98, 59, 94, 91, 52, 44,
65, 85, 72, 92, 49, 67, 58, 48, 62, 54,
89, 67, 58, 48, 85, 45, 77, 76, 81, 77]
print("value n :")
for x in range(10):
print(', '.join([str(num) for num in (n[x*10:x*10+10])]))
Output:
value n :
85, 13, 99, 34, 71, 15, 82, 24, 64, 61
67, 99, 50, 68, 25, 37, 32, 27, 14, 91
79, 15, 47, 48, 74, 88, 64, 53, 77, 50
24, 91, 87, 55, 60, 75, 91, 22, 47, 63
81, 88, 26, 48, 69, 59, 84, 77, 28, 36
59, 74, 89, 73, 91, 64, 55, 88, 90, 48
73, 97, 98, 40, 93, 50, 78, 60, 44, 77
82, 51, 53, 65, 98, 59, 94, 91, 52, 44
65, 85, 72, 92, 49, 67, 58, 48, 62, 54
89, 67, 58, 48, 85, 45, 77, 76, 81, 77
y = [", ".join([str(x) for x in n[10*(i-1):10*i]]) for i in range(1, 11)]
print("\n".join(y))
I first splitter the original list into 10 sublists, turned each one to a string, and then concat those strings.
You can just you simply to get desired output using separator in print
print('value n :')
for i in range(0, len(n), 10):
print(*n[i:i+10], sep = ', ')
Explanation
range(0, len(n), 10) creates starting index on each row
n[i:i+1] is the array of 10 values for each row
*n[i:i+1] is the unpacking operator that turns list n[i:i+1 into positional arguments for print (i.e. equivalent to print(n[i], n[i+1], ...n[i+10])
sep = ', ' causes each positional argument to be printed with a comma separator
Input:
n = [85, 13, 99, 34, 71, 15, 82, 24, 64, 61,
67, 99, 50, 68, 25, 37, 32, 27, 14, 91,
79, 15, 47, 48, 74, 88, 64, 53, 77, 50,
24, 91, 87, 55, 60, 75, 91, 22, 47, 63,
81, 88, 26, 48, 69, 59, 84, 77, 28, 36,
59, 74, 89, 73, 91, 64, 55, 88, 90, 48,
73, 97, 98, 40, 93, 50, 78, 60, 44, 77,
82, 51, 53, 65, 98, 59, 94, 91, 52, 44,
65, 85, 72, 92, 49, 67, 58, 48, 62, 54,
89, 67, 58, 48, 85, 45, 77, 76, 81, 77]
Code:
print('value n:')
for i in range(0, n.__len__(), 10):
print(*n[i:i+10], sep=', ',end='\n')
Output:
value n:
85, 13, 99, 34, 71, 15, 82, 24, 64, 61
67, 99, 50, 68, 25, 37, 32, 27, 14, 91
79, 15, 47, 48, 74, 88, 64, 53, 77, 50
24, 91, 87, 55, 60, 75, 91, 22, 47, 63
81, 88, 26, 48, 69, 59, 84, 77, 28, 36
59, 74, 89, 73, 91, 64, 55, 88, 90, 48
73, 97, 98, 40, 93, 50, 78, 60, 44, 77
82, 51, 53, 65, 98, 59, 94, 91, 52, 44
65, 85, 72, 92, 49, 67, 58, 48, 62, 54
89, 67, 58, 48, 85, 45, 77, 76, 81, 77
Hope someone can shed some light on this. I am trying to learn my way around with HDF5 files. Somehow this list of strings gets encoded into the file as a array of integers but I'm not able to figure out how to go about decoding it. I can plug the file back into pandas using the read_hdf function, but that's not the point - I am trying to understand the encoding logic. Summarized here is the example I was working with.
smiles.txt =
structure
[11CH2]1NCCN2C[C##H]3CCC[C##H]3c4cccc1c24
[11CH2]1NCCN2[C##H]3CCC[C##H]3c4cccc1c24
[11CH3]c1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)C(F)(F)F
[11CH3]c1ccccc1O[C#H]([C##H]2CNCCO2)c3ccccc3
[11CH3]c1ccccc1S[C#H]([C##H]2CNCCO2)c3ccccc3
>>> import pandas as pd
>>> df = pd.read_csv('smiles.txt', header=0)
>>> df.to_hdf('smiles.h5', 'table')
I then explore the structure of the newly created HDF5 file:
>>> import h5py
>>> with h5py.File('smiles.h5',"r") as f:
>>> f.visit(print)
table
table/axis0
table/axis1
table/block0_items
table/block0_values
>>> with h5py.File('smiles_temp', 'r') as f:
>>> print(list(f.keys()))
>>> print(f['/thekey/axis0'][:])
>>> print(f['/thekey/axis1'][:])
>>> print(f['/thekey/block0_items'][:])
>>> print(f['/thekey/block0_values'][:])
['thekey']
[b'structure']
[0 1 2 3 4]
[b'structure']
[array([128, 4, 149, 123, 1, 0, 0, 0, 0, 0, 0, 140, 21,
110, 117, 109, 112, 121, 46, 99, 111, 114, 101, 46, 109, 117,
108, 116, 105, 97, 114, 114, 97, 121, 148, 140, 12, 95, 114,
101, 99, 111, 110, 115, 116, 114, 117, 99, 116, 148, 147, 148,
140, 5, 110, 117, 109, 112, 121, 148, 140, 7, 110, 100, 97,
114, 114, 97, 121, 148, 147, 148, 75, 0, 133, 148, 67, 1,
98, 148, 135, 148, 82, 148, 40, 75, 1, 75, 5, 75, 1,
134, 148, 104, 3, 140, 5, 100, 116, 121, 112, 101, 148, 147,
148, 140, 2, 79, 56, 148, 75, 0, 75, 1, 135, 148, 82,
148, 40, 75, 3, 140, 1, 124, 148, 78, 78, 78, 74, 255,
255, 255, 255, 74, 255, 255, 255, 255, 75, 63, 116, 148, 98,
137, 93, 148, 40, 140, 41, 91, 49, 49, 67, 72, 50, 93,
49, 78, 67, 67, 78, 50, 67, 91, 67, 64, 64, 72, 93,
51, 67, 67, 67, 91, 67, 64, 64, 72, 93, 51, 99, 52,
99, 99, 99, 99, 49, 99, 50, 52, 148, 140, 40, 91, 49,
49, 67, 72, 50, 93, 49, 78, 67, 67, 78, 50, 91, 67,
64, 64, 72, 93, 51, 67, 67, 67, 91, 67, 64, 64, 72,
93, 51, 99, 52, 99, 99, 99, 99, 49, 99, 50, 52, 148,
140, 54, 91, 49, 49, 67, 72, 51, 93, 99, 49, 99, 99,
99, 40, 99, 99, 49, 41, 99, 50, 99, 99, 40, 110, 110,
50, 99, 51, 99, 99, 99, 40, 99, 99, 51, 41, 83, 40,
61, 79, 41, 40, 61, 79, 41, 78, 41, 67, 40, 70, 41,
40, 70, 41, 70, 148, 140, 44, 91, 49, 49, 67, 72, 51,
93, 99, 49, 99, 99, 99, 99, 99, 49, 79, 91, 67, 64,
72, 93, 40, 91, 67, 64, 64, 72, 93, 50, 67, 78, 67,
67, 79, 50, 41, 99, 51, 99, 99, 99, 99, 99, 51, 148,
140, 44, 91, 49, 49, 67, 72, 51, 93, 99, 49, 99, 99,
99, 99, 99, 49, 83, 91, 67, 64, 72, 93, 40, 91, 67,
64, 64, 72, 93, 50, 67, 78, 67, 67, 79, 50, 41, 99,
51, 99, 99, 99, 99, 99, 51, 148, 101, 116, 148, 98, 46],
dtype=uint8)]
How does one go about returning the list of strings using h5py?
Just to clarify, the dataframe displays as:
In [2]: df = pd.read_csv('stack63452223.csv', header=0)
In [3]: df
Out[3]:
structure
0 [11CH2]1NCCN2C[C##H]3CCC[C##H]3c4cccc1c24
1 [11CH2]1NCCN2[C##H]3CCC[C##H]3c4cccc1c24
2 [11CH3]c1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)...
3 [11CH3]c1ccccc1O[C#H]([C##H]2CNCCO2)c3ccccc3
4 [11CH3]c1ccccc1S[C#H]([C##H]2CNCCO2)c3ccccc3
In [11]: df._values
Out[11]:
array([['[11CH2]1NCCN2C[C##H]3CCC[C##H]3c4cccc1c24'],
['[11CH2]1NCCN2[C##H]3CCC[C##H]3c4cccc1c24'],
['[11CH3]c1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)C(F)(F)F'],
['[11CH3]c1ccccc1O[C#H]([C##H]2CNCCO2)c3ccccc3'],
['[11CH3]c1ccccc1S[C#H]([C##H]2CNCCO2)c3ccccc3']], dtype=object)
or as a list of strings:
In [24]: df['structure'].to_list()
Out[24]:
['[11CH2]1NCCN2C[C##H]3CCC[C##H]3c4cccc1c24',
'[11CH2]1NCCN2[C##H]3CCC[C##H]3c4cccc1c24',
'[11CH3]c1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)C(F)(F)F',
'[11CH3]c1ccccc1O[C#H]([C##H]2CNCCO2)c3ccccc3',
'[11CH3]c1ccccc1S[C#H]([C##H]2CNCCO2)c3ccccc3']
The h5 is written by pytables, which is different from h5py; generally h5py can read pytables, but the details can be complicated.
The top level keys:
['axis0', 'axis1', 'block0_items', 'block0_values']
A dataframe has axes (row and column). On another occasion I looked at how a dataframe stores its values, and found that it uses blocks, each holding columns with a common dtype. Here you have 1 column, and it is object dtype, since it contains strings.
Strings are bit awkward in HDF5, especially unicode. numpy arrays use a unicode string dtype; pandas uses object dtype, referencing Python strings (stored outside the dataframe). I suspect then that in saving such a frame pytables is using a more complex referencing scheme (that isn't immediately obvious via h5py).
Guess that's a long answer to just say I don't know.
Pandas own h5 load:
In [19]: pd.read_hdf('stack63452223.h5', 'table')
Out[19]:
structure
0 [11CH2]1NCCN2C[C##H]3CCC[C##H]3c4cccc1c24
1 [11CH2]1NCCN2[C##H]3CCC[C##H]3c4cccc1c24
2 [11CH3]c1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)...
3 [11CH3]c1ccccc1O[C#H]([C##H]2CNCCO2)c3ccccc3
4 [11CH3]c1ccccc1S[C#H]([C##H]2CNCCO2)c3ccccc3
The h5 objects also have attrs,
In [38]: f['table'].attrs.keys()
Out[38]: <KeysViewHDF5 ['CLASS', 'TITLE', 'VERSION', 'axis0_variety', 'axis1_variety', 'block0_items_variety', 'encoding', 'errors', 'nblocks', 'ndim', 'pandas_type', 'pandas_version']>
Fiddling around I found that:
In [66]: x=f['table']['block0_values'][0]
In [67]: b''.join(x.view('S1').tolist())
Out[67]: b'\x80\x04\x95y\x01\x8c\x15numpy.core.multiarray\x94\x8c\x0c_reconstruct\x94\x93\x94\x8c\x05numpy\x94\x8c\x07ndarray\x94\x93\x94K\x85\x94C\x01b\x94\x87\x94R\x94(K\x01K\x05K\x01\x86\x94h\x03\x8c\x05dtype\x94\x93\x94\x8c\x02O8\x94\x89\x88\x87\x94R\x94(K\x03\x8c\x01|\x94NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK?t\x94b\x89]\x94(\x8c)[11CH2]1NCCN2C[C##H]3CCC[C##H]3c4cccc1c24\x94\x8c([11CH2]1NCCN2[C##H]3CCC[C##H]3c4cccc1c24\x94\x8c6[11CH3]c1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)C(F)(F)F\x94\x8c,[11CH3]c1ccccc1O[C#H]([C##H]2CNCCO2)c3ccccc3\x94\x8c,[11CH3]c1ccccc1S[C#H]([C##H]2CNCCO2)c3ccccc3\x94et\x94b.'
Looks like your strings are there. uint8 is a single byte dtype, which can be viewed as byte. Joining them I see your strings, concatenated in some fashion.
reformating:
Out[67]: b'\x80\x04\x95y\x01\x8c\x15numpy.core.multiarray\x94\x8c\x0c_reconstruct\x94\x93\x94\x8c\x05numpy\x94\x8c\x07ndarray\x94\x93\x94K\x85\x94C\x01b\x94\x87\x94R\x94(K\x01K\x05K\x01\x86\x94h\x03\x8c\x05dtype\x94\x93\x94\x8c\x02O8\x94\x89\x88\x87\x94R\x94(K\x03\x8c\x01|\x94NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK?t\x94b\x89]\x94(\x8c)
[11CH2]1NCCN2C[C##H]3CCC[C##H]3c4cccc1c24\x94\x8c(
[11CH2]1NCCN2[C##H]3CCC[C##H]3c4cccc1c24\x94\x8c6
[11CH3]c1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)C(F)(F)F\x94\x8c,
[11CH3]c1ccccc1O[C#H]([C##H]2CNCCO2)c3ccccc3\x94\x8c,
[11CH3]c1ccccc1S[C#H]([C##H]2CNCCO2)c3ccccc3\x94et\x94b.'
I tried to do a slice a list into three new lists but seems my method is problematic. Can you guys help me to see how I should do it? Thank you!
quiz = [[91, 94, 38, 48, 70, 85, 94, 59], [78, 96, 90, 55, 77, 82, 94, 60], [99, 94, 82, 77, 75, 89, 94, 93], [49, 92, 75, 48, 80, 95, 99, 98]]
midterm = []
final = []
I tried to make quiz to have the first five number of the list, midterm then have the next two, and final has the last number of the list:
quiz = [[91, 94, 38, 48, 70,], [78, 96, 90, 55, 77], [99, 94, 82, 77, 75,], [49, 92, 75, 48, 80]]
midterm = [[85, 94,],[82, 94,], [89, 94,], [95, 99,]]
final = [[59], [60], [93], [98]]
And here is my code:
quiz = [[91, 94, 38, 48, 70, 85, 94, 59], [78, 96, 90, 55, 77, 82, 94, 60], [99, 94, 82, 77, 75, 89, 94, 93], [49, 92, 75, 48, 80, 95, 99, 98]]
midterm = quiz[5:2]
final = midterm[5:1]
midterm = [i[5:7] for i in quiz]
final = [i[7:] for i in quiz]
quiz = [i[:5] for i in quiz]
How this works:
[ ] is a condensed version of a for loop.
For example, the above code is the same as the following:
for i in quiz:
midterm.append(i[5:7])
for i in quiz:
final.append(i[7:])
tmp = []
for i in quiz:
tmp.append(i[:5])
quiz = tmp
Which pretty much iterates through all of the elements in quiz and takes the two and the one and the five for the separate arrays. What you were doing wrong is that you did not treat quiz as a two dimensional array, but as a one dimensional array.
Your current code takes the second through fifth elements of the array quiz for midterm, which happen to be the second through fifth arrays of integers, not the second through fifth integers in each array in quiz.
here you go: using list comprehension
>>> quiz = [[91, 94, 38, 48, 70, 85, 94, 59], [78, 96, 90, 55, 77, 82, 94, 60], [99, 94, 82, 77, 75, 89, 94, 93], [49, 92, 75, 48, 80, 95, 99, 98]]
>>> new_quiz = [ x[:5] for x in quiz ]
>>> mid_term = [ x[5:7] for x in quiz ]
>>> final = [ x[-1:] for x in quiz ]
>>> new_quiz
[[91, 94, 38, 48, 70], [78, 96, 90, 55, 77], [99, 94, 82, 77, 75], [49, 92, 75, 48, 80]]
>>> mid_term
[[85, 94], [82, 94], [89, 94], [95, 99]]
>>> final
[[59], [60], [93], [98]]
Suppose I have made a large list of numbers, and I want to make another one which I will add, pairwise, with the first list.
Here's the first list, A:
[109, 77, 57, 34, 94, 68, 96, 72, 39, 67, 49, 71, 121, 89, 61, 84, 45, 40, 104, 68, 54, 60, 68, 62, 91, 45, 41, 118, 44, 35, 53, 86, 41, 63, 111, 112, 54, 34, 52, 72, 111, 113, 47, 91, 107, 114, 105, 91, 57, 86, 32, 109, 84, 85, 114, 48, 105, 109, 68, 57, 78, 111, 64, 55, 97, 85, 40, 100, 74, 34, 94, 78, 57, 77, 94, 46, 95, 60, 42, 44, 68, 89, 113, 66, 112, 60, 40, 110, 89, 105, 113, 90, 73, 44, 39, 55, 108, 110, 64, 108]
And here's B:
[35, 106, 55, 61, 81, 109, 82, 85, 71, 55, 59, 38, 112, 92, 59, 37, 46, 55, 89, 63, 73, 119, 70, 76, 100, 49, 117, 77, 37, 62, 65, 115, 93, 34, 107, 102, 91, 58, 82, 119, 75, 117, 34, 112, 121, 58, 79, 69, 68, 72, 110, 43, 111, 51, 102, 39, 52, 62, 75, 118, 62, 46, 74, 77, 82, 81, 36, 87, 80, 56, 47, 41, 92, 102, 101, 66, 109, 108, 97, 49, 72, 74, 93, 114, 55, 116, 66, 93, 56, 56, 93, 99, 96, 115, 93, 111, 57, 105, 35, 99]
How might I generate the arithmetic addition logic, processing each pairwise value one by one (A[0] and B[0], through A[99], B[99]) and producing the list C (A[0] + B[0] through A[99]+ B[99])?
result = [(x + y) for x, y in itertools.izip(A, B)]
Or:
result = map(operator.add, itertools.izip(A, B))
Here is two possible options:
Use list comprehension.
Use NumPy.
I will be using shortened versions of your lists for convenience, and the element-wise sum will go into c.
List comprehension
a = [109, 77, 57, 34, 94, 68, 96]
b = [35, 106, 55, 61, 81, 109, 82]
c = [a_el + b_el for a_el,b_el in zip(a, b)]
NumPy
import numpy as np
a = np.array([109, 77, 57, 34, 94, 68, 96])
b = np.array([35, 106, 55, 61, 81, 109, 82])
c = a + b
With a list comprehension:
C = [A[i]+ B[i] for i in range(len(A))]
And even safer:
C = [A[i]+ B[i] for i in range(len(A)) if len(A) == len(B)]