How to save Jupyter notebook output in a file using python command? - python

I was trying to save output of my Jupyternotebook 'npsmiles_descriptors.py' into a text file so that it can be used by other applications.
Can anyone help me out by pointing out the error in my last command. I am getting syntax error.
I was trying to save output of my Jupyternotebook 'npsmiles_descriptors.py' into a text file so that it can be used by other applications.
Can anyone help me out by pointing out the error in my last command. I am getting syntax error.
df1 = df['smiles']
print(df['smiles'])
0 O1[C##H]2C[C#H](O)[C##]3([C#H]([C#H](OC(=O)c4...
1 OC[C#H]1N(CCC1)C(=O)[C##H](NC(=O)[C#H](CCCCC)...
2 O1[C##H](C[C#H](O)\C=C/[C##H]([C#H](O)[C#H](\...
3 O1[C#H](CO)[C##H](O)[C#H](O)[C##H](O)[C##H]1O...
4 O1[C##H]([C#H](C[C##H](C)[C#]1(O)CO)C)[C##H]1...
5 P(O[C#H]1[C#H](O[C##]2(O[C##H](C\C=C\c3nc(oc3...
6 O1[C#H](CO)[C##H](O)[C##H](O)[C##H]1n1c2NC=[N...
7 O1[C#H]2[C##H](CC[C##]3(O[C#]34[C##H]2C(=CC4)...
8 O1[C#H](CO)[C##H](O)[C##H](O)[C##H]1n1cnc(C(=...
9 O1[C##]2(C)[C##](O)([C##]3([C##H]([C#H](O)[C#...
10 S(=O)(=O)([O-])N1C[C#](OC)(NC(=O)C)C1=O
11 S1[C#H]2N(C(C(=O)[O-])=C(C1)COC(=O)N)C(=O)[C#...
12 O1[C##H]2C[C##]3(O[C#H]([C#H](CC)C)[C#H](C=C3...
13 O1[C##H](C)[C##H](O)[C##H]([NH3+])C[C##H]1O[C...
14 O=C1[C#]2(O)[C##H](C=C1C)[C#]1(O)[C#H]([C#H]3...
15 S(CC[NH3+])C=1C[C#H]2N(C=1C(=O)[O-])C(=O)[C##...
16 O=C1N(C)[C##H]([C#H](O)[C##H](C\C=C\C)C)C(=O)...
17 O1[C#H](/C(=C/[C#H]2C[C##H](OC)[C#H](O)CC2)/C...
18 O1C[C#H]1C(=O)CCCCC[C##H]1NC(=O)[C##H]2N(CCC2...
19 O(C)c1cc2N([C##H]3[C#]4([C#H]5[NH+](CC=C[C##]...
20 O(C)C1=CC=C2c3c(cc(OC)c(OC)c3OC)CC[C#H](NC(=O...
21 O=C([C##H](\C=C(\C=C\C(=O)N[O-])/C)C)c1ccc(N(...
22 O1[C#](C)([C#H]2[C#H](OC)[C#H](OC(=O)\C=C\C=C...
23 O1[C#H]2n3c4c(c5c(CNC5=O)c5c6c(n(c45)[C#]1(C)...
24 O1[C#H](CC)[C#](O)(C)[C#H](O)[C##H](C)C(=O)[C...
25 O1[C##H](CO)[C#H](O)[C##H](O)[C#H]([NH2+]C)[C...
26 S1[C#H]2N([C##H](C(=O)[O-­-])C1(C)C)C(=O)[C#H]...
27 O=C(N[C##H](O)C(=O)NCCCC[NH2+]CCC[NH3+])C[C##...
28 O1[C##H](CC(=O)[C##H](\C=C(/C)\[C##H](O)[C##H...
29 Oc1ccc(cc1)[C#H](O)[C##H](O)[C##H]1NC(=O)[C#H...
30 O1[C##H]2[C##](O)([C#]34O[C##H]5OC(=O)[C#H](O...
31 Clc1c2Oc3cc4[C##H](NC(=O)[C##H](NC(=O)[C#H](N...
32 O1[C##H](C)[C#H](C)[C#H](O)[C#H](\C=C\C=C\C=C...
33 Clc1c2c(C(O[C##H](C[C#H]3O[C##H]3/C=C\C=C\C(=...
34 O1[C#H](C[C##H](O)[C#H](C\C=C\Cc2c(C1=O)c(O)c...
35 S1C2=N[C#H](c3oc(c(n3)-c3oc(c(n3)-c3occ(n3)-c...
36 O1c2c3c4c(c(O)c2C)c(O)c(NC(=O)/C(=C\C=C\[C#H]...
37 O1[C##H](C[C#H](OC)[C##H](O)CC\C=C(\C=C\[C#H]...
38 O1[C##H](C\C=C\C=C\[C#H](O)[C##H](C[C#H](CC=O...
39 O1[C##]2(C(=O)[O-])[C#](O)(C(O)=O)[C#H](O[C#]...
40 O1C[C#H](CO)[C##H](O)C[C#]12OC[C##H](CC2)CCS
41 ClC(\C=C\[C##H](O)CC(C[C#H]1O[C#H]2[C#H](O)[C...
42 O1[C##H]2[C#H](O[C##]3([C#H](O[C##H]4[C#H](O[...
43 O(C)c1cc2c(nccc2[C##H](O)[C#H]2[N##H+]3C[C##H...
44 O1C[C#H](N=C1c1ccccc1O)C(=O)N[C##H](CCCCN([O-...
45 O(C)c1c(OC)c2[nH]c(cc2cc1OC)C(=O)N1C=2[C#]3([...
46 [S+](CCCNC(=O)c1nc(sc1)-c1nc(sc1)CCNC(=O)[C##...
47 O1[C#H](CCC\C=C\[C#H]2[C##H](C[C##H](O)C2)[C...
48 O1[C##]23[C##H]([C#H](C)C(=C)[C##H](O)[C##H]2...
49 s1cc(nc1C)\C=C(/C)\[C#H]1OC(=O)C[C#H](O)C(C)(...
50 S(C(=O)[C##]1(NC(=O)[C#H](C)[C##H]1O)[C##H](O...
51 Ic1c(C)c(C(S[C##H]2[C#H](O[C##H](ON[C#H]3[C#H...
52 O1[C##H]2O[C##]3(OO[C#]24[C##H](CC[C#H]([C##H...
53 O1[C##H](C[C##H](O)CC1=O)CC[C##H]1[C##H]2C(C=...
54 O1[C##H](C[C##H](OC(=O)[C##H](NC=O)CC(C)C)C\C...
55 O1[C##H](C[C#H]2CO[C##H](C\C(=C\C(OCCCCCCCCC(...
56 O1[C#H](C)[C#H](NC(=O)[C##H](NC(=O)[C#H](NC(=...
57 O=C(N[C##H](CC(C)C)C(=O)[O-])[C##H](O)[C#H]([...
58 OC/C(=C\CC\C(=C\CO)\C)/CC\C=C(\CC\C=C(\C)/C)/C
59 O(C)C1=C2C[C#H](C[C#H](OC)[C#H](O)[C#H](\C=C(...
Name: smiles, dtype: object
python npsmiles_descriptors.py > out.txt
File "<ipython-input-35-e13d177d9791>", line 1
python npsmiles_descriptors.py > out.txt
^
SyntaxError: invalid syntax

Use !python npsmiles_descriptors.py > out.txt instead. !<command line command> is shorthand for
import os
os.system("<command line command>")
in a Jupyter Notebook.
Credit: https://stackoverflow.com/a/47952494/14212394

Related

convert pandas datetime column yyyy-mm-dd to YYYYMMDD

I have a dateframe with datetime column in the format yyyy-mm-dd.
I would like to have it in interger format yyyymmdd . I keep throwing an error using this
x=dates.apply(dt.datetime.strftime('%Y%m%d')).astype(int)
TypeError: descriptor 'strftime' requires a 'datetime.date' object but received a 'str'
This doesn't not work as i tried to pass an array. I know that if I pass just on element it will convert, but how do I do it more pythonic? I did try using lambda but that didn't work either.
If your column is a string, you will need to first use `pd.to_datetime',
df['Date'] = pd.to_datetime(df['Date'])
Then, use .dt datetime accessor with strftime:
df = pd.DataFrame({'Date':pd.date_range('2017-01-01', periods = 60, freq='D')})
df.Date.dt.strftime('%Y%m%d').astype(int)
Or use lambda function:
df.Date.apply(lambda x: x.strftime('%Y%m%d')).astype(int)
Output:
0 20170101
1 20170102
2 20170103
3 20170104
4 20170105
5 20170106
6 20170107
7 20170108
8 20170109
9 20170110
10 20170111
11 20170112
12 20170113
13 20170114
14 20170115
15 20170116
16 20170117
17 20170118
18 20170119
19 20170120
20 20170121
21 20170122
22 20170123
23 20170124
24 20170125
25 20170126
26 20170127
27 20170128
28 20170129
29 20170130
30 20170131
31 20170201
32 20170202
33 20170203
34 20170204
35 20170205
36 20170206
37 20170207
38 20170208
39 20170209
40 20170210
41 20170211
42 20170212
43 20170213
44 20170214
45 20170215
46 20170216
47 20170217
48 20170218
49 20170219
50 20170220
51 20170221
52 20170222
53 20170223
54 20170224
55 20170225
56 20170226
57 20170227
58 20170228
59 20170301
Name: Date, dtype: int32

Analysing Json file in Python using pandas

I have to analyse a lot of data doing my Bachelors project.
The data will be handed to me in .json files. My supervisor has told me that it should be fairly easy if I just use Pandas.
Since I am all new to Python (I have decent experience with MatLab and C though) I am having a rough start.
If someone would be so kind to explain me how to do this I would really appreciate it.
The files look like this:
{"columns":["id","timestamp","offset_freq","reprate_freq"],
"index":[0,1,2,3,4,5,6,7 ...
"data":[[526144,1451900097533,20000000.495000001,250000093.9642499983],[...
need to import the data and analyse it (make some plots), but I'm not sure how to import data like this..
Ps. I have Python and the required packages installed.
You did not give the full format of JSON file, but if it looks like
{"columns":["id","timestamp","offset_freq","reprate_freq"],
"index":[0,1,2,3,4,5,6,7,8,9],
"data":[[39,69,50,51],[62,14,12,49],[17,99,65,79],[93,5,29,0],[89,37,42,47],[83,79,26,29],[88,17,2,7],[95,87,34,34],[40,54,18,68],[84,56,94,40]]}
then you can do (I made up random numbers)
df = pd.read_json(file_name_or_Python_string, orient='split')
print df
id timestamp offset_freq reprate_freq
0 39 69 50 51
1 62 14 12 49
2 17 99 65 79
3 93 5 29 0
4 89 37 42 47
5 83 79 26 29
6 88 17 2 7
7 95 87 34 34
8 40 54 18 68
9 84 56 94 40

Encoding error when opening an Excel file with xlrd

I am trying to open an Excel file (.xls) using xlrd. This is a summary of the code I am using:
import xlrd
workbook = xlrd.open_workbook('thefile.xls')
This works for most files, but fails for files I get from a specific organization. The error I get when I try to open Excel files from this organization follows.
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/app/.heroku/python/lib/python2.7/site-packages/xlrd/__init__.py", line 435, in open_workbook
ragged_rows=ragged_rows,
File "/app/.heroku/python/lib/python2.7/site-packages/xlrd/book.py", line 116, in open_workbook_xls
bk.parse_globals()
File "/app/.heroku/python/lib/python2.7/site-packages/xlrd/book.py", line 1180, in parse_globals
self.handle_writeaccess(data)
File "/app/.heroku/python/lib/python2.7/site-packages/xlrd/book.py", line 1145, in handle_writeaccess
strg = unpack_unicode(data, 0, lenlen=2)
File "/app/.heroku/python/lib/python2.7/site-packages/xlrd/biffh.py", line 303, in unpack_unicode
strg = unicode(rawstrg, 'utf_16_le')
File "/app/.heroku/python/lib/python2.7/encodings/utf_16_le.py", line 16, in decode
return codecs.utf_16_le_decode(input, errors, True)
UnicodeDecodeError: 'utf16' codec can't decode byte 0x40 in position 104: truncated data
This looks as if xlrd is trying to open an Excel file encoded in something other than UTF-16. How can I avoid this error? Is the file being written in a flawed way, or is there just a specific character that is causing the problem? If I open and re-save the Excel file, xlrd opens the file without a problem.
I have tried opening the workbook with different encoding overrides but this doesn't work either.
The file I am trying to open is available here:
https://dl.dropboxusercontent.com/u/6779408/Stackoverflow/AEPUsageHistoryDetail_RequestID_00183816.xls
Issue reported here: https://github.com/python-excel/xlrd/issues/128
What are they using to generate that file ?
They are using some Java Excel API (see below, link here), probably on an IBM mainframe or similar.
From the stack trace the writeaccess information can't decoding into Unicode because the # character.
For more information on the writeaccess information of the XLS fileformat see 5.112 WRITEACCESS or Page 277.
This field contains the username of the user that has saved the file.
import xlrd
dump = xlrd.dump('thefile.xls')
Running xlrd.dump on the original file gives
36: 005c WRITEACCESS len = 0070 (112)
40: d1 81 a5 81 40 c5 a7 83 85 93 40 c1 d7 c9 40 40 ????#?????#???##
56: 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 ################
72: 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 ################
88: 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 ################
104: 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 ################
120: 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 ################
136: 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 ################
After resaving it with Excel or in my case LibreOffice Calc the write access information is overwritten with something like
36: 005c WRITEACCESS len = 0070 (112)
40: 04 00 00 43 61 6c 63 20 20 20 20 20 20 20 20 20 ?~~Calc
56: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
72: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
88: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
104: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
120: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
136: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
Based on the spaces being encoded as 40, I believe the encoding is EBCDIC, and when we convert d1 81 a5 81 40 c5 a7 83 85 93 40 c1 d7 c9 40 40 to EBCDIC we get Java Excel API.
So yes the file is being written in a flawed way in the case of BIFF8 and higher it should be a unicode string, and in BIFF3 to BIFF5, it should be a byte string in the encoding in the CODEPAGE information which is
152: 0042 CODEPAGE len = 0002 (2)
156: 12 52 ?R
1252 is Windows CP-1252 (Latin I) (BIFF4-BIFF5), which is not EBCDIC_037.
The fact the xlrd tried to use unicode, means that it determined the version of the file to be BIFF8.
In this case, you have two options
Fix the file before opening it with xlrd. You could check using dump to a file that isn't standard out, and then if it is the case, you can overwrite the writeaccess information with xlutils.save or another library.
Patch xlrd to handle your special case, in handle_writeaccess adding a try block and setting strg to empty string on unpack_unicode failure.
The following snippet
def handle_writeaccess(self, data):
DEBUG = 0
if self.biff_version < 80:
if not self.encoding:
self.raw_user_name = True
self.user_name = data
return
strg = unpack_string(data, 0, self.encoding, lenlen=1)
else:
try:
strg = unpack_unicode(data, 0, lenlen=2)
except:
strg = ""
if DEBUG: fprintf(self.logfile, "WRITEACCESS: %d bytes; raw=%s %r\n", len(data), self.raw_user_name, strg)
strg = strg.rstrip()
self.user_name = strg
with
workbook=xlrd.open_workbook('thefile.xls',encoding_override="cp1252")
Seems to open the file successfully.
Without the encoding override it complains ERROR *** codepage 21010 -> encoding 'unknown_codepage_21010' -> LookupError: unknown encoding: unknown_codepage_21010
This worked for me.
import xlrd
my_xls = xlrd.open_workbook('//myshareddrive/something/test.xls',encoding_override="gb2312")

Python code crashes when running, but not when debugging (Ctypes)

I am running into a REALLY weird case with a little class involving ctypes that I am writing. The objective of this class is to load a matrix that is in proprietary format into a python structure that I had to create (these matrices can have several cores/layers and each core/layer can have several indices that refer to only a few elements of the matrix, thus forming submatrices).
The code that test the class is this:
import numpy as np
from READS_MTX import mtx
import time
mymatrix=mtx()
mymatrix.load('D:\\MyMatrix.mtx', True)
and the class I created is this:
import os
import numpy as np
from ctypes import *
import ctypes
import time
def main():
pass
#A mydll mtx can have several cores
#we need a class to define each core
#and a class to hold the whole file together
class mtx_core:
def __init__(self):
self.name=None #Matrix core name
self.rows=-1 #Number of rows in the matrix
self.columns=-1 #Number of columns in the matrix
self.type=-1 #Data type of the matrix
self.indexcount=-1 #Tuple with the number of indices for each dimension
self.RIndex={} #Dictionary with all indices for the rows
self.CIndex={} #Dictionary with all indices for the columns
self.basedata=None
self.matrix=None
def add_core(self, mydll,mat,core):
nameC=ctypes.create_string_buffer(50)
mydll.MATRIX_GetLabel(mat,0,nameC)
nameC=repr(nameC.value)
nameC=nameC[1:len(nameC)-1]
#Add the information to the objects' methods
self.name=repr(nameC)
self.rows= mydll.MATRIX_GetBaseNRows(mat)
self.columns=mydll.MATRIX_GetBaseNCols(mat)
self.type=mydll.MATRIX_GetDataType(mat)
self.indexcount=(mydll.MATRIX_GetNIndices(mat,0 ),mydll.MATRIX_GetNIndices(mat,0 ))
#Define the data type Numpy will have according to the data type of the matrix in question
dt=np.float64
v=(self.columns*c_float)()
if self.type==1:
dt=np.int32
v=(self.columns*c_long)()
if self.type==2:
dt=np.int64
v=(self.columns*c_longlong)()
#Instantiate the matrix
time.sleep(5)
self.basedata=np.zeros((self.rows,self.columns),dtype=dt)
#Read matrix and puts in the numpy array
for i in range(self.rows):
mydll.MATRIX_GetBaseVector(mat,i,0,self.type,v)
self.basedata[i,:]=v[:]
#Reads all the indices for rows and put them in the dictionary
for i in range(self.indexcount[0]):
mydll.MATRIX_SetIndex(mat, 0, i)
v=(mydll.MATRIX_GetNRows(mat)*c_long)()
mydll.MATRIX_GetIDs(mat,0, v)
t=np.zeros(mydll.MATRIX_GetNRows(mat),np.int64)
t[:]=v[:]
self.RIndex[i]=t.copy()
#Do the same for columns
for i in range(self.indexcount[1]):
mydll.MATRIX_SetIndex(mat, 1, i)
v=(mydll.MATRIX_GetNCols(mat)*c_long)()
mydll.MATRIX_GetIDs(mat,1, v)
t=np.zeros(mydll.MATRIX_GetNCols(mat),np.int64)
t[:]=v[:]
self.CIndex[i]=t.copy()
class mtx:
def __init__(self):
self.data=None
self.cores=-1
self.matrix={}
mydll=None
def load(self, filename, verbose=False):
#We load the DLL and initiate it
mydll=cdll.LoadLibrary('C:\\Program Files\\Mysoftware\\matrixDLL.dll')
mydll.InitMatDLL()
mat=mydll.MATRIX_LoadFromFile(filename, True)
if mat<>0:
self.cores=mydll.MATRIX_GetNCores(mat)
if verbose==True: print "Matrix has ", self.cores, " cores"
for i in range(self.cores):
mydll.MATRIX_SetCore(mat,i)
nameC=ctypes.create_string_buffer(50)
mydll.MATRIX_GetLabel(mat,i,nameC)
nameC=repr(nameC.value)
nameC=nameC[1:len(nameC)-1]
#If verbose, we list the matrices being loaded
if verbose==True: print " Loading core: ", nameC
self.datafile=filename
self.matrix[nameC]=mtx_core()
self.matrix[nameC].add_core(mydll,mat,i)
else:
raise NameError('Not possible to open file. TranCad returned '+ str(tc_value))
mydll.MATRIX_CloseFile(filename)
mydll.MATRIX_Done(mat)
if __name__ == '__main__':
main()
When I run the test code in ANY form (double clicking, python's IDLE or Pyscripter) it crashes with the familiar error "WindowsError: exception: access violation writing 0x0000000000000246", but when I debug the code using Pyscripter stoping in any inner loop, it runs perfectly.
I'd really appreciate any insights.
EDIT
THe Dumpbin output for the DLL:
File Type: DLL
Section contains the following exports for CaliperMTX.dll
00000000 characteristics
52FB9F15 time date stamp Wed Feb 12 08:19:33 2014
0.00 version
1 ordinal base
81 number of functions
81 number of names
ordinal hint RVA name
1 0 0001E520 InitMatDLL
2 1 0001B140 MATRIX_AddIndex
3 2 0001AEE0 MATRIX_Clear
4 3 0001AE30 MATRIX_CloseFile
5 4 00007600 MATRIX_Copy
6 5 000192A0 MATRIX_CreateCache
7 6 00019160 MATRIX_CreateCacheEx
8 7 0001EB10 MATRIX_CreateSimple
9 8 0001ED20 MATRIX_CreateSimpleLike
10 9 00016D40 MATRIX_DestroyCache
11 A 00016DA0 MATRIX_DisableCache
12 B 0001A880 MATRIX_Done
13 C 0001B790 MATRIX_DropIndex
14 D 00016D70 MATRIX_EnableCache
15 E 00015B10 MATRIX_GetBaseNCols
16 F 00015B00 MATRIX_GetBaseNRows
17 10 00015FF0 MATRIX_GetBaseVector
18 11 00015CE0 MATRIX_GetCore
19 12 000164C0 MATRIX_GetCurrentIndexPos
20 13 00015B20 MATRIX_GetDataType
21 14 00015EE0 MATRIX_GetElement
22 15 00015A30 MATRIX_GetFileName
23 16 00007040 MATRIX_GetIDs
24 17 00015B80 MATRIX_GetInfo
25 18 00015A50 MATRIX_GetLabel
26 19 00015AE0 MATRIX_GetNCols
27 1A 00015AB0 MATRIX_GetNCores
28 1B 00016EC0 MATRIX_GetNIndices
29 1C 00015AC0 MATRIX_GetNRows
30 1D 00018AF0 MATRIX_GetVector
31 1E 00015B40 MATRIX_IsColMajor
32 1F 00015B60 MATRIX_IsFileBased
33 20 000171A0 MATRIX_IsReadOnly
34 21 00015B30 MATRIX_IsSparse
35 22 0001AE10 MATRIX_LoadFromFile
36 23 0001BAE0 MATRIX_New
37 24 00017150 MATRIX_OpenFile
38 25 000192D0 MATRIX_RefreshCache
39 26 00016340 MATRIX_SetBaseVector
40 27 00015C20 MATRIX_SetCore
41 28 00016200 MATRIX_SetElement
42 29 00016700 MATRIX_SetIndex
43 2A 0001AFA0 MATRIX_SetLabel
44 2B 00018E50 MATRIX_SetVector
45 2C 00005DA0 MAT_ACCESS_Create
46 2D 00005E40 MAT_ACCESS_CreateFromCurrency
47 2E 00004B10 MAT_ACCESS_Done
48 2F 00005630 MAT_ACCESS_FillRow
49 30 000056D0 MAT_ACCESS_FillRowDouble
50 31 00005A90 MAT_ACCESS_GetCurrency
51 32 00004C30 MAT_ACCESS_GetDataType
52 33 000058E0 MAT_ACCESS_GetDoubleValue
53 34 00004C40 MAT_ACCESS_GetIDs
54 35 00005AA0 MAT_ACCESS_GetMatrix
55 36 00004C20 MAT_ACCESS_GetNCols
56 37 00004C10 MAT_ACCESS_GetNRows
57 38 000055A0 MAT_ACCESS_GetRowBuffer
58 39 00005570 MAT_ACCESS_GetRowID
59 3A 00005610 MAT_ACCESS_GetToReadFlag
60 3B 00005870 MAT_ACCESS_GetValue
61 3C 00005AB0 MAT_ACCESS_IsValidCurrency
62 3D 000055E0 MAT_ACCESS_SetDirty
63 3E 000059F0 MAT_ACCESS_SetDoubleValue
64 3F 00005620 MAT_ACCESS_SetToReadFlag
65 40 00005960 MAT_ACCESS_SetValue
66 41 00005460 MAT_ACCESS_UseIDs
67 42 00005010 MAT_ACCESS_UseIDsEx
68 43 00005490 MAT_ACCESS_UseOwnIDs
69 44 00004D10 MAT_ACCESS_ValidateIDs
70 45 0001E500 MAT_pafree
71 46 0001E4E0 MAT_palloc
72 47 0001E4F0 MAT_pfree
73 48 0001E510 MAT_prealloc
74 49 00006290 MA_MGR_AddMA
75 4A 00006350 MA_MGR_AddMAs
76 4B 00005F90 MA_MGR_Create
77 4C 00006050 MA_MGR_Done
78 4D 000060D0 MA_MGR_RegisterThreads
79 4E 00006170 MA_MGR_SetRow
80 4F 00006120 MA_MGR_UnregisterThread
81 50 0001E490 UnloadMatDLL
Summary
6000 .data
5000 .pdata
C000 .rdata
1000 .reloc
1000 .rsrc
54000 .text

programming challenge help (python)? [duplicate]

This question already has answers here:
Euler project #18 approach
(10 answers)
Closed 9 years ago.
I'm trying to solve project euler problem 18/67 . I have an attempt but it isn't correct.
tri = '''\
75
95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23'''
sum = 0
spot_index = 0
triarr = list(filter(lambda e: len(e) > 0, [[int(nm) for nm in ln.split()] for ln in tri.split('\n')]))
for i in triarr:
if len(i) == 1:
sum += i[0]
elif len(i) == 2:
spot_index = i.index(max(i))
sum += i[spot_index]
else:
spot_index = i.index(max(i[spot_index],i[spot_index+1]))
sum += i[spot_index]
print(sum)
When I run the program, it is always a little bit off of what the correct sum/output should be. I'm pretty sure that it's an algorithm problem, but I don't know how exactly to fix it or what the best approach to the original problem might be.
Your algorithm is wrong. Consider if there was a large number like 1000000 on the bottom row. Your algorithm might follow a path that doesn't find it at all.
The question hints that this one can be brute forced, but that there is also a more clever way to solve it.
Somehow your algorithm will need to consider all possible pathways/sums.
The brute force method is to try each and every one from top to bottom.
The clever way uses a technique called dynamic programming
Here's the algorithm. I'll let you figure out a way to code it.
Start with the two bottom rows. At each element of the next-to-bottom row, figure out what the sum will be if you reach that element by adding the maximum of the two elements of the bottom row that correspond to the current element of the next-to-bottom row. For instance, given the sample above, the left-most element of the next-to-bottom row is 63, and if you ever reach that element, you will certainly choose its right child 62. So you can replace the 63 on the next-to-bottom row with 63 + 62 = 125. Do the same for each element of the next-to-bottom row; you will get 125, 164, 102, 95, 112, 123, 165, 128, 166, 109, 112, 147, 100, 54. Now delete the bottom row and repeat on the reduced triangle.
There is also a top-down algorithm that is dual to the one given above. I'll let you figure that out, too.

Categories