How to read .dcm in Xcode using python? - python

I'm trying to create an app for viewing and analyzing DICOM slices. I have done this app in MATLAB, but MATLAB does not have enough tools to build a really nice GUI and 3D picture is bad. So, I was trying to use ITK and VTK to build an app in Xcode for a long period of time but without any success. One day I found xcodeproject PythonDicomDocument - this project (written in python) can read and show DICOM image! I have read a tutorial about python and cocoa but I still can't understand how this project works - it has file PythonDicomDocumentDocument.py:
from Foundation import *
from AppKit import *
from iiDicom import *
import objc
import dicom
import numpy
import Image
class PythonDicomDocumentDocument(NSDocument):
imageView = objc.IBOutlet('imageView')
def init(self):
self = super(PythonDicomDocumentDocument, self).init()
self.image = None
return self
def windowNibName(self):
return u"PythonDicomDocumentDocument"
def windowControllerDidLoadNib_(self, aController):
super(PythonDicomDocumentDocument, self).windowControllerDidLoadNib_(aController)
if self.image:
self.imageView.setImageScaling_(NSScaleToFit)
self.imageView.setImage_(self.image)
def dataOfType_error_(self, typeName, outError):
return None
def readFromData_ofType_error_(self, data, typeName, outError):
return NO
def readFromURL_ofType_error_(self, absoluteURL, typeName, outError):
if absoluteURL.isFileURL():
slice = iiDcmSlice.alloc().initWithDicomFileSlice_(absoluteURL.path())
dicomImage = slice.sliceAsNSImage_context_(True, None)
if dicomImage:
self.image = dicomImage
#self.image = dicomImage
return True, None
return False, None
and file main.m:
**#import "<"Python/Python.h>**
**#import "<"Cocoa/Cocoa.h>**
int main(int argc, char *argv[])
{
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSBundle *mainBundle = [NSBundle mainBundle];
NSString *resourcePath = [mainBundle resourcePath];
NSArray *pythonPathArray = [NSArray arrayWithObjects: resourcePath, [resourcePath stringByAppendingPathComponent:#"PyObjC"], #"/System/Library/Frameworks/Python.framework/Versions/Current/Extras/lib/python/", nil];
setenv("PYTHONPATH", [[pythonPathArray componentsJoinedByString:#":"] UTF8String], 1);
NSArray *possibleMainExtensions = [NSArray arrayWithObjects: #"py", #"pyc", #"pyo", nil];
NSString *mainFilePath = nil;
for (NSString *possibleMainExtension in possibleMainExtensions) {
mainFilePath = [mainBundle pathForResource: #"main" ofType: possibleMainExtension];
if ( mainFilePath != nil ) break;
}
if ( !mainFilePath ) {
[NSException raise: NSInternalInconsistencyException format: #"%s:%d main() Failed to find the Main.{py,pyc,pyo} file in the application wrapper's Resources directory.", __FILE__, __LINE__];
}
Py_SetProgramName("/usr/bin/python");
Py_Initialize();
PySys_SetArgv(argc, (char **)argv);
const char *mainFilePathPtr = [mainFilePath UTF8String];
FILE *mainFile = fopen(mainFilePathPtr, "r");
int result = PyRun_SimpleFile(mainFile, (char *)[[mainFilePath lastPathComponent] UTF8String]);
if ( result != 0 )
[NSException raise: NSInternalInconsistencyException
format: #"%s:%d main() PyRun_SimpleFile failed with file '%#'. See console for errors.", __FILE__, __LINE__, mainFilePath];
[pool drain];
return result;
}
So I want to "translate" MATLAB code for reading .dcm:
directory = uigetdir; % after this command Finder window will appear and user will choose a folder with .dcm files
fileFolder = directory; % the path to the folder is saved to a variable fileFolder
dirOutput = dir(fullfile(fileFolder,'*.dcm')); % choose files .dcm in specified folder %and save their names
fileNames = {dirOutput.name}';
Names = char(fileNames);
numFrames = numel(fileNames); % count the number of files in the folder
for i = 1:numFrames
Volume(:,:,i) = dicomread(fullfile(fileFolder,Names(i,:))); % create a 3D array of %DICOM pixel data
end;
Could anyone please tell me how to run the same code for reading .dcm files in Xcode using python???
I've heard that python and MATLAB are similar.

Congratulations on choosing Python for working with DICOM; the SciPy/numpy/matplotlib clan is much better at dealing with huge amounts of volume data than MATLAB (or at least GNU Octave) in my experience.
Trivia load and display code using GDCM's python bindings, a ConvertNumpy.py from GDCM's examples and matplotlib:
#!/usr/bin/env python
import gdcm
import ConvertNumpy
import numpy as np
import matplotlib.pyplot as plt
def loadDicomImage(filename):
reader=gdcm.ImageReader()
reader.SetFileName(filename)
reader.Read()
gdcmimage=reader.GetImage()
return ConvertNumpy.gdcm_to_numpy(gdcmimage)
image=loadDicomImage('mydicomfile.dcm')
plt.gray()
plt.imshow(image)
plt.show()
Note that If your DICOM data contains "padding" values significantly outside your image's air-bone range it might confuse imshow's auto-scaling; use vmax,vmin parameters to that call to specify the range you actually want to see, or implement your own window-levelling code (which is trivial in numpy).

Related

Python ctypes writing data to be read by C executable

I'm trying to learn how to use the Python ctypes library to write data to a file that can easily be read by C executables. In the little test case that I've put together, I'm running into some problems with reading/writing character arrays.
At the moment, I have three source files. write_struct.py creates a simple struct with two
entries, an integer value called git and a character array called command, then writes the struct to a file using ctypes.fwrite. read_struct.c and read_struct.h compile into an executable that internally defines an identical struct to the one in write_struct.py, then reads in the data written by the python script and prints it out.
At the moment, the following values are assigned in the python file (not literally in the manner shown below, scroll down to see the actual code):
git = 1
command = 'cp file1 file2'
And when run, the C executable prints the following:
git: 1
command:
I realize that the problem is almost certainly in how the command variable is being assigned in the python script. I have read that c_char_p() (the function I'm currently using to initialize the data in that variable) does not create a pointer to mutable memory, and create_string_buffer() should be used instead, however I'm not sure about how this works with either adding that data to a struct, or writing it to a file. I guess I'm also confused about how writing pointers/their data to a file works in the first place. What is the best way to go about doing this?
Thanks in advance to anyone that is able to help!!
The code of my three files is below for reference:
write_struct.py:
"""
write_struct.py
"""
from ctypes import *
libc = cdll.LoadLibrary("libc.so.6")
class DataStruct(Structure):
_fields_ = [("git", c_int),
("command", c_char_p)
]
def main():
pydata = DataStruct(1, c_char_p("cp file1 file2"))
libc.fopen.argtypes = c_char_p, c_char_p
libc.fopen.restype = c_void_p
libc.fwrite = libc.fwrite
libc.fwrite.argtypes = c_void_p, c_size_t, c_size_t, c_void_p
libc.fwrite.restype = c_size_t
libc.fclose = libc.fclose
libc.fclose.argtypes = c_void_p,
libc.fclose.restype = c_int
f = libc.fopen("stored_data", "wb")
libc.fwrite(byref(pydata), sizeof(pydata), 1, f)
libc.fclose(f)
return 0
main()
read_struct.c:
/*
* read_struct.c
*
*/
#include "read_struct.h"
int main()
{
data_struct cdata = malloc(DATASIZE);
FILE *fp;
if ((fp = fopen("stored_data", "r")) != NULL) {
fread(cdata, DATASIZE, 1, fp);
printf("git: %i\n", cdata->git);
printf("command:");
printf("%s\n", cdata->command);
fclose(fp);
} else {
printf("Could not open file\n");
exit(1);
}
return 0;
}
read_struct.h:
/*
* read_struct.h
*
*/
#include <stdio.h>
#include <stdlib.h>
typedef struct _data_struct *data_struct;
struct _data_struct {
int git;
char command[40];
};
#define DATASIZE sizeof(struct _data_struct)
You can write binary data directly with Python. ctypes can be used to create the structure and supports bit fields and unions, or for simple structures the struct module can be used.
from ctypes import *
class DataStruct(Structure):
_fields_ = [("git", c_int),
("command", c_char * 40)] # You want array here, not pointer
pydata = DataStruct(1,b'cp file1 file2') # byte string for initialization.
with open('stored_data','wb') as f: # write file in binary mode
f.write(pydata) # ctypes support conversion to bytes
import struct
# See struct docs for formatting codes
# i = int (native-endian. Use <i to force little-endian, >i for big-endian)
# 40s = char[40] (zero-padded if initializer is shorter)
pydata = struct.pack('i40s',1,b'cp file1 file2')
with open('stored_data2','wb') as f:
f.write(pydata)
Ref: https://docs.python.org/3/library/struct.html#format-strings

camera image incorrectly formatted in ctypes pointer (python)

I am using a DLL library to call functions to operate a camera in python, and i'm able to retrieve the image using ctypes but it's formatted incorrectly. The returned image is duplicated and half of it is blank. what do i need to do to fix this?
I have a labview program that correctly takes images from the camera, so that is how they are supposed to look like.
Correct image retrieved using Labview
Image retrieved using Python:
the image is duplicated and also sideways in python.
python code:
from ctypes import *
import numpy as np
import matplotlib.pyplot as plt
mydll = windll.LoadLibrary('StTrgApi.dll')
hCamera = mydll.StTrg_Open()
print(hCamera)
im_height = 1200
im_width = 1600
dwBufferSize = im_height * im_width
pbyteraw = np.zeros((im_height, im_width), dtype=np.uint16)
dwNumberOfByteTrans = 0
dwNumberOfByteTrans = (c_ubyte * dwNumberOfByteTrans)()
dwFrameNo = 0
dwFrameNo = (c_ubyte * dwFrameNo)()
dwMilliseconds = 3000
mydll.StTrg_TakeRawSnapShot(hCamera,
pbyteraw.ctypes.data_as(POINTER(c_int16)), dwBufferSize*2,
dwNumberOfByteTrans, dwFrameNo, dwMilliseconds)
print(pbyteraw)
plt.matshow(pbyteraw)
plt.show()
C++ code for taking the image:
DWORD dwBufferSize = 0;
if(!StTrg_GetRawDataSize(hCamera, &dwBufferSize))
{
_tprintf(TEXT("Get Raw Data Size Failed.\n"));
return(-1);
}
PBYTE pbyteRaw = new BYTE[dwBufferSize];
if(NULL != pbyteRaw)
{
DWORD dwNumberOfByteTrans = 0;
DWORD dwFrameNo = 0;
DWORD dwMilliseconds = 3000;
for(DWORD dwPos = 0; dwPos < 10; dwPos++)
{
if(StTrg_TakeRawSnapShot(hCamera, pbyteRaw, dwBufferSize,
&dwNumberOfByteTrans, &dwFrameNo, dwMilliseconds))
{
TCHAR szFileName[MAX_PATH];
if(is2BytesMode)
{
_stprintf_s(szFileName, _countof(szFileName), TEXT("%s\\%u.tif"), szBitmapFilePath, dwFrameNo);
StTrg_SaveImage(dwWidth, dwHeight, STCAM_PIXEL_FORMAT_16_MONO_OR_RAW, pbyteRaw, szFileName, 0);
}
else
{
_stprintf_s(szFileName, _countof(szFileName), TEXT("%s\\%u.bmp"), szBitmapFilePath, dwFrameNo);
StTrg_SaveImage(dwWidth, dwHeight, STCAM_PIXEL_FORMAT_08_MONO_OR_RAW, pbyteRaw, szFileName, 0);
}
_tprintf(TEXT("Save Image:%s\n"), szFileName);
}
else
{
_tprintf(TEXT("Fail:StTrg_TakeRawSnapShot\n"));
break;
}
}
delete[] pbyteRaw;
}
Based on your C code, something like this should work, but it is untested since I don't have your camera library. If you are using 32-bit Python, make sure the library calls are __stdcall to use WinDLL, else use CDLL. 64-bit Python it doesn't matter. Defining the argument types and return type helps catch errors. For output parameters, create instances of the correct ctype, then pass byref(). The way you were currently doing the output parameters was likely the cause of your crash. Setting argtypes would have detected that the values weren't pointers to DWORDs.
from ctypes import *
from ctypes import wintypes as w
mydll = WinDLL('StTrgApi')
mydll.StTrg_Open.argtypes = None
mydll.StTrg_Open.restype = w.HANDLE
mydll.StTrg_GetRawDataSize.argtypes = w.HANDLE,w.PDWORD
mydll.StTrg_GetRawDataSize.restype = None
mydll.StTrg_TakeRawSnapShot.argtypes = w.HANDLE,w.PBYTE,w.DWORD,w.PDWORD,w.PDWORD,w.DWORD
mydll.StTrg_TakeRawSnapShot.restype = None
hCamera = mydll.StTrg_Open()
print(hCamera)
dwBufferSize = w.DWORD()
mydll.StTrg_GetRawDataSize(hCamera,byref(dwBufferSize))
pbyteraw = (w.BYTE * dwbufferSize)()
dwNumberOfByteTrans = w.DWORD() # output parameters. Pass byref()
dwFrameNo = w.DWORD() # output parameters. Pass byref()
dwMilliseconds = 3000
mydll.StTrg_TakeRawSnapShot(hCamera,
pbyteraw,
dwbufferSize,
byref(dwNumberOfByteTrans),
byref(dwFrameNo),
dwMilliseconds)

Parsing C in python with libclang but generated the wrong AST

I want to use the libclang binding python to generate a C code's AST. OK, the source code is portrayed below .
#include <stdlib.h>
#include "adlist.h"
#include "zmalloc.h"
list *listCreate(void)
{
struct list *list;
if ((list = zmalloc(sizeof(*list))) == NULL)
return NULL;
list->head = list->tail = NULL;
list->len = 0;
list->dup = NULL;
list->free = NULL;
list->match = NULL;
return list;
}
And a implementation I wrote :
#!/usr/bin/python
# vim: set fileencoding=utf-8
import clang.cindex
import asciitree
import sys
def node_children(node):
return (c for c in node.get_children() if c.location.file.name == sys.argv[1])
def print_node(node):
text = node.spelling or node.displayname
kind = str(node.kind)[str(node.kind).index('.')+1:]
return '{} {}'.format(kind, text)
if len(sys.argv) != 2:
print("Usage: dump_ast.py [header file name]")
sys.exit()
clang.cindex.Config.set_library_file('/usr/lib/llvm-3.6/lib/libclang-3.6.so')
index = clang.cindex.Index.create()
translation_unit = index.parse(sys.argv[1], ['-x', 'c++', '-std=c++11', '-D__CODE_GENERATOR__'])
print(asciitree.draw_tree(translation_unit.cursor, node_children, print_node))
But the final output of this test is like the below :
TRANSLATION_UNIT adlist.c
+--FUNCTION_DECL listCreate
+--COMPOUND_STMT
+--DECL_STMT
+--STRUCT_DECL list
+--VAR_DECL list
+--TYPE_REF struct list
Obviously, the final result is wrong. there are much codes left no parsed. I have tried to traverse the translation unit but the result is just like the tree shows---many nodes were gone. Why will be that ? And is there any method to solve the problem? Thank you!
I guess that the reason is that Libclang is unable to parse malloc(). because neither stdlib has been included in this code nor has a user-defined definition provided for malloc.
The parse did not complete successfully, probably because you're missing some include paths.
You can confirm what the exact problem is by printing the diagnostic messages.
translation_unit = index.parse(sys.argv[1], args)
for diag in translation_unit.diagnostics:
print diag

Python convert C header file to dict

I have a C header file which contains a series of classes, and I'm trying to write a function which will take those classes, and convert them to a python dict. A sample of the file is down the bottom.
Format would be something like
class CFGFunctions {
class ABC {
class AA {
file = "abc/aa/functions"
class myFuncName{ recompile = 1; };
};
class BB
{
file = "abc/bb/functions"
class funcName{
recompile=1;
}
}
};
};
I'm hoping to turn it into something like
{CFGFunctions:{ABC:{AA:"myFuncName"}, BB:...}}
# Or
{CFGFunctions:{ABC:{AA:{myFuncName:"string or list or something"}, BB:...}}}
In the end, I'm aiming to get the filepath string (which is actually a path to a folder... but anyway), and the class names in the same class as the file/folder path.
I've had a look on SO, and google and so on, but most things I've found have been about splitting lines into dicts, rather then n-deep 'blocks'
I know I'll have to loop through the file, however, I'm not sure the most efficient way to convert it to the dict.
I'm thinking I'd need to grab the outside class and its relevant brackets, then do the same for the text remaining inside.
If none of that makes sense, it's cause I haven't quite made sense of the process myself haha
If any more info is needed, I'm happy to provide.
The following code is a quick mockup of what I'm sorta thinking...
It is most likely BROKEN and probably does NOT WORK. but its sort of the process that I'm thinking of
def get_data():
fh = open('CFGFunctions.h', 'r')
data = {} # will contain final data model
# would probably refactor some of this into a function to allow better looping
start = "" # starting class name
brackets = 0 # number of brackets
text= "" # temp storage for lines inside block while looping
for line in fh:
# find the class (start
mt = re.match(r'Class ([\w_]+) {', line)
if mt:
if start == "":
start = mt.group(1)
else:
# once we have the first class, find all other open brackets
mt = re.match(r'{', line)
if mt:
# and inc our counter
brackets += 1
mt2 = re.match(r'}', line)
if mt2:
# find the close, and decrement
brackets -= 1
# if we are back to the initial block, break out of the loop
if brackets == 0:
break
text += line
data[start] = {'tempText': text}
====
Sample file
class CfgFunctions {
class ABC {
class Control {
file = "abc\abc_sys_1\Modules\functions";
class assignTracker {
description = "";
recompile = 1;
};
class modulePlaceMarker {
description = "";
recompile = 1;
};
};
class Devices
{
file = "abc\abc_sys_1\devices\functions";
class registerDevice { recompile = 1; };
class getDeviceSettings { recompile = 1; };
class openDevice { recompile = 1; };
};
};
};
EDIT:
If possible, if I have to use a package, I'd like to have it in the programs directory, not the general python libs directory.
As you detected, parsing is necessary to do the conversion. Have a look at the package PyParsing, which is a fairly easy-to-use library to implement parsing in your Python program.
Edit: This is a very symbolic version of what it would take to recognize a very minimalistic grammer - somewhat like the example at the top of the question. It won't work, but it might put you in the right direction:
from pyparsing import ZeroOrMore, OneOrMore, \
Keyword, Literal
test_code = """
class CFGFunctions {
class ABC {
class AA {
file = "abc/aa/functions"
class myFuncName{ recompile = 1; };
};
class BB
{
file = "abc/bb/functions"
class funcName{
recompile=1;
}
}
};
};
"""
class_tkn = Keyword('class')
lbrace_tkn = Literal('{')
rbrace_tkn = Literal('}')
semicolon_tkn = Keyword(';')
assign_tkn = Keyword(';')
class_block = ( class_tkn + identifier + lbrace_tkn + \
OneOrMore(class_block | ZeroOrMore(assignment)) + \
rbrace_tkn + semicolon_tkn \
)
def test_parser(test):
try:
results = class_block.parseString(test)
print test, ' -> ', results
except ParseException, s:
print "Syntax error:", s
def main():
test_parser(test_code)
return 0
if __name__ == '__main__':
main()
Also, this code is only the parser - it does not generate any output. As you can see in the PyParsing docs, you can later add the actions you want. But the first step would be to recognize the what you want to translate.
And a last note: Do not underestimate the complexities of parsing code... Even with a library like PyParsing, which takes care of much of the work, there are many ways to get mired in infinite loops and other amenities of parsing. Implement things step-by-step!
EDIT: A few sources for information on PyParsing are:
http://werc.engr.uaf.edu/~ken/doc/python-pyparsing/HowToUsePyparsing.html
http://pyparsing.wikispaces.com/
(Particularly interesting is http://pyparsing.wikispaces.com/Publications, with a long list of articles - several of them introductory - on PyParsing)
http://pypi.python.org/pypi/pyparsing_helper is a GUI for debugging parsers
There is also a 'tag' Pyparsing here on stackoverflow, Where Paul McGuire (the PyParsing author) seems to be a frequent guest.
* NOTE: *
From PaulMcG in the comments below: Pyparsing is no longer hosted on wikispaces.com. Go to github.com/pyparsing/pyparsing

Create NTFS junction point in Python

Is there a way to create an NTFS junction point in Python? I know I can call the junction utility, but it would be better not to rely on external tools.
Since Python 3.5 there's a function CreateJunction in _winapi module.
import _winapi
_winapi.CreateJunction(source, target)
I answered this in a similar question, so I'll copy my answer to that below. Since writing that answer, I ended up writing a python-only (if you can call a module that uses ctypes python-only) module to creating, reading, and checking junctions which can be found in this folder. Hope that helps.
Also, unlike the answer that utilizes uses the CreateSymbolicLinkA API, the linked implementation should work on any Windows version that supports junctions. CreateSymbolicLinkA is only supported in Vista+.
Answer:
python ntfslink extension
Or if you want to use pywin32, you can use the previously stated method, and to read, use:
from win32file import *
from winioctlcon import FSCTL_GET_REPARSE_POINT
__all__ = ['islink', 'readlink']
# Win32file doesn't seem to have this attribute.
FILE_ATTRIBUTE_REPARSE_POINT = 1024
# To make things easier.
REPARSE_FOLDER = (FILE_ATTRIBUTE_DIRECTORY | FILE_ATTRIBUTE_REPARSE_POINT)
# For the parse_reparse_buffer function
SYMBOLIC_LINK = 'symbolic'
MOUNTPOINT = 'mountpoint'
GENERIC = 'generic'
def islink(fpath):
""" Windows islink implementation. """
if GetFileAttributes(fpath) & REPARSE_FOLDER:
return True
return False
def parse_reparse_buffer(original, reparse_type=SYMBOLIC_LINK):
""" Implementing the below in Python:
typedef struct _REPARSE_DATA_BUFFER {
ULONG ReparseTag;
USHORT ReparseDataLength;
USHORT Reserved;
union {
struct {
USHORT SubstituteNameOffset;
USHORT SubstituteNameLength;
USHORT PrintNameOffset;
USHORT PrintNameLength;
ULONG Flags;
WCHAR PathBuffer[1];
} SymbolicLinkReparseBuffer;
struct {
USHORT SubstituteNameOffset;
USHORT SubstituteNameLength;
USHORT PrintNameOffset;
USHORT PrintNameLength;
WCHAR PathBuffer[1];
} MountPointReparseBuffer;
struct {
UCHAR DataBuffer[1];
} GenericReparseBuffer;
} DUMMYUNIONNAME;
} REPARSE_DATA_BUFFER, *PREPARSE_DATA_BUFFER;
"""
# Size of our data types
SZULONG = 4 # sizeof(ULONG)
SZUSHORT = 2 # sizeof(USHORT)
# Our structure.
# Probably a better way to iterate a dictionary in a particular order,
# but I was in a hurry, unfortunately, so I used pkeys.
buffer = {
'tag' : SZULONG,
'data_length' : SZUSHORT,
'reserved' : SZUSHORT,
SYMBOLIC_LINK : {
'substitute_name_offset' : SZUSHORT,
'substitute_name_length' : SZUSHORT,
'print_name_offset' : SZUSHORT,
'print_name_length' : SZUSHORT,
'flags' : SZULONG,
'buffer' : u'',
'pkeys' : [
'substitute_name_offset',
'substitute_name_length',
'print_name_offset',
'print_name_length',
'flags',
]
},
MOUNTPOINT : {
'substitute_name_offset' : SZUSHORT,
'substitute_name_length' : SZUSHORT,
'print_name_offset' : SZUSHORT,
'print_name_length' : SZUSHORT,
'buffer' : u'',
'pkeys' : [
'substitute_name_offset',
'substitute_name_length',
'print_name_offset',
'print_name_length',
]
},
GENERIC : {
'pkeys' : [],
'buffer': ''
}
}
# Header stuff
buffer['tag'] = original[:SZULONG]
buffer['data_length'] = original[SZULONG:SZUSHORT]
buffer['reserved'] = original[SZULONG+SZUSHORT:SZUSHORT]
original = original[8:]
# Parsing
k = reparse_type
for c in buffer[k]['pkeys']:
if type(buffer[k][c]) == int:
sz = buffer[k][c]
bytes = original[:sz]
buffer[k][c] = 0
for b in bytes:
n = ord(b)
if n:
buffer[k][c] += n
original = original[sz:]
# Using the offset and length's grabbed, we'll set the buffer.
buffer[k]['buffer'] = original
return buffer
def readlink(fpath):
""" Windows readlink implementation. """
# This wouldn't return true if the file didn't exist, as far as I know.
if not islink(fpath):
return None
# Open the file correctly depending on the string type.
handle = CreateFileW(fpath, GENERIC_READ, 0, None, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT, 0) \
if type(fpath) == unicode else \
CreateFile(fpath, GENERIC_READ, 0, None, OPEN_EXISTING, FILE_FLAG_OPEN_REPARSE_POINT, 0)
# MAXIMUM_REPARSE_DATA_BUFFER_SIZE = 16384 = (16*1024)
buffer = DeviceIoControl(handle, FSCTL_GET_REPARSE_POINT, None, 16*1024)
# Above will return an ugly string (byte array), so we'll need to parse it.
# But first, we'll close the handle to our file so we're not locking it anymore.
CloseHandle(handle)
# Minimum possible length (assuming that the length of the target is bigger than 0)
if len(buffer) < 9:
return None
# Parse and return our result.
result = parse_reparse_buffer(buffer)
offset = result[SYMBOLIC_LINK]['substitute_name_offset']
ending = offset + result[SYMBOLIC_LINK]['substitute_name_length']
rpath = result[SYMBOLIC_LINK]['buffer'][offset:ending].replace('\x00','')
if len(rpath) > 4 and rpath[0:4] == '\\??\\':
rpath = rpath[4:]
return rpath
def realpath(fpath):
from os import path
while islink(fpath):
rpath = readlink(fpath)
if not path.isabs(rpath):
rpath = path.abspath(path.join(path.dirname(fpath), rpath))
fpath = rpath
return fpath
def example():
from os import system, unlink
system('cmd.exe /c echo Hello World > test.txt')
system('mklink test-link.txt test.txt')
print 'IsLink: %s' % islink('test-link.txt')
print 'ReadLink: %s' % readlink('test-link.txt')
print 'RealPath: %s' % realpath('test-link.txt')
unlink('test-link.txt')
unlink('test.txt')
if __name__=='__main__':
example()
Adjust the attributes in the CreateFile to your needs, but for a normal situation, it should work. Feel free to improve on it.
It should also work for folder junctions if you use MOUNTPOINT instead of SYMBOLIC_LINK.
You may way to check that
sys.getwindowsversion()[0] >= 6
if you put this into something you're releasing, since this form of symbolic link is only supported on Vista+.
you can use python win32 API modules e.g.
import win32file
win32file.CreateSymbolicLink(srcDir, targetDir, 1)
see http://docs.activestate.com/activepython/2.5/pywin32/win32file__CreateSymbolicLink_meth.html for more details
if you do not want to rely on that too, you can always use ctypes and directly call CreateSymbolicLinl win32 API, which is anyway a simple call
here is example call using ctypes
import ctypes
kdll = ctypes.windll.LoadLibrary("kernel32.dll")
kdll.CreateSymbolicLinkA("d:\testdir", "d:\testdir_link", 1)
MSDN says Minimum supported client Windows Vista
Based on the accepted answer by Charles, here improved (and cross-platform) versions of the functions (Python 2.7 and 3.5+).
islink() now also detects file symbolic links under Windows (just like the POSIX equivalent)
parse_reparse_buffer() and readlink() now actually detect the type of reparse point (NTFS Junction, symlink or generic) which is needed to correctly decode the path
readlink() no longer fails with access denied on NTFS Junctions or directory symlinks (unless you really have no permission to read attributes)
import os
import struct
import sys
if sys.platform == "win32":
from win32file import *
from winioctlcon import FSCTL_GET_REPARSE_POINT
__all__ = ['islink', 'readlink']
# Win32file doesn't seem to have this attribute.
FILE_ATTRIBUTE_REPARSE_POINT = 1024
# These are defined in win32\lib\winnt.py, but with wrong values
IO_REPARSE_TAG_MOUNT_POINT = 0xA0000003 # Junction
IO_REPARSE_TAG_SYMLINK = 0xA000000C
def islink(path):
"""
Cross-platform islink implementation.
Supports Windows NT symbolic links and reparse points.
"""
if sys.platform != "win32" or sys.getwindowsversion()[0] < 6:
return os.path.islink(path)
return bool(os.path.exists(path) and GetFileAttributes(path) &
FILE_ATTRIBUTE_REPARSE_POINT == FILE_ATTRIBUTE_REPARSE_POINT)
def parse_reparse_buffer(buf):
""" Implementing the below in Python:
typedef struct _REPARSE_DATA_BUFFER {
ULONG ReparseTag;
USHORT ReparseDataLength;
USHORT Reserved;
union {
struct {
USHORT SubstituteNameOffset;
USHORT SubstituteNameLength;
USHORT PrintNameOffset;
USHORT PrintNameLength;
ULONG Flags;
WCHAR PathBuffer[1];
} SymbolicLinkReparseBuffer;
struct {
USHORT SubstituteNameOffset;
USHORT SubstituteNameLength;
USHORT PrintNameOffset;
USHORT PrintNameLength;
WCHAR PathBuffer[1];
} MountPointReparseBuffer;
struct {
UCHAR DataBuffer[1];
} GenericReparseBuffer;
} DUMMYUNIONNAME;
} REPARSE_DATA_BUFFER, *PREPARSE_DATA_BUFFER;
"""
# See https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/ns-ntifs-_reparse_data_buffer
data = {'tag': struct.unpack('<I', buf[:4])[0],
'data_length': struct.unpack('<H', buf[4:6])[0],
'reserved': struct.unpack('<H', buf[6:8])[0]}
buf = buf[8:]
if data['tag'] in (IO_REPARSE_TAG_MOUNT_POINT, IO_REPARSE_TAG_SYMLINK):
keys = ['substitute_name_offset',
'substitute_name_length',
'print_name_offset',
'print_name_length']
if data['tag'] == IO_REPARSE_TAG_SYMLINK:
keys.append('flags')
# Parsing
for k in keys:
if k == 'flags':
fmt, sz = '<I', 4
else:
fmt, sz = '<H', 2
data[k] = struct.unpack(fmt, buf[:sz])[0]
buf = buf[sz:]
# Using the offset and lengths grabbed, we'll set the buffer.
data['buffer'] = buf
return data
def readlink(path):
"""
Cross-platform implenentation of readlink.
Supports Windows NT symbolic links and reparse points.
"""
if sys.platform != "win32":
return os.readlink(path)
# This wouldn't return true if the file didn't exist
if not islink(path):
# Mimic POSIX error
raise OSError(22, 'Invalid argument', path)
# Open the file correctly depending on the string type.
if type(path) is type(u''):
createfilefn = CreateFileW
else:
createfilefn = CreateFile
# FILE_FLAG_OPEN_REPARSE_POINT alone is not enough if 'path'
# is a symbolic link to a directory or a NTFS junction.
# We need to set FILE_FLAG_BACKUP_SEMANTICS as well.
# See https://learn.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-createfilea
handle = createfilefn(path, GENERIC_READ, 0, None, OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT, 0)
# MAXIMUM_REPARSE_DATA_BUFFER_SIZE = 16384 = (16 * 1024)
buf = DeviceIoControl(handle, FSCTL_GET_REPARSE_POINT, None, 16 * 1024)
# Above will return an ugly string (byte array), so we'll need to parse it.
# But first, we'll close the handle to our file so we're not locking it anymore.
CloseHandle(handle)
# Minimum possible length (assuming that the length is bigger than 0)
if len(buf) < 9:
return type(path)()
# Parse and return our result.
result = parse_reparse_buffer(buf)
if result['tag'] in (IO_REPARSE_TAG_MOUNT_POINT, IO_REPARSE_TAG_SYMLINK):
offset = result['substitute_name_offset']
ending = offset + result['substitute_name_length']
rpath = result['buffer'][offset:ending].decode('UTF-16-LE')
else:
rpath = result['buffer']
if len(rpath) > 4 and rpath[0:4] == '\\??\\':
rpath = rpath[4:]
return rpath
You don't want to rely on external tools but you don't mind relying on the specific environment? I think you could safely assume that, if it's NTFS you're running on, the junction utility will probably be there.
But, if you mean you'd rather not call out to an external program, I've found the ctypes stuff to be invaluable. It allows you to call Windows DLLs directly from Python. And I'm pretty sure it's in the standard Python releases nowadays.
You'd just have to figure out which Windows DLL the CreateJunction() (or whatever Windows calls it) API call is in and set up the parameters and call. Best of luck with that, Microsoft don't seem to support it very well. You could disassemble the SysInternals junction program or linkd or one of the other tools to find out how they do it.
Me, I'm pretty lazy, I'd just call junction as an external process :-)

Categories