tarfile and user,group information problem

tarfile and user,group information problem - python

I am using python tarfile module to extract files from a *.tgz file. Here what I use:
import tarfile
tar = tarfile.open("some.tar")
tar.extractall(".")
tar.close()
Assume "some.tar" contents as:
-a.txt ===> user:usr1 , group: grp1
-b.txt ===> user:usr2 , group: grp2
But after extracting I lose all of user,group,date... information. They now belong to whoever calls the script(in my case root). They become like:
-a.txt ===> user:root , group: root
-b.txt ===> user:root , group: root
Is there a way to keep file owner,date information of files?
From tarfile module page:
-handles directories, regular files, hardlinks, symbolic links, fifos, character devices and block devices and is able to acquire and restore file information like timestamp, access permissions and owner.
From this statement I understand that is is very well possible to do this by "tarfile" module, or do I understand it wrong?
Python version is 2.6.1
Edit: I am running this script as root
Thanks

As guettli says, you have to be root to be able to change the ownership of a file to somebody else. Otherwise, you open a huge security hole. This is true when using the tar(1) program or when trying to use the tarfile package from python.
Note, though, that some earlier version of Python have a bug (see issue in comments below) that means files extracted by root are owned by root instead of the real owner (user and group).

First, your script needs to run as root (on unix like systems). Otherwise, you can't use chown.
You need to get the TarInfo object for the files:
http://docs.python.org/library/tarfile.html#tarfile.TarInfo
There you get uid (user id) and gid (group id) and (or user name).
Then you need to use chown.

Related

Find desktop folder in a custom location? [duplicate]

I have this small program and it needs to create a small .txt file in their 'My Documents' Folder. Here's the code I have for that:
textfile=open('C:\Users\MYNAME\Documents','w')
lines=['stuff goes here']
textfile.writelines(lines)
textfile.close()
The problem is that if other people use it, how do I change the MYNAME to their account name?

Use os.path.expanduser(path), see http://docs.python.org/library/os.path.html
e.g. expanduser('~/filename')
This works on both Unix and Windows, according to the docs.
Edit: forward slash due to Sven's comment.

This works without any extra libs:
import ctypes.wintypes
CSIDL_PERSONAL = 5 # My Documents
SHGFP_TYPE_CURRENT = 0 # Get current, not default value
buf= ctypes.create_unicode_buffer(ctypes.wintypes.MAX_PATH)
ctypes.windll.shell32.SHGetFolderPathW(None, CSIDL_PERSONAL, None, SHGFP_TYPE_CURRENT, buf)
print(buf.value)
Also works if documents location and/or default save location is changed by user.

On Windows, you can use something similar what is shown in the accepted answer to the question: Python, get windows special folders for currently logged-in user.
For the My Documents folder path, useshellcon.CSIDL_PERSONALin the shell.SHGetFolderPath() function call instead of shellcon.CSIDL_MYPICTURES.
So, assuming you have the PyWin32 extensions1 installed, this might work (see caveat in Update section below):
>>> from win32com.shell import shell, shellcon
>>> shell.SHGetFolderPath(0, shellcon.CSIDL_PERSONAL, None, 0)
u'<path\\to\\folder>'
Update: I just read something that said that CSIDL_PERSONAL won't return the correct folder if the user has changed the default save folder in the Win7 Documents library. This is referring to what you can do in library's Properties dialog:
The checkmark means that the path is set as the default save location.
I currently am unware of a way to call the SHLoadLibraryFromKnownFolder() function through PyWin32 (there currently isn't a shell.SHLoadLibraryFromKnownFolder. However it should be possible to do so using the ctypes module.
1Installers for the latest versions of the Python for Windows Extensions are currently available from: http://sourceforge.net/projects/pywin32

how to check for platform incompatible folder (file) names in python

I would like to be able to check from python if a given string could be a valid cross platform folder name - below is the concrete problem I ran into (folder name ending in .), but I'm sure there are some more special cases (e.g.: con, etc.).
Is there a library for this?
From python (3.2) I created a folder on Windows (7) with a name ending in dot ('.'), e.g. (without square brackets): [What I've done on my holidays, Part II.]
When the created folder was ftp'd (to linux, but I guess that's irrelevant), it did not have the dot in it anymore (and in return, this broke a lot of hyperlinks).
I've checked it from the command line, and it seems that the folder doesn't have the '.' in the filename
mkdir tmp.
dir
cd tmp
cd ..\tmp.
Apparently, adding a single dot at the end of the folder name is ignored, e.g.:
cd c:\Users.
works just as expected.

Nope there's sadly no way to do this. For windows you basically can use the following code to remove all illegal characters - but if someone still has a FAT filesystem you'd have to handle these too since those are stricter. Basically you'll have to read the documentation for all filesystem and come up with a complete list. Here's the NTFS one as a starting point:
ILLEGAL_NTFS_CHARS = "[<>:/\\|?*\"]|[\0-\31]"
def __removeIllegalChars(name):
# removes characters that are invalid for NTFS
return re.sub(ILLEGAL_NTFS_CHARS, "", name)
And then you need some "forbidden" name list as well to get rid of COM. Pretty much a complete mess that.. and that's ignoring linux (although there it's pretty relaxed afaik)

Do not end a file or directory name with a space or a period. Although
the underlying file system may support such names, the Windows shell
and user interface does not.
http://msdn.microsoft.com/en-us/library/aa365247.aspx#naming_conventions
That page will give you information about other illegal names too, for Windows that is. Including CON as you said your self.
If you respect those (seemingly harsh) rules, I think you'll be safe on Linux and most other systems too.

Finding the user's "My Documents" path

I have this small program and it needs to create a small .txt file in their 'My Documents' Folder. Here's the code I have for that:
textfile=open('C:\Users\MYNAME\Documents','w')
lines=['stuff goes here']
textfile.writelines(lines)
textfile.close()
The problem is that if other people use it, how do I change the MYNAME to their account name?

Use os.path.expanduser(path), see http://docs.python.org/library/os.path.html
e.g. expanduser('~/filename')
This works on both Unix and Windows, according to the docs.
Edit: forward slash due to Sven's comment.

This works without any extra libs:
import ctypes.wintypes
CSIDL_PERSONAL = 5 # My Documents
SHGFP_TYPE_CURRENT = 0 # Get current, not default value
buf= ctypes.create_unicode_buffer(ctypes.wintypes.MAX_PATH)
ctypes.windll.shell32.SHGetFolderPathW(None, CSIDL_PERSONAL, None, SHGFP_TYPE_CURRENT, buf)
print(buf.value)
Also works if documents location and/or default save location is changed by user.

On Windows, you can use something similar what is shown in the accepted answer to the question: Python, get windows special folders for currently logged-in user.
For the My Documents folder path, useshellcon.CSIDL_PERSONALin the shell.SHGetFolderPath() function call instead of shellcon.CSIDL_MYPICTURES.
So, assuming you have the PyWin32 extensions1 installed, this might work (see caveat in Update section below):
>>> from win32com.shell import shell, shellcon
>>> shell.SHGetFolderPath(0, shellcon.CSIDL_PERSONAL, None, 0)
u'<path\\to\\folder>'
Update: I just read something that said that CSIDL_PERSONAL won't return the correct folder if the user has changed the default save folder in the Win7 Documents library. This is referring to what you can do in library's Properties dialog:
The checkmark means that the path is set as the default save location.
I currently am unware of a way to call the SHLoadLibraryFromKnownFolder() function through PyWin32 (there currently isn't a shell.SHLoadLibraryFromKnownFolder. However it should be possible to do so using the ctypes module.
1Installers for the latest versions of the Python for Windows Extensions are currently available from: http://sourceforge.net/projects/pywin32

Checking folder/file ntfs permissions using python

As the question title might suggest, I would very much like to know of the way to check the ntfs permissions of the given file or folder (hint: those are the ones you see in the "security" tab). Basically, what I need is to take a path to a file or directory (on a local machine, or, preferrably, on a share on a remote machine) and get the list of users/groups and the corresponding permissions for this file/folder. Ultimately, the application is going to traverse a directory tree, reading permissions for each object and processing them accordingly.
Now, I can think of a number of ways to do that:
parse cacls.exe output -- easily done, BUT, unless im missing something, cacls.exe only gives the permissions in the form of R|W|C|F (read/write/change/full), which is insufficient (I need to get the permissions like "List folder contents", extended permissions too)
xcacls.exe or xcacls.vbs output -- yes, they give me all the permissions I need, but they work dreadfully slow, it takes xcacls.vbs about ONE SECOND to get permissions on a local system file. Such speed is unacceptable
win32security (it wraps around winapi, right?) -- I am sure it can be handled like this, but I'd rather not reinvent the wheel
Is there anything else I am missing here?

Unless you fancy rolling your own, win32security is the way to go. There's the beginnings of an example here:
http://timgolden.me.uk/python/win32_how_do_i/get-the-owner-of-a-file.html
If you want to live slightly dangerously (!) my in-progress winsys package is designed to do exactly what you're after. You can get an MSI of the dev version here:
http://timgolden.me.uk/python/downloads/WinSys-0.4.win32-py2.6.msi
or you can just checkout the svn trunk:
svn co http://winsys.googlecode.com/svn/trunk winsys
To do what you describe (guessing slightly at the exact requirements) you could do this:
import codecs
from winsys import fs
base = "c:/temp"
with codecs.open ("permissions.log", "wb", encoding="utf8") as log:
for f in fs.flat (base):
log.write ("\n" + f.filepath.relative_to (base) + "\n")
for ace in f.security ().dacl:
access_flags = fs.FILE_ACCESS.names_from_value (ace.access)
log.write (u" %s => %s\n" % (ace.trustee, ", ".join (access_flags)))
TJG

How does one add a svn repository build number to Python code?

EDIT: This question duplicates How to access the current Subversion build number? (Thanks for the heads up, Charles!)
Hi there,
This question is similar to Getting the subversion repository number into code
The differences being:
I would like to add the revision number to Python
I want the revision of the repository (not the checked out file)
I.e. I would like to extract the Revision number from the return from 'svn info', likeso:
$ svn info
Path: .
URL: svn://localhost/B/trunk
Repository Root: svn://localhost/B
Revision: 375
Node Kind: directory
Schedule: normal
Last Changed Author: bmh
Last Changed Rev: 375
Last Changed Date: 2008-10-27 12:09:00 -0400 (Mon, 27 Oct 2008)
I want a variable with 375 (the Revision). It's easy enough with put $Rev$ into a variable to keep track of changes on a file. However, I would like to keep track of the repository's version, and I understand (and it seems based on my tests) that $Rev$ only updates when the file changes.
My initial thoughts turn to using the svn/libsvn module built in to Python, though I can't find any documentation on or examples of how to use them.
Alternatively, I've thought calling 'svn info' and regex'ing the code out, though that seems rather brutal. :)
Help would be most appreciated.
Thanks & Cheers.

There is a command called svnversion which comes with subversion and is meant to solve exactly that kind of problem.

Stolen directly from django:
def get_svn_revision(path=None):
rev = None
if path is None:
path = MODULE.__path__[0]
entries_path = '%s/.svn/entries' % path
if os.path.exists(entries_path):
entries = open(entries_path, 'r').read()
# Versions >= 7 of the entries file are flat text. The first line is
# the version number. The next set of digits after 'dir' is the revision.
if re.match('(\d+)', entries):
rev_match = re.search('\d+\s+dir\s+(\d+)', entries)
if rev_match:
rev = rev_match.groups()[0]
# Older XML versions of the file specify revision as an attribute of
# the first entries node.
else:
from xml.dom import minidom
dom = minidom.parse(entries_path)
rev = dom.getElementsByTagName('entry')[0].getAttribute('revision')
if rev:
return u'SVN-%s' % rev
return u'SVN-unknown'
Adapt as appropriate. YOu might want to change MODULE for the name of one of your codemodules.
This code has the advantage of working even if the destination system does not have subversion installed.

Python has direct bindings to libsvn, so you don't need to invoke the command line client at all. See this blog post for more details.
EDIT: You can basically do something like this:
from svn import fs, repos, core
repository = repos.open(root_path)
fs_ptr = repos.fs(repository)
youngest_revision_number = fs.youngest_rev(fs_ptr)

I use a technique very similar to this in order to show the current subversion revision number in my shell:
svnRev=$(echo "$(svn info)" | grep "^Revision" | awk -F": " '{print $2};')
echo $svnRev
It works very well for me.
Why do you want the python files to change every time the version number of the entire repository is incremented? This will make doing things like doing a diff between two files annoying if one is from the repo, and the other is from a tarball..

If you want to have a variable in one source file that can be set to the current working copy revision, and does not replay on subversion and a working copy being actually available at the time you run your program, then SubWCRev my be your solution.
There also seems to be a linux port called SVNWCRev
Both perform substitution of $WCREV$ with the highest commit level of the working copy. Other information may also be provided.

Based on CesarB's response and the link Charles provided, I've done the following:
try:
from subprocess import Popen, PIPE
_p = Popen(["svnversion", "."], stdout=PIPE)
REVISION= _p.communicate()[0]
_p = None # otherwise we get a wild exception when Django auto-reloads
except Exception, e:
print "Could not get revision number: ", e
REVISION="Unknown"
Golly Python is cool. :)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

tarfile and user,group information problem - python

First, your script needs to run as root (on unix like systems). Otherwise, you can't use chown. You need to get the TarInfo object for the files: http://docs.python.org/library/tarfile.html#tarfile.TarInfo There you get uid (user id) and gid (group id) and (or user name). Then you need to use chown.

Related

Find desktop folder in a custom location? [duplicate]

how to check for platform incompatible folder (file) names in python

Finding the user's "My Documents" path

Checking folder/file ntfs permissions using python

How does one add a svn repository build number to Python code?

Categories

Resources