using diff util with --show-function-line for Python files - python

I like to compare Python sources with diff util and I have seen that git diff shows me the function name (for some files, but also not for Python) in the hunk or chunk? header. Thats why I searched to do the same with normal diff.
Then I found that there is a -p, --show-c-function param, which don't work well with Python files (I only get the class name). So I also found in the man page the param: -F, --show-function-line=RE
Now I searched for a nice RE to match python functions to include them in my diff output without luck.
I know we have def myname(): or async def myname(): ... and maybe more?
Has someone a good RegEx for it?
I found this one, but it don't work with diff util (the output has no function names):
diff -Nru -F '(?P<function>\w+)\s?\((?P<arg>(?P<args>\w+(,\s?)?)+)\)' modules/websocket/__init__.py.old modules/websocket/__init__.py
Regards,
Thomas

Related

Basic issue with glob in python

I'm really unexpert in python, so forgive my question if stupid.
I'm trying a simple script that operates on all the files in a folder.
However, I apparently can only access the folder recursively!
I explain. I have a folder, DATA, with subfolders for each day (of the form YYYY-MM-DD).
If I try
for filename in glob.glob('C:\Users\My username\Documents\DATA\2021-01-20\*'):
print filename
I get no output.
However, if I try instead
for filename in glob.glob('C:\Users\My username\Documents\DATA\*\*'):
print filename
the output is that expected:
C:\Users\My username\Documents\DATA\2021-01-20\210120_HOPG_sputteredTip0001.sxm
C:\Users\My username\Documents\DATA\2021-01-20\210120_HOPG_sputteredTip0002.sxm
...
I even tried different folder names (removing the dashes, using letters in the beginning, using only letters, using a shorter folder name) but the result is still the same.
What am I missing?
(BTW: I am on python 2.7, and it's because the program I need for the data is only compatible with python 2)
Beware when using backslashes in strings. In Python this means escaping characters. Try prepending your string with r like so:
for filename in glob.glob(r'C:\Users\My username\Documents\DATA\*'):
# Do you business
Edit:
As #poomerang has pointed out a shorter answer has previously been provided as to what 'r' does in Python here
Official docs for Python string-literals: Python 2.7 and for Python 3.8.
Recursive file search is not possible with glob in Python 2.7. I.e. searching for files in a folder, its subfolders, sub-subfolders and so on.
You have two options:
use os.walk (you might need to change your code's structure however)
Use the backported pathlib2 module from PyPI https://pypi.org/project/pathlib2/ - which should include a glob function supporting the recursive search using ** wildcard.

Subversion Hook Script WIndows, Python, pysvn

I'm trying to create a hook script for subversion on windows, I have a bat file that calls my python script but getting the log/comments seems to be beyond me.
I have pysvn installed and can get the transaction like this:
repos_path = sys.argv[1]
transaction_name = sys.argv[2]
transaction = pysvn.Transaction( repos_path, transaction_name)
I can also list what has changed:
transaction.changed(0)
What I cannot figure out is how to get the log/comment for the transaction. I realize that in pysvn there is a command similar to:
transaction.propget(propname,path)
But cannot for the life of me get it to return anything. I assume propname should be "svn:log", for path I have tried the fiel name, the repo path, null but all get are errors.
AT the end of the day I need to validate the comment, there will be matching against external data that will evolve, hence why I want to do it in python rather than the bat file, plus it may move to a linux server later.
AM I missing something obvious? How do I get the log/comment as a string?
Thanks, Chris.
After a great deal of trial and error and better searching after a day of frustration I found that I need to use the revision property, not a straight property, for a given transaction this will return the user submitted comment:
transaction.revpropget("svn:log")
There are other useful properties, this will return a list of all revision properties:
transaction.revproplist()
for example:
{'svn:log': 'qqqqqqq', 'svn:txn-client-compat-version': '1.9.7', 'svn:txn-user-agent': 'SVN/1.9.7 (x64-microsoft-windows) TortoiseSVN-1.9.7.27907', 'svn:author': 'harry', 'svn:date': '2017-12-14T16:13:52.361605Z'}

How to find out if a folder is a hard link and get its real path

I'm trying to findout if a folder is actually a hard link to another, and in that case, findout its real path.
I did a simple example in python in the following way(symLink.py):
#python 3.4
import os
dirList = [x[0] for x in os.walk('.')]
print (dirList)
for d in dirList:
print (os.path.realpath(d), os.path.islink(d))
"""
Given this directories structure:
<dir_path>\Example\
<dir_path>\Example\symLinks.py
<dir_path>\Example\hardLinkToF2 #hard link pointing to <dir_path>\Example\FOLDER1\FOLDER2
<dir_path>\Example\softLinkToF2 #soft link pointing to <dir_path>\Example\FOLDER1\FOLDER2
<dir_path>\Example\FOLDER1
<dir_path>\Example\FOLDER1\FOLDER2
The output from executing: C:\Python34\python <dir_path>\Example\symLinks.py is:
['.', '.\\FOLDER1', '.\\FOLDER1\\FOLDER2', '.\\hardLinkToF2']
<dir_path>\Example False
<dir_path>\Example\FOLDER1 False
<dir_path>\Example\FOLDER1\FOLDER2 False
<dir_path>\Example\hardLinkToF2 False
"""
In this example os.path.islink always returns False both for a hard or a soft link.
In the other hand, os.path.realpath returns the actual path for soft links, not for the hard links.
I've made this example using python 3.4 in Windows 8.
I have no clue if I am doing something wrong or if there is another way to achieve it.
Not to bee too harsh, but I spent 1 minute googling and got all the answers. Hint hint.
To tell if they are hardlinks, you have to scan all the files then compare their os.stat results to see if they point to the same inode. Example:
https://gist.github.com/simonw/229186
For symbolic links in python on Windows, it can be trickier... but luckily this has already been answered:
Having trouble implementing a readlink() function
(per #ShadowRanger in comments), make sure you are not using junctions instead of symbolic links since they may not report correctly. – ShadowRanger
https://bugs.python.org/issue29250
Links to directories on Windows are implemented using reparse points. They can take the form of either "directory junctions" or "symbolic links". Hard links to directories are not possible on Windows NTFS.
At least as of Python 3.8 os.path.samefile(dir1, dir2) supports both symbolic links and directory junctions and will return True if both resolve to the same destination.
os.path.realpath(dirpath) will also work to give you the real (completely resolved) path for both symbolic links and directory junctions.
If you need to determine which of the two directories is a reparse point, you can leverage os.lstat() as os.path.islink() only supports symbolic links.
import os
import stat
def is_reparse_point(dirpath):
return os.lstat(dirpath).st_file_attributes & stat.FILE_ATTRIBUTE_REPARSE_POINT
Insofar as it may be valuable for testing, here are some useful utilities available in the Windows CMD shell.
Interrogate reparse point data:
>fsutil reparsepoint query <path to directory>
Create reparse points of both the "symbolic link" and "directory junction" variety *:
>mklink /d <symbolic link name> <path to target directory>
>mklink /j <junction name> <path to target directory>
You can read more about the difference between hard links and junctions, symbolic links, and reparse points in Microsoft's docs.
*Note that creating symbolic links typically requires Administrator privileges.

Inject the revision number in sourcecode (TortoiseSvn or SVN Shell)

I would like to inject the revision number in source code on commit.
I found out that I could do it through svn shell by doing something like:
find . -name *.php -exec svn propset svn:keywords "Rev"
However someone else said that that would not work as there are no files in the repository (as they files are encrypted), and I should be able to do it in tortoiseSVN. I found the "Hook Scripts" section, but I have completely no experience with this stuff.
Could you give me some indication how the command should look like, if I would like to have the first lines of code look like:
/*
* Version: 154
* Last modified on revision: 150
*/
I know that you could inject by using $ver$ but how to do it so only files in certain directories with certain extensions get this changed.
Don't write your own method for injecting version numbers. Instead,
only introduce the replaced tags $Revision$, etc.) in the files you want the replacement to happen for
only enable replacement (using svn propset svn:keywords Revision or some such) for those files

How does one add a svn repository build number to Python code?

EDIT: This question duplicates How to access the current Subversion build number? (Thanks for the heads up, Charles!)
Hi there,
This question is similar to Getting the subversion repository number into code
The differences being:
I would like to add the revision number to Python
I want the revision of the repository (not the checked out file)
I.e. I would like to extract the Revision number from the return from 'svn info', likeso:
$ svn info
Path: .
URL: svn://localhost/B/trunk
Repository Root: svn://localhost/B
Revision: 375
Node Kind: directory
Schedule: normal
Last Changed Author: bmh
Last Changed Rev: 375
Last Changed Date: 2008-10-27 12:09:00 -0400 (Mon, 27 Oct 2008)
I want a variable with 375 (the Revision). It's easy enough with put $Rev$ into a variable to keep track of changes on a file. However, I would like to keep track of the repository's version, and I understand (and it seems based on my tests) that $Rev$ only updates when the file changes.
My initial thoughts turn to using the svn/libsvn module built in to Python, though I can't find any documentation on or examples of how to use them.
Alternatively, I've thought calling 'svn info' and regex'ing the code out, though that seems rather brutal. :)
Help would be most appreciated.
Thanks & Cheers.
There is a command called svnversion which comes with subversion and is meant to solve exactly that kind of problem.
Stolen directly from django:
def get_svn_revision(path=None):
rev = None
if path is None:
path = MODULE.__path__[0]
entries_path = '%s/.svn/entries' % path
if os.path.exists(entries_path):
entries = open(entries_path, 'r').read()
# Versions >= 7 of the entries file are flat text. The first line is
# the version number. The next set of digits after 'dir' is the revision.
if re.match('(\d+)', entries):
rev_match = re.search('\d+\s+dir\s+(\d+)', entries)
if rev_match:
rev = rev_match.groups()[0]
# Older XML versions of the file specify revision as an attribute of
# the first entries node.
else:
from xml.dom import minidom
dom = minidom.parse(entries_path)
rev = dom.getElementsByTagName('entry')[0].getAttribute('revision')
if rev:
return u'SVN-%s' % rev
return u'SVN-unknown'
Adapt as appropriate. YOu might want to change MODULE for the name of one of your codemodules.
This code has the advantage of working even if the destination system does not have subversion installed.
Python has direct bindings to libsvn, so you don't need to invoke the command line client at all. See this blog post for more details.
EDIT: You can basically do something like this:
from svn import fs, repos, core
repository = repos.open(root_path)
fs_ptr = repos.fs(repository)
youngest_revision_number = fs.youngest_rev(fs_ptr)
I use a technique very similar to this in order to show the current subversion revision number in my shell:
svnRev=$(echo "$(svn info)" | grep "^Revision" | awk -F": " '{print $2};')
echo $svnRev
It works very well for me.
Why do you want the python files to change every time the version number of the entire repository is incremented? This will make doing things like doing a diff between two files annoying if one is from the repo, and the other is from a tarball..
If you want to have a variable in one source file that can be set to the current working copy revision, and does not replay on subversion and a working copy being actually available at the time you run your program, then SubWCRev my be your solution.
There also seems to be a linux port called SVNWCRev
Both perform substitution of $WCREV$ with the highest commit level of the working copy. Other information may also be provided.
Based on CesarB's response and the link Charles provided, I've done the following:
try:
from subprocess import Popen, PIPE
_p = Popen(["svnversion", "."], stdout=PIPE)
REVISION= _p.communicate()[0]
_p = None # otherwise we get a wild exception when Django auto-reloads
except Exception, e:
print "Could not get revision number: ", e
REVISION="Unknown"
Golly Python is cool. :)

Categories