I am trying to get each entity to draw a certain type of graph using the Understand Python API. All the inputs are good and the database is opened but the the single item is not drawn. There is no error and no output file. The code is listed below. The Python code is called from a C# application which calls upython.exe with the associated arguments.
The Python file receives the Scitools directory and opens the understand database. The entities are also an argument which is loaded into a temp file. The outputPath references the directory where the SVG file will be placed. I can't seem to figure out why the item.draw method isn't working.
import os
import sys
import argparse
import re
import json
#parse arguments
parser = argparse.ArgumentParser(description = 'Creates butterfly or call by graphs from and Understand DB')
parser.add_argument('PathToSci', type = str, help = "Path to Sci Understand libraries")
parser.add_argument('PathToDB', type = str, help = "Path to the Database you want to create graphs from")
parser.add_argument('PathToOutput', type = str, help='Path to where the graphs should be outputted')
parser.add_argument('TypeOfGraph', type = str,
help="The type of graph you want to generate. Same names as in Understand GUI. IE 'Butterfly' 'Called By' 'Control Flow' ")
parser.add_argument("entities", help='Path to json list file of Entity long names you wish to create graphs for')
args, unknown = parser.parse_known_args()
# they may have entered a path with a space broken into multiple strings
if len(unknown) > 0:
print("Unkown argument entered.\n Note: Individual arguments must be passed as a single string.")
quit()
pathToSci = args.PathToSci
pathToDB = args.PathToDB
graphType = args.TypeOfGraph
entities = json.load(open(args.entities,))
pathToOutput = args.PathToOutput
pathToSci = os.path.join(pathToSci, "Python")
sys.path.append(pathToSci)
import understand
db = understand.open(pathToDB)
count = 0
for name in entities:
count += 1
print("Completed: " + str(count) + "/" + str(len(entities)))
#if it is an empty name don't make a graph
if len(name) == 0:
break
pattern = re.compile((name + '$').replace("\\", "/"))
print("pattern: " + str(pattern))
sys.stdout.flush()
ent = db.lookup(pattern)
print("ent: " + str(ent))
sys.stdout.flush()
print("Type: " + str(type(ent[0])))
sys.stdout.flush()
for item in ent:
try:
filename = os.path.join(pathToOutput, item.longname() + ".svg")
print("Graph Type: " + graphType)
sys.stdout.flush()
print("filename: " + filename)
sys.stdout.flush()
print("Item Kind: " + str(ent[0].kind()))
sys.stdout.flush()
item.draw(graphType, filename)
except understand.UnderstandError:
print("error creating graph")
sys.stdout.flush()
except Exception as e:
print("Could not create graph for " + item.kind().longname() + ": " + item.longname())
sys.stdout.flush()
print(e)
sys.stdout.flush()
db.close()
The output is below:
Completed: 1/1
pattern: re.compile('CSC03.SIN_COS$')
ent: [#lCSC03.SIN_COS#kacsc03.sin_cos(long_float,csc03.sctype)long_float#f./../../../IOSSP/Source_Files/OGP/OGP_71/csc03/csc03.ada]
Type: <class 'understand.Ent'>
Graph Type: Butterfly
filename: C:\Users\M73720\Documents\DFS\DFS-OGP-25-Aug-2022-11-24\SVGs\Entities\CSC03.SIN_COS.svg
Item Kind: Function
It turns out that it was a problem in the Understand API. The latest build corrected the problem. This was found by talking with SciTools Support group.
Related
I am trying to unpack android 11 image / get info from the raw .img for selinux info, symlinks etc.
I am using this wonderful tool: https://github.com/cubinator/ext4/blob/master/ext4.py35.py
and my code looks like this:
#!/usr/bin/env python3
import argparse
import sys
import os
import ext4
parser = argparse.ArgumentParser(description='Read <modes, symlinks, contexts and capabilities> from an ext4 image')
parser.add_argument('ext4_image', help='Path to ext4 image to process')
args = parser.parse_args()
exists = os.path.isfile(args.ext4_image)
if not exists:
print("Error: input file " f"[{args.ext4_image}]" " was not found")
sys.exit(1)
file = open(args.ext4_image, "rb")
volume = ext4.Volume(file)
def scan_dir (root_inode, root_path = ""):
for entry_name, entry_inode_idx, entry_type in root_inode.open_dir():
if entry_name == "." or entry_name == "..":
continue
entry_inode = root_inode.volume.get_inode(entry_inode_idx)
entry_inode_path = root_path + "/" + entry_name
if entry_inode.is_dir:
scan_dir(entry_inode, entry_inode_path)
if entry_inode_path[-1] == '/':
continue
xattrs_perms = list(entry_inode.xattrs())
found_cap = False
found_con = False
if "security.capability" in f"{xattrs_perms}": found_cap = True
if "security.selinux" in f"{xattrs_perms}": found_con = True
contexts = ""
capability = ", \"capabilities\", 0x0"
if found_cap:
if found_con:
capability = f"{xattrs_perms[1:2]}"
else:
capability = f"{xattrs_perms[0:1]}"
capability = capability.split(" ")[1][:-3][+2:].encode('utf-8').decode('unicode-escape').encode('ISO-8859-1')
capability = hex(int.from_bytes(capability[4:8] + capability[14:18], "little"))
capability = ", \"capabilities\", " f"{capability}"
capability = f"{capability}"
if found_con:
contexts = f"{xattrs_perms[0:1]}"
contexts = f"{contexts.split( )[1].split('x00')[0][:-1][+2:]}"
contexts = f"{contexts}"
filefolder = ''.join(entry_inode_path.split('/', 1))
print("set_metadata(\""f"{filefolder}" "\", \"uid\", " f"{str(entry_inode.inode.i_uid)}" ", \"gid\", " f"{str(entry_inode.inode.i_gid)}" ", \"mode\", " f"{entry_inode.inode.i_mode & 0x1FF:0>4o}" f"{capability}" ", \"selabel\", \"" f"{contexts}" "\");")
scan_dir(volume.root)
file.close()
then I just have to do ./read.py vendor.img and it works.
Untill recently I tried this weird vendor.img from android 11 and got this weird issue.
Traceback (most recent call last):
File "./tools/metadata.py", line 53, in <module>
scan_dir(volume.root)
File "./tools/metadata.py", line 26, in scan_dir
scan_dir(entry_inode, entry_inode_path)
File "./tools/metadata.py", line 26, in scan_dir
scan_dir(entry_inode, entry_inode_path)
File "./tools/metadata.py", line 29, in scan_dir
xattrs_perms = list(entry_inode.xattrs())
File "/home/semaphore/unpacker/tools/ext4.py", line 976, in xattrs
for xattr_name, xattr_value in self._parse_xattrs(inline_data[offset:], 0, prefix_override = prefix_override):
File "/home/semaphore/unpacker/tools/ext4.py", line 724, in _parse_xattrs
xattr_inode = self.volume.get_inode(xattr.e_value_inum, InodeType.FILE)
NameError: name 'xattr' is not defined
I have tried removing the if and keeping code after else only here: https://github.com/cubinator/ext4/blob/master/ext4.py35.py#L722
Sadly no luck. It looks like the tool is not finished? But there are no other alternatives.
Any help is welcome :)
Thank you.
EDIT: someone suggested replace xattr with xattr_entry
So i did and i got this error: takes 2 positional arguments but 3 were given
I tried fixing that and got:
File "/home/semaphore/unpacker/tools/ext4.py", line 724, in _parse_xattrs
xattr_inode = self.volume.get_inode(xattr_entry.e_value_inum)
File "/home/semaphore/unpacker/tools/ext4.py", line 595, in get_inode
inode_table_offset = self.group_descriptors[group_idx].bg_inode_table * self.block_size
IndexError: list index out of range
And I could not fix this error :(
Maybe theres an alternative to getting selinux info, capabilities, uid, gid, permissions from raw ext4 image?
I read that you had tried to fix the issue yourself but you never posted a snippet of the code you're currently using.
I am not sure but it seems to me you modified the signature of get_inode instead of modifying which parameters get passed to it.
E.g. did you try:
xattr_inode = self.volume.get_inode(xattr_entry.e_value_inum)
I figured out how to do it in an alternative way.
First mount the image (needs root access):
os.system("sudo mount -t ext4 -o loop vendor.img vendor")
Then use: os.lstat and os.getxattr on each file. It gives all the information:
stat_info = os.lstat(file)
try:
cap = hex(int.from_bytes(os.getxattr(file, "security.capability")[4:8] + os.getxattr(file, "security.capability")[14:18], "little"))
except:
cap = "0x0"
try:
selabel = os.getxattr(file, b"security.selinux", follow_symlinks=False).decode().strip('\n\0')
except:
selabel = "u:object_r:unlabeled:s0"
metadata.append("set_metadata(\"/" + file + "\", \"uid\", " + str(stat_info.st_uid) + ", \"gid\", " + str(stat_info.st_gid) + ", \"mode\", " + oct(stat_info.st_mode)[-4:] + ", \"capabilities\", " + cap + ", \"selabel\", \"" + selabel + "\");")
Like so. This is the only solution I could find
I'm trying to stop this code from giving me an error about a file I created called beloved.txt I used the FillNotFoundError: to say not to give me the error and to print the file thats not found but instead its printing the message and the error message. How can I fix it ?
def count_words(Filenames):
with open(Filenames) as fill_object:
contentInFill = fill_object.read()
words = contentInFill.rsplit()
word_length = len(words)
print("The file " + Filename + " has " + str(word_length) + " words.")
try:
Filenames = open("beloved.txt", mode="rb")
data = Filenames.read()
return data
except FileNotFoundError as err:
print("Cant find the file name")
Filenames = ["anna.txt", "gatsby.txt", "don_quixote.txt", "beloved.txt", "mockingbird.txt"]
for Filename in Filenames:
count_words(Filename)
A few tips:
Don't capitalize variables besides class names.
Use different variable names when referring to different things. (i.e. don't use Filenames = open("beloved.txt", mode="rb") when you already have a global version of that variable, and a local version of that variable, and now you are reassigning it to mean something different again!! This behavior will lead to headaches...
The main problem with the script though is trying to open a file outside your try statement. You can just move your code to be within the try:! I also don't understand except FileNotFoundError as err: when you don't use err. You should rewrite that to except FileNotFoundError: in this case :)
def count_words(file):
try:
with open(file) as fill_object:
contentInFill = fill_object.read()
words = contentInFill.rsplit()
word_length = len(words)
print("The file " + file + " has " + str(word_length) + " words.")
with open("beloved.txt", mode="rb") as other_file:
data = other_file.read()
return data
except FileNotFoundError:
print("Cant find the file name")
filenames = ["anna.txt", "gatsby.txt", "don_quixote.txt", "beloved.txt", "mockingbird.txt"]
for filename in filenames:
count_words(filename)
I also do not understand why you have your function return data when data is read from the same file regardless of that file you input to the function?? You will get the same result returned in all cases...
The "with open(Filenames) as fill_objec:" sentence will throw you the exception.
So you at least must enclose that sentence in the try part. In your code you first get that len in words, and then you check for the specific file beloved.txt. This doubled code lets you to the duplicated mensajes. Suggestion:
def count_words(Filenames):
try:
with open(Filenames) as fill_object:
contentInFill = fill_object.read()
words = contentInFill.rsplit()
word_length = len(words)
print("The file " + Filename + " has " + str(word_length) + " words.")
except FileNotFoundError as err:
print("Cant find the file name")
So I have a rather general question I was hoping to get some help with. I put together a Python program that runs through and automates workflows at the state level for all the different counties. The entire program was created for research at school - not actual state work. Anyways, I have two designs shown below. The first is an updated version. It takes about 40 minutes to run. The second design shows the original work. Note that it is not a well structured design. However, it takes about five minutes to run the entire program. Could anybody give any insight why there are such differences between the two? The updated version is still ideal as it is much more reusable (can run and grab any dataset in the url) and easy to understand. Furthermore, 40 minutes to get about a hundred workflows completed is still a plus. Also, this is still a work in progress. A couple minor issues still need to be addressed in the code but it is still a pretty cool program.
Updated Design
import os, sys, urllib2, urllib, zipfile, arcpy
from arcpy import env
path = os.getcwd()
def pickData():
myCount = 1
path1 = 'path2URL'
response = urllib2.urlopen(path1)
print "Enter the name of the files you need"
numZips = raw_input()
numZips2 = numZips.split(",")
myResponse(myCount, path1, response, numZips2)
def myResponse(myCount, path1, response, numZips2):
myPath = os.getcwd()
for each in response:
eachNew = each.split(" ")
eachCounty = eachNew[9].strip("\n").strip("\r")
try:
myCountyDir = os.mkdir(os.path.expanduser(myPath+ "\\counties" + "\\" + eachCounty))
except:
pass
myRetrieveDir = myPath+"\\counties" + "\\" + eachCounty
os.chdir(myRetrieveDir)
myCount+=1
response1 = urllib2.urlopen(path1 + eachNew[9])
for all1 in response1:
allNew = all1.split(",")
allFinal = allNew[0].split(" ")
allFinal1 = allFinal[len(allFinal)-1].strip(" ").strip("\n").strip("\r")
numZipsIter = 0
path8 = path1 + eachNew[9][0:len(eachNew[9])-2] +"/"+ allFinal1
downZip = eachNew[9][0:len(eachNew[9])-2]+".zip"
while(numZipsIter <len(numZips2)):
if (numZips2[numZipsIter][0:3].strip(" ") == "NWI") and ("remap" not in allFinal1):
numZips2New = numZips2[numZipsIter].split("_")
if (numZips2New[0].strip(" ") in allFinal1 and numZips2New[1] != "remap" and numZips2New[2].strip(" ") in allFinal1) and (allFinal1[-3:]=="ZIP" or allFinal1[-3:]=="zip"):
urllib.urlretrieve (path8, allFinal1)
zip1 = zipfile.ZipFile(myRetrieveDir +"\\" + allFinal1)
zip1.extractall(myRetrieveDir)
#maybe just have numzips2 (raw input) as the values before the county number
#numZips2[numZipsIter][0:-7].strip(" ") in allFinal1 or numZips2[numZipsIter][0:-7].strip(" ").lower() in allFinal1) and (allFinal1[-3:]=="ZIP" or allFinal1[-3:]=="zip"
elif (numZips2[numZipsIter].strip(" ") in allFinal1 or numZips2[numZipsIter].strip(" ").lower() in allFinal1) and (allFinal1[-3:]=="ZIP" or allFinal1[-3:]=="zip"):
urllib.urlretrieve (path8, allFinal1)
zip1 = zipfile.ZipFile(myRetrieveDir +"\\" + allFinal1)
zip1.extractall(myRetrieveDir)
numZipsIter+=1
pickData()
#client picks shapefiles to add to map
#section for geoprocessing operations
# get the data frames
#add new data frame, title
#check spaces in ftp crawler
os.chdir(path)
env.workspace = path+ "\\symbology\\"
zp1 = os.listdir(path + "\\counties\\")
def myGeoprocessing(layer1, layer2):
#the code in this function is used for geoprocessing operations
#it returns whatever output is generated from the tools used in the map
try:
arcpy.Clip_analysis(path + "\\symbology\\Stream_order.shp", layer1, path + "\\counties\\" + layer2 + "\\Streams.shp")
except:
pass
streams = arcpy.mapping.Layer(path + "\\counties\\" + layer2 + "\\Streams.shp")
arcpy.ApplySymbologyFromLayer_management(streams, path+ '\\symbology\\streams.lyr')
return streams
def makeMap():
#original wetlands layers need to be entered as NWI_line or NWI_poly
print "Enter the layer or layers you wish to include in the map"
myInput = raw_input();
counter1 = 1
for each in zp1:
print each
print path
zp2 = os.listdir(path + "\\counties\\" + each)
for eachNew in zp2:
#print eachNew
if (eachNew[-4:] == ".shp") and ((myInput in eachNew[0:-7] or myInput.lower() in eachNew[0:-7])or((eachNew[8:12] == "poly" or eachNew[8:12]=='line') and eachNew[8:12] in myInput)):
print eachNew[0:-7]
theMap = arcpy.mapping.MapDocument(path +'\\map.mxd')
df1 = arcpy.mapping.ListDataFrames(theMap,"*")[0]
#this is where we add our layers
layer1 = arcpy.mapping.Layer(path + "\\counties\\" + each + "\\" + eachNew)
if(eachNew[7:11] == "poly" or eachNew[7:11] =="line"):
arcpy.ApplySymbologyFromLayer_management(layer1, path + '\\symbology\\' +myInput+'.lyr')
else:
arcpy.ApplySymbologyFromLayer_management(layer1, path + '\\symbology\\' +eachNew[0:-7]+'.lyr')
# Assign legend variable for map
legend = arcpy.mapping.ListLayoutElements(theMap, "LEGEND_ELEMENT", "Legend")[0]
# add wetland layer to map
legend.autoAdd = True
try:
arcpy.mapping.AddLayer(df1, layer1,"AUTO_ARRANGE")
#geoprocessing steps
streams = myGeoprocessing(layer1, each)
# more geoprocessing options, add the layers to map and assign if they should appear in legend
legend.autoAdd = True
arcpy.mapping.AddLayer(df1, streams,"TOP")
df1.extent = layer1.getExtent(True)
arcpy.mapping.ExportToJPEG(theMap, path + "\\counties\\" + each + "\\map.jpg")
# Save map document to path
theMap.saveACopy(path + "\\counties\\" + each + "\\map.mxd")
del theMap
print "done with map " + str(counter1)
except:
print "issue with map or already exists"
counter1+=1
makeMap()
Original Design
import os, sys, urllib2, urllib, zipfile, arcpy
from arcpy import env
response = urllib2.urlopen('path2URL')
path1 = 'path2URL'
myCount = 1
for each in response:
eachNew = each.split(" ")
myCount+=1
response1 = urllib2.urlopen(path1 + eachNew[9])
for all1 in response1:
#print all1
allNew = all1.split(",")
allFinal = allNew[0].split(" ")
allFinal1 = allFinal[len(allFinal)-1].strip(" ")
if allFinal1[-10:-2] == "poly.ZIP":
response2 = urllib2.urlopen('path2URL')
zipcontent= response2.readlines()
path8 = 'path2URL'+ eachNew[9][0:len(eachNew[9])-2] +"/"+ allFinal1[0:len(allFinal1)-2]
downZip = str(eachNew[9][0:len(eachNew[9])-2])+ ".zip"
urllib.urlretrieve (path8, downZip)
# Set the path to the directory where your zipped folders reside
zipfilepath = 'F:\Misc\presentation'
# Set the path to where you want the extracted data to reside
extractiondir = 'F:\Misc\presentation\counties'
# List all data in the main directory
zp1 = os.listdir(zipfilepath)
# Creates a loop which gives use each zipped folder automatically
# Concatinates zipped folder to original directory in variable done
for each in zp1:
print each[-4:]
if each[-4:] == ".zip":
done = zipfilepath + "\\" + each
zip1 = zipfile.ZipFile(done)
extractiondir1 = extractiondir + "\\" + each[:-4]
zip1.extractall(extractiondir1)
path = os.getcwd()
counter1 = 1
# get the data frames
# Create new layer for all files to be added to map document
env.workspace = "E:\\Misc\\presentation\\symbology\\"
zp1 = os.listdir(path + "\\counties\\")
for each in zp1:
zp2 = os.listdir(path + "\\counties\\" + each)
for eachNew in zp2:
if eachNew[-4:] == ".shp":
wetlandMap = arcpy.mapping.MapDocument('E:\\Misc\\presentation\\wetland.mxd')
df1 = arcpy.mapping.ListDataFrames(wetlandMap,"*")[0]
#print eachNew[-4:]
wetland = arcpy.mapping.Layer(path + "\\counties\\" + each + "\\" + eachNew)
#arcpy.Clip_analysis(path + "\\symbology\\Stream_order.shp", wetland, path + "\\counties\\" + each + "\\Streams.shp")
streams = arcpy.mapping.Layer(path + "\\symbology\\Stream_order.shp")
arcpy.ApplySymbologyFromLayer_management(wetland, path + '\\symbology\\wetland.lyr')
arcpy.ApplySymbologyFromLayer_management(streams, path+ '\\symbology\\streams.lyr')
# Assign legend variable for map
legend = arcpy.mapping.ListLayoutElements(wetlandMap, "LEGEND_ELEMENT", "Legend")[0]
# add the layers to map and assign if they should appear in legend
legend.autoAdd = True
arcpy.mapping.AddLayer(df1, streams,"TOP")
legend.autoAdd = True
arcpy.mapping.AddLayer(df1, wetland,"AUTO_ARRANGE")
df1.extent = wetland.getExtent(True)
# Export the map to a pdf
arcpy.mapping.ExportToJPEG(wetlandMap, path + "\\counties\\" + each + "\\wetland.jpg")
# Save map document to path
wetlandMap.saveACopy(path + "\\counties\\" + each + "\\wetland.mxd")
del wetlandMap
print "done with map " + str(counter1)
counter1+=1
Have a look at this guide:
https://wiki.python.org/moin/PythonSpeed/PerformanceTips
Let me quote:
Function call overhead in Python is relatively high, especially compared with the execution speed of a builtin function. This strongly suggests that where appropriate, functions should handle data aggregates.
So effectively this suggests, to not factor out something as a function that is going to be called hundreds of thousands of times.
In Python functions won't be inlined, and calling them is not cheap. If in doubt use a profiler to find out how many times is each function called, and how long does it take on average. Then optimize.
You might also give PyPy a shot, as they have certain optimizations built in. Reducing the function call overhead in some cases seems to be one of them:
Python equivalence to inline functions or macros
http://pypy.org/performance.html
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
This is my first question, and I apologize if its a bit long on the code-example side.
As part of a job application I was asked to write a Bit Torrent file parser that exposed some of the fields. I did the code, and was told my code was "not quite at the level that we require from a team lead". Ouch!
That's fine its, been years since I have coded, and list comprehensions, generators did not exist back in the day (I started with COBOL, but have coded with C, C++, etc). To me the below code is very clean. Sometimes there is no need to use more complex structures, syntax or patterns - "Keep it Simple".
Could I ask some Python guru's to critique this code please? I'm believe it is useful to others to see where the code could be improved. There were more comments, etc (the bencode.py is from http://wiki.theory.org/Decoding_bencoded_data_with_python )
The areas I can think of:
in the display_* methods to use list comprehensions to avoid the string of "if's"better
list comprehension / generator usage
bad use of globals
stdin/stdout/piping? This was a simple assignment, so I thought it was not necessary.
I was personally proud of this code, so would like to know where I need to improve. Thanks.
#!/usr/bin/env python2
"""Bit Torrent Parsing
Parses a Bit Torrent file.
A basic parser for Bit Torrent files. Visit http://wiki.theory.org/BitTorrentSpecification for the BitTorrent specification.
"""
__author__ = "...."
__version__ = "$Revision: 1.0 $"
__date__ = "$Date: 2012/10/26 11:08:46 $"
__copyright__ = "Enjoy & Distribute"
__license__ = "Python"
import bencode
import argparse
from argparse import RawTextHelpFormatter
import binascii
import time
import os
import pprint
torrent_files = 0
torrent_pieces = 0
def display_root(filename, root):
"""prints main (root) information on torrent"""
global torrent_files
global torrent_pieces
print
print "Bit Torrent Metafile Structure root nodes:"
print "------------------------------------------"
print "Torrent filename: ", filename
print " Info: %d file(s), %d pieces, ~%d kb/pc" % (
torrent_files,
torrent_pieces,
root['info']['piece length'] / 1024)
if 'private' in root['info']:
if root['info']['private'] == 1:
print " Publish presence: Private"
print " Announce: ", root['announce']
if 'announce-list' in root:
print " Announce List: "
for i in root['announce-list']:
print " ", i[0]
if 'creation date' in root:
print " Creation Date: ", time.ctime(root['creation date'])
if 'comment' in root:
print " Comment: ", root['comment']
if 'created-by' in root:
print " Created-By: ", root['created-by']
print " Encoding: ", root['encoding']
print
def display_torrent_file(info):
"""prints file information (single or multifile)"""
global torrent_files
global torrent_pieces
if 'files' in info:
# multipart file mode
# directory, followed by filenames
print "Files:"
max_files = args.maxfiles
display = max_files if (max_files < torrent_files) else torrent_files
print " %d File %d shown: " % (torrent_files, display)
print " Directory: ", info['name']
print " Filenames:"
i = 0
for files in info['files']:
if i < max_files:
prefix = ''
if len(files['path']) > 1:
prefix = './'
filename = prefix + '/'.join(files['path'])
if args.filehash:
if 'md5sum' in files:
md5hash = binascii.hexlify(files['md5sum'])
else:
md5hash = 'n/a'
print ' %s [hash: %s]' % (filename, md5hash)
else:
print ' %s ' % filename
i += 1
else:
break
else:
# single file mode
print "Filename: ", info['name']
print
def display_pieces(pieceDict):
"""prints SHA1 hash for pieces, limited by arg pieces"""
global torrent_files
global torrent_pieces
# global pieceDict
# limit since a torrent file can have 1,000's of pieces
max_pieces = args.pieces if args.pieces else 10
print "Pieces:"
print " Torrent contains %s pieces, %d shown."% (
torrent_pieces, max_pieces)
print " piece : sha1"
i = 0
while i < max_pieces and i < torrent_pieces:
# print SHA1 hash in readable hex format
print ' %5d : %s' % (i+1, binascii.hexlify(pieceDict[i]))
i += 1
def parse_pieces(root):
"""create dictionary [ piece-num, hash ] from info's pieces
Returns the pieces dictionary. key is the piece number, value is the
SHA1 hash value (20-bytes)
Keyword arguments:
root -- a Bit Torrent Metafile root dictionary
"""
global torrent_pieces
pieceDict = {}
i = 0
while i < torrent_pieces:
pieceDict[i] = root['info']['pieces'][(20*i):(20*i)+20]
i += 1
return pieceDict
def parse_root_str(root_str):
"""create dictionary [ piece-num, hash ] from info's pieces
Returns the complete Bit Torrent Metafile Structure dictionary with
relevant Bit Torrent Metafile nodes and their values.
Keyword arguments:
root_str -- a UTF-8 encoded string with root-level nodes (e.g., info)
"""
global torrent_files
global torrent_pieces
try:
torrent_root = bencode.decode(root_str)
except StandardError:
print 'Error in torrent file, likely missing separators like ":"'
if 'files' in torrent_root['info']:
torrent_files = len(torrent_root['info']['files'])
else:
torrent_files = 1
torrent_pieces = len(torrent_root['info']['pieces']) / 20
torrent_piece = parse_pieces(torrent_root)
return torrent_root, torrent_piece
def readfile(filename):
"""read file and return file's data"""
global torrent_files
global torrent_pieces
if os.path.exists(filename):
with open(filename, mode='rb') as f:
filedata = f.read()
else:
print "Error: filename: '%s' does not exist." % filename
raise IOError("Filename not found.")
return filedata
if __name__ == "__main__":
parser = argparse.ArgumentParser(formatter_class=RawTextHelpFormatter,
description=
"A basic parser for Bit Torrent files. Visit "
"http://wiki.theory.org/BitTorrentSpecification for "
"the BitTorrent specification.",
epilog=
"The keys for the Bit Torrent MetaInfo File Structure "
"are info, announce, announce-list, creation date, comment, "
"created by and encoding. \n"
"The Info Dictionary (info) is dependant on whether the torrent "
"file is a single or multiple file. The keys common to both "
"are piece length, pieces and private.\nFor single files, the "
"additional keys are name, length and md5sum.For multiple files "
"the keys are, name and files. files is also a dictionary with "
"keys length, md5sum and path.\n\n"
"Examples:\n"
"torrentparse.py --string 'l4:dir14:dir28:file.exte'\n"
"torrentparse.py --filename foo.torrent\n"
"torrentparse.py -f foo.torrent -f bar.torrent "
"--maxfiles 2 --filehash --pieces 2 -v")
filegroup = parser.add_argument_group('Input File or String')
filegroup.add_argument("-f", "--filename",
help="name of torrent file to parse",
action='append')
filegroup.add_argument("-fh", "--filehash",
help="display file's MD5 hash",
action = "store_true")
filegroup.add_argument("-maxf", "--maxfiles",
help="display X filenames (default=20)",
metavar = 'X',
type=int, default=20)
piecegroup = parser.add_argument_group('Torrent Pieces')
piecegroup.add_argument("-p", "--pieces",
help = "display X piece's SHA1 hash (default=10)",
metavar = 'X',
type = int)
parser.add_argument("-s", "--string",
help="string for bencoded dictionary item")
parser.add_argument("-v", "--verbose",
help = "Display MetaInfo file to stdout",
action = "store_true")
args = parser.parse_args()
if args.string:
print
text = bencode.decode(args.string)
print text
else:
for fn in args.filename:
try:
filedata = readfile(fn)
torrent_root, torrent_piece = parse_root_str(filedata)
except IOError:
print "Please enter a valid filename"
raise
if torrent_root:
display_root(fn, torrent_root)
display_torrent_file(torrent_root['info'])
if args.pieces:
display_pieces(torrent_piece)
verbose = True if args.verbose else False
if verbose:
print
print "Verbose Mode: \nPrinting root and info dictionaries"
# remove pieces as its long. display it afterwards
pieceless_root = torrent_root
del pieceless_root['info']['pieces']
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(pieceless_root)
print
print "Print info's piece information: "
pp.pprint(torrent_piece)
print
print "\n"
The following snippet:
i = 0
while i < torrent_pieces:
pieceDict[i] = root['info']['pieces'][(20*i):(20*i)+20]
i += 1
should be replaced by:
for i in range(torrent_pieces):
pieceDict[i] = root['info']['pieces'][(20*i):(20*i)+20]
That might be the kind of thing they want to see. In general, Python code shouldn't need explicit index variable manipulation in for loops very much.
The first thing I notice is that you've got a lot of global variables. That's no good; your code is no longer threadsafe, for one problem. (I see now that you noted that in your question, but that is something that should be changed.)
This looks a little odd:
i = 0
for files in info['files']:
if i < max_files:
# ...
else:
break
Instead, you could just do this:
for file in info['files'][:max_files]:
# ...
I also notice that you parse the file just enough to output all of the data pretty-printed. You might want to put it into appropriate structures. For example, have Torrent, Piece, and File classes.
Trying to learn some geospatial python. More or less following the class notes here.
My Code
#!/usr/bin/python
# import modules
import ogr, sys, os
# set working dir
os.chdir('/home/jacques/misc/pythongis/data')
# create the text file we're writing to
file = open('data_export.txt', 'w')
# import the required driver for .shp
driver = ogr.GetDriverByName('ESRI Shapefile')
# open the datasource
data = driver.Open('road_surveys.shp', 1)
if data is None:
print 'Error, could not locate file'
sys.exit(1)
# grab the datalayer
layer = data.GetLayer()
# loop through the features
feature = layer.GetNextFeature()
while feature:
# acquire attributes
id = feature.GetFieldAsString('Site_Id')
date = feature.GetFieldAsString('Date')
# get coordinates
geometry = feature.GetGeometryRef()
x = str(geometry.GetX())
y = str(geometry.GetY()
# write to the file
file.Write(id + ' ' + x + ' ' + y + ' ' + cover + '\n')
# remove the current feature, and get a new one
feature.Destroy()
feature = layer.GetNextFeature()
# close the data source
datasource.Destroy()
file.close()
Running that gives me the following:
File "shape_summary.py", line 38
file.write(id + ' ' + x + ' ' + y + ' ' + cover + '\n')
^
SyntaxError: invalid syntax
Running Python 2.7.1
Any help would be fantastic!
Previous line is missing a close parenthesis:
y = str(geometry.GetY())
Also, just a style comment: it's a good idea to avoid using the variable name file in python because it actually has a meaning. Try opening a new python session and running help(file)
1)write should shouldn't be upper case in your code (Python is case sensitive)
2)make sure id is a string; if it's isn't use str(id) in your term, same for "cover" and "x" and "y"