Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 months ago.
The community reviewed whether to reopen this question 8 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I'm new to Python and programming in general. I am taking a module at university which requires me to write some fairly basic programs in Python. However, I got this feedback on my last assignment:
There should be a header block containing the file name, author name, date created, date modified and python version
What is a header block? Is it just comments at the top of your code or is it be something which prints when the program runs? Or something else?
There's thing called Docstring in python (and here're some conventions on how to write python code in general - PEP 8) escaped by either triple single quote ''' or triple double quote """ well suited for multiline comments:
'''
File name: test.py
Author: Peter Test
Date created: 4/20/2013
Date last modified: 4/25/2013
Python Version: 2.7
'''
You also may used special variables later (when programming a module) that are dedicated to contain info as:
__author__ = "Rob Knight, Gavin Huttley, and Peter Maxwell"
__copyright__ = "Copyright 2007, The Cogent Project"
__credits__ = ["Rob Knight", "Peter Maxwell", "Gavin Huttley",
"Matthew Wakefield"]
__license__ = "GPL"
__version__ = "1.0.1"
__maintainer__ = "Rob Knight"
__email__ = "rob#spot.colorado.edu"
__status__ = "Production"
More details in answer here.
My Opinion
I use this this format, as I am learning, "This is more for my own sanity, than a necessity."
As I like consistency. So, I start my files like so.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# =============================================================================
# Created By : Jeromie Kirchoff
# Created Date: Mon August 18 18:54:00 PDT 2018
# =============================================================================
"""The Module Has Been Build for..."""
# =============================================================================
# Imports
# =============================================================================
from ... import ...
<more code...>
First line is the Shebang
And I know There's no reason for most Python files to have a shebang line but, for me I feel it lets the user know that I wrote this explicitly for python3. As on my mac I have both python2 & python3.
Line 2 is the encoding, again just for clarification
As some of us forget when we are dealing with multiple sources (API's, Databases, Emails etc.)
Line 3 is more of my own visual representation of the max 80 char.
I know "Oh, gawd why?!?" again this allows me to keep my code within 80 chars for visual representation & readability.
Line 4 & 5 is just my own way of keeping track as when working in a big group keeping who wrote it on hand is helpful and saves a bit of time looking thru your GitHub. Not relevant again just things I picked up for my sanity.
Line 7 is your Docstring that is required at the top of each python file per Flake8.
Again, this is just my preference. In a working environment you have to win everyone over to change the defacto behaviour. I could go on and on about this but we all know about it, at least in the workplace.
Header Block
What is a header block?
Is it just comments at the top of your code
or is it be something which prints when the program runs?
Or something else?
So in this context of a university setting:
Header block or comments
Header comments appear at the top of a file. These lines typically include the filename, author, date, version number, and a description of what the file is for and what it contains. For class assignments, headers should also include such things as course name, number, section, instructor, and assignment number.
Is it just comments at the top of your code or is it be something which prints when the program runs? Or something else?
Well, this can be interpreted differently by your professor, showcase it and ask!
"If you never ask, The answer is ALWAYS No."
ie:
# Course: CS108
# Laboratory: A13
# Date: 2018/08/18
# Username: JayRizzo
# Name: Jeromie Kirchoff
# Description: My First Project Program.
If you are looking for Overkill:
or the python way using "Module Level Dunder Names"
Standard Module Level Dunder Names
__author__ = 'Jeromie Kirchoff'
__copyright__ = 'Copyright 2018, Your Project'
__credits__ = ['Jeromie Kirchoff', 'Victoria Mackie']
__license__ = 'MSU' # Makin' Shi* Up!
__version__ = '1.0.1'
__maintainer__ = 'Jeromie Kirchoff'
__email__ = 'MyGmailUser#gmail.com'
__status__ = 'Prototype'
Add Your Own Custom Names:
__course__ = 'cs108'
__teammates__ = ['Jeromie Kirchoff']
__laboratory__ = 'A13'
__date__ = '2018/08/18'
__username__ = 'JayRizzo'
__description__ = 'My First Project Program.'
Then just add a little code to print if the instructor would like.
print('# ' + '=' * 78)
print('Author: ' + __author__)
print('Teammates: ' + ', '.join(__teammates__))
print('Copyright: ' + __copyright__)
print('Credits: ' + ', '.join(__credits__))
print('License: ' + __license__)
print('Version: ' + __version__)
print('Maintainer: ' + __maintainer__)
print('Email: ' + __email__)
print('Status: ' + __status__)
print('Course: ' + __course__)
print('Laboratory: ' + __laboratory__)
print('Date: ' + __date__)
print('Username: ' + __username__)
print('Description: ' + __description__)
print('# ' + '=' * 78)
End RESULT
Every time the program gets called it will show the list.
$ python3 custom_header.py
# ==============================================================================
Author: Jeromie Kirchoff
Teammates: Jeromie Kirchoff
Copyright: Copyright 2018, Your Project
Credits: Jeromie Kirchoff, Victoria Mackie
License: MSU
Version: 1.0.1
Maintainer: Jeromie Kirchoff
Email: MyGmailUser#gmail.com
Status: Prototype
Course: CS108
Laboratory: A13
Date: 2018/08/18
Username: JayRizzo
Description: My First Project Program.
# ==============================================================================
Notes: If you expand your program just set this once in the init.py and you should be all set, but again check with the professor.
If would like the script checkout my github.
Your instructor wants you to add some information to your assignment's source code's top section something like this, so you are right you will add comments:
####################################
# File name: ... #
# Author: ... #
# Submission: #
# Instructor: #
####################################
Very good discussion here --> What is the common header format of Python files?
The Python docstring should be concise, and not really contain revision history, or anything not directly related to the current version behaviour. I have yet to see "man" style docstrings and it may be just as well.
A flower box, with revision history independent of source control (as some of the revisions may pre-date your source control eventually) goes back to the days of reading code on paper or as emailed. We were not always as connected as we are now.
Using a modern IDE this has fallen out of favour, but can be seen for older/larger high level works. In some shops the sign in is not performed by the coder, especially if the code has been "shopped out". Some signin's are commented in a lazy, slovenly fashion.
So it varies, but :
#! /usr/bin/python
#--------------------------------#
# optional flower box
#--------------------------------#
"""
Multiple lines of doc if required
"""
import foo
import bar
__metastuff__ = 'some value'
I see the 'meta' higher up, notably in the youtube promotionals for "pycharm". People like to see it below the imports as it is really code and the imports are expected to come before the code. I can imagine it may become easy to get carried away. Sensible and informative comments in the low level code are worth way more than what is written upstairs anyway.
In the real world, just do what everybody else is doing on your project and you will be fine. It is common to re-use a template anyway, or copy and paste (i.e. ripoff) from a "prototype".
A header block are just comments at the top of the code. It doesn't print when the program runs.
A example could look like the following:
# File name: test.py
# Author: Peter Test
# Date created: 4/20/2013
# Date last modified: 4/25/2013
# Python Version: 2.7
# Begin code
a = 1
b = 2
c = a + b
print c
In this context, you are correct. A header block means a set of comments at the top of the source file that contains the requested information. It does not need to contain any code that does anything.
Related
I wrote a library using just ast and inspect libraries to parse and emit [uses astor on Python < 3.9] internal Python constructs.
Just realised that I really need to preserve comments afterall. Preferably without resorting to a RedBaron or LibCST; as I just need to emit the unaltered commentary; is there a clean and concise way of comment-preserving parsing/emitting Python source with just stdlib?
What I ended up doing was writing a simple parser, without a meta-language in 339 source lines:
https://github.com/offscale/cdd-python/blob/master/cdd/cst_utils.py
Implementation of Concrete Syntax Tree [List!]
Reads source character by character;
Once end of statement† is detected, add statement-type into 1D list;
†end of line if line.lstrip().startswith("#") or line not endswith('\\') and balanced_parens(line) else continue munching until that condition is true… plus some edge-cases around multiline strings and the like;
Once finished there is a big (1D) list where each element is a namedtuple with a value property.
Integration with builtin Abstract Syntax Tree ast
Limit ast nodes to modify—not remove—to: {ClassDef,AsyncFunctionDef,FunctionDef} docstring (first body element Constant|Str), Assign and AnnAssign;
cst_idx, cst_node = find_cst_at_ast(cst_list, _node);
if doc_str node then maybe_replace_doc_str_in_function_or_class(_node, cst_idx, cst_list)
…
Now the cst_list contains only changes to those aforementioned nodes, and only when that change is more than whitespace, and can be created into a string with "".join(map(attrgetter("value"), cst_list)) for outputting to eval or straight out to a source file (e.g., in-place overriding).
Quality control
100% test coverage
100% doc coverage
Support for last 6 versions of Python (including latest alpha)
CI/CD
(Apache-2.0 OR MIT) licensed
Limitations
Lack of meta-language, specifically lack of using Python's provided grammar means new syntax elements won't automatically be supported (match/case is supported, but if there's new syntax introduced since, it isn't [yet?] supported… at least not automatically);
Not builtin to stdlib so stdlib could break compatibility;
Deleting nodes is [probably] not supported;
Nodes can be incorrectly identified if there are shadow variables or similar issues that linters should point out.
Comments can be preserved by merging them back into the generated source code by capturing them with the tokenizer.
Given a toy program in a program variable, we can demonstrate how comments get lost in the AST:
import ast
program = """
# This comment lost
p1v = 4 + 4
p1l = ['a', # Implicit line joining comment for a lost
'b'] # Ending comment for b lost
def p1f(x):
"p1f docstring"
# Comment in function p1f lost
return x
print(p1f(p1l), p1f(p1v))
"""
tree = ast.parse(program)
print('== Full program code:')
print(ast.unparse(tree))
The output shows all comments gone:
== Full program code:
p1v = 4 + 4
p1l = ['a', 'b']
def p1f(x):
"""p1f docstring"""
return x
print(p1f(p1l), p1f(p1v))
However, if we scan the comments with the tokenizer, we can
use this to merge the comments back in:
from io import StringIO
import tokenize
def scan_comments(source):
""" Scan source code file for relevant comments
"""
# Find token for comments
for k,v in tokenize.tok_name.items():
if v == 'COMMENT':
comment = k
break
comtokens = []
with StringIO(source) as f:
tokens = tokenize.generate_tokens(f.readline)
for token in tokens:
if token.type != comment:
continue
comtokens += [token]
return comtokens
comtokens = scan_comments(program)
print('== Comment after p1l[0]\n\t', comtokens[1])
Output (edited to split long line):
== Comment after p1l[0]
TokenInfo(type=60 (COMMENT),
string='# Implicit line joining comment for a lost',
start=(4, 12), end=(4, 54),
line="p1l = ['a', # Implicit line joining comment for a lost\n")
Using a slightly modified version of ast.unparse(), replacing
methods maybe_newline() and traverse() with modified versions,
you should be able to merge back in all comments at their
approximate locations, using the location info from the comment
scanner (start variable), combined with the location info from the
AST; most nodes have a lineno attribute.
Not exactly. See for example the list variable assignment. The
source code is split out over two lines, but ast.unparse()
generates only one line (see output in the second code segment).
Also, you need to ensure to update the location info in the AST
using ast.increment_lineno() after adding code.
It seems some more calls to
maybe_newline() might be needed in the library code (or its
replacement).
I'm working on a project in my work using purely Python 3:
If I take my scanner, (Because I work in inventory) and anything I scan goes into a text doc, and I scan the location "117" , and then I scan any device in any other location, (the proceeding lines in the text doc "100203") and I run the script and it plugs in '117' in the search on our database and changes each of the devices (whether they were assigned to that location or not) into that location, (Validating those devices are in location '117')
My main question is the 3rd objective down from the Objectives list below that doesn't have "Done" after it.
Objective:
Pull strings from a text document, convert it into a dictionary. = (Text_Dictionary) **Done**
Assign the first var in the dictionary to a separate var. = (First_Line) **Done**
All proceeding var's greater then the first var in the dictionary should be passed into a function individually. = (Proceeding_Lines)
Side note: The code should loop in a fashion that should (.pop) the var from the dictionary/list, But I'm open for other alternatives. (Not mandatory)
What I already have is:
Project.py:
1 import re
2 import os
3 import time
4 import sys
5
6 with open(r"C:\Users\...\text_dictionary.txt") as f:
7 Text_Dictionary = [line.rstrip('\n') for line in
8 open(r"C:\Users\...\text_dictionary.txt")]
9
10 Text_Dict = (Text_Dictionary)
11 First_Line = (Text_Dictionary[0])
12
13 print("The first line is: ", First_Line)
14
15 end = (len(Text_Dictionary) + 1)
16 i = (len(Text_Dictionary))
17
What I have isn't much on the surface, but I have another "*.py" file fill of code that I am going to copy in for the action that I wish to preform on each of the vars in the Text_Dictionary.txt. Lines 15 - 16 was me messing with what I thought might solve this.
In the imported text document, the var's look very close to this (Same length)(All digits):
Text_Dictionary.txt:
117
23000
53455
23454
34534
...
Note: These values will change for each time the code is ran, meaning someone will type/scan in these lines of digits each time.
Explained concept:
Ideally, I would like to have the first line point towards a direction, and the rest of the digits would follow; however, each (Example: '53455') needs to be ran separately then the next in line and (Example: '117') would be where '53455' goes. You could say the first line is static throughout the code, unless otherwise changed inText_Dictionary.txt. '117'is ran in conjunction with each iteration proceeding it.
Background:
This is for inventory management for my office, I am in no way payed for doing this, but it would make my job a heck-of-a-lot easier. Also, I know basic python to get myself around, but this kinda stumped me. Thank you to whoever answers!
I've no clue what you're asking, but I'm going to take a guess. Before I do so, your code was annoying me:
with open("file.txt") as f:
product_ids = [line.strip() for line in f if not line.isspace()]
There. That's all you need. It protects against blank lines in the file and weird invisible spaces too, this way. I decided to leave the data as strings because it probably represents an inventory ID, and in the future that might be upgraded to "53455-b.42454#62dkMlwee".
I'm going to hazard a guess that you want to run different code depending on the number at the top. If so, you can use a dictionary containing functions. You said that you wanted to run code from another file, so this is another_file.py:
__all__ = ["dispatch_whatever"]
dispatch_whatever = {}
def descriptive_name_for_117(product_id):
pass
dispatch_whatever["117"] = descriptive_name_for_117
And back in main_program.py, which is stored in the same directory:
from another_file import dispatch_whatever
for product_id in product_ids[1:]:
dispatch_whatever[product_ids[0]](product_id)
I am working on Python in Eclipse + PyDev. When I create a new module from "Module: CLI (argparse)" template, Eclipse creates the following comment block in the beginning of the file.
#!/usr/local/bin/python2.7
# encoding: utf-8
'''
packagename.newfile -- shortdesc
packagename.newfile is a description
It defines classes_and_methods
#author: user_name
#copyright: 2015 organization_name. All rights reserved.
#license: license
#contact: user_email
#deffield updated: Updated
'''
It appears to have some kind of structure, I guess it's parsed by something? I have a few questions:
How is this type of comment structure called and by what programs is it used?
How do I include a multiline apache license 2.0 notice under #license?
What is updated field for?
Most Python documentation in this form is adhering to the Epydoc standard.
http://epydoc.sourceforge.net/manual-fields.html shows details of all of the fields that you've listed above. This file can then be parsed by Epydoc and documentation created from that.
This following example shows that multiline documentation is allowed:
def example():
"""
#param x: This is a description of
the parameter x to a function.
Note that the description is
indented four spaces.
#type x: This is a description of
x's type.
#return: This is a description of
the function's return value.
It contains two paragraphs.
"""
#[...]
This was sourced from http://epydoc.sourceforge.net/epydoc.html#fields
The updated part is showing the ability to add in any #style annotation. See http://epydoc.sourceforge.net/epydoc.html#adding-new-fields.
So #deffield updated: Updated means that there's a new annotation #updated. This would be used as follows
"""
#updated 17/02/2015
"""
This would then be rendered into the HTML created from Epydoc.
I'm fairly new to python, but I'm making a script and I want one of the functions to update a variable from another file. It works, but when I exit the script and reload it, the changes aren't there anymore. For example (this isn't my script):
#File: changeFile.txt
number = 0
#File: changerFile.py
def changeNumber():
number += 1
If I retrieve number during that session, it will return 1, but if I exit out and go back in again and retrieve number without calling changeNumber, it returns 0.
How can I get the script to actually save the number edited in changeNumber to changeFile.txt? As I said, I'm fairly new to python, but I've looked just about everywhere on the Internet and couldn't really find an answer that worked.
EDIT: Sorry, I forgot to include that in the actual script, there are other values.
So I want to change number and have it save without deleting the other 10 values stored in that file.
Assuming, as you show, that changeFile.txt has no other content whatever, then just change the function to:
def changeNumber():
global number # will not possibly work w/o this, the way you posted!
number += 1
with open('changeFile.txt', 'w') as f:
f.write('number = {}\n'.format(number))
ADDED: the OP edited the Q to mention (originally omitted!-) the crucial fact that changefile.txt has other lines that need to be preserved as well as the one that needs to be changed.
That, of course, changes everything -- but, Python can cope!-)
Just add import fileinput at the start of this module, and change the last two lines of the above snippet (starting with with) to:
for line in fileinput.input(['changefile.txt'], inplace=True):
if line.startswith('number ');
line = 'number = {}\n'.format(number)'
print line,
This is the Python 2 solution (the OP didn't bother to tell us if using Py2 or Py3, a crucial bit of info -- hey, who cares about making it easy rather than very hard for willing volunteers to help you, right?!-). If Python 3, change the last statement from print line, to
print(line, end='')
to get exactly the same desired effect.
I am working on a project that uses the Unity3D game engine. For some of the pipeline requirements, it is best to be able to update some files from external tools using Python. Unity's meta and anim files are in YAML so I thought this would be strait forward enough using PyYAML.
The problem is that Unity's format uses custom attributes and I am not sure how to work with them as all the examples show more common tags used by Python and Ruby.
Here is what the top lines of a file look like:
%YAML 1.1
%TAG !u! tag:unity3d.com,2011:
--- !u!74 &7400000
AnimationClip:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 0}
...
When I try to read the file I get this error:
could not determine a constructor for the tag 'tag:unity3d.com,2011:74'
Now after looking at all the other questions asked, this tag scheme does not seem to resemble those questions and answers. For example this file uses "!u!" which I was unable to figure out what it means or how something similar would behave (my wild uneducated guess says it looks like an alias or namespace).
I can do a hack way and strip the tags out but that is not the ideal way to try to do this. I am looking for help on a solution that will properly handle the tags and allow me to parse & encode the data in a way that preserves the proper format.
Thanks,
-R
I also had this problem, and the internet was not very helpful. After bashing my head against this problem for 3 days, I was able to sort it out...or at least get a working solution. If anyone wants to add more info, please do. But here's what I got.
1) The documentation on Unity's YAML file format(they call it a "textual scene file" because it contains text that is human readable) - http://docs.unity3d.com/Manual/TextualSceneFormat.html
It is a YAML 1.1 compliant format. So you should be able to use PyYAML or any other Python YAML library to load up a YAML object.
Okay, great. But it doesn't work. Every YAML library has issues with this file.
2) The file is not correctly formed. It turns out, the Unity file has some syntactical issues that make YAML parsers error out on it. Specifically:
2a) At the top, it uses a %TAG directive to create an alias for the string "unity3d.com,2011". It looks like:
%TAG !u! tag:unity3d.com,2011:
What this means is anywhere you see "!u!", replace it with "tag:unity3d.com,2011".
2b) Then it goes on to use "!u!" all over the place before each object stream. But the problem is that - to be YAML 1.1 compliant - it should actually declare a tag alias for each stream (any time a new object starts with "--- "). Declaring it once at the top and never again is only valid for the first stream, and the next stream knows nothing about "!u!", so it errors out.
Also, this tag is useless. It basically appends "tag:unity3d.com,2011" to each entry in the stream. Which we don't care about. We already know it's a Unity YAML file. Why clutter the data?
3) The object types are given by Unity's Class ID. Here is the documentation on that:
http://docs.unity3d.com/Manual/ClassIDReference.html
Basically, each stream is defined as a new class of object...corresponding to the IDs in that link. So a "GameObject" is "1", etc. The line looks like this:
--- !u!1 &100000
So the "--- " defines a new stream. The "!u!" is an alias for "tag:unity3d.com,2011" and the "&100000" is the file ID for this object (inside this file, if something references this object, it uses this ID....remember YAML is a node-based representation, so that ID is used to denote a node connection).
The next line is the root of the YAML object, which happens to be the name of the Unity Class...example "GameObject". So it turns out we don't actually need to translate from Class ID to Human Readable node type. It's right there. If you ever need to use it, just take the root node. And if you need to construct a YAML object for Unity, just keep a dictionary around based on that documentation link to translate "GameObject" to "1", etc.
The other problem is that most YAML parsers (PyYAML is the one I tested) only support 3 types of YAML objects out of the box:
Scalar
Sequence
Mapping
You can define/extend custom nodes. But this amounts to hand writing your own YAML parser because you have to define EXPLICITLY how each YAML constructor is created, and outputs. Why would I use a Library like PyYAML, then go ahead and write my own parser to read these custom nodes? The whole point of using a library is to leverage previous work and get all that functionality from day one. I spent 2 days trying to make a new constructor for each class ID in unity. It never worked, and I got into the weeds trying to build the constructors correctly.
THE GOOD NEWS/SOLUTION:
Turns out, all the Unity nodes I've ever run into so far are basic "Mapping" nodes in YAML. So you can throw away the custom node mapping and just let PyYAML auto-detect the node type. From there, everything works great!
In PyYAML, you can pass a file object, or a string. So, my solution was to write a simple 5 line pre-parser to strip out the bits that confuse PyYAML(the bits that Unity incorrectly syntaxed) and feed this new string to PyYAML.
1) Remove line 2 entirely, or just ignore it:
%TAG !u! tag:unity3d.com,2011:
We don't care. We know it's a unity file. And the tag does nothing for us.
2) For each stream declaration, remove the tag alias ("!u!") and remove the class ID. Leave the fileID. Let PyYAML auto-detect the node as a Mapping node.
--- !u!1 &100000
becomes...
--- &100000
3) The rest, output as is.
The code for the pre-parser looks like this:
def removeUnityTagAlias(filepath):
"""
Name: removeUnityTagAlias()
Description: Loads a file object from a Unity textual scene file, which is in a pseudo YAML style, and strips the
parts that are not YAML 1.1 compliant. Then returns a string as a stream, which can be passed to PyYAML.
Essentially removes the "!u!" tag directive, class type and the "&" file ID directive. PyYAML seems to handle
rest just fine after that.
Returns: String (YAML stream as string)
"""
result = str()
sourceFile = open(filepath, 'r')
for lineNumber,line in enumerate( sourceFile.readlines() ):
if line.startswith('--- !u!'):
result += '--- ' + line.split(' ')[2] + '\n' # remove the tag, but keep file ID
else:
# Just copy the contents...
result += line
sourceFile.close()
return result
To create a PyYAML object from a Unity textual scene file, call your pre-parser function on the file:
import yaml
# This fixes Unity's YAML %TAG alias issue.
fileToLoad = '/Users/vlad.dumitrascu/<SOME_PROJECT>/Client/Assets/Gear/MeleeWeapons/SomeAsset_test.prefab'
UnityStreamNoTags = removeUnityTagAlias(fileToLoad)
ListOfNodes = list()
for data in yaml.load_all(UnityStreamNoTags):
ListOfNodes.append( data )
# Example, print each object's name and type
for node in ListOfNodes:
if 'm_Name' in node[ node.keys()[0] ]:
print( 'Name: ' + node[ node.keys()[0] ]['m_Name'] + ' NodeType: ' + node.keys()[0] )
else:
print( 'Name: ' + 'No Name Attribute' + ' NodeType: ' + node.keys()[0] )
Hope that helps!
-Vlad
PS. To Answer the next issue in making this usable:
You also need to walk the entire project directory and parse all ".meta" files for the "GUID", which is Unity's inter-file reference. So, when you see a reference in a Unity YAML file for something like:
m_Materials:
- {fileID: 2100000, guid: 4b191c3a6f88640689fc5ea3ec5bf3a3, type: 2}
That file is somewhere else. And you can re-cursively open that one to find out any dependencies.
I just ripped through the game project and saved a dictionary of GUID:Filepath Key:Value pairs which I can match against.