Find string of method in a file - python

Right now I have this to find a method
def getMethod(text, a, filetype):
start = a
fin = a
if filetype == "cs":
for x in range(a, 0, -1):
if text[x] == "{":
start = x
break
for x in range(a, len(text)):
if text[x] == "}":
fin = x
break
return text[start:fin + 1]
How can I get the method the index a is in?
I can't just find { and } because you can have things like new { } which won't work
if I had a file with a few methods and I wanted to find what method the index of x is in then I want the body of that method for example if I had the file
private string x(){
return "x";
}
private string b(){
return "b";
}
private string z(){
return "z";
}
private string a(){
var n = new {l = "l"};
return "a";
}
And I got the index of "a" which lets say is 100
then I want to find the body of that method. So everything within { and }
So this...
{
var n = new {l = "l"};
return "a";
}
But using what I have now it would return:
{l = "l"};
return "a";
}

If my interpretation is correct, it seems you are attempting to parse C# source code to find the C# method that includes a given position a in a .cs file, the content of which is contained in text.
Unfortunately, if you want to do a complete and accurate job, I think you would need a full C# parser.
If that sounds like too much work, I'd think about using a version of ctags that is compatible with C# to generate a tag file and then search in the tag file for the method that applies to a given source file line instead of the original source file.

As Simon stated, if your problem is to parse source code, the best bet is to get a proper parser for that language.
If you're just looking to match up the braces however, there is a well-known algorithm for that: Python parsing bracketed blocks
Just be aware that since source code is a complex beast, don't expect this to work 100%.

Related

how can i print 2 or more values in array concept

My code is like this:
main(){
var courselist = ["dart","flutter","swift","R"];
print(courselist[0]);
}
But I want to print my output as like this .. 'dart' and 'R'
How can I print 2 or more values in array concept?
I have tried doing like this:
print(courselist[0],[3]);
print(courselist[0],courselist[3]);
print(courselist[0] & [3]);
My Code:
main(){
var courselist = ["dart","flutter",24,0.45];
print(courselist[0]);
}
Error:
Harish#Harish:~/Desktop/dartfiles/Basics$ dart 01var_datatypes.dart
01var_datatypes.dart:26:20: Error: The method '[]' isn't defined for the class 'Object'.
- 'Object' is from 'dart:core'.
Try correcting the name to the name of an existing method, or defining a method named '[]'.
print(courselist[2][0]);
^^
You need to write courselist[i] to print ith index entry of the list. You need to write print(courselist [0], courselist [3]) to print each of the entry. Writing print(courselist [0],[3]) or print(courselist [0]and[3]) will not work. If you have large number of indices to print then you can avoid writing multiple times by iterating over a list:
vals=[0,3]
for val in vals:
print(courselist [val])
You need to use correct PYTHON code for this to work.
If you want to display two variable in fluttet(dart) as you add tag you have can do as mention below.
print('Something ${variable1} ${variable 2} ');
You can also add whatever text you want at anywhere.
Within a quoted string you can build your output by evaluating expressions and variables.
Expressions surround them with ${expr}. Variables just prefix with $. E.g. $varname,
main() {
var courselist = ["dart", "flutter", "swift", "R"];
print("Dart Course: ${courselist[0]}, R Course: ${courselist[3]}");
num grade = 10;
print("Amount $grade");
}

Update value in a JOSNObject without pointers in python

I'm programming a python code in which I use JSONObjects to communicate with a Java application. My problem ist, that I want to change a value in the JSONObject (in this example called py_json) and the dimension of that JSONObject is not fixed but known.
varName[x] is the input of the method and the length of varName is the dimension/size of the JSONObjects.
The code would work like that but I can't copy and paste the code 100 times to be sure that there are no bigger JSONObjects.
if length == 1:
py_json[VarName[0]] = newValue
elif length == 2:
py_json[VarName[0]][VarName[1]] = newValue
elif length == 3:
py_json[VarName[0]][VarName[1]][VarName[2]] = newValue
In C I would solve it with pointers like that:
int *pointer = NULL;
pointer = &py_json;
for (i=0; i<length; i++){
pointer = &(*pointer[VarName[i]]);
}
*pointer = varValue;
But there are no pointers in python.
Do you known a way to have a dynamic solution in python?
Python's "variables" are just names pointing to objects (instead of symbolic names for memory addresses as in C) and Python assignement doesn't "copies" a variable value to a new memory location, it only make the name points to a different object - so you don't need pointers to get the same result (you probably want to read this for more on python's names / variables).
IOW, the solution is basically the same: just use a for loop to get the desired target (actually: the parent of the desired target), then assign to it:
target = py_json
for i in range(0, length - 1):
target = target[VarName[i]]
target[VarName[length - 1]] = newValue

Line-length based custom python JSON encoding for serializables

My problem is similar to Can I implement custom indentation for pretty-printing in Python’s JSON module? and How to change json encoding behaviour for serializable python object? but instead I'd like to collapse lines together if the entire JSON encoded structure can fit on that single line, with configurable line length, in Python 2.X and 3.X. The output is intended for easy-to-read documentation of the JSON structures, rather than debugging. Clarifying: the result MUST be valid JSON, and allow for the regular JSON encoding features of OrderedDicts/sort_keys, default handlers, and so forth.
The solution from custom indentation does not apply, as the individual structures would need to know their serialized lengths in advance, thus adding a NoIndent class doesn't help as every structure might or might not be indented. The solution from changing the behavior of json serializable does not apply as there aren't any (weird) custom overrides on the data structures, they're just regular lists and dicts.
For example, instead of:
{
"#context": "http://linked.art/ns/context/1/full.jsonld",
"id": "http://lod.example.org/museum/ManMadeObject/0",
"type": "ManMadeObject",
"classified_as": [
"aat:300033618",
"aat:300133025"
]
}
I would like to produce:
{
"#context": "http://linked.art/ns/context/1/full.jsonld",
"id": "http://lod.example.org/museum/ManMadeObject/0",
"type": "ManMadeObject",
"classified_as": ["aat:300033618", "aat:300133025"]
}
At any level of nesting within the structure, and across any numbers of levels of nesting until the line length was reached. Thus if there was a list with a single object inside, with a single key/value pair, it would become:
{
"#context": "http://linked.art/ns/context/1/full.jsonld",
"id": "http://lod.example.org/museum/ManMadeObject/0",
"type": "ManMadeObject",
"classified_as": [{"id": "aat:300033618"}]
}
It seems like a recursive descent parser on the indented output would work, along the lines of #robm's approach to custom indentation, but the complexity seems to quickly approach that of writing a JSON parser and serializer.
Otherwise it seems like a very custom JSONEncoder is needed.
Your thoughts appreciated!
Very inefficient, but seems to work so far:
def _collapse_json(text, collapse):
js_indent = 2
lines = text.splitlines()
out = [lines[0]]
while lines:
l = lines.pop(0)
indent = len(re.split('\S', l, 1)[0])
if indent and l.rstrip()[-1] in ['[', '{']:
curr = indent
temp = []
stemp = []
while lines and curr <= indent:
if temp and curr == indent:
break
temp.append(l[curr:])
stemp.append(l.strip())
l = lines.pop(0)
indent = len(re.split('\S', l, 1)[0])
temp.append(l[curr:])
stemp.append(l.lstrip())
short = " " * curr + ''.join(stemp)
if len(short) < collapse:
out.append(short)
else:
ntext = '\n'.join(temp)
nout = _collapse_json(ntext, collapse)
for no in nout:
out.append(" " * curr + no)
l = lines.pop(0)
elif indent:
out.append(l)
out.append(l)
return out
def collapse_json(text, collapse):
return '\n'.join(_collapse_json(text, collapse))
Happy to accept something else that produces the same output without crawling up and down constantly!

Is there any way to fetch all field name of collection in mongodb?

Is there any way by which we can fetch the field(Column name) of a collection in MongoDb. Like we have in mysql as follows:
SHOW columns FROM table_name
Or any way to check if a particular field exists in a collection.
First question no, as each doc in a collection is independent; second question yes (using $exists):
# Get the count of docs that contain field 'fieldname'
db.coll.find({'fieldname': {'$exists': 1}}).count()
In MongoDB every entry can contain a different number of fields, as well as different field names, too. That's why such a command is not there in the Mongo APIs: in MySQL you can do it because every row will have the same number of column and same column names. In MongoDB wou cannot make this assumption. What you can do is to check if a field is there in the retrieved document simply by:
if field_name in doc:
# Do stuff.
where field_name is the "column" name you want to check existence and doc is the current doc pointed by the cursor. Remember that doc is a dict, so you can treat it as you would treat any other dict in Python.
Since each document is separate from the other there is no easy way to do this, however, if you wish to understand what you have in your collection you can use Variety as described here:
http://blog.mongodb.org/post/21923016898/meet-variety-a-schema-analyzer-for-mongodb
It basically Map Reduces your collection to find out what fields you have in it.
As #JohnnyHK said, you can check the existance of a field by using $exists: http://docs.mongodb.org/manual/reference/operator/exists/
This is not best practices but in the shell you can type
Object.keys(db.posts.findOne())
Note: this doesnt show you the inner keys in the object, you can use map reduce to resolve this but if your object is simple this pretty much does the task.
I faced same problem when I was dealing with some third party heterogeneous data and I solved it using map reduce on the entire collection , here is the js code I used in case you can find it useful:
function MapKeys() {
var tmp,tmpEmpty,ChildObjTp,ChildIsAr;
var levelCurrent=0;
var record=this;
function isArray(obj) {return typeof(obj)=='object'&&(obj instanceof Array);}
//function emptyIf(obj){if (obj=='tojson') {return ' ';} else {return obj+' '}; } //#note date fields return .tojson so strip it
function emptyIf(obj){if (typeof(this[obj])=='function') {return ' ';} else {return obj+' ';} } //#note date fields return .tojson so strip it
//var toType = function(obj) { // * http://javascriptweblog.wordpress.com/2011/08/08/fixing-the-javascript-typeof-operator/
// return ({}).toString.call(obj).match(/\s([a-zA-Z]+)/)[1].toLowerCase()
// }
function keysToArray(obj,propArr,levelMax, _level) {
/** example: r1=keysToArray(doc,[null,[] ],2,0)
_level is used for recursion and should always called with 0
if levelMax is negative returns maximum level
levelMax=0 means top level only 2 up to 2nd depth level etc.
*/
for (var key in obj) {
if (obj.hasOwnProperty(key)) {
if (obj[key] instanceof Object && ! (obj[key] instanceof Array))
if (levelMax < 0 || _level+1 <= levelMax) {
{propArr[1].push(keysToArray(obj[key],[key,[] ],levelMax,_level+1));}
}
else {} //needed coz nested if ?
{propArr[1].push(key);}
}
}
return propArr;
}
//----------------------------------------------------------------------------------------------
function arrayToStr(lst,prevKey,delimiter, inclKeys,levelMax,_level,_levelMaxFound) {
/** example: r2=arrayToStr(r1,'','|',true,2,0,0)
_level and _levelMaxFound is used for recursion and should always called with value 0
if levelMax is negative returns maximum level
levelMax=0 means top level only 2 up to 2nd depth level etc.
*/
var rt,i;
_levelMaxFound=Math.max(_level,_levelMaxFound);
if (prevKey !=='') {prevKey += '.';}
var rtStr ='';
if (lst[0]) {prevKey += lst[0]+'.';}
if (inclKeys) {rtStr += prevKey.slice(0,-1);}
for (var n in lst[1]) {
i=lst[1][n];
if (typeof(i)=='string') {
rtStr += delimiter + prevKey + i;
}
else
{
if (levelMax < 0 || _level+1 <= levelMax) {
rt=arrayToStr(i,prevKey.slice(0,-1),delimiter, inclKeys,levelMax,_level+1,_levelMaxFound);
rtStr += delimiter + rt[0];
_levelMaxFound=Math.max(rt[1],_levelMaxFound);
}
else {}
}
}
if (rtStr[0] == delimiter) {rtStr=rtStr.slice(1);} // Lstrip delimiters if any
return [rtStr,_levelMaxFound]
}
//----------------------------------------------------------------------------------------------
var keysV=keysToArray(this,[null,[] ] ,parms.levelMax, 0); // we can't sort here coz array is nested
keysV = arrayToStr(keysV,'',' ', parms.inclHeaderKeys,-1,0,0);
var MaxDepth=keysV[1];
keysV=keysV[0].split(' '); // so we can sort
keysV.sort(); // sort to make sure indentical records map to same id
keysV=keysV.join(' ');
emit ({type:'fieldsGrp',fields:keysV}, {cnt:1, percent:0.0,depth:MaxDepth,exampleIds:[this._id]});}
function ReduceKeys (key, values) {
//var total = {cnt:0,percent:0.0,exampleIds:[]}
var total = {cnt:0, percent:0.0,depth:values[0].depth,exampleIds:[]}
for(var i in values) {
total.cnt += values[i].cnt;
if (total.exampleIds.length < parms.Reduce_ExamplesMax){
total.exampleIds = values[i].exampleIds.concat(total.exampleIds);
}
}
return total;}

Parsing an existing config file

I have a config file that is in the following form:
protocol sample_thread {
{ AUTOSTART 0 }
{ BITMAP thread.gif }
{ COORDS {0 0} }
{ DATAFORMAT {
{ TYPE hl7 }
{ PREPROCS {
{ ARGS {{}} }
{ PROCS sample_proc }
} }
} }
}
The real file may not have these exact fields, and I'd rather not have to describe the the structure of the data is to the parser before it parses.
I've looked for other configuration file parsers, but none that I've found seem to be able to accept a file of this syntax.
I'm looking for a module that can parse a file like this, any suggestions?
If anyone is curious, the file in question was generated by Quovadx Cloverleaf.
pyparsing is pretty handy for quick and simple parsing like this. A bare minimum would be something like:
import pyparsing
string = pyparsing.CharsNotIn("{} \t\r\n")
group = pyparsing.Forward()
group << pyparsing.Group(pyparsing.Literal("{").suppress() +
pyparsing.ZeroOrMore(group) +
pyparsing.Literal("}").suppress())
| string
toplevel = pyparsing.OneOrMore(group)
The use it as:
>>> toplevel.parseString(text)
['protocol', 'sample_thread', [['AUTOSTART', '0'], ['BITMAP', 'thread.gif'],
['COORDS', ['0', '0']], ['DATAFORMAT', [['TYPE', 'hl7'], ['PREPROCS',
[['ARGS', [[]]], ['PROCS', 'sample_proc']]]]]]]
From there you can get more sophisticated as you want (parse numbers seperately from strings, look for specific field names etc). The above is pretty general, just looking for strings (defined as any non-whitespace character except "{" and "}") and {} delimited lists of strings.
Taking Brian's pyparsing solution another step, you can create a quasi-deserializer for this format by using the Dict class:
import pyparsing
string = pyparsing.CharsNotIn("{} \t\r\n")
# use Word instead of CharsNotIn, to do whitespace skipping
stringchars = pyparsing.printables.replace("{","").replace("}","")
string = pyparsing.Word( stringchars )
# define a simple integer, plus auto-converting parse action
integer = pyparsing.Word("0123456789").setParseAction(lambda t : int(t[0]))
group = pyparsing.Forward()
group << ( pyparsing.Group(pyparsing.Literal("{").suppress() +
pyparsing.ZeroOrMore(group) +
pyparsing.Literal("}").suppress())
| integer | string )
toplevel = pyparsing.OneOrMore(group)
sample = """
protocol sample_thread {
{ AUTOSTART 0 }
{ BITMAP thread.gif }
{ COORDS {0 0} }
{ DATAFORMAT {
{ TYPE hl7 }
{ PREPROCS {
{ ARGS {{}} }
{ PROCS sample_proc }
} }
} }
}
"""
print toplevel.parseString(sample).asList()
# Now define something a little more meaningful for a protocol structure,
# and use Dict to auto-assign results names
LBRACE,RBRACE = map(pyparsing.Suppress,"{}")
protocol = ( pyparsing.Keyword("protocol") +
string("name") +
LBRACE +
pyparsing.Dict(pyparsing.OneOrMore(
pyparsing.Group(LBRACE + string + group + RBRACE)
) )("parameters") +
RBRACE )
results = protocol.parseString(sample)
print results.name
print results.parameters.BITMAP
print results.parameters.keys()
print results.dump()
Prints
['protocol', 'sample_thread', [['AUTOSTART', 0], ['BITMAP', 'thread.gif'], ['COORDS',
[0, 0]], ['DATAFORMAT', [['TYPE', 'hl7'], ['PREPROCS', [['ARGS', [[]]], ['PROCS', 'sample_proc']]]]]]]
sample_thread
thread.gif
['DATAFORMAT', 'COORDS', 'AUTOSTART', 'BITMAP']
['protocol', 'sample_thread', [['AUTOSTART', 0], ['BITMAP', 'thread.gif'], ['COORDS', [0, 0]], ['DATAFORMAT', [['TYPE', 'hl7'], ['PREPROCS', [['ARGS', [[]]], ['PROCS', 'sample_proc']]]]]]]
- name: sample_thread
- parameters: [['AUTOSTART', 0], ['BITMAP', 'thread.gif'], ['COORDS', [0, 0]], ['DATAFORMAT', [['TYPE', 'hl7'], ['PREPROCS', [['ARGS', [[]]], ['PROCS', 'sample_proc']]]]]]
- AUTOSTART: 0
- BITMAP: thread.gif
- COORDS: [0, 0]
- DATAFORMAT: [['TYPE', 'hl7'], ['PREPROCS', [['ARGS', [[]]], ['PROCS', 'sample_proc']]]]
I think you will get further faster with pyparsing.
-- Paul
I'll try and answer what I think is the missing question(s)...
Configuration files come in many formats. There are well known formats such as *.ini or apache config - these tend to have many parsers available.
Then there are custom formats. That is what yours appears to be (it could be some well-defined format you and I have never seen before - but until you know what that is it doesn't really matter).
I would start with the software this came from and see if they have a programming API that can load/produce these files. If nothing is obvious give Quovadx a call. Chances are someone has already solved this problem.
Otherwise you're probably on your own to create your own parser.
Writing a parser for this format would not be terribly difficult assuming that your sample is representative of a complete example. It's a hierarchy of values where each node can contain either a value or a child hierarchy of values. Once you've defined the basic types that the values can contain the parser is a very simple structure.
You could write this reasonably quickly using something like Lex/Flex or just a straight-forward parser in the language of your choosing.
You can easily write a script in python which will convert it to python dict, format looks almost like hierarchical name value pairs, only problem seems to be
Coards {0 0}, where {0 0} isn't a name value pair, but a list
so who know what other such cases are in the format
I think your best bet is to have spec for that format and write a simple python script to read it.
Your config file is very similar to JSON (pretty much, replace all your "{" and "}" with "[" and "]"). Most languages have a built in JSON parser (PHP, Ruby, Python, etc), and if not, there are libraries available to handle it for you.
If you can not change the format of the configuration file, you can read all file contents as a string, and replace all the "{" and "}" characters via whatever means you prefer. Then you can parse the string as JSON, and you're set.
I searched a little on the Cheese Shop, but I didn't find anything helpful for your example. Check the Examples page, and this specific parser ( it's syntax resembles yours a bit ). I think this should help you write your own.
Look into LEX and YACC. A bit of a learning curve, but they can generate parsers for any language.
Maybe you could write a simple script that will convert your config into xml file and then read it just using lxml, Beatuful Soup or anything else? And your converter could use PyParsing or regular expressions for example.

Categories