Visualize Python function flow (e.g. as tree or concept map) - python

I have taught myself python in quite a haphazard way. So my question perhaps won't be very pythonic.
When I write functions in classes, I often lose overview of what each function does. I do try to name them properly. But still, they are sometimes smaller parts of code where it seems arbitrary in which functions to put them. So whenever I want to make changes, I still need to go through the entire code in order to figure out how my functions actually flow.
From what I understand, we have objects and we have functions, and these are our units for structuring our code. But this only gives you a flat structure. It doesn't allow you to do any kind of nesting, like in a tree diagram, over multiple levels. Especially, the code in my file doesn't automatically water itself so that the functions that are called first and more often automatically further up on the top, whereas helper functions would be automatically further down in the document, or even nested.
In fact, even being able to visually nest lower-order subroutines "inside" a higher-order function that calls it would seem helpful. But it's not something that would be supported by Python's syntax. (Plus, it wouldn't quite suffice, because a sub-routine might be used by several higher-order functions.)
I imagine it would be useful to see all functions in my code visualized as a tree, or as a concept map:
Where each function is a dot. And calling order visualized by arrows. This way, I would also easily see which functions are more central, and which are more outliers, or even orphaned.
Yet perhaps this isn't even a case for another tool. Perhaps this is more a case of me not understanding proper coding. Perhaps there is something I can do differently in order to get the kind of intuitive overview over how my program works, without needing another tool.

Firstly, I am not quite sure why this isn't asked more often. Reading code is not intuitive, at all! We should be able to visualize the evolution of a process or function so well that close to every one will be able to understand its behavior. In the 60s and so, people had to be reasonably sure their programs would run, because getting access to the computer would take time; today we execute or compile our program, run tests if we have them, and get to know immediately whether it works. What happened is there is less mental effort now, we execute a bit less code in our heads, and the computer a bit more. But we must still think of how the program behaves midst execution in order to debug. For the future, it would be nice if the computer could just tell us how the program behaves.
You propose looking at a sort of tree of the program as a resolute, and after all, the abstract syntax tree is literally a tree, but I don't think this is what we ought to spend our efforts on when it comes to visualizing systems. What would be preferable is if we could look at an interactive view of how the problem changes its intermediate data-structures as a function of time.
Currently, we have debuggers--but that's akin to looking at the issue by asking what a function is at many values, when we would much rather look at its graph. A lot of programming is done by doing something you feel is right, observing if the behavior correct, if it's not correct then we make modifications by reacting and correcting said behavior.
Bret Victor in his essay, Learnable Programming, discusses this topic. I highly recommend it, even though it won't help you right now, maybe you can help others in the future by making these ideas more prevalent.
Onwards, then to where I tell you what you can do right now. In his book Clean Code, Robert C. Martin proposes structuring code much like how a newspaper is laid out.
Think of a well-written newspaper article. You read it vertically. At the top you expect a headline that will tell you what the story is about and allows you to decide whether it is something you want to read. The first paragraph gives you a synopsis of the whole story, hiding all the details while giving you the broad-brush concepts. As you continue downward, the details increase until you have all the dates, names, quotes, claims, and other minutia.
What is proposed, is to organize your program top-down, with higher level procedures that call mid-level procedures, which in turn call the lower level procedures. At any place, it should be obvious that (1) you are at the appropriate level of abstraction, and (2) you are looking at the part of the program implementing the behavior you seek to modify.
This means storing state at the level where it belongs, and exposing it anywhere else. Procedures should take only the parameters they need, because more parameters means you must also think about more parameters when reasoning about the code.
This is the primary reason for decomposing procedures into smaller procedures. For example, here some code I've written previously. It's not perfect, but you can see very clearly which part of the program you need to go to if you want to change anything.
Of course, higher order procedures are listed before any other. I'm telling you what I'm going to do, before I show you how I do it.
function formatLinks(matches, stripped) {
let formatted_links = []
for (match of matches) {
let diff = difference(match, stripped)
if (isSimpleLink(diff)) {
formatted_links.push(formatAsSimpleLink(diff))
} else if (hasPrefix(diff)) {
formatted_links.push(formatAsPrefixedLink(diff))
} else if (hasSuffix(diff)) {
formatted_links.push(formatAsSuffixedLink(diff))
}
}
// There might be multiple links within a word
// in which case we join them and strip any duplicated parts
// (which are otherwise interpreted as suffixes)
if (formatted_links.length > 1) {
return combineLinks(formatted_links)
}
// Default case
return formatted_links[0]
}
If JavaScript was a typed language, and if we could see an image of the decisions made in the code, as a factor of input and time, this could be even better.
I think Quokka.js and VS Code Debug Visualizer are both doing interesting work in this sector.

Related

What are states vs stateless properties and advantages? [duplicate]

This question already has answers here:
Advantages of stateless programming?
(10 answers)
Closed 3 years ago.
I have read from various sources as a state being defined as something along the lines of:
'the program's condition regarding stored inputs';
'contents of memory locations at any given time in the program's
execution'
But then I look up what the characteristic of being stateless is (e.g "Haskell is stateless"):
'when an application isn't dependent on its state';
'physical state can't be changed'
'Same output for same input - address in memory always stays the same'
'methods don't depend on an instance and its corresponding instance
variables'
Now, I must have misunderstood the (vague?) former definition, as surely FP languages which go hand-in-hand with the 'stateless' model stores inputs too?? Or is this something about functions simply being evaluating rather than mutating data?
Well, I sort of get that such models are sometimes powerful - after reading about its use in program verification, debugging and concurrency.
But it did get fairly complicated when I then read on about how:
"it eliminates a whole class of multithreading bugs related to race
conditions"
more expressive code (whatever that is?)
"static evaluation ... can be used to favourably guide computer's
positions [in a tic-tac-toe game tree]" https://www.cs.kent.ac.uk/people/staff/dat/miranda/whyfp90.pdf (or perhaps static typing is for another question entirely?)
So I also wondered about the advantages of being able to manipulate states, in iterative programming languages, and many forums gave examples like how altering one's 'age' via calling add() will mutate the 'age' variable outside of its scope.
Maybe it's my lack of experience with OOP, but what are the exact advantages of using states in wider applications?
If any example code could be given, please could you try and stick to Python/Haskell as the contrasting representatives of these opposite disciplines? My inexpertise in reading other languages does seem to hinder my understanding of other posts' explanations.
If you consider a program where there is no state whatsoever, and the program is not able to read external states (for example, checking the system time), that would entail that any function, besides what values may be passed as parameters, would have no way of differentiating any different circumstances because no different circumstances can exist.
With this class of program, the run-time is effectively redundant as an optimizing compiler could derive any resulting outputs statically. Its essentially equivalent to writing out a large math equation. You can solve it, but the idea of "running the program" is redundant because it has an output inherent to the program itself that cannot change.
Obviously this doesn't resemble most programs, even ones that may be described as "stateless". Usually there is some minimal kind of state, such as input initially passed to the program. For example, imagine a program that outputs the first N digits of the square root of a number K. This program now has an initial state, but beyond knowing what N and K are, the program doesn't need to track program-level states like what day of the week it is, whether or not the user prefers MDY vs DMY date format, etc. However, since the program almost certainly involves some kind of dynamic looping to find N digits, it would need to have some kind of state associated with the loop (for example an iteration number)
So when code is referred to as "stateless" its not a 100% promise, but a kind of qualifier of the degree to which the code depends on state.
So what are the advantages and disadvantages here? The more your code depends on state, the more likely it is to do something that the programmer wasn't expecting. Remember, state is something inherent to the runtime. But we don't write code at runtime. We can try to imagine all the different possible runtime states that could happen, but this quickly gets out of hand.
Examples
Here's an example of the same python program in a stateful vs stateless way.
stateful.py
import math
angle_type = 'radians'
def cosecant(x):
if angle_type == 'radians':
return 1/math.sin(x)
elif angle_type == 'degrees':
return 1/math.sin(x*math.pi/180)
else:
raise NotImplementedError('cosecant is not implemented for angle type', angle_type)
================== RESTART: F:/Documents/Python/stateful.py ==================
>>> cosecant(1)
1.1883951057781212
>>> angle_type = 'degrees'
>>> cosecant(1)
57.298688498550185
>>> angle_type = 'turns'
>>> cosecant(1)
Traceback (most recent call last):
File "<pyshell#16>", line 1, in <module>
cosecant(1)
File "F:/Documents/Python/stateful.py", line 11, in cosecant
raise NotImplementedError('cosecant is not implemented for angle type', angle_type)
NotImplementedError: ('cosecant is not implemented for angle type', 'turns')
>>>
stateless.py
import math
def cosecant(x):
return 1/math.sin(x)
def cosecant_degrees(x):
return 1/math.sin(x*math.pi/180)
================= RESTART: F:/Documents/Python/stateless.py =================
>>> cosecant(1)
1.1883951057781212
>>> cosecant_degrees(1)
57.298688498550185
>>> cosecant_turns(1)
Traceback (most recent call last):
File "<pyshell#19>", line 1, in <module>
cosecant_turns(1)
NameError: name 'cosecant_turns' is not defined
>>>
This example may seem odd, but the same principals and pitfalls extend to more complex programs. We can see how using state here can lead to problems. Consider stateful.py. One problem is that the person calling cosecant doesn't necessarily know the current state of angle_type, they would have to set it every time before using cosecant to make sure that it will do what the want it to do. Since programmers tend to not like repeating themselves, someone may declare mymathlib.angle_type = 'degrees' at the start of their program, and assume nothing else will ever change it. For example, what if you are working in degrees, but then you call a subroutine that changes angle_type to 'radians' and doesn't change to degrees upon finishing.
If other parts of the program are changing it, then the programmer effectively has to set the value to what the want every time before calling it. And even still, if code is running on multiple threads, there is no guarantee that when you set angle_type to 'degrees', another thread didn't immediately set it to something else before your cosecant call gets executed (a race condition).
In our stateless version of the program, all of these problems disappear. What's the cost? Well now we have 2 different functions instead of 1. Why is this a cost? Well, generally, it is considered good practice to keep an API smaller rather than larger, having multiple tools that do the same thing is confusing for people trying to use your library. Because of this principle, it is sometimes tempting to smoosh 2 or more different functions into 1 if they seem to be doing the same thing. In general this isn't terrible, in fact it often helps a library in terms of maintenance and usability. In this case though, it has caused more harm than good because in the stateful program, while the programmer can effectively use it to do the same thing, we no longer have the simple guarantee that the function will do what you think it will do every time.
This may on the surface seem like a silly example, but consider that python's decimal module does nearly the same thing. In the decimal module, states variables including the number of decimal places of precision and rounding rules are stored in the current thread's context. This isn't quite as problematic as the previous example. Since each thread has its own context, we don't have to worry about the race condition, but there are still potential trouble spots, like if a subroutine changes state without nicely changing it back for you.
It's easy to rationalize this kind of design and say "a competent programmer should be able to analyze the code and be able to avoid these kind of state related problems if they take the time to think about it". This is true in theory, but if we look at how people actually write code, the principle of least effort trumps everything else most of the time. Look at examples of use of the decimal module in the official documentation, or any tutorial, and you will notice a common pattern. The references to decimal.getcontext() outnumber the references to decimal.setcontext() 10 to 1 -- that is if setcontext is mentioned at all, often it is not. In order to manage state in a competent "safety first" kind of way, both of these tools are equally important and if anything decimal.setcontext() should be used more often since it is what you use to guarantee consistent behavior. Because of this least-effort principle, it is inevitable that programmers, especially beginners, will write code like this that may work in testing, but has no hard guarantee of future safety as the program evolves.
Conclusion
So is stateful code evil, and stateless code is our savior? Well maybe. It could help to think of this little epigram as a guiding principle to avoid certain pitfalls. The reality is that often stateful code is hard to avoid, or even inherent to the program's design. For example, how could we make a video game without state? How do we make a health counter? How does the player move around the level and track its position? Can we have scores or win conditions? Stateful code is not going anywhere, but it does genuinely help to have an understanding of the common pitfalls of designing programs in that way.

Difference between Python methods which is can make new variable or not

Some methods don't need to make a new variable, i.e. lists.reverse() works like this:
lists = [123, 456, 789]
lists.reverse()
print(lists)
this method make itself reversed (without new variable).
Why there is vary ways to manufacture variable in Python?
Some cases which is like variable.method().method2().method3() are typed continuously but type(variable) and print() are not. Why we can't typing like variable.print() or variable.type()?
Is there any philosophical reasons for Python?
You may be confused by the difference between a function and a method, and by three different purposes to them. As much as I dislike using SO for tutorial purposes, these issues can be hard to grasp from other documentation. You can look up function vs method easily enough -- once you know it's a (slightly) separate issue.
Your first question is a matter of system design. Python merely facilitates what programmers want to do, and the differentiation is common to many (most?) programming languages since ASM and FORTRAN crawled out of the binary slime pools in the days when dinosaurs roamed the earth.
When you design how your application works, you need to make a lot of implementation decisions: individual variables vs a sequence, in-line coding vs functions, separate functions vs encased functions vs classes and methods, etc. Part of this decision making is what each function should do. You've raised three main types:
(1) Process this data -- take the given data and change it, rearrange it, whatever needs doing -- but I don't need the previous version, just the improved version, so just put the new stuff where the old stuff was. This is used almost exclusively when one variable is getting processed; we don't generally take four separate variables and change each of them. In that case, we'd put them all in a list and change the list (a single variable). reverse falls into this class.
One important note is that for such a function, the argument in question must be mutable (capable of change). Python has mutable and immutable types. For instance, a list is mutable; a tuple is immutable. If you wanted to reverse a tuple, you'd need to return a new tuple; you can't change the original.
(2) Tell me something interesting -- take the given data and extract some information. However, I'm going to need the originals, so leave them alone. If I need to remember this cool new insight, I'll put it in a variable of my own. This is a function that returns a value. sqrt is one such function.
(3) Interact with the outside world -- input or output data permanently. For output, nothing in the program changes; we may present the data in an easy-to-read format, but we don't change anything internally. print is such a function.
Much of this decision also depends on the function's designed purpose: is this a "verb" function (do something) or a noun/attribute function (look at this data and tell me what you see)?
Now you get the interesting job for yourself: learn the art of system design. You need to become familiar enough with the available programming tools that you have a feeling for how they can be combined to form useful applications.
See the documentation:
The reverse() method modifies the sequence in place for economy of space when reversing a large sequence. To remind users that it operates by side effect, it does not return the reversed sequence.

Syntax recognizer in python

I need a module or strategy for detecting that a piece of data is written in a programming language, not syntax highlighting where the user specifically chooses a syntax to highlight. My question has two levels, I would greatly appreciate any help, so:
Is there any package in python that receives a string(piece of data) and returns if it belongs to any programming language syntax ?
I don't necessarily need to recognize the syntax, but know if the string is source code or not at all.
Any clues are deeply appreciated.
Maybe you can use existing multi-language syntax highlighters. Many of them can detect language a file is written in.
You could have a look at methods around baysian filtering.
My answer somewhat depends on the amount of code you're going to be given. If you're going to be given 30+ lines of code, it should be fairly easy to identify some unique features of each language that are fairly common. For example, tell the program that if anything matches an expression like from * import * then it's Python (I'm not 100% sure that phrasing is unique to Python, but you get the gist). Other things you could look at that are usually slightly different would be class definition (i.e. Python always starts with 'class', C will start with a definition of the return so you could check to see if there is a line that starts with a data type and has the formatting of a method declaration), conditionals are usually formatted slightly differently, etc, etc. If you wanted to make it more accurate, you could introduce some sort of weighting system, features that are more unique and less likely to be the result of a mismatched regexp get a higher weight, things that are commonly mismatched get a lower weight for the language, and just calculate which language has the highest composite score at the end. You could also define features that you feel are 100% unique, and tell it that as soon as it hits one of those, to stop parsing because it knows the answer (things like the shebang line).
This would, of course, involve you knowing enough about the languages you want to identify to find unique features to look for, or being able to find people that do know unique structures that would help.
If you're given less than 30 or so lines of code, your answers from parsing like that are going to be far less accurate, in that case the easiest best way to do it would probably be to take an appliance similar to Travis, and just run the code in each language (in a VM of course). If the code runs successfully in a language, you have your answer. If not, you would need a list of errors that are "acceptable" (as in they are errors in the way the code was written, not in the interpreter). It's not a great solution, but at some point your code sample will just be too short to give an accurate answer.

Working with trees of class instances in Python

I'm looking for more information about dealing with trees of class instances, and how best to go about calling methods on the leaves from the trunk. I have a trunk instance with many branch instances (in a dictionary), and each has many leaf instances (and dicts in the branches). The leaves are where the action really happens, and as such there are methods in the leaves for querying values, restoring values, and many other things.
This leads to what feels like duplication of code, as I might want to do something to all leaves of a branch, so there are methods in the branches for doing something to a leaf, a specified set of leaves, or all leaves known to that branch, though these do reduce the code duplication by simply looping over the leafs and asking them to do said things to themselves (thus the actual code doing the work is in one place in the leaf class).
Then the trunk comes in, where I might want to do something to the entire tree (i.e. all leaves) in one fell swoop, so I have methods there that ask all known objects to run their all-leaf functions. I start to feel pretty removed from the real action in the leaves this way, though it works fine, and the code seems fairly tight - extremely brief, readable, and functioning fine.
Another issue comes in logical groupings. There are bits of data I might want to associate with some, most, or all leaves, to indicate that they're part of some logical group, so currently the leaves themselves are all storing that kind of data. When I want to get a logical group, I have to scan all leaves and gather them back up, rather than having some sort of list at the trunk level. This actually all works fine, and is even pretty logical, yet it feels insane. Is this simply the nature of working with tree-like structures, because of their complexity, or are there other ways of doing these kinds of things? I prefer not to build secondary structures to connect to things from the opposite direction - e.g. making a structure with references to the leaves in a logical group, approaching them then from that more list-like direction. One bonus of keeping things all in a large tree like this is that it can be dumped and loaded in one shot with pickle.
I'd love to hear thoughts - any and all - from anyone else's experience with such things.
What I'm taking away from your question is that "everything works", but that the code is starting to feel unmanagable and difficult to reason about, and: is there a better way to do this?
The one thing your question is missing is a solid context. What sort of problem is your tree structure actually solving? What do these object actually do? Are they all the same type of object, or is there a mix of objects? With some of these specifics you might get more practical responses.
As it stands, I would suggest checking out some resources on design patterns. Specifically the composite and visitor patterns.
On the book end of things you could have a look at Design Patterns and/or Refactoring to Patterns. Neither of these have any Python code in them, but if you don't mind Java, the latter is an excellent introduction to taking hard to reason code structures and using a pattern to better organize things.
You might also have a look at Alex Martelli's talk on Python Design Patterns.
This question has some further resource links regarding patterns and python in general.
Hope that helps.

How do I design a class in Python?

I've had some really awesome help on my previous questions for detecting paws and toes within a paw, but all these solutions only work for one measurement at a time.
Now I have data that consists off:
about 30 dogs;
each has 24 measurements (divided into several subgroups);
each measurement has at least 4 contacts (one for each paw) and
each contact is divided into 5 parts and
has several parameters, like contact time, location, total force etc.
Obviously sticking everything into one big object isn't going to cut it, so I figured I needed to use classes instead of the current slew of functions. But even though I've read Learning Python's chapter about classes, I fail to apply it to my own code (GitHub link)
I also feel like it's rather strange to process all the data every time I want to get out some information. Once I know the locations of each paw, there's no reason for me to calculate this again. Furthermore, I want to compare all the paws of the same dog to determine which contact belongs to which paw (front/hind, left/right). This would become a mess if I continue using only functions.
So now I'm looking for advice on how to create classes that will let me process my data (link to the zipped data of one dog) in a sensible fashion.
How to design a class.
Write down the words. You started to do this. Some people don't and wonder why they have problems.
Expand your set of words into simple statements about what these objects will be doing. That is to say, write down the various calculations you'll be doing on these things. Your short list of 30 dogs, 24 measurements, 4 contacts, and several "parameters" per contact is interesting, but only part of the story. Your "locations of each paw" and "compare all the paws of the same dog to determine which contact belongs to which paw" are the next step in object design.
Underline the nouns. Seriously. Some folks debate the value of this, but I find that for first-time OO developers it helps. Underline the nouns.
Review the nouns. Generic nouns like "parameter" and "measurement" need to be replaced with specific, concrete nouns that apply to your problem in your problem domain. Specifics help clarify the problem. Generics simply elide details.
For each noun ("contact", "paw", "dog", etc.) write down the attributes of that noun and the actions in which that object engages. Don't short-cut this. Every attribute. "Data Set contains 30 Dogs" for example is important.
For each attribute, identify if this is a relationship to a defined noun, or some other kind of "primitive" or "atomic" data like a string or a float or something irreducible.
For each action or operation, you have to identify which noun has the responsibility, and which nouns merely participate. It's a question of "mutability". Some objects get updated, others don't. Mutable objects must own total responsibility for their mutations.
At this point, you can start to transform nouns into class definitions. Some collective nouns are lists, dictionaries, tuples, sets or namedtuples, and you don't need to do very much work. Other classes are more complex, either because of complex derived data or because of some update/mutation which is performed.
Don't forget to test each class in isolation using unittest.
Also, there's no law that says classes must be mutable. In your case, for example, you have almost no mutable data. What you have is derived data, created by transformation functions from the source dataset.
The following advices (similar to #S.Lott's advice) are from the book, Beginning Python: From Novice to Professional
Write down a description of your problem (what should the problem do?). Underline all the nouns, verbs, and adjectives.
Go through the nouns, looking for potential classes.
Go through the verbs, looking for potential methods.
Go through the adjectives, looking for potential attributes
Allocate methods and attributes to your classes
To refine the class, the book also advises we can do the following:
Write down (or dream up) a set of use cases—scenarios of how your program may be used. Try to cover all the functionally.
Think through every use case step by step, making sure that everything we need is covered.
I like the TDD approach...
So start by writing tests for what you want the behaviour to be. And write code that passes. At this point, don't worry too much about design, just get a test suite and software that passes. Don't worry if you end up with a single big ugly class, with complex methods.
Sometimes, during this initial process, you'll find a behaviour that is hard to test and needs to be decomposed, just for testability. This may be a hint that a separate class is warranted.
Then the fun part... refactoring. After you have working software you can see the complex pieces. Often little pockets of behaviour will become apparent, suggesting a new class, but if not, just look for ways to simplify the code. Extract service objects and value objects. Simplify your methods.
If you're using git properly (you are using git, aren't you?), you can very quickly experiment with some particular decomposition during refactoring, and then abandon it and revert back if it doesn't simplify things.
By writing tested working code first you should gain an intimate insight into the problem domain that you couldn't easily get with the design-first approach. Writing tests and code push you past that "where do I begin" paralysis.
The whole idea of OO design is to make your code map to your problem, so when, for example, you want the first footstep of a dog, you do something like:
dog.footstep(0)
Now, it may be that for your case you need to read in your raw data file and compute the footstep locations. All this could be hidden in the footstep() function so that it only happens once. Something like:
class Dog:
def __init__(self):
self._footsteps=None
def footstep(self,n):
if not self._footsteps:
self.readInFootsteps(...)
return self._footsteps[n]
[This is now a sort of caching pattern. The first time it goes and reads the footstep data, subsequent times it just gets it from self._footsteps.]
But yes, getting OO design right can be tricky. Think more about the things you want to do to your data, and that will inform what methods you'll need to apply to what classes.
After skimming your linked code, it seems to me that you are better off not designing a Dog class at this point. Rather, you should use Pandas and dataframes. A dataframe is a table with columns. You dataframe would have columns such as: dog_id, contact_part, contact_time, contact_location, etc.
Pandas uses Numpy arrays behind the scenes, and it has many convenience methods for you:
Select a dog by e.g. : my_measurements['dog_id']=='Charly'
save the data: my_measurements.save('filename.pickle')
Consider using pandas.read_csv() instead of manually reading the text files.
Writing out your nouns, verbs, adjectives is a great approach, but I prefer to think of class design as asking the question what data should be hidden?
Imagine you had a Query object and a Database object:
The Query object will help you create and store a query -- store, is the key here, as a function could help you create one just as easily. Maybe you could stay: Query().select('Country').from_table('User').where('Country == "Brazil"'). It doesn't matter exactly the syntax -- that is your job! -- the key is the object is helping you hide something, in this case the data necessary to store and output a query. The power of the object comes from the syntax of using it (in this case some clever chaining) and not needing to know what it stores to make it work. If done right the Query object could output queries for more then one database. It internally would store a specific format but could easily convert to other formats when outputting (Postgres, MySQL, MongoDB).
Now let's think through the Database object. What does this hide and store? Well clearly it can't store the full contents of the database, since that is why we have a database! So what is the point? The goal is to hide how the database works from people who use the Database object. Good classes will simplify reasoning when manipulating internal state. For this Database object you could hide how the networking calls work, or batch queries or updates, or provide a caching layer.
The problem is this Database object is HUGE. It represents how to access a database, so under the covers it could do anything and everything. Clearly networking, caching, and batching are quite hard to deal with depending on your system, so hiding them away would be very helpful. But, as many people will note, a database is insanely complex, and the further from the raw DB calls you get, the harder it is to tune for performance and understand how things work.
This is the fundamental tradeoff of OOP. If you pick the right abstraction it makes coding simpler (String, Array, Dictionary), if you pick an abstraction that is too big (Database, EmailManager, NetworkingManager), it may become too complex to really understand how it works, or what to expect. The goal is to hide complexity, but some complexity is necessary. A good rule of thumb is to start out avoiding Manager objects, and instead create classes that are like structs -- all they do is hold data, with some helper methods to create/manipulate the data to make your life easier. For example, in the case of EmailManager start with a function called sendEmail that takes an Email object. This is a simple starting point and the code is very easy to understand.
As for your example, think about what data needs to be together to calculate what you are looking for. If you wanted to know how far an animal was walking, for example, you could have AnimalStep and AnimalTrip (collection of AnimalSteps) classes. Now that each Trip has all the Step data, then it should be able to figure stuff out about it, perhaps AnimalTrip.calculateDistance() makes sense.

Categories