Idiomatic Clojure equivalent of this Python code? - python

I wrote a simple stack-based virtual machine in Python, and now I'm trying to rewrite it in Clojure, which is proving difficult as I don't have much experience with Lisp. This Python snippet processes the bytecode, which is represented as a list of tuples like so:
[("label", "entry"),
("load", 0),
("load", 1),
("add",),
("store", 0)]
Or in Clojure:
[[:label :entry]
[:load 0]
[:load 1]
[:add]
[:store 0]]
When a Function object loads the bytecode, every "label" tuple is processed specially to mark that position, while every other tuple stays in the final bytecode. I would assume that the Clojure equivalent of this function would involve a fold, but I'm not sure how to do that in an elegant or idiomatic way. Any ideas?

Reading that Python snippet, it looks like you want the eventual output to look like
{:code [[:load 0]
[:load 1]
[:add]
[:store 0]]
:labels {:entry 0}}
It's much easier to write the code once you have a firm description of the goal, and indeed this is a pretty simple reduce. There are a number of stylistically-different ways to write the reductor, but this way seems easiest to read, for me.
(defn load [asm]
(reduce (fn [{:keys [code labels]} [op arg1 & args :as instruction]]
(if (= :label op)
{:code code
:labels (assoc labels arg1 (count code))}
{:code (conj code instruction)
:labels labels}))
{:code [], :labels {}},
asm))
Edit
This version supports a name argument, and simplifies the reduction step by not repeating elements that don't change.
(defn load [name asm]
(reduce (fn [program [op arg1 :as instruction]]
(if (= :label op)
(assoc-in program [:labels arg1] (count (:code program)))
(update-in program [:code] conj instruction)))
{:code [], :labels {}, :name name},
asm))

I can't guarantee that this is idiomatic Clojure, but this is a functional version of your Python code, which should at least get you pretty close.
(def prog [
[:label :entry]
[:load 0]
[:load 1]
[:add]
[:store 0]])
(defn parse [stats]
(let [
f (fn [[out-stats labels pc] stat]
(if (= :label (first stat))
[out-stats (conj labels [(second stat) pc]) pc]
[(conj out-stats stat) labels (+ 1 pc)]))
init [[] {} 0]
]
(reduce f init stats)))
(println (parse prog))
So I think you're correct that a fold is what you want. All functional folds walk a collection and "reduce" that collection into a single value. However, nothing says that the resulting single value can't also be a collection or, as in this case, a collection of collections.
In our case, we are going to use the three-parameter version of reduce - this lets us provide an initial accumulator value. We need to do this because we are going to track a lot of state as we iterate across the collection of bytecodes, and the two-parameter version pretty much requires that your accumulator be similar to the items in the list. (c.f. (reduce + [1 2 3 4]) )
When working with a functional fold, you need to think in terms of what you are accumulating, and how each element in the input collection contributes to that accumulation. If you look at your Python code, there are three values that can be updated on each turn of the loop:
The output statements (self.code)
The label mapping (self.labels)
The program counter (pc)
Nothing else is written during the loop. So, our accumulator value will need to store those three values.
That previous bit is the most important part.
Once you have that, the rest should be pretty easy. We need an initial accumulator value, which has no code, no label mappings, and a PC that starts at 0. On each iteration, we will update the accumulator in one of two ways:
Add a new label mapping
Add a new output program statement, and increment the program counter
And now, the output:
[[[:load 0] [:load 1] [:add] [:store 0]]
{:entry 0}
4]
That's a 3-element vector. The first element is the program. The second element is the label mappings. The third element is the next PC value. Now, you might modify parse to only produce two values; that's not an unreasonable thing to do. There are reasons you might not want to do it, but that's more an issue of API design than anything. I'll leave it as an exercise to the reader.
I should also mention that, initially, I had omitted the let block and had simply inlined the named values. I decided to pull them out to hopefully increase readability. Again, I don't know which is more idiomatic. That might be more of a per-project convention.
Finally, I don't know if monads have really taken off in the Clojure community, but you could also create a monad for bytecode parsing, and define the operations "add-statement" and "add-label" to be values in that monad. This would greatly increase the set-up complexity, but would simplify the actual parsing code. In fact, it would allow your parsing code to look fairly procedural, which may or may not be a good thing. (don't worry, it's still functional and side-effect free; monads just let you hide plumbing.) If your Python sample is pretty representative of the kind of data you need to process, then monads are almost certainly unnecessary overhead. On the other hand, if you actually have to manage much more state than indicated by your sample, then monads might help to keep you sane.

(defn make-function [name code]
(let [[code labels] (reduce (fn [[code labels] inst]
(if (= (first inst) :label)
[code (assoc labels (second inst) (count code))]
[(conj code inst) labels]))
[[] {}] ;; initial state of code and labels
code)]
{:name name, :code code :labels labels}))
It's a bit wide for my liking, but not too bad.

I'm going to give you a general solution for these kind of problems.
Most loops can be done effortlessly with a strait forward map, filter or reduce, and if your data structure is recursive, naturally the loop will be a recursion.
Your loop, however, is a different kind of loop. Your loop accumulates a result -- which would suggests using reduce -- but the loop also carries a local variable along (pc), so it's not a strait reduce.
It's a reasonably common kind of loop. If this was Racket, I would use for/fold1, but since it's not, we will have to shoehorn your loop onto reduce.
Let's define a function called load which returns two things, the processed code and the processed labels. I will also use a helper function called is-label?.
(defn load [asm]
(defn is-label? [x] (= (first x) :label))
{:code <<< CODE GOES HERE >>>
:labels
<<< CODE GOES HERE >>>
})
Right now, your loop does two things, it processes the code, and it processes the labels. As much as possible, I try to keep loops to a single task. It makes them easier to understand, and it often reveals opportunities for using the simpler loop constructs.
To get the code, we simply need to remove the labels. That's a call to filter.
{:code (filter (complement is-label?) asm)
:labels
<<< CODE GOES HERE >>>
}
Reduce normally has only one accumulator, but your loop needs two: the result, and the local variable pc. I will package these two into a vector which will be immediately deconstructed by the body of the loop. The two slots of the vector will be my two local variables.
The initial values for these two variables appear as the 2nd argument to reduce.
(first
(reduce
(fn [[result, pc] inst]
<< MORE CODE >>
[{} 0] asm))
(Note how the initial values for the variables are placed far from their declaration. If the body is long this can be hard to read. That's the problem Racket's for/fold1 solves.)
Once reduce returns, I call first to discard to the local variable pc and keep just the result.
Filling the body of the loop is straight forward. If the instruction is a label, assoc into the result, otherwise increase pc by one. In either case, I construct a vector containing new values for all the local variables.
(fn [[result, pc] [_ arg :as inst]]
(if (is-label? inst)
[(assoc result arg pc) pc]
[result (inc pc)]))
This technique can be used to convert any accumulator-with-locals loop into a reduce. Here's the full code.
(defn load [asm]
(defn is-label? [x] (= (first x) :label))
{:code (filter (complement is-label?) asm)
:labels
(first
(reduce
(fn [[result, pc] [_ arg :as inst]]
(if (is-label? inst)
[(assoc result arg pc) pc]
[result (inc pc)]))
[{} 0] asm))})

Related

scala slower than python in constructing a set

I am learning scala by converting some of my python code to scala code. I just encountered an issue where the python code is significantly outperforming the scala code. The code is supposed to construct a set of candidate pairs based on some conditions. Scala has comparable runtime performance with python for all previous parts.
The id_map is an array of map from Long to set of string. The average number of k-v pairs in the map is 1942.
The scala code snippet is below:
// id_map Array[mutable.Map[Long, Set[String]]
val candidate_pairs = id_map
.flatMap(hashmap => hashmap.values)
.filter(_.size >= 2)
.flatMap(strset => strset.toList.combinations(2))
.map(_.sorted)
.toSet
and the corresponding python code is
candidate_pairs = set()
for hashmap in id_map.values():
for strset in hashmap.values():
if len(strset) >= 2:
for pair in combinations(strset, 2):
candidate_pairs.add(tuple(sorted(pair)))
The scala code snippet takes 80 seconds while python version takes 10 seconds.
I am wondering what can I optimize the above code to make it faster. What I have been trying is updating the set using the for loop
var candidate_pairs = Set.empty[List[String]]
for (
hashmap: mutable.Map[Long, Set[String]] <- id_map;
setstr: Set[String] <- hashmap.values if setstr.size >= 2;
pair <- setstr.toList.combinations(2)
)
candidate_pairs += pair.sorted
and although the candidate_pairs is updated a lot of time and each time it creates a new set, it actually is faster than the previous scala version, and takes about 50 seconds, still worse than python though. I tried using mutable set but however the result is about the same as the immutable version.
Any help would be appreciated! Thanks!
Being slower than python sounds ... surprising.
First of all, make sure you have adequate memory settings, and it is not spending half of those 80 seconds in GC.
Also, be sure to "warm up" the JVM (run your function a few times before doing actual measurement), use the same exact data for runs in python and scala (not just same statistics, exactly the same data), and do not include the time spent acquiring/generating data into measurement. Make several runs and compare average time, not how much a single run took.
Having said that, a few ways to make your code faster:
Adding .view (or .iterator) after id_map in your implementation cuts the execution time by about factor of 4 in my experiments.
(.view makes your chained transformation applied "lazily" – essentially, making a single pass through the single instance of array instead of multiple with multiple copies).
- Replacing .map(_.sorted) with
.map {
case List(a,b) if a < b => (a,b)
case List(a,b) => (b, a)
}
Shaves off about another 75% (sorting two element lists is mostly overhead).
This changes the return type to tuples rather than lists (constructing lots of tiny lists also adds up), but this seems even more appropriate in this case actually.
– Removing .filter(_.size >= 2) (it is redundant anyway, and computing size of a collection may get expensive) yields further improvement, but fairly small, that I did not bother to measure exactly.
Additionally, it may be cheaper to get rid of the separate sort step altogether, and just add .sorted before .combinations. I have not tested it, because it would be futile without knowing more details about your data profile.
These are some general improvements that should improve your performance either way, though it is hard to be sure you'll see the same effect as I do, as I don't really know anything about your data beyond that average map size, the improvement you see might be even better than mine, or it could be somewhat smaller ... but you should see some.
I ran this version with some test Scala code I created. On a list of 1944 elements, it completed in about 15 ms on my laptop.
id_map
.flatMap(hashmap => hashmap.values)
.flatMap { strset =>
if (strset.size >= 2) {
strset.toIndexedSeq.combinations(2)
} else IndexedSeq.empty
}.map(_.sorted).toSet
Main changes I have are to use an IndexedSeq instead of a List (which is a LinkedList), and to do the filter on the fly.
I assume you didn't want to hyper optimize, in which case you could still remove a lot of the intermediate collections created in the flatMap, combinations, conversion to IndexedSeq and toSet call.

Python 'pointer arithmetic' - Quicksort

In idiomatic C fashion, one can implement quicksort in a simple way with two arguments:
void quicksort(int inputArray[], int numelems);
We can safely use two arguments for later subdivisions (i.e. the partitions, as they're commonly called) via pointer arithmetic:
//later on in our quicksort routine...
quicksort(inputArray+last+1, numelems-last-1);
In fact, I even asked about this before on SO because I was untrained in pointer arithmetic at the time: see Passing an array to a function with an odd format - “v+last+1”
Basically, Is it possible to replicate the same behavior in python and if so, how? I have noticed that lists can be subdivided with the colon inside of square brackets (the slicing operator), but the slice operator does not pass the list from that point on; that is to say that the 1st element (0th index) is still the same in both cases.
As you're aware, Python's slice syntax makes a copy, so in order to manipulate a subsection of a list (not "array", in Python) in place, you need to pass around both the list and the start-index and size (or end-index) of the portion under discussion, much as you could in C. The signature of the recursive function would be something like:
def quicksort( inputList, numElems, startIndex = 0 ):
And the recursive call would be something like:
quicksort( inputList, numElems-last-1, last+1 )
Throughout the function you'd add startIndex to whatever list accesses you would make.
I suppose if you want to do something like that you could do the following:
# list we want to mutate
sort_list = [1,2,3,4,5,6,7,8,9,0]
#wrapper just so everything looks pretty, process could go here if we wanted
def wrapper(a, numlems):
cut = len(a) - numlems
# overwrites a part of the list with another one
a[cut:] = process(a[cut:])
# processing of the slice
def process(a):
# just to show it works
a[1] = 15
return a
wrapper(sort_list, 2)
print(sort_list)
wrapper(sort_list, 4)
print(sort_list)
wrapper(sort_list, 6)
print(sort_list)
This is probably considered pretty evil in python and I wouldn't really recommend it, but it does emulate the functionality you wanted.
For python you only really need:
def quicksort(inputList, startIndex):
Then creating and concatenating slices would work fine without the need for pointer like functionality.

how to parallelize big for loops in python

I just got to Python, and I am still in the steep phase of the learning curve. Thank you for any comments ahead.
I have a big for loop to run (big in the sense of many iterations), for example:
for i in range(10000)
for j in range(10000)
f((i,j))
I though that it would be a common question of how to parallelize it, and after hours of search on google I arrived at the solution using "multiprocessing" module, as the following:
pool=Pool()
x=pool.map(f,[(i,j) for i in range(10000) for j in range(10000)])
This works when the loop is small. However, it is really slow if the loop is large, Or sometimes a memory error occurs if the loops are too big. It seems that python would generate the list of arguments first, and then feed the list to the function "f", even using xrange. Is that correct?
So this parallelization does not work for me because I do not really need to store all arguments in a list. Is there a better way to do this? I appreciate any suggestions or references. Thank you.
It seems that python would generate the list of arguments first, and then feed the list to the function "f", even using xrange. Is that correct?
Yes, because you're using a list comprehension, which explicitly asks it to generate that list.
(Note that xrange isn't really relevant here, because you only have two ranges at a time, each 10K long; compared to the 100M of the argument list, that's nothing.)
If you want it to generate the values on the fly as needed, instead of all 100M at once, you want to use a generator expression instead of a list comprehension. Which is almost always just a matter of turning the brackets into parentheses:
x=pool.map(f,((i,j) for i in range(10000) for j in range(10000)))
However, as you can see from the source, map will ultimately just make a list if you give it a generator, so in this case, that won't solve anything. (The docs don't explicitly say this, but it's hard to see how it could pick a good chunksize to chop the iterable into if it didn't have a length…).
And, even if that weren't true, you'd still just run into the same problem again with the results, because pool.map returns a list.
To solve both problems, you can use pool.imap instead. It consumes the iterable lazily, and returns a lazy iterator of results.
One thing to note is that imap does not guess at the best chunksize if you don't pass one, but just defaults to 1, so you may need a bit of thought or trial&error to optimize it.
Also, imap will still queue up some results as they come in, so it can feed them back to you in the same order as the arguments. In pathological cases, it could end up queuing up (poolsize-1)/poolsize of your results, although in practice this is incredibly rare. If you want to solve this, use imap_unordered. If you need to know the ordering, just pass the indexes back and forth with the args and results:
args = ((i, j) for i in range(10000) for j in range(10000))
def indexed_f(index, (i, j)):
return index, f(i, j)
results = pool.imap_unordered(indexed_f, enumerate(args))
However, I notice that in your original code, you're not doing anything at all with the results of f(i, j). In that case, why even bother gathering up the results at all? In that case, you can just go back to the loop:
for i in range(10000):
for j in range(10000):
map.apply_async(f, (i,j))
However, imap_unordered may still be worth using, because it provides a very easy way to block until all of the tasks are done, while still leaving the pool itself running for later use:
def consume(iterator):
deque(iterator, max_len=0)
x=pool.imap_unordered(f,((i,j) for i in range(10000) for j in range(10000)))
consume(x)

why i can't reverse a list of list in python

i wanted to do something like this but this code return list of None (i think it's because list.reverse() is reversing the list in place):
map(lambda row: row.reverse(), figure)
i tried this one, but the reversed return an iterator :
map(reversed, figure)
finally i did something like this , which work for me , but i don't know if it's the right solution:
def reverse(row):
"""func that reverse a list not in place"""
row.reverse()
return row
map(reverse, figure)
if someone has a better solution that i'm not aware of please let me know
kind regards,
The mutator methods of Python's mutable containers (such as the .reverse method of lists) almost invariably return None -- a few return one useful value, e.g. the .pop method returns the popped element, but the key concept to retain is that none of those mutators returns the mutated container: rather, the container mutates in-place and the return value of the mutator method is not that container. (This is an application of the CQS principle of design -- not quite as fanatical as, say, in Eiffel, the language devised by Bertrand Meyer, who also invented CQS, but that's just because in Python "practicality beats purity, cfr import this;-).
Building a list is often costlier than just building an iterator, for the overwhelmingly common case where all you want to do is loop on the result; therefore, built-ins such as reversed (and all the wonderful building blocks in the itertools module) return iterators, not lists.
But what if you therefore have an iterator x but really truly need the equivalent list y? Piece of cake -- just do y = list(x). To make a new instance of type list, you call type list -- this is such a general Python idea that it's even more crucial to retain than the pretty-important stuff I pointed out in the first two paragraphs!-)
So, the code for your specific problem is really very easy to put together based on the crucial notions in the previous paragraphs:
[list(reversed(row)) for row in figure]
Note that I'm using a list comprehension, not map: as a rule of thumb, map should only be used as a last-ditch optimization when there is no need for a lambda to build it (if a lambda is involved then a listcomp, as well as being clearer as usual, also tends to be faster anyway!-).
Once you're a "past master of Python", if your profiling tells you that this code is a bottleneck, you can then know to try alternatives such as
[row[::-1] for row in figure]
applying a negative-step slicing (aka "Martian Smiley") to make reversed copies of the rows, knowing it's usually faster than the list(reversed(row)) approach. But -- unless your code is meant to be maintained only by yourself or somebody at least as skilled at Python -- it's a defensible position to use the simplest "code from first principles" approach except where profiling tells you to push down on the pedal. (Personally I think the "Martian Smiley" is important enough to avoid applying this good general philosophy to this specific use case, but, hey, reasonable people could differ on this very specific point!-).
You can also use a slice to get the reversal of a single list (not in place):
>>> a = [1,2,3,4]
>>> a[::-1]
[4, 3, 2, 1]
So something like:
all_reversed = [lst[::-1] for lst in figure]
...or...
all_reversed = map(lambda x: x[::-1], figure)
...will do what you want.
reversed_lists = [list(reversed(x)) for x in figure]
map(lambda row: list(reversed(row)), figure)
You can also simply do
for row in figure:
row.reverse()
to change each row in place.

side effect gotchas in python/numpy? horror stories and narrow escapes wanted

I am considering moving from Matlab to Python/numpy for data analysis and numerical simulations. I have used Matlab (and SML-NJ) for years, and am very comfortable in the functional environment without side effects (barring I/O), but am a little reluctant about the side effects in Python. Can people share their favorite gotchas regarding side effects, and if possible, how they got around them? As an example, I was a bit surprised when I tried the following code in Python:
lofls = [[]] * 4 #an accident waiting to happen!
lofls[0].append(7) #not what I was expecting...
print lofls #gives [[7], [7], [7], [7]]
#instead, I should have done this (I think)
lofls = [[] for x in range(4)]
lofls[0].append(7) #only appends to the first list
print lofls #gives [[7], [], [], []]
thanks in advance
Confusing references to the same (mutable) object with references to separate objects is indeed a "gotcha" (suffered by all non-functional languages, ones which have mutable objects and, of course, references). A frequently seen bug in beginners' Python code is misusing a default value which is mutable, e.g.:
def addone(item, alist=[]):
alist.append(item)
return alist
This code may be correct if the purpose is to have addone keep its own state (and return the one growing list to successive callers), much as static data would work in C; it's not correct if the coder is wrongly assuming that a new empty list will be made at each call.
Raw beginners used to functional languages can also be confused by the command-query separation design decision in Python's built-in containers: mutating methods that don't have anything in particular to return (i.e., the vast majority of mutating methods) return nothing (specifically, they return None) -- they're doing all their work "in-place". Bugs coming from misunderstanding this are easy to spot, e.g.
alist = alist.append(item)
is pretty much guaranteed to be a bug -- it appends an item to the list referred to by name alist, but then rebinds name alist to None (the return value of the append call).
While the first issue I mentioned is about an early-binding that may mislead people who think the binding is, instead, a late one, there are issues that go the other way, where some people's expectations are for an early binding while the binding is, instead, late. For example (with a hypothetical GUI framework...):
for i in range(10):
Button(text="Button #%s" % i,
click=lambda: say("I'm #%s!" % i))
this will show ten buttons saying "Button #0", "Button #1", etc, but, when clicked, each and every one of them will say it's #9 -- because the i within the lambda is late bound (with a lexical closure). A fix is to take advantage of the fact that default values for argument are early-bound (as I pointed out about the first issue!-) and change the last line to
click=lambda i=i: say("I'm #%s!" % i))
Now lambda's i is an argument with a default value, not a free variable (looked up by lexical closure) any more, and so the code works as intended (there are other ways too, of course).
I stumbled upon this one recently again, (after years of python) while trying to remove a small dependency on numpy.
If you come from matlab you should use and trust numpy functions for mono-type array handling. Along with matplotlib, they are some very convenient packages for a smooth transition.
import numpy as np
np.zeros((4,)) # to make an array full of zeros [0,0,0,0]
np.zeros((4,1)) # another one full of zeros but 2 dimensions [[0],[0],[0],[0]]
np.zeros((4,0)) # an empty array like [[],[],[],[]]
np.zeros((0,4)) # another empty array, which can not be represented with python lists o_O
etc.

Categories