In Python, one can have a list (similar to an array in swift):
>>> li=[0,1,2,3,4,5]
And perform a slice assignment on any / all of the list:
>>> li[2:]=[99] # note then end index is not needed if you mean 'to the end'
>>> li
[0, 1, 99]
Swift has a similar slice assignment (this is in the swift interactive shell):
1> var arr=[0,1,2,3,4,5]
arr: [Int] = 6 values {
[0] = 0
[1] = 1
[2] = 2
[3] = 3
[4] = 4
[5] = 5
}
2> arr[2...arr.endIndex-1]=[99]
3> arr
$R0: [Int] = 3 values {
[0] = 0
[1] = 1
[2] = 99
}
So far, so good. But, there are a couple of issues.
First, swift does not work for an empty list or if the index is after the endIndex. Python appends if the slice index is after then end index:
>>> li=[] # empty
>>> li[2:]=[6,7,8]
>>> li
[6, 7, 8]
>>> li=[0,1,2]
>>> li[999:]=[999]
>>> li
[0, 1, 2, 999]
The equivalent in swift is an error:
4> var arr=[Int]()
arr: [Int] = 0 values
5> arr[2...arr.endIndex-1]=[99]
fatal error: Can't form Range with end < start
That is easy to test and code around.
Second issue is the killer: it is really slow in swift. Consider this Python code to to perform exact summations of a list of floats:
def msum(iterable):
"Full precision summation using multiple floats for intermediate values"
# Rounded x+y stored in hi with the round-off stored in lo. Together
# hi+lo are exactly equal to x+y. The inner loop applies hi/lo summation
# to each partial so that the list of partial sums remains exact.
# Depends on IEEE-754 arithmetic guarantees. See proof of correctness at:
# www-2.cs.cmu.edu/afs/cs/project/quake/public/papers/robust-arithmetic.ps
partials = [] # sorted, non-overlapping partial sums
for x in iterable:
i = 0
for y in partials:
if abs(x) < abs(y):
x, y = y, x
hi = x + y
lo = y - (hi - x)
if lo:
partials[i] = lo
i += 1
x = hi
partials[i:] = [x]
return sum(partials, 0.0)
It works by maintaining a hi/lo partial summations so that msum([.1]*10) produces 1.0 exactly rather than 0.9999999999999999. The C equivalent of msum is part of the math library in Python.
I have attempted to replicate in swift:
func msum(it:[Double])->Double {
// Full precision summation using multiple floats for intermediate values
var partials=[Double]()
for var x in it {
var i=0
for var y in partials{
if abs(x) < abs(y){
(x, y)=(y, x)
}
let hi=x+y
let lo=y-(hi-x)
if abs(lo)>0.0 {
partials[i]=lo
i+=1
}
x=hi
}
// slow part trying to replicate Python's slice assignment partials[i:]=[x]
if partials.endIndex>i {
partials[i...partials.endIndex-1]=[x]
}
else {
partials.append(x)
}
}
return partials.reduce(0.0, combine: +)
}
Test the function and speed:
import Foundation
var arr=[Double]()
for _ in 1...1000000 {
arr+=[10, 1e100, 10, -1e100]
}
print(arr.reduce(0, combine: +)) // will be 0.0
var startTime: CFAbsoluteTime!
startTime = CFAbsoluteTimeGetCurrent()
print(msum(arr), arr.count*5) // should be arr.count * 5
print(CFAbsoluteTimeGetCurrent() - startTime)
On my machine, that takes 7 seconds to complete. Python native msum takes 2.2 seconds (about 4x faster) and the library fsum function takes 0.09 seconds (almost 90x faster)
I have tried to replace partials[i...partials.endIndex-1]=[x] with arr.removeRange(i..<arr.endIndex) and then appending. Little faster but not much.
Question:
Is this idiomatic swift: partials[i...partials.endIndex-1]=[x]
Is there a faster / better way?
First (as already said in the comments), there is a huge
difference between non-optimized and optimised code in Swift
("-Onone" vs "-O" compiler option, or Debug vs. Release configuration), so for performance test make sure that the "Release" configuration
is selected. ("Release" is also the default configuration if you
profile the code with Instruments).
It has some advantages to use half-open ranges:
var arr = [0,1,2,3,4,5]
arr[2 ..< arr.endIndex] = [99]
print(arr) // [0, 1, 99]
In fact, that's how a range is stored internally, and it allows you
to insert a slice at the end of the array (but not beyond that as in Python):
var arr = [Int]()
arr[0 ..< arr.endIndex] = [99]
print(arr) // [99]
So
if partials.endIndex > i {
partials[i...partials.endIndex-1]=[x]
}
else {
partials.append(x)
}
is equivalent to
partials[i ..< partials.endIndex] = [x]
// Or: partials.replaceRange(i ..< partials.endIndex, with: [x])
However, that is not a performance improvement. It seems that
replacing a slice is slow in Swift. Truncating the array and
appending the new element with
partials.replaceRange(i ..< partials.endIndex, with: [])
partials.append(x)
reduced the time for your test code from about 1.25 to 0.75 seconds on my
computer.
As #MartinR points out, replaceRange is faster than slice assignment.
If you want maximum speed (based on my tests), your best bet is probably:
partials.replaceRange(i..<partials.endIndex, with: CollectionOfOne(x))
CollectionOfOne is faster than [x] because it just stores the element inline within the struct, rather than allocating memory like an array.
Related
Hey I am trying to convert my python code to R and can't seem to figure out the last part of the recursion. If anyone who has experience in both languages could help that would be great!
def robber(nums):
if len(nums) == 0: return 0
elif len(nums) <= 2: return max(nums)
else:
A = [nums[0], max(nums[0:2])]
for i in range(2, len(nums)):
A.append(max(A[i-1], A[i-2] + nums[i]))
return A[-1]
Above is the Python version and below is my attempt so far on converting to R
robbing <- function(nums) {
if (length(nums) == 0){
result <- 0
}
else if(length(nums) <= 2){
result <- max(nums)
}
else{
a <- list(nums[0], max(nums(0:2)))
for (i in range(2, length(nums))){
result <- max(a[i-1], a[i-2] + nums[i])
}
}
#result <- a[-1]
}
You have a couple of problems.
You are zero-indexing your vectors. R is 1-indexed (first element of y is y[1] not y[0].
Ranges (slices in python) in R are inclusive. Eg: 0:2 = c(0, 1, 2) while python is right-exclusive 0:2 = [0, 1].
R uses minus elements to "remove" elements of vectors, while Python uses these to extract from reverse order. Eg: y[-1] = y[2:length(y)] in R.
R's range function is not the same as Python's range function. The equivalent in R would be seq or a:b (example 3:n). Not again that it is right-inclusive while pythons is right-exclusive!
You are not storing your intermediary results in a as you are doing in python. You need to do this at run-time
And last: R functions will return the last evaluation by default. So there is no need to explicitly use return. This is not a problem per-say, but something that can make code look cleaner (or less clean in some cases). So one option to fix you problem would be:
robber <- function(nums){
n <- length(nums) # <= Only compute length **once** =>
if(n == 0)
0 # <= Returned because no more code is run after this =>
else if(n <= 2)
max(nums) # <= Returned because no more code is run after this =>
else{
a <- numeric(n) # <= pre-allocate our vector =>
a[1:2] <- cummax(nums[1:2]) # <= Cummax instead of c(nums[1], max(nums[1:2])) =>
for(i in 3:n){ # <= Note that we start at 3, because of R's 1-indexing =>
a[i] <- max(a[i - 1], a[i - 2] + nums[i])
}
a[n]
}
}
Note 3 things:
I use that R vectors are 1-indexed, and my range goes from 3 as a consequence of this.
I pre-allocate my a vector (here using numeric(n)). R vector expansion is slow while python lists are constant in time-complexity. So preallocation is the recommended way to go in all cases.
I extract my length once and store it in a variable. n <- length(nums). It is inherently unnecessary to evaluate this multiple times, and it is recommended to store these intermediary results in a variable. This goes for any language such as R, Python and even in compild languages such as C++ (while for the latter, in many cases the compiler is smart enough to not recompute the result).
Last I use cummax where I can. I feel there is an optimized way to get your result almost immediately using vectorization, but I can't quite see it.
I would avoid to use a list. Because appending lists is slow. (Especially in R! - Vector is much better. But we don't need any sequence and indexing, if we use variables like I show you here).
You don't need to build a list.
All you need to keep in memory is the previous
and the preprevious value for res.
def robber(nums, res=0, prev=0, preprev=0): # local vars predefined here
for x in nums:
prev, preprev = res, prev
res = max(prev, preprev + x)
return res
This python function does the same like your given. (Try it out!).
In R this would be:
robber <- function(nums, res=0, prev=0, preprev=0) {
for (x in nums) {
preprev <- prev
prev <- res # correct order important!
res <- max(prev, preprev + x)
}
res
}
Taking the local variable definitions into the argument list saves in R 3 lines of code, therefore I did it.
I suggest you can change result to return() and renaming object a outside the function, also change len to length() by the end of the function.
a <- list(nums[0], max(nums(0:2)))
robbing <- function(nums) {
if (length(nums) == 0){
return(0)
}
else if(length(nums) <= 2){
return(max(nums))
}
else{
for (i in range(2, length(nums))){
return(max(a[i-1], a[i-2] + nums[i]))
}
}
return(a[length(a)])
}
I'm inexperienced in Python and started with Python 3.4.
I read over the Python 3.x documentation on loop idioms, and haven't found a way of constructing a familiar C-family for-loop, i.e.
for (i = 0; i < n; i++) {
A[i] = value;
}
Writing a for-loop like this in Python seems all but impossible by design. Does anyone know the reason why Python iteration over a sequence follows a pattern like
for x in iterable: # e.g. range, itertools.count, generator functions
pass;
Is this more efficient, convenient, or reduces index-out-of-bounds exception?
for lower <= var < upper:
That was the proposed syntax for a C-style loop. I say "was the proposed syntax", because PEP 284 was rejected, because:
Specifically, Guido did not buy the premise that the range() format needed fixing, "The whole point (15 years ago) of range() was to *avoid* needing syntax to specify a loop over numbers. I think it's worked out well and there's nothing that needs to be fixed (except range() needs to become an iterator, which it will in Python 3.0)."
So no for lower <= var < upper: for us.
Now, how to get a C-style loop? Well, you can use range([start,]end[,step]).
for i in range(0,len(blah),3):
blah[i] += merp #alters every third element of blah
#step defaults to 1 if left off
You can enumerate if you need both index and value:
for i,j in enumerate(blah):
merp[j].append(i)
If you wanted to look at two (or more!) iterators together you can zip them (Also: itertools.izip and itertools.izip_longest)
for i,j in zip(foo,bar):
if i == j: print("Scooby-Doo!")
And finally, there's always the while loop
i = 0
while i < upper:
A[i] = b
i++
Addendum: There's also PEP 276, which suggested making ints iterable, which was also rejected. Still would have been half-open
range(n) produces a suitable iterable :)
for i in range(n):
A[i] = value
for the more general case (not just counting) you should transform to a while loop. eg
i = 0
while i < n:
A[i] = value
i += 1
The foreach-style loop is pretty common in most languages now as it's rare you need access to the index of the collection and more common you only need the object itself. Furthermore, elements which would require an iterator as their is no random access (e.g. set) can be iterated with the exact same syntax as a randomly accessible collection.
In python, the correct way of accessing the index while iterating should be:
for i, x in enumerate(iterable):
At this point, i is your index and x is the item at iterable[i].
You'll want to look at using the range() function.
for i in range(n):
A[i] = value
The function can be used as either range(n), which returns a list of integers 0 - n, or as range(start, end) which will return integers from the start value to the end value. For example:
range(1, 5)
will give you the numbers 1, 2, 3, 4, and 5.
Python is a higher-level language than C and iterating over a high-level abstraction such as 'sequence' is more naturally and safely expressed with another one - 'iterator'. C doesn't really have such abstraction so it's hardly surprising it expresses most traversal with a low-level, 'hand-operated' index or pointer increment. That's an artifact of the low-level nature of C, though - it would be silly for a higher-level abstraction to use it as a primary building block for all looping constructs and most, not just Python, don't.
The ideal way to have C-style for loop in python is to use range. Not many are aware of an overloaded function of range(stop) which accepts start, stop and step arguments where step is optional. With this, you could almost do anything that you could with C-style for loops:
range(start, stop[, step])
for (i = 0; i < 10; i++)
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
for (i = 1; i < 11; i++)
>>> range(1, 11)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for (i = 0; i < 30; i=i+5)
>>> range(0, 30, 5)
[0, 5, 10, 15, 20, 25]
for (i = 0; i < 10; i=i+3)
>>> range(0, 10, 3)
[0, 3, 6, 9]
for (i = 0; i > -10; i--)
>>> range(0, -10, -1)
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
Check https://docs.python.org/2/library/functions.html#range
First, python has several ways to perform C style for loops with the two most common being (the first you allude to in your post, ie. with the generator object returned by range):
for i in range(some_end_value):
print(i)
# or the many times preferred
for i, elem in enumerate(some_list):
print("i is at {0} and the value is {1}".format(i, elem))
As to why python is setup this way, I think this has just become a more convenient and preferred way of setting up foreach-style loops - particuliarly as languages have moved away from the need to define arrays/list with their max index. For instance in java one can also do:
for (int i: someArray) {
system.out.println(i) // which would print the current item of an integer array
}
c# (foreach (int i in someArray)) and c++ (for (auto &i: int)) also have their own foreach loops. While in c, most people tend to write macros to get the functionality of a foreach loop.
It's just a convenient way to access dynamic arrays, lists, dicts, and other constructs. While while loops can be used for activities that must modify the iterator itself - or just a second variable could be created and modified mathematically using the iterator.
The equivalent of the C-loop:
for (i = 0; i < n; i++) A[i] = value;
i.e., to set all items in an array to the same value if A is a numpy array:
A[:] = value
Or if len(A) > n then
A[:n] = value
If you want to create a Python list with n values:
A = [value] * n #NOTE: all items refer to the *same* object
You could also replace values in the existing list:
A[:n] = [value]*n #NOTE: it may grow if necessary
Or without creating a temporary list:
for i in range(n): A[i] = value
The pythonic way to enumerate all values with corresponding indices while using the values:
for index, item in enumerate(A):
A[index] = item * item
The code could also be written using a list comprehension:
A = [item * item for item in A] #NOTE: the original list object may survive
Don't try to write C in Python.
I am trying to play around with some R code I found recently that imitates parts of Norvig's spell checker written in Python; In particular, I am trying to work out the right way to implement the edit2 function in R:
def splits(word):
return [(word[:i], word[i:])
for i in range(len(word)+1)]
def edits1(word):
pairs = splits(word)
deletes = [a+b[1:] for (a, b) in pairs if b]
transposes = [a+b[1]+b[0]+b[2:] for (a, b) in pairs if len(b) > 1]
replaces = [a+c+b[1:] for (a, b) in pairs for c in alphabet if b]
inserts = [a+c+b for (a, b) in pairs for c in alphabet]
return set(deletes + transposes + replaces + inserts)
def edits2(word):
return set(e2 for e1 in edits1(word) for e2 in edits1(e1))
However, in my benchmarks, it seems, generating thousands of small strings in R using paste0 (or str_c from stringr, or stri_join from stringi) results in code that is roughly 10x (or ~100x, or ~50x) slower than the Python implementation shown by Norvig. (Yes, the stringr and stringi-based functions interestingly are even slower than using paste0.) My questions are (with #3 being the main one I want resolved):
Am I doing this correctly (is the code "right")?
If so, is this a known issue of R (extremely slow string concatenation)?
Is there anything I can do about this to make this significantly faster (one or more orders of magnitude, at least) without rewriting the whole function in Rcpp11 or something like that?
Here is my R code I came up with for the edit2 function:
# 1. generate a list of all binary splits of a word
binary.splits <- function(w) {
n <- nchar(w)
lapply(0:n, function(x)
c(stri_sub(w, 0, x), stri_sub(w, x + 1, n)))
}
# 2. generate a list of all bigrams for a word
bigram.unsafe <- function(word)
sapply(2:nchar(word), function(i) substr(word, i-1, i))
bigram <- function(word)
if (nchar(word) > 1) bigram.unsafe(word) else word
# 3. four edit types: deletion, transposition, replacement, and insertion
alphabet = letters
deletions <- function(splits) if (length(splits) > 1) {
sapply(1:(length(splits)-1), function(i)
paste0(splits[[i]][1], splits[[i+1]][2]), simplify=FALSE)
} else {
splits[[1]][2]
}
transpositions <- function(splits) if (length(splits) > 2) {
swaps <- rev(bigram.unsafe(stri_reverse(splits[[1]][2])))
sapply(1:length(swaps), function(i)
paste0(splits[[i]][1], swaps[i], splits[[i+2]][2]), simplify=FALSE)
} else {
stri_reverse(splits[[1]][2])
}
replacements <- function(splits) if (length(splits) > 1) {
sapply(1:(length(splits)-1), function(i)
lapply(alphabet, function(symbol)
paste0(splits[[i]][1], symbol, splits[[i+1]][2])))
} else {
alphabet
}
insertions <- function(splits)
sapply(splits, function(pair)
lapply(alphabet, function(symbol)
paste0(pair[1], symbol, pair[2])))
# 4. create a vector of all words at edit distance 1 given the input word
edit.1 <- function(word) {
splits <- binary.splits(word)
unique(unlist(c(deletions(splits),
transpositions(splits),
replacements(splits),
insertions(splits))))
}
# 5. create a simple function to generate all words of edit distance 1 and 2
edit.2 <- function(word) {
e1 <- edit.1(word)
unique(c(unlist(lapply(e1, edit.1)), e1))
}
If you start profiling this code, you will see that replacements and insertions have nested "lapplies" and seem to take 10x longer than the deletions or transpositions, because they generate far more spelling variants.
library(rbenchmark)
benchmark(edit.2('abcd'), replications=20)
This takes about 8 seconds on my Core i5 MacBook Air, while the corresponding Python benchmark (running the corresponding edit2 function 20 times) takes about 0.6 seconds, i.e., it is about 10-15 times faster!
I have tried using expand.grid to get rid of the inner lapply, but this made the code slower, not faster. And I know that using lapply in place of sapply makes my code a bit faster, but I do not see the point of using the "wrong" function (I want a vector back) for a minor speed bump. But maybe generating the result of the edit.2 function can be made much faster in pure R?
Performance of R's paste0 vs. python's ''.join
The original title asked whether paste0 in R was 10x slower than string concatenation in python. If it is, then there's no hope of writing an algorithm that relies heavily on string concatenation in R that is as fast as the corresponding python algorithm.
I have
> R.version.string
[1] "R version 3.1.0 Patched (2014-05-31 r65803)"
and
>>> sys.version '3.4.0 (default, Apr 11 2014, 13:05:11) \n[GCC 4.8.2]'
Here's a first comparison
> library(microbenchmark)
> microbenchmark(paste0("a", "b"), times=1e6)
Unit: nanoseconds
expr min lq median uq max neval
paste0("a", "b") 951 1071 1162 1293 21794972 1e+06
(so about 1s for all replicates) versus
>>> import timeit
>>> timeit.timeit("''.join(x)", "x=('a', 'b')", number=int(1e6))
0.119668865998392
I guess that's the 10x performance difference the original poster observed.
However, R works better on vectors, and the algorithm involves vectors
of words anyway, so we might be interested in the comparison
> x = y = sample(LETTERS, 1e7, TRUE); system.time(z <- paste0(x, y))
user system elapsed
1.479 0.009 1.488
and
>>> setup = '''
import random
import string
y = x = [random.choice(string.ascii_uppercase) for _ in range(10000000)]
'''
>>> timeit.Timer("map(''.join, zip(x, y))", setup=setup).repeat(1)
[0.362522566007101]
This suggests that we would be on the right track if our R algorithm
were to run at 1/4 the speed of python; the OP found a 10-fold
difference, so it looks like there's room for improvement.
R iteration versus vectorization
The OP uses iteration (lapply and friends), rather than vectorization. We can compare the vector version to various approaches to iteration with the following
f0 = paste0
f1 = function(x, y)
vapply(seq_along(x), function(i, x, y) paste0(x[i], y[i]), character(1), x, y)
f2 = function(x, y) Map(paste0, x, y)
f3 = function(x, y) {
z = character(length(x))
for (i in seq_along(x))
z[i] = paste0(x[i], y[i])
z
}
f3c = compiler::cmpfun(f3) # explicitly compile
f4 = function(x, y) {
z = character()
for (i in seq_along(x))
z[i] = paste0(x[i], y[i])
z
}
Scaling the data back, defining the 'vectorized' solution as f0, and
comparing these approaches
> x = y = sample(LETTERS, 100000, TRUE)
> library(microbenchmark)
> microbenchmark(f0(x, y), f1(x, y), f2(x, y), f3(x, y), f3c(x, y), times=5)
Unit: milliseconds
expr min lq median uq max neval
f0(x, y) 14.69877 14.70235 14.75409 14.98777 15.14739 5
f1(x, y) 241.34212 250.19018 268.21613 279.01582 292.21065 5
f2(x, y) 198.74594 199.07489 214.79558 229.50684 271.77853 5
f3(x, y) 250.64388 251.88353 256.09757 280.04688 296.29095 5
f3c(x, y) 174.15546 175.46522 200.09589 201.18543 214.18290 5
with f4 being too painfully slow to include
> system.time(f4(x, y))
user system elapsed
24.325 0.000 24.330
So from this one can see the advice from Dr. Tierney, that there may be a benefit to vectorizing those lapply calls.
Further vectorizing the updated original post
#fnl adopted the original code by partly unrolling the loops. There remain opportunities for more of the same, for instance,
replacements <- function(splits) if (length(splits$left) > 1) {
lapply(1:(length(splits$left)-1), function(i)
paste0(splits$left[i], alphabet, splits$right[i+1]))
} else {
splits$right[1]
}
might be revised to perform a single paste call, relying on argument recycling (short vectors recycled until their length matches longer vectors)
replacements1 <- function(splits) if (length(splits$left) > 1) {
len <- length(splits$left)
paste0(splits$left[-len], rep(alphabet, each = len - 1), splits$right[-1])
} else {
splits$right[1]
}
The values are in different order, but that is not important for the algorithm. Dropping subscripts (prefix with -) is potentially more memory efficient. Similiarly
deletions1 <- function(splits) if (length(splits$left) > 1) {
paste0(splits$left[-length(splits$left)], splits$right[-1])
} else {
splits$right[1]
}
insertions1 <- function(splits)
paste0(splits$left, rep(alphabet, each=length(splits$left)), splits$right)
We then have
edit.1.1 <- function(word) {
splits <- binary.splits(word)
unique(c(deletions1(splits),
transpositions(splits),
replacements1(splits),
insertions1(splits)))
}
with some speed-up
> identical(sort(edit.1("word")), sort(edit.1.1("word")))
[1] TRUE
> microbenchmark(edit.1("word"), edit.1.1("word"))
Unit: microseconds
expr min lq median uq max neval
edit.1("word") 354.125 358.7635 362.5260 372.9185 521.337 100
edit.1.1("word") 296.575 298.9830 300.8305 307.3725 369.419 100
The OP indicates that their original version was 10x slower than
python, and that their original modifications resulted in a 5x
speed-up. We gain a further 1.2x speed-up, so are perhaps at the
expected performance of the algorithm using R's paste0. A next step is to ask whether alternative algorithms or implementations are more performant, in particular substr might be promising.
Following #LukeTierney's tips in the question's comments on vecotrizing paste0 calls and returning two vectors binary.splits, I edited the functions to be correctly vectorized. I have added the additional modifications as described by #MartinMorgan in his answer, too: dropping items using single suffixes instead of using selection ranges (i.e., "[-1]" instead of "[2:n]", etc.; but NB: for multiple suffixes, as used in transpositions, this is actually slower) and, particularly, using rep to further vectorize the paste0 calls in replacements and insertions.
This results in the best possible answer (so far?) to implement edit.2 in R (thank you, Luke and Martin!). In other words, with the main hints provided by Luke and some subsequent improvements by Martin, the R implementation ends up roughly half as fast as Python (but see Martin's final comments in his answer below). (The functions edit.1, edit.2, and bigram.unsafe remain unchanged, as shown above.)
binary.splits <- function(w) {
n <- nchar(w)
list(left=stri_sub(w, rep(0, n + 1), 0:n),
right=stri_sub(w, 1:(n + 1), rep(n, n + 1)))
}
deletions <- function(splits) {
n <- length(splits$left)
if (n > 1) paste0(splits$left[-n], splits$right[-1])
else splits$right[1]
}
transpositions <- function(splits) if (length(splits$left) > 2) {
swaps <- rev(bigram.unsafe(stri_reverse(splits$right[1])))
paste0(splits$left[1:length(swaps)], swaps,
splits$right[3:length(splits$right)])
} else {
stri_reverse(splits$right[1])
}
replacements <- function(splits) {
n <- length(splits$left)
if (n > 1) paste0(splits$left[-n],
rep(alphabet, each=n-1),
splits$right[-1])
else alphabet
}
insertions <- function(splits)
paste0(splits$left,
rep(alphabet, each=length(splits$left)),
splits$right)
Overall, and to conclude this exercise, Luke's and Martin's suggestions made the R implementation run roughly half as fast as the Python code shown in the beginning, improving my original code by about a factor of 6. What worries me even more in the end, however, are two different issues: (1) The R code seems to be far more verbose (LOC, but might be polished up a bit) and (2) the fact that even a slight deviation from "correct vectorization" makes R code perform horrible, while in Python slight deviations from "correct Python" usually do not have such an extreme impact. Nonetheless, I'll keep on with my "coding efficient R" effort - thanks to everybody involved!
I have a list of sorted floats y, as well as a list of unsorted floats x.
Now, I need to find out for every element in x between which values of y it lies, preferably by index of y. So for example, if
y=[1,2,3,4,5]
x[0]=3.5
I would need the output for index 0 of x to be (2,3), because 3.5 is between y[2] and y[3].
Basically, it is the same as seeing y as bin edges and sorting x to those bins, I guess.
What would be the easiest way yo accomplish that?
I would use zip (itertools.izip in Python 2.x) to accomplish this:
from itertools import islice#, izip as zip # if Python 2.x
def nearest_neighbours(x, lst):
for l1, l2 in zip(lst, islice(lst, 1, None)):
if l1 <= x <= l2:
return l1, l2
else:
# ?
Example usage:
>>> nearest_neighbours(3.5, range(1, 6))
(3, 4)
You will have to decide what you want to happen if x isn't between any pair in lst (i.e. replace # ?!) If you want indices (although your example isn't using them), have a play with enumerate.
Thanks - I'm aware of how to code that step-by-step. However, I was looking for a pretty/easy/elegant solution and now I am using numpy.digitize(), wich looks pretty to me and works nicely.
Q: What would be the easiest way yo accomplish that?
Instead of giving you the code, I think you should see this pseudo-code and try to write your own code! Don't just copy paste code from the internet, if you want to educate yourself!
Pseudocode:
// Assume that when you have a tie,
// you put the number in the smallest range
// Here, b is between 2.1 and 3.5, instead of
// 3.5 and 4.1
float a[5] = {0.1, 1.1, 2.1, 3.5, 4.1}; // your y
float b = 3.5; // your x
// counter for the loop and indexes. Init i to second element
integer i = 1, prev = -1, next;
// while we are not in the end of the array
while(i < 5) {
// if b is in the range of ( a(i-1), a(i) ]
if(b <= a[i] && b > a[i - 1]) {
// mark the indexes
prev = i - 1;
next = i;
}
// go to next element
i++;
}
if(prev = -1)
print "Number is not between some numbers"
else
print "prev, next"
I think that this can make you understand the point and then be able to select the easiest way for you.
I tried a codility sample question, answering in python. I am not getting 100 score because it failed to finish in time on large data set.
The following is the question:
A non-empty zero-indexed array A consisting of N integers is given.
The first covering prefix of array A is the smallest integer P such
that 0 ≤ P < N and such that every value that occurs in array A also
occurs in sequence A[0], A[1], ..., A[P].
For example, the first covering prefix of the following 5−element array A:
A[0] = 2 A[1] = 2 A[2] = 1
A[3] = 0 A[4] = 1
is 3, because sequence [ A[0], A[1], A[2], A[3] ] equal to [2, 2, 1, 0], contains all values that occur in array A.
My answer is:
def ps ( A ):
N = len(A);
if N == 0: return -1
bit = {}
for i in range(N):
if not A[i] in bit.keys():
bit[A[i]] = 1
P = i
return P
Result:
It doesn't give me 100 for this question because it thinks my algo is O(N**3), and failed test cases are
random_n_log_100000
random test 100 000 elements and n/log_2 n values. 10.025 s. TIMEOUT ERROR
running time: >10.02 sec., time limit: 9.82 sec.
random_n_10000
random test 10 000 elements and values. 1.744 s. TIMEOUT ERROR
running time: >1.74 sec., time limit: 1.10 sec.
random_n_100000
random test 100 000 elements and values. 10.025 s. TIMEOUT ERROR
running time: >10.02 sec., time limit: 9.94 sec.
Analysis:
At first I believe my code is O(N) as I assume the key part of my code, A[i] in bit.keys(), has constant run-time, i.e. O(1). But perhaps on large data set, the hash function gives a lot of collision so the runtime is no longer O(1)?
Does O(N**3) means O(N^3)? I have this question because I have seen other post where codility report an N square algo as O(N^2). So I suppose they will be consistent in their report?
If they really think my answer is O(N^3), then is it reasonable because my code only run past their time limit by less than 1 second? Here I assume their time limit is for an O(N) algo because this is what they request in the question. If that is the case, I can't see why an O(N^3) algo is just >1 sec slow??
bit.keys() is a list. Testing if an element is in a list is O(n).
On the other hand, testing if an element is in a dict is O(1).
So change
if not A[i] in bit.keys():
to
if not A[i] in bit:
With this change, I believe your algorithm is O(n).
(Without the change, I believe your algorithm is O(n^2), not O(n^3).)
Tried C# code. dont know if it will work for higher values
class Program
{
static void Main(string[] args)
{
int[] A = {2,1,1,0,3};
int i ,n=A.Length;
for (i= 0; i<n;i++)
{
if (A.Contains(i))
{
//
}
else
{
int p = i;
}
}
Console.WriteLine("p not found");
}
}