Quickselect algorithm in Swift
up vote
10
down vote
favorite
I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.
This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.
The language is Swift 3, compiled with Xcode 8.1.
extension Array where Element: Comparable {
public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")
var a = self // A mutable copy.
var low = startIndex
var high = endIndex
while high - low > 1 {
// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}
// Only single candidate left:
return a[low]
}
public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}
Examples:
let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9
let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"
Feedback on all aspects of the code is welcome, such as (but not limited to):
- Possible improvements (speed, clarity, swiftyness, ...).
- Naming. In particular: what would be a better name for
kSmallest(_ k: Int)
in consideration of the
Swift API Design Guidelines? - The check
if a[low] == pivotElement
looks artificial, but without that
an infinite loop can occur, e.g. for an array with all elements equal.
Is there a better solution?
Remark: To make this code compile with Swift 4 (or later), just replace the line
swap(&a[pivotIndex], &a[i])
with
a.swapAt(pivotIndex, i)
algorithm swift swift3
add a comment |
up vote
10
down vote
favorite
I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.
This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.
The language is Swift 3, compiled with Xcode 8.1.
extension Array where Element: Comparable {
public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")
var a = self // A mutable copy.
var low = startIndex
var high = endIndex
while high - low > 1 {
// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}
// Only single candidate left:
return a[low]
}
public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}
Examples:
let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9
let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"
Feedback on all aspects of the code is welcome, such as (but not limited to):
- Possible improvements (speed, clarity, swiftyness, ...).
- Naming. In particular: what would be a better name for
kSmallest(_ k: Int)
in consideration of the
Swift API Design Guidelines? - The check
if a[low] == pivotElement
looks artificial, but without that
an infinite loop can occur, e.g. for an array with all elements equal.
Is there a better solution?
Remark: To make this code compile with Swift 4 (or later), just replace the line
swap(&a[pivotIndex], &a[i])
with
a.swapAt(pivotIndex, i)
algorithm swift swift3
Don't you need to manipulatek
before iterating the 2nd partition? (See wikipedia talk page.)
– greybeard
Nov 5 '16 at 8:54
@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53
add a comment |
up vote
10
down vote
favorite
up vote
10
down vote
favorite
I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.
This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.
The language is Swift 3, compiled with Xcode 8.1.
extension Array where Element: Comparable {
public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")
var a = self // A mutable copy.
var low = startIndex
var high = endIndex
while high - low > 1 {
// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}
// Only single candidate left:
return a[low]
}
public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}
Examples:
let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9
let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"
Feedback on all aspects of the code is welcome, such as (but not limited to):
- Possible improvements (speed, clarity, swiftyness, ...).
- Naming. In particular: what would be a better name for
kSmallest(_ k: Int)
in consideration of the
Swift API Design Guidelines? - The check
if a[low] == pivotElement
looks artificial, but without that
an infinite loop can occur, e.g. for an array with all elements equal.
Is there a better solution?
Remark: To make this code compile with Swift 4 (or later), just replace the line
swap(&a[pivotIndex], &a[i])
with
a.swapAt(pivotIndex, i)
algorithm swift swift3
I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.
This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.
The language is Swift 3, compiled with Xcode 8.1.
extension Array where Element: Comparable {
public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")
var a = self // A mutable copy.
var low = startIndex
var high = endIndex
while high - low > 1 {
// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}
// Only single candidate left:
return a[low]
}
public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}
Examples:
let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9
let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"
Feedback on all aspects of the code is welcome, such as (but not limited to):
- Possible improvements (speed, clarity, swiftyness, ...).
- Naming. In particular: what would be a better name for
kSmallest(_ k: Int)
in consideration of the
Swift API Design Guidelines? - The check
if a[low] == pivotElement
looks artificial, but without that
an infinite loop can occur, e.g. for an array with all elements equal.
Is there a better solution?
Remark: To make this code compile with Swift 4 (or later), just replace the line
swap(&a[pivotIndex], &a[i])
with
a.swapAt(pivotIndex, i)
algorithm swift swift3
algorithm swift swift3
edited Nov 15 at 13:22
asked Nov 1 '16 at 21:14
Martin R
15.2k12264
15.2k12264
Don't you need to manipulatek
before iterating the 2nd partition? (See wikipedia talk page.)
– greybeard
Nov 5 '16 at 8:54
@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53
add a comment |
Don't you need to manipulatek
before iterating the 2nd partition? (See wikipedia talk page.)
– greybeard
Nov 5 '16 at 8:54
@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53
Don't you need to manipulate
k
before iterating the 2nd partition? (See wikipedia talk page.)– greybeard
Nov 5 '16 at 8:54
Don't you need to manipulate
k
before iterating the 2nd partition? (See wikipedia talk page.)– greybeard
Nov 5 '16 at 8:54
@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53
@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53
add a comment |
2 Answers
2
active
oldest
votes
up vote
2
down vote
Let's start by updating this line to Swift 4.2 :
let pivotElement = a[Int.random(in: low..<high)]
In my tests, Int.random(in:)
is way faster than Int(arc4random_uniform)
, and the comparisons won't take that into consideration.
Efficiency
1st improvement
There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}
The first two conditions are equiprobable. The least probable case is left for last.
Benchmarks
The benchmarking code is the following:
let a = Array(1...10_000).shuffled()
var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")
It prints the average time for looking up one kth smallest element.
kSmallest
is the original, kSmallest2
is the new one. They both operate on the same array a
to ensure fairness.
kSmallest2
is up to 7μs
faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms
execution time gain for a 10.000-element array:
kSmallest 1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)
In the worst case, in my tests, kSmallest2
may rarely be 2μs
slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.
2nd improvement
The following improvement concerns arrays with duplicates, and avoids unnecessary loops:
while a[low] == pivotElement, high - low > 1 {
low += 1
}
Instead of hopping by one index alone:
if a[low] == pivotElement {
low += 1
}
Benchmarks
The following code was used:
let a : [Int] = (Array(0..<100))
.reduce(into: ) { $0 = $0 + Array(repeating: $1, count: Int.random(in: 10..<30))}
.shuffled()
var timings1: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings1.append(Double(end - start)/Double(1e6))
}
let average1: Double = timings1.reduce(0, +) / Double(timings1.count)
print("kSmallest", average1, "ms")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e6))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print("kSmallest2", average2, "ms")
var timings3: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest3(k)
let end = mach_absolute_time()
timings3.append(Double(end - start)/Double(1e6))
}
let average3: Double = timings3.reduce(0, +) / Double(timings3.count)
print("kSmallest3", average3, "ms")
kSmallest3
has both the 1st and 2nd improvements.
Here are the results:
kSmallest 0.026760692307692326 ms
kSmallest2 0.026387321917808244 ms
kSmallest3 0.02211695943097998 ms
Conclusion
In an array with a high number of duplicates, the original code is now ~17%
faster by implementing both improvements. If the array has unique elements, kSmallest2
is naturally the fastest.
Naming
Personally, I would prefer nthSmallest(_ n: Int)
, for the following reasons:
n
instead ofk
since the latter usually used in constant naming
nth
instead ofn
denotes ordinality- Since the comparison closure isn't provided,
nthSmallest(_ n: Int)
would be preferable tonthElement(_ n: Int)
since it means that we'll be comparing elements in an ascending order. (nth_element
is the name of such a function in C++)
Readability
If readability and conciseness are paramount, here is an alternative that uses the partition(by:)
method :
public func nthSmallest(_ n: Int) -> Element {
precondition(count > 0, "No elements to choose from")
precondition(1 <= n && n <= count, "n must be in the range 1...count")
var low = startIndex
var high = endIndex
var mutableArrayCopy = self
while high - low > 1 {
let randomIndex = Int.random(in: low..<high)
let randomElement = mutableArrayCopy[randomIndex]
//pivot will be the index returned by partition
let pivot = mutableArrayCopy.partition { $0 >= randomElement }
if n < pivot + 1 {
high = pivot
} else if n > pivot + 1 {
low = pivot
//Avoids infinite loops when an array has duplicates
while mutableArrayCopy[low] == randomElement, high - low > 1 {
low += 1
}
} else {
return randomElement
}
}
// Only single candidate left:
return mutableArrayCopy[low]
}
It is at least 50% slower in my tests because partition(b:)
is O(n)
. The original swapping code is faster since it only considers swapping elements at indices between low
and high - 1
and not the whole array.
Here are some tests:
[1, 3, 2, 4, 7, 8, 5, 6, 9, 10].nthSmallest(6) //6
[10, 20, 30, 40, 50, 60, 20, 30, 20, 10].nthSmallest(4) //20
["a", "a", "a", "a", "a"].nthSmallest(3) //"a"
A recursive rewrite of the original code seems more readable. But at the cost of execution time, since each time the function is called, a mutable copy of the array will be created.
In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e.a.kSmallest(k) == a.sorted()[k-1]
for1<=k<=a.count
. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
– Martin R
Nov 17 at 19:25
Thanks for your suggestions. I will check them in more detail later, but in my first test[1, 2, 3, 2, 3, 4, 2].kSmallest3(2)
never returned.
– Martin R
Nov 17 at 19:32
@MartinR Could you please benchmark the new code (kSmallest3
)? (Sorry for the delayed update. There is still a third improvement hopefully coming soon 🤞🏻)
– Carpsen90
4 hours ago
add a comment |
up vote
1
down vote
There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high
low
and pivotIndex
into let
constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).
Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
I would definitely try refactoring some of this to FP, then running it through some measureBlock()s
in Xcode to see if it's faster.
Hope this helps!
Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11
@MartinR from the limited testing I've done,let
tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion).high
andlow
are both out of scope. However, I think keepinga
as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutatepivotIndex + 1
should be sped up withlet
and recursion.
– Fluidity
Nov 4 '16 at 15:33
I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43
It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40
I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55
|
show 1 more comment
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
Let's start by updating this line to Swift 4.2 :
let pivotElement = a[Int.random(in: low..<high)]
In my tests, Int.random(in:)
is way faster than Int(arc4random_uniform)
, and the comparisons won't take that into consideration.
Efficiency
1st improvement
There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}
The first two conditions are equiprobable. The least probable case is left for last.
Benchmarks
The benchmarking code is the following:
let a = Array(1...10_000).shuffled()
var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")
It prints the average time for looking up one kth smallest element.
kSmallest
is the original, kSmallest2
is the new one. They both operate on the same array a
to ensure fairness.
kSmallest2
is up to 7μs
faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms
execution time gain for a 10.000-element array:
kSmallest 1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)
In the worst case, in my tests, kSmallest2
may rarely be 2μs
slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.
2nd improvement
The following improvement concerns arrays with duplicates, and avoids unnecessary loops:
while a[low] == pivotElement, high - low > 1 {
low += 1
}
Instead of hopping by one index alone:
if a[low] == pivotElement {
low += 1
}
Benchmarks
The following code was used:
let a : [Int] = (Array(0..<100))
.reduce(into: ) { $0 = $0 + Array(repeating: $1, count: Int.random(in: 10..<30))}
.shuffled()
var timings1: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings1.append(Double(end - start)/Double(1e6))
}
let average1: Double = timings1.reduce(0, +) / Double(timings1.count)
print("kSmallest", average1, "ms")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e6))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print("kSmallest2", average2, "ms")
var timings3: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest3(k)
let end = mach_absolute_time()
timings3.append(Double(end - start)/Double(1e6))
}
let average3: Double = timings3.reduce(0, +) / Double(timings3.count)
print("kSmallest3", average3, "ms")
kSmallest3
has both the 1st and 2nd improvements.
Here are the results:
kSmallest 0.026760692307692326 ms
kSmallest2 0.026387321917808244 ms
kSmallest3 0.02211695943097998 ms
Conclusion
In an array with a high number of duplicates, the original code is now ~17%
faster by implementing both improvements. If the array has unique elements, kSmallest2
is naturally the fastest.
Naming
Personally, I would prefer nthSmallest(_ n: Int)
, for the following reasons:
n
instead ofk
since the latter usually used in constant naming
nth
instead ofn
denotes ordinality- Since the comparison closure isn't provided,
nthSmallest(_ n: Int)
would be preferable tonthElement(_ n: Int)
since it means that we'll be comparing elements in an ascending order. (nth_element
is the name of such a function in C++)
Readability
If readability and conciseness are paramount, here is an alternative that uses the partition(by:)
method :
public func nthSmallest(_ n: Int) -> Element {
precondition(count > 0, "No elements to choose from")
precondition(1 <= n && n <= count, "n must be in the range 1...count")
var low = startIndex
var high = endIndex
var mutableArrayCopy = self
while high - low > 1 {
let randomIndex = Int.random(in: low..<high)
let randomElement = mutableArrayCopy[randomIndex]
//pivot will be the index returned by partition
let pivot = mutableArrayCopy.partition { $0 >= randomElement }
if n < pivot + 1 {
high = pivot
} else if n > pivot + 1 {
low = pivot
//Avoids infinite loops when an array has duplicates
while mutableArrayCopy[low] == randomElement, high - low > 1 {
low += 1
}
} else {
return randomElement
}
}
// Only single candidate left:
return mutableArrayCopy[low]
}
It is at least 50% slower in my tests because partition(b:)
is O(n)
. The original swapping code is faster since it only considers swapping elements at indices between low
and high - 1
and not the whole array.
Here are some tests:
[1, 3, 2, 4, 7, 8, 5, 6, 9, 10].nthSmallest(6) //6
[10, 20, 30, 40, 50, 60, 20, 30, 20, 10].nthSmallest(4) //20
["a", "a", "a", "a", "a"].nthSmallest(3) //"a"
A recursive rewrite of the original code seems more readable. But at the cost of execution time, since each time the function is called, a mutable copy of the array will be created.
In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e.a.kSmallest(k) == a.sorted()[k-1]
for1<=k<=a.count
. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
– Martin R
Nov 17 at 19:25
Thanks for your suggestions. I will check them in more detail later, but in my first test[1, 2, 3, 2, 3, 4, 2].kSmallest3(2)
never returned.
– Martin R
Nov 17 at 19:32
@MartinR Could you please benchmark the new code (kSmallest3
)? (Sorry for the delayed update. There is still a third improvement hopefully coming soon 🤞🏻)
– Carpsen90
4 hours ago
add a comment |
up vote
2
down vote
Let's start by updating this line to Swift 4.2 :
let pivotElement = a[Int.random(in: low..<high)]
In my tests, Int.random(in:)
is way faster than Int(arc4random_uniform)
, and the comparisons won't take that into consideration.
Efficiency
1st improvement
There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}
The first two conditions are equiprobable. The least probable case is left for last.
Benchmarks
The benchmarking code is the following:
let a = Array(1...10_000).shuffled()
var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")
It prints the average time for looking up one kth smallest element.
kSmallest
is the original, kSmallest2
is the new one. They both operate on the same array a
to ensure fairness.
kSmallest2
is up to 7μs
faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms
execution time gain for a 10.000-element array:
kSmallest 1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)
In the worst case, in my tests, kSmallest2
may rarely be 2μs
slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.
2nd improvement
The following improvement concerns arrays with duplicates, and avoids unnecessary loops:
while a[low] == pivotElement, high - low > 1 {
low += 1
}
Instead of hopping by one index alone:
if a[low] == pivotElement {
low += 1
}
Benchmarks
The following code was used:
let a : [Int] = (Array(0..<100))
.reduce(into: ) { $0 = $0 + Array(repeating: $1, count: Int.random(in: 10..<30))}
.shuffled()
var timings1: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings1.append(Double(end - start)/Double(1e6))
}
let average1: Double = timings1.reduce(0, +) / Double(timings1.count)
print("kSmallest", average1, "ms")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e6))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print("kSmallest2", average2, "ms")
var timings3: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest3(k)
let end = mach_absolute_time()
timings3.append(Double(end - start)/Double(1e6))
}
let average3: Double = timings3.reduce(0, +) / Double(timings3.count)
print("kSmallest3", average3, "ms")
kSmallest3
has both the 1st and 2nd improvements.
Here are the results:
kSmallest 0.026760692307692326 ms
kSmallest2 0.026387321917808244 ms
kSmallest3 0.02211695943097998 ms
Conclusion
In an array with a high number of duplicates, the original code is now ~17%
faster by implementing both improvements. If the array has unique elements, kSmallest2
is naturally the fastest.
Naming
Personally, I would prefer nthSmallest(_ n: Int)
, for the following reasons:
n
instead ofk
since the latter usually used in constant naming
nth
instead ofn
denotes ordinality- Since the comparison closure isn't provided,
nthSmallest(_ n: Int)
would be preferable tonthElement(_ n: Int)
since it means that we'll be comparing elements in an ascending order. (nth_element
is the name of such a function in C++)
Readability
If readability and conciseness are paramount, here is an alternative that uses the partition(by:)
method :
public func nthSmallest(_ n: Int) -> Element {
precondition(count > 0, "No elements to choose from")
precondition(1 <= n && n <= count, "n must be in the range 1...count")
var low = startIndex
var high = endIndex
var mutableArrayCopy = self
while high - low > 1 {
let randomIndex = Int.random(in: low..<high)
let randomElement = mutableArrayCopy[randomIndex]
//pivot will be the index returned by partition
let pivot = mutableArrayCopy.partition { $0 >= randomElement }
if n < pivot + 1 {
high = pivot
} else if n > pivot + 1 {
low = pivot
//Avoids infinite loops when an array has duplicates
while mutableArrayCopy[low] == randomElement, high - low > 1 {
low += 1
}
} else {
return randomElement
}
}
// Only single candidate left:
return mutableArrayCopy[low]
}
It is at least 50% slower in my tests because partition(b:)
is O(n)
. The original swapping code is faster since it only considers swapping elements at indices between low
and high - 1
and not the whole array.
Here are some tests:
[1, 3, 2, 4, 7, 8, 5, 6, 9, 10].nthSmallest(6) //6
[10, 20, 30, 40, 50, 60, 20, 30, 20, 10].nthSmallest(4) //20
["a", "a", "a", "a", "a"].nthSmallest(3) //"a"
A recursive rewrite of the original code seems more readable. But at the cost of execution time, since each time the function is called, a mutable copy of the array will be created.
In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e.a.kSmallest(k) == a.sorted()[k-1]
for1<=k<=a.count
. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
– Martin R
Nov 17 at 19:25
Thanks for your suggestions. I will check them in more detail later, but in my first test[1, 2, 3, 2, 3, 4, 2].kSmallest3(2)
never returned.
– Martin R
Nov 17 at 19:32
@MartinR Could you please benchmark the new code (kSmallest3
)? (Sorry for the delayed update. There is still a third improvement hopefully coming soon 🤞🏻)
– Carpsen90
4 hours ago
add a comment |
up vote
2
down vote
up vote
2
down vote
Let's start by updating this line to Swift 4.2 :
let pivotElement = a[Int.random(in: low..<high)]
In my tests, Int.random(in:)
is way faster than Int(arc4random_uniform)
, and the comparisons won't take that into consideration.
Efficiency
1st improvement
There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}
The first two conditions are equiprobable. The least probable case is left for last.
Benchmarks
The benchmarking code is the following:
let a = Array(1...10_000).shuffled()
var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")
It prints the average time for looking up one kth smallest element.
kSmallest
is the original, kSmallest2
is the new one. They both operate on the same array a
to ensure fairness.
kSmallest2
is up to 7μs
faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms
execution time gain for a 10.000-element array:
kSmallest 1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)
In the worst case, in my tests, kSmallest2
may rarely be 2μs
slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.
2nd improvement
The following improvement concerns arrays with duplicates, and avoids unnecessary loops:
while a[low] == pivotElement, high - low > 1 {
low += 1
}
Instead of hopping by one index alone:
if a[low] == pivotElement {
low += 1
}
Benchmarks
The following code was used:
let a : [Int] = (Array(0..<100))
.reduce(into: ) { $0 = $0 + Array(repeating: $1, count: Int.random(in: 10..<30))}
.shuffled()
var timings1: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings1.append(Double(end - start)/Double(1e6))
}
let average1: Double = timings1.reduce(0, +) / Double(timings1.count)
print("kSmallest", average1, "ms")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e6))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print("kSmallest2", average2, "ms")
var timings3: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest3(k)
let end = mach_absolute_time()
timings3.append(Double(end - start)/Double(1e6))
}
let average3: Double = timings3.reduce(0, +) / Double(timings3.count)
print("kSmallest3", average3, "ms")
kSmallest3
has both the 1st and 2nd improvements.
Here are the results:
kSmallest 0.026760692307692326 ms
kSmallest2 0.026387321917808244 ms
kSmallest3 0.02211695943097998 ms
Conclusion
In an array with a high number of duplicates, the original code is now ~17%
faster by implementing both improvements. If the array has unique elements, kSmallest2
is naturally the fastest.
Naming
Personally, I would prefer nthSmallest(_ n: Int)
, for the following reasons:
n
instead ofk
since the latter usually used in constant naming
nth
instead ofn
denotes ordinality- Since the comparison closure isn't provided,
nthSmallest(_ n: Int)
would be preferable tonthElement(_ n: Int)
since it means that we'll be comparing elements in an ascending order. (nth_element
is the name of such a function in C++)
Readability
If readability and conciseness are paramount, here is an alternative that uses the partition(by:)
method :
public func nthSmallest(_ n: Int) -> Element {
precondition(count > 0, "No elements to choose from")
precondition(1 <= n && n <= count, "n must be in the range 1...count")
var low = startIndex
var high = endIndex
var mutableArrayCopy = self
while high - low > 1 {
let randomIndex = Int.random(in: low..<high)
let randomElement = mutableArrayCopy[randomIndex]
//pivot will be the index returned by partition
let pivot = mutableArrayCopy.partition { $0 >= randomElement }
if n < pivot + 1 {
high = pivot
} else if n > pivot + 1 {
low = pivot
//Avoids infinite loops when an array has duplicates
while mutableArrayCopy[low] == randomElement, high - low > 1 {
low += 1
}
} else {
return randomElement
}
}
// Only single candidate left:
return mutableArrayCopy[low]
}
It is at least 50% slower in my tests because partition(b:)
is O(n)
. The original swapping code is faster since it only considers swapping elements at indices between low
and high - 1
and not the whole array.
Here are some tests:
[1, 3, 2, 4, 7, 8, 5, 6, 9, 10].nthSmallest(6) //6
[10, 20, 30, 40, 50, 60, 20, 30, 20, 10].nthSmallest(4) //20
["a", "a", "a", "a", "a"].nthSmallest(3) //"a"
A recursive rewrite of the original code seems more readable. But at the cost of execution time, since each time the function is called, a mutable copy of the array will be created.
Let's start by updating this line to Swift 4.2 :
let pivotElement = a[Int.random(in: low..<high)]
In my tests, Int.random(in:)
is way faster than Int(arc4random_uniform)
, and the comparisons won't take that into consideration.
Efficiency
1st improvement
There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:
if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}
The first two conditions are equiprobable. The least probable case is left for last.
Benchmarks
The benchmarking code is the following:
let a = Array(1...10_000).shuffled()
var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")
It prints the average time for looking up one kth smallest element.
kSmallest
is the original, kSmallest2
is the new one. They both operate on the same array a
to ensure fairness.
kSmallest2
is up to 7μs
faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms
execution time gain for a 10.000-element array:
kSmallest 1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)
In the worst case, in my tests, kSmallest2
may rarely be 2μs
slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.
2nd improvement
The following improvement concerns arrays with duplicates, and avoids unnecessary loops:
while a[low] == pivotElement, high - low > 1 {
low += 1
}
Instead of hopping by one index alone:
if a[low] == pivotElement {
low += 1
}
Benchmarks
The following code was used:
let a : [Int] = (Array(0..<100))
.reduce(into: ) { $0 = $0 + Array(repeating: $1, count: Int.random(in: 10..<30))}
.shuffled()
var timings1: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings1.append(Double(end - start)/Double(1e6))
}
let average1: Double = timings1.reduce(0, +) / Double(timings1.count)
print("kSmallest", average1, "ms")
var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e6))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print("kSmallest2", average2, "ms")
var timings3: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest3(k)
let end = mach_absolute_time()
timings3.append(Double(end - start)/Double(1e6))
}
let average3: Double = timings3.reduce(0, +) / Double(timings3.count)
print("kSmallest3", average3, "ms")
kSmallest3
has both the 1st and 2nd improvements.
Here are the results:
kSmallest 0.026760692307692326 ms
kSmallest2 0.026387321917808244 ms
kSmallest3 0.02211695943097998 ms
Conclusion
In an array with a high number of duplicates, the original code is now ~17%
faster by implementing both improvements. If the array has unique elements, kSmallest2
is naturally the fastest.
Naming
Personally, I would prefer nthSmallest(_ n: Int)
, for the following reasons:
n
instead ofk
since the latter usually used in constant naming
nth
instead ofn
denotes ordinality- Since the comparison closure isn't provided,
nthSmallest(_ n: Int)
would be preferable tonthElement(_ n: Int)
since it means that we'll be comparing elements in an ascending order. (nth_element
is the name of such a function in C++)
Readability
If readability and conciseness are paramount, here is an alternative that uses the partition(by:)
method :
public func nthSmallest(_ n: Int) -> Element {
precondition(count > 0, "No elements to choose from")
precondition(1 <= n && n <= count, "n must be in the range 1...count")
var low = startIndex
var high = endIndex
var mutableArrayCopy = self
while high - low > 1 {
let randomIndex = Int.random(in: low..<high)
let randomElement = mutableArrayCopy[randomIndex]
//pivot will be the index returned by partition
let pivot = mutableArrayCopy.partition { $0 >= randomElement }
if n < pivot + 1 {
high = pivot
} else if n > pivot + 1 {
low = pivot
//Avoids infinite loops when an array has duplicates
while mutableArrayCopy[low] == randomElement, high - low > 1 {
low += 1
}
} else {
return randomElement
}
}
// Only single candidate left:
return mutableArrayCopy[low]
}
It is at least 50% slower in my tests because partition(b:)
is O(n)
. The original swapping code is faster since it only considers swapping elements at indices between low
and high - 1
and not the whole array.
Here are some tests:
[1, 3, 2, 4, 7, 8, 5, 6, 9, 10].nthSmallest(6) //6
[10, 20, 30, 40, 50, 60, 20, 30, 20, 10].nthSmallest(4) //20
["a", "a", "a", "a", "a"].nthSmallest(3) //"a"
A recursive rewrite of the original code seems more readable. But at the cost of execution time, since each time the function is called, a mutable copy of the array will be created.
edited 4 hours ago
answered Nov 17 at 15:42
Carpsen90
1798
1798
In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e.a.kSmallest(k) == a.sorted()[k-1]
for1<=k<=a.count
. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
– Martin R
Nov 17 at 19:25
Thanks for your suggestions. I will check them in more detail later, but in my first test[1, 2, 3, 2, 3, 4, 2].kSmallest3(2)
never returned.
– Martin R
Nov 17 at 19:32
@MartinR Could you please benchmark the new code (kSmallest3
)? (Sorry for the delayed update. There is still a third improvement hopefully coming soon 🤞🏻)
– Carpsen90
4 hours ago
add a comment |
In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e.a.kSmallest(k) == a.sorted()[k-1]
for1<=k<=a.count
. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
– Martin R
Nov 17 at 19:25
Thanks for your suggestions. I will check them in more detail later, but in my first test[1, 2, 3, 2, 3, 4, 2].kSmallest3(2)
never returned.
– Martin R
Nov 17 at 19:32
@MartinR Could you please benchmark the new code (kSmallest3
)? (Sorry for the delayed update. There is still a third improvement hopefully coming soon 🤞🏻)
– Carpsen90
4 hours ago
In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e.
a.kSmallest(k) == a.sorted()[k-1]
for 1<=k<=a.count
. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.– Martin R
Nov 17 at 19:25
In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e.
a.kSmallest(k) == a.sorted()[k-1]
for 1<=k<=a.count
. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.– Martin R
Nov 17 at 19:25
Thanks for your suggestions. I will check them in more detail later, but in my first test
[1, 2, 3, 2, 3, 4, 2].kSmallest3(2)
never returned.– Martin R
Nov 17 at 19:32
Thanks for your suggestions. I will check them in more detail later, but in my first test
[1, 2, 3, 2, 3, 4, 2].kSmallest3(2)
never returned.– Martin R
Nov 17 at 19:32
@MartinR Could you please benchmark the new code (
kSmallest3
)? (Sorry for the delayed update. There is still a third improvement hopefully coming soon 🤞🏻)– Carpsen90
4 hours ago
@MartinR Could you please benchmark the new code (
kSmallest3
)? (Sorry for the delayed update. There is still a third improvement hopefully coming soon 🤞🏻)– Carpsen90
4 hours ago
add a comment |
up vote
1
down vote
There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high
low
and pivotIndex
into let
constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).
Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
I would definitely try refactoring some of this to FP, then running it through some measureBlock()s
in Xcode to see if it's faster.
Hope this helps!
Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11
@MartinR from the limited testing I've done,let
tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion).high
andlow
are both out of scope. However, I think keepinga
as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutatepivotIndex + 1
should be sped up withlet
and recursion.
– Fluidity
Nov 4 '16 at 15:33
I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43
It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40
I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55
|
show 1 more comment
up vote
1
down vote
There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high
low
and pivotIndex
into let
constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).
Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
I would definitely try refactoring some of this to FP, then running it through some measureBlock()s
in Xcode to see if it's faster.
Hope this helps!
Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11
@MartinR from the limited testing I've done,let
tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion).high
andlow
are both out of scope. However, I think keepinga
as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutatepivotIndex + 1
should be sped up withlet
and recursion.
– Fluidity
Nov 4 '16 at 15:33
I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43
It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40
I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55
|
show 1 more comment
up vote
1
down vote
up vote
1
down vote
There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high
low
and pivotIndex
into let
constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).
Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
I would definitely try refactoring some of this to FP, then running it through some measureBlock()s
in Xcode to see if it's faster.
Hope this helps!
There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high
low
and pivotIndex
into let
constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).
Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:
// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}
I would definitely try refactoring some of this to FP, then running it through some measureBlock()s
in Xcode to see if it's faster.
Hope this helps!
edited Nov 4 '16 at 15:44
answered Nov 4 '16 at 14:53
Fluidity
15811
15811
Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11
@MartinR from the limited testing I've done,let
tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion).high
andlow
are both out of scope. However, I think keepinga
as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutatepivotIndex + 1
should be sped up withlet
and recursion.
– Fluidity
Nov 4 '16 at 15:33
I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43
It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40
I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55
|
show 1 more comment
Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11
@MartinR from the limited testing I've done,let
tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion).high
andlow
are both out of scope. However, I think keepinga
as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutatepivotIndex + 1
should be sped up withlet
and recursion.
– Fluidity
Nov 4 '16 at 15:33
I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43
It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40
I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55
Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11
Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11
@MartinR from the limited testing I've done,
let
tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high
and low
are both out of scope. However, I think keeping a
as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1
should be sped up with let
and recursion.– Fluidity
Nov 4 '16 at 15:33
@MartinR from the limited testing I've done,
let
tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high
and low
are both out of scope. However, I think keeping a
as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1
should be sped up with let
and recursion.– Fluidity
Nov 4 '16 at 15:33
I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43
I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43
It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40
It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40
I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55
I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55
|
show 1 more comment
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f145877%2fquickselect-algorithm-in-swift%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Don't you need to manipulate
k
before iterating the 2nd partition? (See wikipedia talk page.)– greybeard
Nov 5 '16 at 8:54
@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53