Rather than an edge case, I'd call this a case of trying to satisfy two incompat...

xg15 · on Feb 20, 2017

Couldn't you easily satisfy both goals by moving the "copy all elements not marked for deletion into a new array" code into a "finally" clause?

If I remember correctly, that should ensure the code is executed even if the block contains a break.

That said, I'd wager Ruby's strange way of handling breaks in blocks passed to functions is the real source of problems here.

phamilton · on Feb 20, 2017

`..into a new array` is precisely what `reject` does. We're talking about `reject!` which modifies the existing array.

DannyBee · on Feb 21, 2017

Isn't the simple solution for doing it in place "swap reject element position with current end of array, decrement size of array?" What am i missing?

perfmode · on Feb 21, 2017

Wouldn't that change the order of elements?

DannyBee · on Feb 21, 2017

The only guarantee i saw was that the output array did not contain the rejected elements. (and looking at the reference docs, that's all it says)

It appears order is guaranteed to be maintained but it's not specified :)

danfang · on Feb 21, 2017

No. [1,2,3].reject! { |x| x == 1 } would give you [3,2], rearranging the order of elements. It would work for sets.

rtpg · on Feb 21, 2017

maybe there's a finalization step that's necessary to free up memory? This seems like the most straightforward solution though

brandonbloom · on Feb 21, 2017

xg15's comment still makes sense if you replace "copy all ..." with "slide the remaining elements down by the number of removed elements" to match the mutative behavior.

However, a bigger problem is concurrent modification: What happens if your reject! predicate has a side effect which modifies the array? Ideally you'd receive a "ConcurrentModificationException" or the like.

leereeves · on Feb 21, 2017

That would still require O(n) space (to store the list of elements to keep or remove) so why not simply use the regular `reject` ?

junke · on Feb 21, 2017

No, before sliding elements, you set a local variable "dirty" to true. You iterate over all elements and keep the ones that are not rejected by moving them at the end of the current max index. When you code finishes normally, dirty is set to false.

You have an "ensure" block (finally) protecting the code which handles the cases where dirty is true (moving all the remaining elements after the current max). Then you truncate the array.

brandonbloom · on Feb 21, 2017

Because you're saving the O(N) time copy...

iirc, Ruby's array doesn't expose the distinction between length and capacity, but internally it may decide to amortize the cost of such copies. For example, if reject only removes a small % of elements, it may choose not to shrink capacity.

leereeves · on Feb 21, 2017

You're not saving the O(N) time copy, just replacing it with an O(N) slide.

(That's assuming you copy at all, instead of simply updating the reference.)

brandonbloom · on Feb 22, 2017

I meant O(N) space, as you said "That would still require O(n) space" - but a slide does not require additional space.

rntz · on Feb 21, 2017

In a multithreaded context, even that isn't perfect; another thread can observe the array in an inconsistent state while you're mutating it. I wonder what Ruby does here?

drodgers · on Feb 21, 2017

If you're in a multithreaded context, then you need to handle locking of shared mutable memory; Ruby doesn't fix that for you (but it does make it pretty easy with stdlib Mutex and block syntax). MRI happens to make many basic operations thread-safe because of the GIL, but you shouldn't get into the habit of relying on that naively because it will stop working as you expect as soon as you start thinking of complex operations like reject! as 'just another atomic operation'.

Here's an example of how you'd manage the contention yourself:

    my_array = [1, 2, 3]
    my_array_lock = Mutex.new

    Thread.new do
        my_array_lock.synchronize do
            my_array.reject! { |x| x % 2 == 0 }
        end
    end

    Thread.new do
        my_array_lock.synchronize do
            my_array.push(4)
        end
    end

    Thread.new do
        my_array_lock.synchronize do
            # Because we've serialised our operations on the array, there's no chance
            # that we read the elements in an incorrect order
            my_array.each do |x|
                puts x
            end
        end
    end