Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Rather than an edge case, I'd call this a case of trying to satisfy two incompatible goals: fast and internally consistent after each deletion.

Either implementation will be a poor choice for some use cases (hence the bug reports about both).



Couldn't you easily satisfy both goals by moving the "copy all elements not marked for deletion into a new array" code into a "finally" clause?

If I remember correctly, that should ensure the code is executed even if the block contains a break.

That said, I'd wager Ruby's strange way of handling breaks in blocks passed to functions is the real source of problems here.


`..into a new array` is precisely what `reject` does. We're talking about `reject!` which modifies the existing array.


Isn't the simple solution for doing it in place "swap reject element position with current end of array, decrement size of array?" What am i missing?


Wouldn't that change the order of elements?


The only guarantee i saw was that the output array did not contain the rejected elements. (and looking at the reference docs, that's all it says)

It appears order is guaranteed to be maintained but it's not specified :)


No. [1,2,3].reject! { |x| x == 1 } would give you [3,2], rearranging the order of elements. It would work for sets.


maybe there's a finalization step that's necessary to free up memory? This seems like the most straightforward solution though


xg15's comment still makes sense if you replace "copy all ..." with "slide the remaining elements down by the number of removed elements" to match the mutative behavior.

However, a bigger problem is concurrent modification: What happens if your reject! predicate has a side effect which modifies the array? Ideally you'd receive a "ConcurrentModificationException" or the like.


That would still require O(n) space (to store the list of elements to keep or remove) so why not simply use the regular `reject` ?


No, before sliding elements, you set a local variable "dirty" to true. You iterate over all elements and keep the ones that are not rejected by moving them at the end of the current max index. When you code finishes normally, dirty is set to false.

You have an "ensure" block (finally) protecting the code which handles the cases where dirty is true (moving all the remaining elements after the current max). Then you truncate the array.


Because you're saving the O(N) time copy...

iirc, Ruby's array doesn't expose the distinction between length and capacity, but internally it may decide to amortize the cost of such copies. For example, if reject only removes a small % of elements, it may choose not to shrink capacity.


You're not saving the O(N) time copy, just replacing it with an O(N) slide.

(That's assuming you copy at all, instead of simply updating the reference.)


I meant O(N) space, as you said "That would still require O(n) space" - but a slide does not require additional space.


In a multithreaded context, even that isn't perfect; another thread can observe the array in an inconsistent state while you're mutating it. I wonder what Ruby does here?


If you're in a multithreaded context, then you need to handle locking of shared mutable memory; Ruby doesn't fix that for you (but it does make it pretty easy with stdlib Mutex and block syntax). MRI happens to make many basic operations thread-safe because of the GIL, but you shouldn't get into the habit of relying on that naively because it will stop working as you expect as soon as you start thinking of complex operations like reject! as 'just another atomic operation'.

Here's an example of how you'd manage the contention yourself:

    my_array = [1, 2, 3]
    my_array_lock = Mutex.new

    Thread.new do
        my_array_lock.synchronize do
            my_array.reject! { |x| x % 2 == 0 }
        end
    end

    Thread.new do
        my_array_lock.synchronize do
            my_array.push(4)
        end
    end

    Thread.new do
        my_array_lock.synchronize do
            # Because we've serialised our operations on the array, there's no chance
            # that we read the elements in an incorrect order
            my_array.each do |x|
                puts x
            end
        end
    end




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: