No More Array Errors (Part II) - Array_alg(C++)

Array_set removal

The Array_set function remove() can be implemented in a couple of different ways. First, we will use the Array Algorithm rem(), described in rem(C++), which searches sequentially for elements with a given value, removes them, and compacts storage as it goes by moving the remaining elements to the left (that is, toward lower array indices). rem() returns a pointer just beyond the last remaining element so that if the function result is less than b+n, it means that one or more elements have been removed:

        unsigned remove(const T& t,int count=1){
           check();
           unsigned result=0;
           if(count>0 &&
               (result=(rem(t,&b[0],&b[n])<&b[n]))){
               n--;
           }
           check();
           return result;
       }

This code has two problems. Most seriously, it is incorrect! Fortunately, however, the error will be caught when the invariant is checked on exit. It so happens that rem() is an unstable algorithm; that is, it does not guarantee that the relative order of the remaining elements will be preserved. Thus, an array that is sorted on entry to rem() is not guaranteed to be sorted on exit.

Fortunately, rem() is one of a family of algorithms (see ``Algorithm families'') described in rem(C++), and one of the members of this family is a stable algorithm called rem_s(). Let's use it:

        unsigned remove(const T& t,int count=1){
           check();
           unsigned result=0;
           if(count>0 &&
               (result=(rem_s(t,&b[0],&b[n])<&b[n]))){
               n--;
           }
           check();
           return result;
       }

The remaining problem with this implementation is its efficiency. The complexity of rem_s() is O(N); according to rem(C++), it does ``exactly N [equality] tests and at most P assignments,'' where N is the size of the array and P is the number of elements not equal to t. Surely we could do better by taking advantage of the sortedness and uniqueness properties guaranteed by the invariant; specifically, we can locate the element in at most lgN equality tests using bin_search() and close the gap with at most N assignments using copy():

        unsigned remove(const T& t,int count=1){
           check();
           T* p = 0;
           if(count>0 &&
               (p = (T*)bin_search(t,&b[0],&b[n]))){
               copy(p+1,&b[n],p);
               n--;
           }
           check();
           return p!=0;
       }

Although this solution is efficient, there is again an equally efficient way to do it with fewer lines of code: we can use the algorithm set_remove(), described in set_remove(C++):

        unsigned remove(const T& t,int count=1){
           check();
           unsigned result=0;
           if(count>0 &&
               set_remove(t,&b[0],&b[n])<&b[n]){
               n--;
               result=1;
           }
           check();
           return result;
       }