When working with sorted lists you often come to the point where you want to combine two or more of them.
This *merge* procedure forms the heart of merge sort
it works something like:

merge [1,3,4,5] [2,3,4] = [1,2,3,3,4,4,5]

This `merge` function is not in the Haskell standard library, and even if there were, it might not be very useful.

The problem is that when you need `merge` you often need a slight variation.
For example, you might want to remove duplicates,

merge_union [1,3,4,5] [2,3,4] = [1,2,3,4,5]

Or find the elements common to both lists,

merge_intersection [1,3,4,5] [2,3,4] = [3,4]

Or you want the difference, the symmetric difference, or...

The solution for all these problems is to make a more general `merge` function.
To do that we take a note from the most generic function over a single list, `foldr`.
The generic merge function is also a right fold, but over two lists.
Behold the type signature:

mergeByR :: (a -> b -> Ordering) -- ^ cmp: Comparison function -> (a -> b -> c -> c) -- ^ f_{xy}: Combine when a and b are equal -> (a -> c -> c) -- ^ f_{x}: Combine when a is less -> (b -> c -> c) -- ^ f_{y}: Combine when b is less -> c -- ^ z: Base case -> [a] -> [b] -> c -- ^ Argument lists and result list

Don't be scared by the size. The reason there are a lot of arguments is that for each case we use a different combining function:
If the smallest element comes from the first list we use `f _{x}`, if it comes from the second list we use

The lists from the example above can be aligned as follows:

xs = [1, 3, 4, 5 ] ys = [ 2, 3, 4 ] function to use = [f_{x}, f_{y}, f_{xy}, f_{xy}, f_{x}] mergeByR .... = f_{x}1 . f_{y}2 . f_{xy}3 3 . f_{xy}4 4 . f_{x}5 $ z

The function implementation is straightforward:

mergeByR cmp f_{xy}f_{x}f_{y}z = go where go [] ys = foldr f_{y}z ys go xs [] = foldr f_{x}z xs go (x:xs) (y:ys) = case cmp x y of LT -> f_{x}x (go xs (y:ys)) EQ -> f_{xy}x y (go xs ys) GT -> f_{y}y (go (x:xs) ys)

Now, let's look at some uses of this function.
First of all, the usual merge sort `merge` function:

mergeBy cmp = mergeByR cmp (\a b c -> a:b:c) (:) (:) [] merge = mergeBy compare

Instead of adding both `a` and `b` to the resulting list when they are equal, we can instead add only one of them, or even the result of some function on them.
This gives the set `union` operation:

unionByWith cmp f = mergeByR cmp (\a b c -> f a b:c) (:) (:) [] unionWith = unionByWith compare

If we ignore elements that occur in only one of the lists by setting `f _{x}` and

intersectionByWith cmp f = mergeByR cmp (\a b c -> f a b:c) (const id) (const id) [] intersectionWith = intersectionByWith compare

With these merge functions, implementing merge sort becomes simple.
All that is left to do is split a list in two, and recursively sort and merge.

split :: [a] -> ([a],[a]) split (x:y:zs) = let (xs,ys) = split zs in (x:xs,y:ys) split xs = (xs,[]) sort [] = [] sort [x] = [x] sort xs = let (ys,zs) = split xs in merge (sort ys) (sort zs)

If we replace `merge` by `unionWith` we instead get a sort that combines duplicate elements.

Besides set operations, `mergeByR` can also be (ab)used for other things, such as

zipWith = intersectionByWith (const $ const EQ)

Or a variant of `zipWith`, that keeps the tail of the longer list:

zipWith' = unionByWith (const $ const EQ)

We can even implement concatenation:

(++) = mergeByR (const $ const LT) undefined (:) (:) []

## Comments

I like this generic merge function. It is interesting that you made it possible for the function to return results of other types than lists, something I probably wouldn't have done. But none of your examples use the fact that you can return some other type. Do you have such an example? It would be nice to see if the extra power is really motivated.

My generic merge looks like:

So that captures, once and for all, the merging aspect but it leaves you with the problem of what to do with the MergeResult. Most uses do a mergeBy followed by a fold of some kind.

For example we can select out subsets, like the intersection, or lefts paired with maybe rights or vice versa. We can do it lazily with a right fold or if we're checking that all the input falls into some category we can do a left fold, accumulating the good results and failing if we find any bad, or partitioning the good and bad and returning both.

I suspect these two merges are equivalent. I could probably implement mine in terms of yours like:

The other way round is probably a foldr where we do case analysis on the merge result and apply on of

f_xy f_x f_yas appropriate.Josef: The flexible return type allows you to combine

mergeByRwith other folds, in the same way as foldr/build list fusion. For instance, conting the number of common elements with(length . intersection)can be fused to(mergeR (\_ _ n -> n+1) id id 0).Duncan: Yes, they are the same (or at most a foldr away). Case analysis on your

MergeResultis captured by a deconstructor with the type:Now

foldr (deMergeResult fx fxy fy) z . mergeBy cmpis the same asmergeByR.Would you mind if I kidnapped this for list-extras? I essentially already have it hidden away in Data.List.Extras.Pair, though not quite generalized from zipping to the merge case.

It can be generalized further to distinguish between the LT/GT cases and the cases where one list is longer than the other--- the particular use case being to detect length mismatch in a single traversal as is done by pairWithBy and friends.

Er, your blog seems to have eaten the link: http://hackage.haskell.org/cgi-bin/hackage-scripts/package/list-extras

The genericity of this function is great, but there is no such thing as a merge/intersection. You either want to merge or to intersect list but the two concepts are mutually exclusive. Using the appropriate one will probably be faster and easier to understand. Remember the UNIX philosophy? It applies very well to functional programming. Do one thing, but do it well

Twan, you don't need the extra general type to make your function work well with foldr/build fusion. The more specific alternative returning a list can be made to fuse just as well.

It's interesting that you should mention foldr/build fusion because your function is an example of where foldr/build fusion doesn't work that well. Suppose you have something like 'mergeByR cmp fxy fx fy z (build f) (build g)'. Now we would like to fuse both argument lists so that no intermediate lists remain. This seems impossible to do with the foldr/build framework. Switching to destroy/unfoldr solves the problem.

Nice! Though to be really (datatype-)generic, I would expect it not to take lists, but just "any"

f(probably aFunctor f, orTraversable f, or so).José: Most Functors can not be merged. For example,

data Pair a = Pair a ais a Functor, but it doesn't even allow concatenation.But there are some possibilities for further generalization. Note that

Orderingis isomorphic to a pair(Bool,Bool), not both False, saying which items come first. The combining functions could be replaced by a single one with typeMaybe a -> Maybe b -> c -> c, where it is never the case that both arguments are Nothing.To merge three lists you could use triples

(Bool,Bool,Bool), and alotof combining functions. Maybe you can do something interesting with a type other thanBool.## Reply