Finding rectangles, part 2: borders

In the previous post, we looked at finding axis aligned rectangles in a binary image. Today I am going to solve a variation of that problem:

Given a binary image, find the largest axis aligned rectangle with a 1 pixel wide border that consists entirely of foreground pixels.

Here is an example:
,
where white pixels are the background and blue is the foreground. The rectangle with the largest area is indicated in red.

Like the previous rectangle finding problem, this one also came up in my masters thesis. The application was to, given a scan of a book, find the part that is a page, cutting away clutter:
.

Specification

The types we are going to need are exactly the same as in my previous post:

-- An image is a 2D list of booleans, True is the foreground
type Image = [[Bool]]
-- An axis-aligned rectangle data Rect = Rect { left, top, width, height :: Int } deriving (Eq,Ord,Show)

The difference compared to last time is the contains function, which tells whether an image contains a given rectangle. We are now looking only at the borders of rectangles, or 'border rectangles' for short.

-- does an image contain a given border rectangle?
contains :: Image -> Rect -> Bool
contains im rect = isBorder (cropRect im rect)
-- crop an image to the pixels inside a given rectangle cropRect :: Image -> Rect -> Image cropRect im (Rect x y w h) = map cols (rows im) where rows = take h . drop y . (++repeat []) cols = take w . drop x . (++repeat False)
-- is the border of an image foreground? isBorder :: Image -> Bool isBorder im = and (head im) -- top border && and (last im) -- bottom border && and (map head im) -- left border && and (map last im) -- right border

Finding the largest border rectangle can again be done by enumerating all rectangles contained in the image, and picking the largest one:

largestRectspec :: Image -> Rect
largestRectspec = maximalRectBy area . allRects
allRects :: Image -> [Rect] allRects im = filter (im `contains`) rects where -- boring details omitted, see previous post

Just as before, this specification has runtime O(n6) for an n by n image, which is completely impractical.

An O(n4) algorithm

Unfortunately, the nice properties of maximal rectangles will not help us out this time. In particular, whenever a filled rectangle is contained in an image, then so are all smaller subrectangles So we could 'grow' filled rectangles one row or column at a time. This is no longer true for border rectangles.

We can, however, easily improve the above O(n6) algorithm to an O(n4) one by using the line endpoints. With those we can check if an image contains a rectangle in constant time. We just need to check all four of the sides:

-- pseudo code, not actually O(n^4) without O(1) array lookup
containsfast im (Rect x y w h)
    = r!!(x,y)     >= x+w -- top border
   && r!!(x,y+h-1) >= x+w -- bottom border
   && b!!(x,y)     >= y+h -- left border
   && b!!(x+w-1,y) >= y+h -- right border

Where r and b give the right and bottom endpoints of the horizontal and vertical lines through each pixel.
r = , b = .

An O(n3) algorithm

As the title of this section hints, a still more efficient algorithm is possible. The trick is to only look for rectangles with a specific height h. For any given height h, we will be able to find only maximal rectangles of that height.

For example, for h=6 we would expect to find these rectangles:
.
Notice how each of these rectangles consist of three parts: a left side, a middle and a right side:
.

The left and right parts both consist of a vertical line at least h pixels high. We can find those vertical lines by looking at the top (or bottom) line endpoints. The top endpoint for pixel (x,y+h-1) should be at most y,

let h = 6 -- for example
let av = zipWith2d (<=) (drop (h-1) t) y
-- in images:
av =  <= 
   = 

Each True pixel in av corresponds to a point where there is a h pixel high vertical line. So, a potential left or right side of a rectangle.

The middle part of each rectangle has both pixel (x,y) and (x,y+h-1) set,

let ah = zipWith2d (&&) a (drop (h-1) a)
-- in images:
ah =  && 
   = 

To find the rectangles of height h, we just need to find runs that start and end with a pixel in av, and where all pixels in between are in ah. First we find the left coordinates of the rectangles,

let leStep (av,ah,x) le
      | av = min le x  -- pixel in av ==> left part
      | ah = le        -- pixel in ah, but not av ==> continue middle part
      | otherwise = maxBound
let le = scanRightward leStep maxBound (zip2d3 av ah x)
le = 

Finally we need to look for right sides. These are again given by av. For each right side, le gives the leftmost left side, and h gives the height of the rectangles:

let mkRect x y av le
      | av = [Rect le y (x-le+1) h] -- pixel in av ==> right part
      | otherwise = []
let rects = zipWith2d4 mkRect x y av le
rects = 

Compare the resulting image to the one at the start of this section. We found the same rectangles.

Just like last time, all we need to do now is put the steps together in a function:

rectsWithHeight :: Int -> Image -> [Rect]
rectsWithHeight h a = concat . concat $ rects
  where
    x  = scanRightward (\_ x -> x + 1) (-1) a
    y  = scanDownward  (\_ y -> y + 1) (-1) a
    t  = scanDownward  (\(a,y) t -> if a then t else y+1) 0 (zip2d a y)
    ah = zipWith2d (&&) (drop (h-1) a) a
    av = zipWith2d (<=) (drop (h-1) t) y
    leStep (av,ah,x) le
      | av = min le x
      | ah = le
      | otherwise = maxBound
    le = scanRightward leStep maxBound (zip2d3 av ah x)
    mkRect x y av le
      | av = [Rect le y (x-le+1) h]
      | otherwise = []
    rects = zipWith2d4 mkRect x y av le

Of course, finding (a superset of) all maximal rectangles in an image is just a matter of calling rectsWithHeight for all possible heights.

findRectsfast :: Image -> [Rect]
findRectsfast im = concat [ rectsWithHeight h im | h <- [1..imHeight im] ]
largestRectfast :: Image -> Rect largestRectfast = maximalRectBy area . findRectsfast

Let's quickly check that this function does the same as the specification,

propfast_spec = forAll genImage $ \a -> largestRectspec a == largestRectfast a
λ> quickCheck propfast_spec
+++ OK, passed 100 tests.

Great.

Conclusions

The runtime of rectsWithHeight is linear in the number of pixels; and it is called n times for an n by n image. Therefore the total runtime of largestRectfast is O(n3). While this is much better than what we started with, it can still be quite slow. For example, the book page that motivated this problem is around 2000 pixels squared. Finding the largest rectangle takes on the order of 20003 = 8*109, or 8 giga-operations, which is still a pretty large number.

To make this algorithm faster in practice, I used a couple of tricks. Most importantly, if we know what function we are maximizing, say area, then we can stop as soon as we know that we can't possibly find a better rectangle. The idea is to start with h=imHeight im, and work downwards. Keep track of the area a of the largest rectangle. Then as soon as h * imWidth im < a, we can stop, because any rectangle we can find from then on will be smaller.

Is this the best we can do? No. I know an algorithm for finding all maximal border rectangles in O(n2*(log n)2) time. But it is rather complicated, and this post is long enough already. So I will save it for another time. If anyone thinks they can come up with such an algorithm themselves, I would love to read about it in the comments.

Comments

paolinoDate: 2011-10-13T13:29Zx

Hello Twan, while the problem looks difficult for the general case, it looks like in this specific case we are looking for a surrounding rectangle. If there is one, picking two opposite corners among the big ones should resolve. The element by element product of a modified r and b where we store the black pixel line length instead of the endpoint should reveal the biggest corner , a symmetric operation should reveal its opposite.

Twan van LaarhovenDate: 2011-10-13T14:21Zx

paolino: That method would indeed work. However, there are O(n2) top-left corners, and a similar number of bottom-right one. You would essentially need to check all combinations, which gives a runtime of O(n4). It is not always the case that the bigest rectangle is formed by the bigest corner:

.A#######
.#.......
.#.B###..
.#.#..#..
.#.###b..

Here A is the bigest top-left corner, but it does not form a rectangle (except the 1 pixel wide or tall ones). Instead you need the smaller corner B, together with b'.

paolinoDate: 2011-10-13T15:42Zx

Right, I just see that a search on the other diagonal is also important. Btw, it's right that worst case will be O (n^4). My idea was just trying to mimic what the eyes are doing to find the border and they are fast for the page example.

Twan van LaarhovenDate: 2011-10-16T13:51Zx

paolino: Combining the corners is actually one idea my supervisor came up with when I was working on this problem for my masters thesis. The idea was to sort the top-left and bottom-right corners by size, and then try the largest combinations first.

Your eyes are probably doing something slightly different (at least, mine are). Namely, they look for wider borders. And when you find a nice wide border on the top, left, bottom and right sides, you just combine those.

knifeziaDate: 2011-12-11T21:42Zx

Hello, cool article, but do you have how to determine largest rectangle which have verticles in cells with 1's ? Some fast approach ? :) Thanks for your time.

Twan van LaarhovenDate: 2011-12-12T10:27Zx

knifezia: Do you mean to find rectangles that look like this?

1000001
0000000
0000000
1000001

I have never thought about that.

The technique from this post can also be used for this problem. That is, find rectangles with a fixed height. The runtime would be O(n3). The faster algorithm described in my next post on this topic (which I haven't published yet) can perhaps also be adapted, but I don't yet see a way to do that.

Reply

(optional)
(optional, will not be revealed)
Name a function of type [[a]] -> [a]:
Use > code for code blocks, @code@ for inline code. Some html is also allowed.