The basic tools/values required for the task are:
- A connected component labeling method;
- Thresholds for determining whether to discard or keep a connected component;
- A metric for calculating the distance between connected components and a threshold for determining whether to join they or not (this is required only if you actually want do such thing, which is still unclear).
The first is not available on PIL, but scipy provides it. If you don't want to use scipy too, consider the answer at https://stackoverflow.com/a/14350691/1832154. I've used the code at that answer, adapted it to use PIL images instead of plain lists, and assumed the functions present there were placed in a module called wu_ccl. For the third step I used the simple chessboard distance in an O(n^2) fashion.
Then, discarding components with less than 200 pixels, considering that components closer than 100 pixels should be in the same bounding box, and padding the bounding box in 10 pixels, this is what we get:

You could simply change the component threshold to a higher value in order to keep only the largest one. Also, you could do the two steps mentioned before this image in a reverse order: first join close components, then discard (but this is not done in the code below).
While these are relatively simple tasks, the code is not so short since we are not relying on any library for doing the tasks. Following is an example code that achieves the image above, the merging of connected components is particularly big, I guess writing it in a rush gave a code much larger than needed.
import sys
from collections import defaultdict
from PIL import Image, ImageDraw
from wu_ccl import scan, flatten_label
def borders(img):
result = img.copy()
res = result.load()
im = img.load()
width, height = img.size
for x in xrange(1, width - 1):
for y in xrange(1, height - 1):
if not im[x, y]: continue
if im[x, y-1] and im[x, y+1] and im[x-1, y] and im[x+1, y]:
res[x, y] = 0
return result
def do_wu_ccl(img):
label, p = scan(img)
ncc = flatten_label(p)
# Relabel.
l = label.load()
for x in xrange(width):
for y in xrange(height):
if l[x, y]:
l[x, y] = p[l[x, y]]
return label, ncc
def calc_dist(a, b):
dist = float('inf')
for p1 in a:
for p2 in b:
p1p2_chessboard = max(abs(p1[0] - p2[0]), abs(p1[1] - p2[1]))
if p1p2_chessboard < dist:
dist = p1p2_chessboard
return dist
img = Image.open(sys.argv[1]).convert('RGB')
width, height = img.size
# Pad image.
img_padded = Image.new('L', (width + 2, height + 2), 0)
width, height = img_padded.size
# "discard" jpeg artifacts.
img_padded.paste(img.convert('L').point(lambda x: 255 if x > 30 else 0), (1, 1))
# Label the connected components.
label, ncc = do_wu_ccl(img_padded)
# Count number of pixels in each component and discard those too small.
minsize = 200
cc_size = defaultdict(int)
l = label.load()
for x in xrange(width):
for y in xrange(height):
cc_size[l[x, y]] += 1
cc_filtered = dict((k, v) for k, v in cc_size.items() if k > 0 and v > minsize)
# Consider only the borders of the remaining components.
result = Image.new('L', img.size)
res = result.load()
im = img_padded.load()
l = label.load()
for x in xrange(1, width - 1):
for y in xrange(1, height - 1):
if im[x, y] and l[x, y] in cc_filtered:
res[x-1, y-1] = l[x, y]
result = borders(result)
width, height = result.size
result.save(sys.argv[2])
# Collect the border points for each of the remainig components.
res = result.load()
cc_points = defaultdict(list)
for x in xrange(width):
for y in xrange(height):
if res[x, y]:
cc_points[res[x, y]].append((x, y))
cc_points_l = list(cc_points.items())
# Perform a dummy O(n^2) method to determine whether two components are close.
grouped_cc = defaultdict(set)
dist_threshold = 100 # pixels
for i in xrange(len(cc_points_l)):
ki = cc_points_l[i][0]
grouped_cc[ki].add(ki)
for j in xrange(i + 1, len(cc_points_l)):
vi = cc_points_l[i][1]
vj = cc_points_l[j][1]
kj = cc_points_l[j][0]
dist = calc_dist(vi, vj)
if dist < dist_threshold:
grouped_cc[ki].add(kj)
grouped_cc[kj].add(ki)
# Flatten groups.
flat_groups = defaultdict(set)
used = set()
for group, v in grouped_cc.items():
work = set(v)
if group in used:
continue
while work:
gi = work.pop()
if gi in flat_groups[group] or gi in used:
continue
used.add(gi)
flat_groups[group].add(gi)
new = grouped_cc[gi]
if not flat_groups[group].issuperset(new):
work.update(new)
# Draw a bounding box around each group.
draw = ImageDraw.Draw(img)
bpad = 10
for cc in flat_groups.values():
data = []
for vi in cc:
data.extend(cc_points[vi])
xsort = sorted(data)
ysort = sorted(data, key=lambda x: x[1])
# Padded bounding box.
bbox = (xsort[0][0] - bpad, ysort[0][1] - bpad,
xsort[-1][0] + bpad, ysort[-1][1] + bpad)
draw.rectangle(bbox, outline=(0, 255, 0))
img.save(sys.argv[2])
Again, the function wu_ccl.scan need to be adjusted (taken from the mentioned answer), and for doing that consider creating an image with mode 'I' inside it instead of using nested Python lists. I also did a slight change to flatten_label so it returns the number of connected components (but it is not actually used in this final code presented).