Skip to content Skip to sidebar Skip to footer

Remove Remains In A Letter Image With Python

I have a set of images that represent letters extracted from an image of a word. In some images there are remains of the adjacent letters and I want to eliminate them but I do not

Solution 1:

In the very beginning of the question you have mentioned that letters have been extracted from an image of a word.

So as I think, You could have done the extraction correctly. Then you wouldn't have faced a problem like this. I can give you a solution which is applicable to either extracting letters from original image or extract and separate letters from the image you have given.

Answer :

You can use convex hull coordinates to separate characters like this.

code:

import cv2
import numpy as np

img = cv2.imread('test.png', 0)
cv2.bitwise_not(img,img)
img2 = img.copy()

ret, threshed_img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
image, contours, hier = cv2.findContours(threshed_img, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)

#--- Black image to be used to draw individual convex hull ---
black = np.zeros_like(img)
contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])

for cnt in contours:
    hull = cv2.convexHull(cnt)

    img3 = img.copy()
    black2 = black.copy()

    #--- Here is where I am filling the contour after finding the convex hull ---
    cv2.drawContours(black2, [hull], -1, (255, 255, 255), -1)
    r, t2 = cv2.threshold(black2, 127, 255, cv2.THRESH_BINARY)
    masked = cv2.bitwise_and(img2, img2, mask = t2)
    cv2.imshow("masked.jpg", masked)
    cv2.waitKey(0)

cv2.destroyAllWindows()

outputs:

So as I suggest, the better thing is to use this solution when you extract characters from original image rather than removing noises after extraction.

Solution 2:

I would try the following:

  1. Sum along the columns so that every image gets projected into a vector
  2. Assuming that white=0 and black=1, find the first index value in that vector that = 0.
  3. Remove the image columns to the left of the index value from step 2.
  4. Reverse the summed vector from step 1
  5. Find the first index value that =0 in the reversed vector from step four.
  6. Remove the image columns to the right of the reversed index value from step 5.

This would work nicely for a binary image where white = 0 and black = 1 but if not, there are several methods around this including image threshholding or setting tolerance levels (e.g. for step 2. find first index value in vector that > tolerance...)

Post a Comment for "Remove Remains In A Letter Image With Python"