We proved that we could find apples in apple bin images and fit boxes/masks to them to output pixel measurements of the detected fruit. The next step was to devise a suitable method to transform image to useful mm/inch values in real world coordinates.
However, the captured apple bin could appear within the frame in a random position with varied camera angle and pitch. We found that humans couldn’t always replicate a photo with the perfect angle and pitch using a mobile phone. The solution: we applied a perspective transformation to stretch the bin top into a known frame position, mimicking a top-down view seen in below image (2)
To achieve this, the following process was applied:
– Acquire input image as seen below in (1)
– Determine the four bin corner coordinates
– Apply perspective warp to desired frame dimension
(1) Raw input image of apple bin before perspective transformation with imperfect camera position
(2) Perspective transformed apple bin image to give top-down view
To develop a proof of concept, the four corners of the bin were manually determined. Once developed into an app, growers will take an image before dragging four markers into the four corners of the bin. This process will be automated in the future.
With the physical dimensions of the apple bin known, the final frame dimensions can now be set. For example – if an orchard has bin top dimensions of 1200mm X 1155mm, the perspective transformed output can be returned as 1200pixel X 1155pixel to give a mm to pixel ratio of 1:1. This can be scaled proportionally to reduce the input image size while keeping the relationship between width and length consistent.
We’re almost there! Stay tuned for the next step: the actual apple size estimation.