Custom Computer Vision Models

There are many potential uses of computer vision in viticulture, including counting buds/clusters, identifying pests, and monitoring fruit ripeness just to name a few. You can train a model to classify or detect anything that you can see with your eyes (and sometimes things you can’t see). In this article, we do our best to break down the process of creating and deploying your own custom computer vision models using open tools that we either we developed and/or are freely available. Please do not hesitate to reach out with any questions or ideas you may have. We look forward to seeing what you build!

Define Your Task

Before diving in, it is important to start with the end in mind. Ask yourself: ‘What is the task we are looking to accomplish?’ and, 'how will my end-users accomplish this task in the field?

Let’s say your task is to count early shoots on your vines. Ideally, your end users will capture a photo or video of your vines from a few feet away and the system will count the shoots that it sees.

You can immediately imagine several problems that might come up in the process. First of all, if we capture images of vines head-on, we might also capture vines in the next row over. There are several ways to avoid this issue including using depth sensing optical equipment or conducting image capture at night using lighting. However, the simplest solution in this case might simply be to capture images from down low, pointing the camera up toward the sky so that only the vine we are interested in is being captured. Sometimes we may have to try several techniques to figure our which method produces the best results.

For the sake of this tutorial, let’s assume we are going to capture images of vines from below and that we are going to train this model to detect and count shoots just after bud break.

Capturing Imagery

Now that we understand our goal, it’s time to get out in the field and gather images that will be used to train our model. When it comes to machine learning, the more data we can feed a model, the more accurate it will be. Therefore, we’ll want to capture as many images as we practically can. We will want to simulate capturing our raw images in the same fashion that we imagine our end users capturing images. For our example, we’d want to snap images using our phone from below our vines during their phase of growth that we expect the model to be used in. Make sure you are at the same distance that you imagine your users being. To make your model more robust, consider capturing images in slightly different conditions. Go out when it is sunny, cloudy, morning, mid-day, and evening. Use different phones if you have access. You might even capture video and use a software program to pull frames out of the video.

Remember, you don’t have to use all of your imagery all at once when working on your model. But having more images is generally easy to accomplish and could help you down the road when you’re looking to improve the accuracy of your model.

Define Your Task

Capturing Imagery

Annotate Your Images with Roboflow

Using Roboflow to Train a Fine-tuned Model

Using Google Colab to Train a Model