Custom Computer Vision Models

There are many potential uses of computer vision in viticulture, including counting buds/clusters, identifying pests, and monitoring fruit ripeness just to name a few. You can train a model to classify or detect anything that you can see with your eyes (and sometimes things you can’t see). In this article, we do our best to break down the process of creating and deploying your own custom computer vision models using open tools that we developed and/or are freely available. Please do not hesitate to reach out with any questions or ideas you may have. We look forward to seeing what you build!

Define Your Task

Before diving in, it is important to start with the end in mind. Ask yourself: ‘What is the task we are looking to accomplish?’ and, 'how will my end-users accomplish this task in the field?

Let’s say your task is to count early shoots on your vines. Ideally, your end users will capture a photo or video of your vines from a few feet away and the system will count the shoots that it sees.

You can immediately imagine several problems that might come up in the process. First of all, if we capture images of vines head-on, we might also capture vines in the next row over. There are several ways to avoid this issue including using depth sensing optical equipment or conducting image capture at night using lighting. However, the simplest solution in this case might simply be to capture images from down low, pointing the camera up toward the sky so that only the vine we are interested in is being captured. Sometimes we may have to try several techniques to figure our which method produces the best results.

For the sake of this tutorial, let’s assume we are going to capture images of vines from below and that we are going to train this model to detect and count shoots just after bud break.

Capturing Imagery

Now that we understand our goal, it’s time to get out in the field and gather images that will be used to train our model. When it comes to machine learning, the more data we can feed a model, the more accurate it will be. Therefore, we’ll want to capture as many images as we practically can. We will want to simulate capturing our raw images in the same fashion that we imagine our end users capturing images. For our example, we’d want to snap images using our phone from below our vines during their phase of growth that we expect the model to be used in. Make sure you are at the same distance that you imagine your users being. To make your model more robust, consider capturing images in slightly different conditions. Go out when it is sunny, cloudy, morning, mid-day, and evening. Use different phones if you have access. You might even capture video and use a software program to pull frames out of the video.

Remember, you don’t have to use all of your imagery all at once when working on your model. But having more images is generally easy to accomplish and could help you down the road when you’re looking to improve the accuracy of your model.

IMG_3792.jpeg
An example image of early Concord shoots.

Annotate Your Images with Roboflow

Annotation is the process of labeling your images and providing context to the model during the training process. In our example, we want to draw bounding boxes around all of the shoots on our vines as shown in the image below.

Screenshot 2024-05-02 at 11.45.11 AM.png

There are many tools that you can use to create these annotations but we have found Roboflow to be one of the easiest and most useful tools for managing large datasets of images and annotating them. The best part about Roboflow is that everything is done in the cloud which means you don’t need to manage files on your computer. Furthermore, you can invite collaborators to work on annotating your images together. Once you have an annotated dataset on Roboflow, you can easily export it in a format that can be used to train your model. Roboflow even has easy-to-use training built into their platform.

It is important that your take your time and learn how to use Roboflow. Going slow and learning this tool will be invaluable in your computer vision journey.

Please read Getting Started with Roboflow before moving on in order to get started using their tools:

Brad Dwyer, James Gallagher. (Mar 16, 2023). Getting Started with Roboflow. Roboflow Blog: Getting Started with Roboflow

Once you’ve familiarized yourself with Roboflow’s platform, upload your images and start annotating! When you're ready, you can use Roboflow’s automatic training process to create a hosted model that can be used right away within myEV!

If you are solely looking to detect and count objects using still images, head over to our cluster counting example Point-and-Shoot Cluster Counting in myEV . In that example, you will simply replace the model URL with the url to your new custom model.

https://orbitist.atlassian.net/wiki/spaces/EV/pages/138281026

Using Google Colab to Train a Model

Roboflow provides a handful of free credits to use with their easy-to-use training system. However, if you’d like more control over your model, would like to train it many times to test its performance under multiple configurations, or would like to deploy the model with our video counting process, then read on.

We have created a Google Colaboratory (Colab) notebook that will work in conjunction with a dataset managed on Roboflow. When you are finished running through the notebook, you’ll have a ‘.pt’ file that you can use for Video Cluster Counting.

Configure the Colab

  1. In a new tab, open up our Colab notebook: https://colab.research.google.com/drive/1HIzjqz5iboY2M1p4rfB2ktytr3-bf8dL?usp=sharing

  2. Click ‘Copy to Drive’. This will copy the notebook to your Google Drive and open it in a new tab.

  3. In your copy of the Colab, go to the ‘Runtime’ menu and select ‘Change runtime type’. Make sure a GPU option is selected such as 'T4 GPU'.

  4. Finally, adjust the settings block. Here are what each variable in that block means:

    • RF_WORKSPACE - The ID of your Roboflow workspace. See image below for where to obtain this in your Roboflow account.

    • RF_PROJECT - The ID of your Roboflow project. See image below for where to obtain this in your Roboflow account.

    • RF_VERSION - The version of your Roboflow dataset. See image below for where to obtain this in your Roboflow account.

    • RF_API_KEY - Your Roboflow API Key. You guest it… see the image below for where to obtain this in your Roboflow account.

    • EXPORT_PATH - An optional location within your Google Drive where you can tell the script to save your trained model. There is an optional block at the end of the notebook that will connect to your Google drive. If you use this option, be sure to accept the permissions prompts and give full access to your Google Drive.

    • TRAINING_EPOCHS - This tells the script how long to train your model. To make a long story, short, there is an optimal amount of training for any given dataset. If you train the model too little, it’ll be wildly inaccurate. Train it too much and it will become ‘overfit' – meaning it will make its definitions of classes too rigid based on the dataset at hand and will also be inaccurate. Usually a larger dataset can be trained through more ‘epochs’ without overfitting. This is an area where experience and trial and error come in. If you have a reasonably large datasets (say about 300 annotated images with augmentations that make the dataset several thousand images in size) 100 epochs is a good place to start. A small model with only a handful of images might only need 10 or 20 epochs. Our notebook will output training graphs that show metrics related to the training process. One graph that is particularly important is the class loss graph which should look like a nice curve, starting high, dropping rapidly, and then leveling off with a tail that isn’t too long. This is an area where practice helps these concepts sink in. Eventually you’ll be able to compare the performance of a model with its graphs and have an intuitive understanding of what went wrong or right in the training process. Much has been written on this topic, so a quick search on Google or Youtube will go a long way in understanding the training process at a deeper level.

Run Through the Script

At this point, you should be all set to train your custom model! Simply step through one code block at a time pressing the little play button on the left and waiting for a checkbox to appear before moving on to the next block. Some blocks will take a significantly longer time to run. Be sure not to close the Colab or you will have to start at the top again (despite the checkboxes remaining).

Note that when you get to the block labeled ‘mount Google Drive’, Colab will ask permission to access your Google Drive. You must grant full access to your drive in order for this script to work. Be sure to login when prompted to the correct Google account and grant all permissions.

Finished!

When all of the code blocks have been run, you’ll have a model that can be used in computer vision deployments. If you established an ‘EXPORT_PATH' in the notebook settings, and you ran the last block of the notebook, you’ll have a file on your Google Drive called ‘best.pt’. This file can be uploaded to your Roboflow project or used in our Video Cluster Counting notebook.

Next Steps