Local-Area Relative Hybrid Performance
Prediction Tool

Performance Prediction Tool
This tool is affectionately referred to as “Tool #3”. Whereas Tool #1 (Wide-area Corn Hybrid Assessment) is a tool that allows you the user to define what part of the overall database you’d like to pull data from (i.e. allows you to create a subset of data) before you summarize relative hybrid performance, tool #3 uses the entirety of the database to make a very specific rifled (targeted) predictions of relative hybrid performance – predictions that are customized for your own unique location (latitude/longitude and elevation) and management (planting rates and yield levels) scenarios.
 
In order to accomplish this, we utilize multi-variate regression analyses of the master dataset to ascertain (i.e. “tease out”) hybrid behavioral patterns and tendencies. Having thus-wise discovered the specific natures of hybrid behavior, we can then use that knowledge to predict relative hybrid performance – if the parent dataset for any give hybrid is “sufficiently large and varied” (adequate in number of loc’s and covers a wide range of each variable (latitudes, longitudes, elevations, planting rates, and yield levels).
 
Our estimation of what is “sufficiently large” is at least 50 loc’s of data. We will sometimes publish results of hybrids with as few as 30 loc’s of data; but please be hereby warned that results from so few loc’s might not be entirely trust-worthy. Please use discretion and good judgement.
 
“Sufficiently varied” is another matter all of and by itself. Having at least 50 loc’s of a hybrid does not necessarily assure that we can accurately predict how that hybrid might perform for you. We have several examples in our database where we have over 50 loc’s of data for a hybrid, but none of them were from plots planted below 30,000 seeds per acre. Such a dataset is of absolutely no value to someone who wants to plant 27,000 seeds per acre.
 
As you should have noticed, I underscored “absolutely no value”. Please allow me to try to explain why: The algorithms used in this tool are sufficiently complex and creative enough to do a very good job of predicting performance for combinations of lat/long/elev and planting rates & yield levels that lie within the landscape of the database itself (within the range of values for each variable tested). Once we wander beyond that explored landscape (exceed the tested range for any one variable; we call this “coloring outside the lines”), all bets are off. As I mentioned – our algorithms can be rather “creative”: once we extrapolate into territory with no data to hold them down, any given algorithm for any given hybrid might get really “wonky” (inaccurate; produce values that should obviously not be trusted) really quickly.
 
To help the user judge (i.e. “sense”) when a prediction might be becoming a little unreliable, we’ve listed the range of tested values for each variable, and have color-coded that data table to quickly provide cautions that should prove helpful in guiding the trust-level (i.e. “confidence”) that you can place in any given prediction (tool output) for any given hybrid.
 
My own limits of any confidence whatsoever: 1) fewer than 30 loc’s of data; 2) anytime I have exceeded any of the range limits for any of the 5 variables (even if just a little bit); and 3) any time a prediction exceeds 120% or falls below 80% of “location average” (i.e. anticipated “yield level”).
 
In addition to the ranges of values for each of the variables, we also publish the number of loc’s/hits (n) as well as the correlation coefficient (R2) for each hybrid’s performance-predicting algorithm. Sometimes the correlation coefficient can be used as an indicator that the dataset is poor (contains too much variability (i.e. “noise”) to adequately discern/determine how 2 variables (e.g. planting rate and yield) are related. However, by definition, the primary (exclusive?) purpose of the correlation coefficient is by definition “to measure how strong a relationship is between two variables.”
 
In our case, Tool #3 seeks to ascertain the strength of the relationships between the independent variables (latitude, longitude, elevation, planting rate & yield level) and the dependent variable (relative yield). Intuitively we (farmers, agronomists) know that these five things can drive hybrid performance (but they don’t always necessarily do). Some hybrids like it out west where it’s hot, and where yields and planting rates are often times low… some do not… and some really do not care…
 
So let me quickly point out the usefulness of (i.e. reason why you might want to pay attention to) the correlation coefficient in our little tool: hybrids that are known as “superstars” (are broadly-adapted and act like both “race horses” and “work horses” at the same time; i.e. are relatively high-yielding and competitive regardless of where you plant them, how you treat them, and how happy you do or do not make them…) will have low R2 values (less than 0.35) because they have weak relationships between several or all of the independent variables and yield. Hybrids that are more strongly “driven” by these 5 variables will show high R2 values (greater than 0.65).
 

Set-up Guidelines:
  1. Enter the requested economic considerations (“Avg Farm-Gate Price/Unit”, i.e. what you think the average bag of seed is worth; and “Grain Price” (what you anticipate receiving for your crop after harvest).

  2. Enter in your own unique values for the 5 independent variables (“Planting Rate (seeds/ac)”, “Yield Level” (farm-wide or field-wide or zone-wide yield you anticipate harvesting), “Latitude”, “Longitude”, and “Elevation”.

  3. Enter in the maturity range (low end and high end) that you’d like to see results for. (If you leave these to the default values (no maturity restrictions) then you will soon see how “wonky” the results can/will be for hybrids that are too early or late for you…)

  4. Click the “Calculate” button and view the results.

  5. If you’d like to download (save) the results to a “csv file” (can be opened by Excel; we recommend you then “save as” an Excel file so you can further use/manipulate the data), click the “Save Results” button. Downloaded versions of the file will not save/include the color-coded cautionary system that is available on the website.

A couple additional notes:
  1. The lat/long relationships are fairly coarse (i.e. small differences have extremely small impacts) – you do not need to enter different lat/long’s for each of the fields (or zones within fields) that you farm. Just use one value of each for your whole farm (unless you have multiple farms that are separated by tens of miles).

  2. If you do have quite a range of different elevations across your farm, you may want to play around with different elevations a bit.

  3. We do recommend running different reports for different fields or even for different zones within fields if planting rates and yield levels vary much.

  4. A number of different combination/derivative reports can be created that do an excellent job of helping you to discover and understand different hybrid “behavior”. Your own creativity and imagination will prompt you to look at the resulting data in various ways. Do not rein this desire in. We encourage you to be creative and explore…

  5. However, creating these reports does take time. If time allows, we will generate and make available useful examples of different derivative reports and make them available for you to download.

Derivative Reports - Tool #3

Creator Steven Dvorak demonstrates the functions and results of Tool #3 - Local-Area Relative Hybrid Performance Prediction Tool.  View the video tutorial on how this tool works.
 
Subscribe to the Veritas Seed Data channel to see when new videos are released.