type
status
date
slug
summary
tags
category
icon
password

Question:

  • GH3922E3 does not indicate which 9 of the dataset are seed processed and which 9 are unprocessed. They are all called GH3922E3 in the dataset, so there are 18 data. The other varieties have 9 data.
  • I went to goldenharvestseeds.com and found that there is no such seed as GH3922E3, but only GH2922E3. Is it an input error or is GH3922E3 the seed we used in our experiment, and I would like to double-check it. If it is GH3922E3, please provide the data of its Adaptation to Soil Types.
  • The following images will help you understand the two questions above
 
notion image

1. Hypotheses and Objectives

1.1. Research Question (Goal)

  • How many days after the flood is the most accurate prediction of soybean yield?

1.2. Hypothesis

  • Hypothesis 1: There is a significant effect between the number of days of flooding and soybean yield.
  • Hypothesis 2*: There is a significant effect of seed (variety + seed treatment) on soybean yield.
  • Hypothesis 3*: There is a significant interaction between the number of days of flooding and seed (variety + seed treatment) on soybean yield.
  • Hypothesis 4*: Some seed varieties are resistant to the effect of flood days on soybean yield. There is no significant effect between their number of flood days and soybean yield.
  • Hypothesis 5: There is a significant effect between vegetation indices (VIs) and soybean yield.
    • * :: Represented is not currently the main goal

1.3. Research Objectives

1.3.1. Objective 1: To evaluate the relationship between the independent variables (days to flood, seed, VIs) and the dependent variable (soybean yield) (Addressing Hypotheses 1, 2, 3, 5).

  • Actions:
      1. Perform an ANOVA to determine if there is a significant difference.
      1. Perform a post hoc multiple comparison test (Tukey's HSD) to determine which specific group are significantly different from each other.
      1. Create error bars in the data visualization to show the variability and confidence in the results.
  • Measurable Outcomes: Statistical significance (p-values), effect sizes, and confidence intervals.
  • Deadline: 02/22/2024

2. Methods

2.1. Analyzing Data

2.1.0. Data Source

 

2.1.1. Different flooding days

In total, there were three different treatments 0days 3days and 7days
notion image

2.1.2. Different seed (variety + seed treatment)

19 varieties in total. of which GH3922E3 has 18 samples. The other varieties have 9 samples each.
notion image

Detail

GH3922E3 (18 samples)
GH3922E3 (18 samples)
other variety (9 samples)
other variety (9 samples)

2.1.3. Scatterplot of yield per plot

2.1.3.1. The following plot shows two different yield data.

  • One is the yield measured at harvest (Plot Yield (lb/5*28)).
  • One is the yield with the effect of the moisture removed (Plot Yield (bu/ac)).
notion image

2.1.3.2. These two yields are highly correlated. (Later studies focused only on Plot Yield (bu/ac))

notion image

2.1.3.3. Demonstrating the scatter distribution of Plot Yield (bu/ac) vs. Plot ID → Insight: The flooding days has a significant effect on yield. This can be further verified using ANOVA.

  • Below is a scatter plot that distinguishes between Variety + Seed Treatment and Flood Duration.
notion image
  • Below is a scatter plot distinguishing Variety + Seed Treatment and Flood Duration, respectively. The second p lot shows that flooding days has a significant effect on yield. This can be further verified using ANOVA.
notion image
notion image

2.2. Explore the relationship between the flooding days and yield→Flood dates could have a significant effect on yields

2.3.1. Check out the yield data

  • Scatter plots of data for three different durations of days at different plots
  • Histogram of the overall three duration day yield distributions
notion image
notion image

2.3.2. ANOVA Validated

  • p-value: 1.243535e-26 → have significantly different

result detail

notion image

2.3.3. Post hoc multiple comparison test (Tukey's HSD)

  • No significant difference between 0 and 3 days .
  • Significant difference between 0 and 7 days .
  • Significant difference between 3 and 7 days

result detail

notion image

2.3. Explore the relationship between the VIs and yield

2.3.1. Check out the VIs data for 3 different imaging dates

2.3.1.1. Imaging date 07152023

  • For both the 3-day flood zone and the 7-day flood zone, both were actually affected by the 3-day flood. Although some zone were labeled as 7-day flood zones, they were actually subjected to only 3 days of flooding, and these 7-day zones will continue to flood in the future, while the 3-day zones were drained on this day(07152023).

Line chart with error bars (Mean and standard deviation)

notion image

Each VIs - Line chart with error bars (Mean and standard deviation)

notion image

detail file

Exported HTML files of the execution code may need to be downloaded to view them.
link:

2.3.1.2. Imaging date 07192023

  • The 3-day flood zone was drained after 4 days and the 7-day flood zone was drained on this day(07192023).

Line chart with error bars (Mean and standard deviation)

notion image

Each VIs - Line chart with error bars (Mean and standard deviation)

notion image

detail file

Exported HTML files of the execution code may need to be downloaded to view them.
link:

2.3.1.3. Imaging date 09122023

  • It should be in the R5 or R6 stage of soybeans

There's a mistake in removing the soil

  • When using the Otsu method for image segmentation, the method assumes that the image contains two major luminance classes in order to make it easier to find a threshold to separate these two classes. However, in the September image, since almost the entire scene is covered by green crops, the luminance distribution of the image is mainly concentrated in a narrower range, creating a single major peak. In this case, the luminance distribution of the image lacks an obvious bimodal character, making it difficult for the Otsu method to effectively distinguish between crops and soil. As a result, the Otsu method is unable to perform the segmentation task very well in this image, which is almost exclusively a single background color, especially when attempting to distinguish subtle differences (e.g., crop vs. soil). In order to improve the segmentation results, it may be necessary to explore the use of other image segmentation techniques or incorporate additional image processing steps to enhance the contrast in the image to facilitate better application of the Otsu method or other segmentation strategies.
  • example:
bad
notion image
good
notion image
good
notion image
bad
notion image
 
good
notion image
good

Line chart with error bars (Mean and standard deviation)

notion image

Each VIs - Line chart with error bars (Mean and standard deviation)

notion image

detail file

Exported HTML files of the execution code may need to be downloaded to view them.
link:

2.3.2. ANOVA validation

2.3.2.1. Imaging date 07152023

In this analysis, 28 different vegetation indices were tested. The ANOVA results show that 27 of these indices demonstrate statistically significant differences, while only 1 index does not:
  • Significant Indices (27 total):
    • ['EXG', 'NDVI', 'GNDVI', 'NDRE', 'CCCI', 'CIgreen', 'CIred_edge', 'MSR', 'RDVI', 'RVI', 'TVI', 'IPVI', 'MTCI', 'MTVI', 'R_MCARI_MTVI', 'MSAVI', 'NDCI', 'NGRDI', 'OSAVI', 'PVI', 'RECI', 'RERI', 'RGVI', 'SRPI', 'SAVI', 'TCARI', 'TSAVI']
  • Non-Significant Index (1 total):
    • ['MCARI']

2.3.2.2. Imaging date 07192023

In this analysis, 28 different vegetation indices were tested. The ANOVA results show that 28 of these indices demonstrate statistically significant differences:
  • Significant Indices (28 total):
    • ['EXG', 'NDVI', 'GNDVI', 'NDRE', 'CCCI', 'CIgreen', 'CIred_edge', 'MSR', 'RDVI', 'RVI', 'TVI', 'IPVI', 'MTCI', 'MCARI', 'MTVI', 'R_MCARI_MTVI', 'MSAVI', 'NDCI', 'NGRDI', 'OSAVI', 'PVI', 'RECI', 'RERI', 'RGVI', 'SRPI', 'SAVI', 'TCARI', 'TSAVI']
  • Non-Significant Index (1 total):
    • ['’]

2.3.2.3. Imaging date 09122023

In this analysis, 28 different vegetation indices were tested. The ANOVA results show that 26 of these indices demonstrate statistically significant differences, while only 2 index does not:
  • Significant Indices (26 total):
    • ['EXG', 'NDVI', 'GNDVI', 'NDRE', 'CIgreen', 'CIred_edge', 'MSR', 'RDVI', 'RVI', 'TVI', 'IPVI', 'MCARI', 'MTVI', 'R_MCARI_MTVI', 'MSAVI', 'NDCI', 'NGRDI', 'OSAVI', 'PVI', 'RECI', 'RERI', 'RGVI', 'SRPI', 'SAVI', 'TCARI', 'TSAVI']
  • Non-Significant Index (2 total):
    • ['CCCI', 'MTCI']

2.3.3. Post hoc multiple comparison test (Tukey's HSD)

2.3.3.1. Imaging date 07152023

In this analysis, 28 different vegetation indices were tested. The results of the analysis of variance (ANOVA) showed that 27 of these indices were statistically significantly different and only 1 index was not significantly different. The results of the commonly used NDVI, GNDVI, and SAVI are presented here.

NDVI - Here's a legend to illustrate

notion image

GNDVI

notion image

SAVI

notion image

More VIs:

link:

2.3.3.2. Imaging date 07192023

In this analysis, 28 different vegetation indices were tested. The results of the analysis of variance (ANOVA) showed that 28 of these indices were statistically significantly different. The results of the commonly used NDVI, GNDVI, and SAVI are presented here.

NDVI

notion image

GNDVI

notion image

SAVI

notion image

More VIs:

link:

2.3.3.3. Imaging date 09122023

In this analysis, 28 different vegetation indices were tested. The results of the analysis of variance (ANOVA) showed that 26 of these indices were statistically significantly different and only 2 index was not significantly different. The results of the commonly used NDVI, GNDVI, and SAVI are presented here.

NDVI

notion image

GNDVI

notion image

SAVI

notion image

More VIs:

link:
 

2.3.4. insights

  1. For 07152023 imaging data. Because both the 3-day flood zone and the 7-day flood zone were actually affected by the 3-day flood, we expect that the vegetation indices (VIs) should be similar in both zones. However, the actual results did not match the expectation.
    1. Conjecture 1: Problems in the soil removal step, with soil data getting into the dataset, leading to results that are not as expected
    2. Conjecture 2: The effect of reflection from the water surface leads to different results.

2.4. Verify conjectures

2.4.1. For Conjecture 1

  1. By looking at the mask with the soil removed. Manually recognize if soil data is included in the soil removal data.
    1. Some of the crop boundaries were found to have soil captured in the dataset.
      1. notion image

2.4.2. For Conjecture 2

  1. No further verification at this time

3. Improved methods - EXG remove soil

  • Previously, the soil was removed by Otsu method using NDVI as a parameter. This time, we will try to remove the soil using the Otsu method with EXG as the parameter and see the result.
    • Showing the result of soil removal

      notion image

3.1. Explore the relationship between the VIs and yield

3.1.1. Check out the VIs data for 3 different imaging dates

3.1.1.1. Imaging date 07152023

  • For both the 3-day flood zone and the 7-day flood zone, both were actually affected by the 3-day flood. Although some zone were labeled as 7-day flood zones, they were actually subjected to only 3 days of flooding, and these 7-day zones will continue to flood in the future, while the 3-day zones were drained on this day(07152023).

Line chart with error bars (Mean and standard deviation)

notion image

Each VIs - Line chart with error bars (Mean and standard deviation)

notion image

detail file

Exported HTML files of the execution code may need to be downloaded to view them.
link:

3.1.1.2. Imaging date 07192023

  • The 3-day flood zone was drained after 4 days and the 7-day flood zone was drained on this day(07192023).

Line chart with error bars (Mean and standard deviation)

notion image

Each VIs - Line chart with error bars (Mean and standard deviation)

notion image

detail file

Exported HTML files of the execution code may need to be downloaded to view them.
link:

3.1.2. ANOVA validation

3.1.2.1. Imaging date 07152023

In this analysis, 28 different vegetation indices were tested. The ANOVA results show that 26 of these indices demonstrate statistically significant differences, while only 2 index does not:
  • Significant Indices (26 total):
    • ['EXG', 'NDVI', 'GNDVI', 'NDRE', 'CCCI', 'CIgreen', 'CIred_edge', 'MSR', 'RDVI', 'RVI', 'TVI', 'IPVI', 'MTCI', 'MTVI', 'R_MCARI_MTVI', 'MSAVI', 'NGRDI', 'OSAVI', 'PVI', 'RECI', 'RERI', 'RGVI', 'SRPI', 'SAVI', 'TCARI', 'TSAVI']
  • Non-Significant Index (2 total):
    • ['MCARI', 'NDCI']

3.1.2.2. Imaging date 07192023

In this analysis, 28 different vegetation indices were tested. The ANOVA results show that 28 of these indices demonstrate statistically significant differences:
  • Significant Indices (28 total):
    • ['EXG', 'NDVI', 'GNDVI', 'NDRE', 'CCCI', 'CIgreen', 'CIred_edge', 'MSR', 'RDVI', 'RVI', 'TVI', 'IPVI', 'MTCI', 'MCARI', 'MTVI', 'R_MCARI_MTVI', 'MSAVI', 'NDCI', 'OSAVI', 'PVI', 'RECI', 'RERI', 'RGVI', 'SRPI', 'SAVI', 'TCARI', 'TSAVI']
  • Non-Significant Index (1 total):
    • ['NGRDI’]

3.1.3. Post hoc multiple comparison test (Tukey's HSD)

3.1.3.1. Imaging date 07152023

In this analysis, 28 different vegetation indices were tested. The results of the analysis of variance (ANOVA) showed that 27 of these indices were statistically significantly different and only 1 index was not significantly different. The results of the commonly used NDVI, GNDVI, and SAVI are presented here.

NDVI

notion image

GNDVI

notion image

SAVI

notion image

More VIs:

link:

3.1.3.2. Imaging date 07192023

In this analysis, 28 different vegetation indices were tested. The results of the analysis of variance (ANOVA) showed that 28 of these indices were statistically significantly different. The results of the commonly used NDVI, GNDVI, and SAVI are presented here.

NDVI

notion image

GNDVI

notion image

SAVI

notion image

More VIs:

link:

3.1.4. insights

  1. For 07152023 imaging data. Because both the 3-day flood zone and the 7-day flood zone were actually affected by the 3-day flood, we expect that the vegetation indices (VIs) should be similar in both zones. However, the actual results did not match the expectation.
  • After soil removal with EXG. There was still a significant difference between 3 and 7 days on the day of 0715
  1. Conjecture 1: The effect of reflection from the water surface leads to different results.
    1.  

4. Data comparison between different imaging dates

  • Compare data from different imaging dates to see trends in GNDVI. Error bars show the mean and standard deviation of GNDVI.
    • notion image
  • Compare data from different imaging dates to see trends in canopy coverage. The error bars show the mean and standard deviation of canopy coverage.
    • notion image
      Current observations indicate that the impacts of flooding on crop health and yields are consistent with expectations. However, in order to more fully demonstrate and validate this trend, we still need to obtain and analyze data from additional imaging dates.

      More Charts

      notion image
      notion image
QGIS-Basic operationsTwo Pointers
Tianqi
Tianqi
I'm currently working in a lab focused on computer vision projects powered by machine learning.
Announcement
type
status
date
slug
summary
tags
category
icon
password
🎉Welcome to my blog🎉
Sometimes it is necessary to refresh the page twice to get the latest data because the data in the database is not updated in time. This operation can be performed on each page.
-- Tianqi ---