I am trying to calculate the genetic correlation (rG) among ancestor and descendant populations. However, as I do not know how the individuals are related, I am trying to apply a simulation-based approach based on phenotypes (Zintras, 2011, Journal of Genetics).

Lets take longest leaf branch and number of leaf pairs as an example:

To remove the effect of the generation (ancestors versus descendants) I am working on residuals based on the two traits (this was repeated for two different datasets – different regions: a and b)):

# Length of longest branch / plant height
m1 <- lm (LengthLongestBranch ~  Generation, data = a)
summary(m1)
res3 <- resid(m1)
plot(fitted(m1), res3)
abline(0,0)
qqnorm(res3)
qqline(res3)

# Number of leaf pairs
m2 <- lm (NumbLeafPairs ~  Generation, data = a)
summary(m2)
res4 <- resid(m2)
plot(fitted(m2), res4)
abline(0,0)
qqnorm(res4)
qqline(res4)

# Creating a dataframe for the residuals from the two traits
df <- data.frame(res)
df$res2 <- res2

aheight <- c(res)
aleaf <- c(res2)

bheight <- c(res3)
bleaf <- c(res4)

# Initialize an empty vector to store the results
results <- numeric()

# Continue looping until the result vector has a length of 20
while (length(results) < 20) {
  shh <- sample(aheight, 10, replace = FALSE) #Because vectors are not the same length, I take random samples of 10 from each)
  shl <- sample(aleaf, 10, replace = FALSE)
  ssh <- sample(bheight, 10, replace = FALSE)
  ssl <- sample(bleaf, 10, replace = FALSE)
  
  one <- cov(shh, ssl)
  two <- cov(shl, ssh)
  three <- cov(shh, ssh)
  four <- cov(shl, ssl)
  
  # Check if both "three" and "four" are positive
  if (three > 0 & four > 0) {
    result <- (0.5 * (one + two)) / (sqrt(three * four))
    results <- c(results, result)
  }
}

# Display the results
#print(results)

mean(results) #Should be rG?

My trait measurements ranges from between 4 – 33 (leaf pairs) and 2.0 – 51 cm height. However, sometimes covariances are calculated as negative, affecting the formular for the genetic correlation. When I have tried the loop 20 times, the mean of the reuslts (rG) sometimes turns out negative or above 1. As I have understood, it should be somewhere between 0-1 (almost like an r value from a regression).

Can someone shed some light on what I might be missing? Maybe it is not even possible to calculate the genetic correlation based on simulations?

New contributor

c_moeller is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.