#### Getting the dispersion parameter from a negative binomial regression in python

R code to find the dispersion parameter of the negative binomial regression model.
mod.Syn.L.plusGamma <- glm.nb(Syn ~ offset(I(1*log(L))), data = Lang.data)
r = mod.Syn.L.plusGamma\$theta

The dispersion parameter theta is estimated by the glm model in R. model summary below:
Call:
glm.nb(formula = Syn ~ offset(I(1 * log(L))), data = Lang.data,
init.theta = 71.78887386, link = log)

## Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -11.2986 0.1231 -91.77 <2e-16 ***

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Negative Binomial(71.7889) family taken to be 1)

``````Null deviance: 494.73  on 3397  degrees of freedom
``````

Residual deviance: 494.73 on 3397 degrees of freedom
AIC: 630.26

Number of Fisher Scoring iterations: 1

``````          Theta:  72
Std. Err.:  1537
``````

Warning while fitting theta: iteration limit reached

2 x log-likelihood: -626.26

Trying to replicate the same model in python:
def model_3(data):
formula = “Syn ~ 1”
data[‘log_L’] = np.log(data[‘L’])
model = smf.glm(formula=formula, data=data, family=sm.families.NegativeBinomial(), offset=data[‘log_L’], observed=False).fit()
return model

# Dep. Variable: Syn No. Observations: 3398 Model: GLM Df Residuals: 3397 Model Family: NegativeBinomial Df Model: 0 Link Function: Log Scale: 1.0000 Method: IRLS Log-Likelihood: -313.43 Date: Tue, 11 Jun 2024 Deviance: 445.59 Time: 14:20:10 Pearson chi2: 3.37e+03 No. Iterations: 6 Pseudo R-squ. (CS): 0.000 Covariance Type: nonrobust

``````             coef    std err          z      P>|z|      [0.025      0.975]
``````

# Intercept -11.2997 0.125 -90.506 0.000 -11.544 -11.055

Model intercept and other variables are nearly identical to the values produced by the model in R however the dispersion parameter is not estimated during the process. Some sources mention that the dispersion parameter theta and the shape parameter alpha are aggregated so their value will be equal to one which is the scale but I am not sure if that is completely true. What is clear is that alpha and theta are not directly accessible from the model.

Are there other ways to estimate the dispersion parameter, maybe from the other model values such as variance and mu?

New contributor

JasonFred Ngwa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.