// Feb 20 2011 21:07:20
// Simulate an infection passing through a population
// 100 cases are exposed to the pathogen in each time period
// 2 groups, with different probability of infection at exposure
// (iff not already infected, hence these are hazard rates not probabilities)

// Note that the time-periods need not be of equal length, but that if
// they are, the underlying hazard is fixed. 

/*

What this demonstrates is that where the binary outcome is due to
exposure over time, rather than a one-off outcome, the OR and the RR are
defective. Instead a hazard rate is desired, and it happens that the
cloglog link in glm will give it. I'm not sure what to think about a
hazard model that doesn't know anything about the time of event.

A 1-unit change in x leads to a B change in log(-log(1-p))
exp(B) in -log(1-p)
exp(-exp(B)) in 1-p

if we reverse code, it is e(-e(B)) change in p

*/


postfile infect run t infnsm infsmo relrate OR clog using infect, replace every(1)

// Run the simulation 10 times
forvalues r = 1/10 {

// Create the simulated population
clear
set obs 5000
gen smoker = _n>_N/2
gen infected = 0
gen randso = 0

// tinf is time of infection -- if this is available proper hazard models are possible
gen tinf = 0

// At each of 200 steps ...
forvalues x = 1/200 {
  replace randso = uniform()
  sort randso
// ... expose a random 100 cases to the pathogen
  replace tinf = `x' if infected == 0 & _n<=100 & !smoker & uniform()<0.25
  replace tinf = `x' if infected == 0 & _n<=100 &  smoker & uniform()<0.50
  replace infected = 1 if tinf == `x'

// Calculate cloglog parameter, OR and RR
  qui tab smoker infected, matcell(tabdat)
  qui cloglog infected i.smoker
  matrix cll = e(b)
  scalar cll2 = cll[1,2]
  scalar infsmo  = tabdat[2,2]
  scalar infnsm  = tabdat[1,2]
  scalar relrate = (tabdat[2,2]/(tabdat[2,1]+tabdat[2,2]))/(tabdat[1,2]/(tabdat[1,1]+tabdat[1,2]))
  scalar OR      = (tabdat[2,2]/(tabdat[2,1]            ))/(tabdat[1,2]/(tabdat[1,1]            ))
  capture post infect (`r') (`x') (infnsm) (infsmo) (relrate) (OR) (cll2)
  }
save infhaz`r', replace
}
postclose infect