Slides

A copy of the slides can be found here.

Exercises

Shared slides

Again, please share your results with us and the other participants along the way by adding them to the shared slideshow here. We will use this slideshow to discuss our findings together.

3. Using the temporal/tiered PC algorithm (TPC)

We will now make use of the fact that the variables in nlsdata are organized in three temporal tiers, corresponding to the survey round in which they were measured: r1, r6, and r12.

3.1 Apply the temporal PC algorithm

Run the temporal PC algorithm:

tpcres <- tpc(nlsdata, sparsity = 0.05, order = c("r1", "r6", "r12"))

Plot the result and compare it with what you found in exercise 1.2 (ordinary PC algorithm):

  • Where do the to graphs agree, where do they disagree?
  • Which graph is more informative, i.e. holds more causal information? In terms of omitted adjacencies? In terms of orientations?
  • What proposed causal relationships in the new TPC result do you not trust?
  • What proposed causal relationships in the new TPC result do you find to be plausible?
  • Which of the two graphs, pcres and tpcres, do you find to be overall most plausible?

3.2 Changing the tiers

We will now make three alternative versions of the dataset where variables are assigned differently to tiers:

# new dataset where mcollege is assigned to its own (first) tier
nlsdata_4tiers <- nlsdata
names(nlsdata_4tiers)[names(nlsdata_4tiers) == "r1_mcollege"] <- "r0_mcollege"

# new dataset where tiers r1 and r6 are collapsed into one (named r1)
nlsdata_bigearlytier <- nlsdata
names(nlsdata_bigearlytier)[names(nlsdata_bigearlytier) %in% c("r6_depressed", "r6_docvisits", "r6_exercise")] <- c("r1_depressed6", "r1_docvisits6", "r1_exercise6")

# new dataset where tiers r6 and r12 are collapsed into one
nlsdata_biglatetier <- nlsdata 
names(nlsdata_biglatetier)[names(nlsdata_biglatetier) %in% c("r6_depressed", "r6_docvisits", "r6_exercise")] <- c("r12_depressed6", "r12_docvisits6", "r12_exercise6")

Run the TPC algorithm on each of these new datasets using this code (note how tiers are specified):

tpcres_4tiers <- tpc(nlsdata_4tiers, sparsity = 0.05, order = c("r0", "r1", "r6", "r12"))
tpcres_bigearlytier <- tpc(nlsdata_bigearlytier, sparsity = 0.05, order = c("r1", "r12"))
tpcres_biglatetier <- tpc(nlsdata_biglatetier, sparsity = 0.05, order = c("r1", "r12"))

Look at each of the models, and compare them with each other and the results from 3.1 and 2.1. Which version of tiered information seems to be most useful for the algorithm?

3.3 Update your CPDAG

Now, consider whether you want to change any parts of our first attempt at an expert CPDAG for these data, based on what you found using TPC.

  • Draw a new updated CPDAG. You can take a photo of it and share it in our joint slideshow.

3.4 Vary test, sparsity level or choice of variables

Go back to exercises 2.1, 2.3, or 2.4 and rerun them using the TPC algorithm instead of the PC algorithm.