Last updated: 2018-08-05
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(20180609)
The command set.seed(20180609)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: 191f152
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: data/
Ignored: docs/.DS_Store
Ignored: docs/images/.DS_Store
Ignored: docs/images/.Rapp.history
Ignored: output/.DS_Store
Ignored: output/.Rapp.history
Ignored: output/MASHvFLASHgtex/.DS_Store
Ignored: output/MASHvFLASHsims/.DS_Store
Ignored: output/MASHvFLASHsims/backfit/.DS_Store
Ignored: output/MASHvFLASHsims/backfit/.Rapp.history
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 191f152 | Jason Willwerscheid | 2018-08-05 | wflow_publish(“analysis/conclusions.Rmd”) |
In general, I find results obtained using FLASH to be more appealing than those obtained using MASH. While the simulation study showed that FLASH does poorly on independent effects, one doesn’t really expect to see large independent effects in the GTEx data. The situation where an identical effect is combined with a large unique effect is a much more plausible scenario, and FLASH outperforms MASH in such cases.
The canonical loadings of the OHF and “Top k” fits are essential to the success of FLASH. They make for more interpretable results and, at least visually, better explain the data.
It would be a good idea to increase the size of the random subset to get reasonable priors for all unique effects. With the present random subset, the prior for many tissues is effectively a point mass at zero, which guarantees that the final fit will miss any unique effects in those tissues. Alternatively, one might perform some post hoc manipulation of the priors by, for example, enforcing some minimum for each \(w_f\) and \(\sigma^2_f\).
Getting good loadings for the FLASH fits is crucial. (By “good”, I mean that they reflect some reality, but also that they are interpretable and, in general, sparse.) I think the loadings could be improved by adding nonnegativity constraints (particularly with the OHF approach). The next step, I think, is to experiment with fitting FLASH objects using “+uniform” ash priors.
Although the FLASH results are somewhat nicer than results for MASH, I’m not really comparing apples and apples since I’ve used ED rather than FLASH to generate data-driven covariance matrices for MASH. It would be a good idea to repeat the analysis using covariance matrices derived from FLASH loadings.
sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] workflowr_1.0.1 Rcpp_0.12.17 digest_0.6.15
[4] rprojroot_1.3-2 R.methodsS3_1.7.1 backports_1.1.2
[7] git2r_0.21.0 magrittr_1.5 evaluate_0.10.1
[10] stringi_1.1.6 whisker_0.3-2 R.oo_1.21.0
[13] R.utils_2.6.0 rmarkdown_1.8 tools_3.4.3
[16] stringr_1.3.0 yaml_2.1.17 compiler_3.4.3
[19] htmltools_0.3.6 knitr_1.20
This reproducible R Markdown analysis was created with workflowr 1.0.1