Imputation is a method of inferring missing genotypes in a dataset. In our paper titled "Imputation and quality control steps for combining multiple genome-wide datasets", we have provided a description on how to perform imputation on a large dataset and also ways of combining platform separated imputed datasets into one dataset. Imputation is performed using standard information provided on the Impute2 website. Below we provide the two scripts that can be used to combine imputed datasets and also convert impute2 files to PLINK files.
- Imputation Scripts - 1.0.0, released on January 15th, 2015
- impute2-group-join.py: This script takes all imputed files as input and provide one merged dataset as output.
- impute2-to-plink.py: This script is used to convert impute2 files to PLINK format.
- Refer to each utility's --help option for more information.
For more information on imputation and our method, please refer to our paper.