Readme for NPUTE



Python ( version 2.5-2.7, NOT version 3.0)

Numpy (









0 for impute, 1 for test

Single Window


A positive integer for the window size

Window File


The name of file containing a list of window sizes separated by newline characters.

Window Range



Input File


The name of a comma-separated file containing the input data.  Rows are SNPs, columns are haplotypes.

Output File


The name of the file to output to.  DATA WILL BE OVERWRITTEN.





We recommend that you use NPUTE by first testing a large range of windows and using that with the highest estimated accuracy to do the actual imputation.  In this tutorial, we will guide you through the process of imputing the included sample data (sample_data.csv).


Once you have installed Python and Numpy, you may go to the Start Menu > Run*, type in cmd, and press enter.  This should bring up the command-line interface.  Using DOS commands (if in Windows), navigate to the directory containing sample_data.csv.  Entering the following command will run an imputation test on the data for a range of windows between 5 and 30 and output the results in the file out.csv.


python -m 1 -r 5:30 -i sample_data.csv -o out.csv


After 5-15 minutes, the process will be complete and the output will be stored in out.csv.  When you open the file in Excel, you'll see that the highest estimated accuracy is for a window size of 12 (97.11% correct).


Now to do the actual imputation at that window size, use the following command:


python -m 0 -w 12 -i sample_data.csv -o imputed_data_w12.csv


The unknowns will be imputed and all of the data will be written to imputed_data_w12.csv with imputed values in lower case.



* If using Mac OS or Linux, use a terminal window (ex. xterm) instead.




If you have any questions or comments, please feel free to conact us at or