Category: statistics

  • Binary encoding for human genetic data

    abstract The current representation of genetic data as [0,1,2] poses some key limitations on interpretation and analysis. The proposed solution treats the genetic data as a categorical variable belonging to the following categories; [A,T,C,G,m]. The categories represent at least one nucleotide for a given SNP, and the ‘m’ category represents whether the SNP is homozygous…