The ground-breaking Human Genome Project has a diversity problem. Seventy percent of its first DNA sequence came from the genetics of just one man while the rest came from about 50 other volunteers.
Their data formed the backbone of the Human Genome Project's first DNA sequence which has since become a reference; the standard to which every human DNA sequence is compared. Every person's genetic code is unique so using just one reference genome—most of it from one person—to stand in for all of humanity has introduced subtle biases into genetics research.
A recent study of DNA from people of African descent showed at least 300 million letters were missing from the reference genome.