adc34 t1_isfx6ri wrote on October 15, 2022 at 5:56 PM

Both coding and non-coding DNA. Actually, 0.1% is a little bit outdated. The variance can be higher according to the 1000 genome project. It is said in this article that > We find that a typical genome differs from the reference human genome at 4.1 million to 5.0 million sites. Although >99.9% of variants consist of SNPs and short indels, structural variants affect more bases: the typical genome contains an estimated 2,100 to 2,500 structural variants (∼1,000 large deletions, ∼160 copy-number variants, ∼915 Alu insertions, ∼128 L1 insertions, ∼51 SVA insertions, ∼4 NUMTs, and ∼10 inversions), affecting ∼20 million bases of sequence.

These ~20 million bases count for ~0.6% of total genome length

danby t1_isgiefk wrote on October 15, 2022 at 8:27 PM

I think a thing people maybe don't realise is that the 99.9% figure was a bit of a guesstimate 20 years ago. It has taken something like the 1000 genomes project to actually calculate the number/amount of differences

RantingRobot t1_ishhfay wrote on October 16, 2022 at 12:49 AM

Also, a concrete answer to the question doesn't really exist since the number of differences vary depending on how you count them.

Some stretches of DNA do multiple, overlapping things, so is that counted as one difference or four? Some stretches of DNA can be the same in two people, but epigenetically expressed in one but not the other, so is that counted as one difference or none?

The number will always be kind of a guestimate.

danby t1_ishjkfs wrote on October 16, 2022 at 1:05 AM

Agreed. "Proportion of non-shared base pairs" is at least a decent enough, semi-objective way to compare the differences between two genomes without getting too far in to the weeds about what exactly constitutes a difference. There are, in the end of the day, lots of differences that simply can't be expressed as a percentage difference (like gene/chromosome translocation)

Fmatosqg t1_ishmusp wrote on October 16, 2022 at 1:32 AM

Since all of this is meant to produce proteins, it's only fair that the calculation is biased towards things that make different proteins.

So if a gene/allele gets moved to a different place, it still counts as no difference.

DreamWithinAMatrix t1_ishq800 wrote on October 16, 2022 at 1:58 AM

Protein production used to be the thinking back in the day of the term "junk DNA" but we've since learned that actually there are sequences that have non-protein generating functions. Promoters and alternative splicing are the ones that come to mind. There are viral gene inserts which were originally thought to have no function but seem to be amplified in some regions and is now hypothesized to be a source of accelerated evolution, such as, in neurons which may have contributed to how humans diverged from chimps. The epigenome is the methyl groups around the DNA which can open or close to prevent the genes from being expressed, which might be mainly driven by environmental conditions and change frequently. There are some portions of DNA which might fold on itself to prevent expression as well.

If you only look at the raw gene sequence and say only the protein producing ones count. You have no way of telling:

how much
how many kinds
speed
and whether the protein is currently being expressed

without taking all those things into account. Also there are so many of the above being discovered that there's really no way to calculate all that yet

joalheagney t1_isi6mvr wrote on October 16, 2022 at 4:18 AM

Not to mention all the various segments that code for functional but non-protein encoding RNA.

[deleted] t1_isjaww4 wrote on October 16, 2022 at 12:39 PM

[removed]

doc_nano t1_ishqweh wrote on October 16, 2022 at 2:04 AM

Well… sort of. While encoding proteins is arguably the most important and certainly the most visible function of the genome, there are parts that code for RNA that does not get translated into protein. These and other non-coding segments actually make up the majority of the human genome, and many of them play important roles. Though it is true that almost all those roles support the expression or regulation of proteins in some indirect way.

Also, a gene moving to a different locus can actually make a big difference, because the way it is expressed and regulated can change, even if it codes for the same protein.

danby t1_isis6zk wrote on October 16, 2022 at 8:39 AM

> So if a gene/allele gets moved to a different place, it still counts as no difference.

Definitely not. Translocation often leads to or implies different expression of genes. As an aside many, many translocations over large amounts of evolutionary time can lead to things like chromosome loss and/or speciation events. These are important forms of genetic change/mutation that do lead to important functional change. And they do make genomes quite different in ways that aren't measurable by simple percentages.

BryKKan t1_isj5b4f wrote on October 16, 2022 at 11:38 AM

See, that's the problem though. Simply translocating a sequence, with no alteration, can diminish or amplify expression dramatically. So that could still be considered a difference.

derefr t1_isih06b wrote on October 16, 2022 at 6:11 AM

"Easy" — but impractical to calculate in practice — concrete answer: it's the information-theoretic co-compressibility of the all the dependent information required to construct one individual's proteome relative to another indivdual's.

(I.e., if you have all the DNA + methylations et al of one person's genome, stored in a file, which you then compress in an information-theoretical optimal way [not with a general-purpose compressor, but rather one that takes advantage of the structure of DNA, rearranging things to pack better], and then measure the file-size of the result; and then you create another file which contains all that same [uncompressed] information, plus the information of a second person's DNA + methylations et al; and you optimally compress that file; then by what percentage is the second optimally-compressed file larger than the first?)

Or, to use a fanciful analogy: if we had a machine to synthesize human cells "from the bottom up", and you had all the information required to print one particular human's cells stored somewhere — then how much more information would you need as a "patch" on the first human's data, to describe an arbitrary other particular human, on average?

Inariameme t1_isk4gr1 wrote on October 16, 2022 at 4:21 PM

idk that i tend to agree with any of the computational architectures ;)

Simply, is DNA as linear as has been suggested? probabilistic-ally_

[deleted] t1_isi9idi wrote on October 16, 2022 at 4:47 AM

[removed]

[deleted] t1_isisapp wrote on October 16, 2022 at 8:41 AM

[removed]

snuffleupugus_anus t1_isnsed2 wrote on October 17, 2022 at 11:11 AM

Would a metric like ratio of varying base pairs to the differential in expressed proteins be better metric? I realize that it's just a theoretical number and that we can't actually count literally every protein in a human body, but, as a thought experiment I suppose, is that a more meaningful depiction of actual genetic difference?

[deleted] t1_isix804 wrote on October 16, 2022 at 9:51 AM

[removed]

sunplaysbass t1_isgminf wrote on October 15, 2022 at 8:56 PM

Half a percent range seems huge to me. But that’s my no nothing reaction.

Ixosis t1_isgv0ig wrote on October 15, 2022 at 9:58 PM

Really isn’t that large when you find out we share 70% of our DNA with bananas

sunplaysbass t1_isgvgdx wrote on October 15, 2022 at 10:01 PM

To me that is why 0.6% variance within humans is a lot, if we’re 30% off from being a banana.

powercow t1_ish1kn3 wrote on October 15, 2022 at 10:46 PM

from what i read, we have less variation than other animals. due to some event 70,000 years ago that caused our population to collapse to only a few thousand

PhilosopherFLX t1_ishdx0a wrote on October 16, 2022 at 12:21 AM

Always wonder how that squares with Neanderthal interbreeding when Neanderthals mostly lived 130,000 to 40,000 years ago, right in the middle of 70,000.

ECEXCURSION t1_ishvhah wrote on October 16, 2022 at 2:40 AM

Maybe Neanderthals hunted humans to the brink of extinction. Just like humans and vampires!

Sylvurphlame t1_isirxzk wrote on October 16, 2022 at 8:36 AM

Nah. A giant race war is something humanity would never engage in…

Wait…

[deleted] t1_isi9o65 wrote on October 16, 2022 at 4:49 AM

[removed]

Angdrambor t1_isjtgb1 wrote on October 16, 2022 at 3:07 PM

Makes you wonder if they hit that same bottleneck before we wiped them out.

Xais56 t1_ishh9ox wrote on October 16, 2022 at 12:47 AM

Depends on the animal. I doubt cheetahs have much variance.

Something hardy and successful and desired by humans though I could see having huge variance. Cannabis plants must have incredible variance between sexually produced individuals. (I'm aware it's not an animal, but the point stands).

powercow t1_ishztdj wrote on October 16, 2022 at 3:16 AM

oh for sure some have similar or even less than us. I was talking more about on the average side of things, we are a bit less genetically diverse than most. But especially among endangered species id expect diversity to be likely to be lower than ours. Not all that long ago they discovered a family of stick insect that everyone thought was extinct, living in a bush on a remote island. Since only a single family of them were found, its unlikely they are as diverse as we are.

[deleted] t1_isi95e5 wrote on October 16, 2022 at 4:43 AM

[removed]

LoreChano t1_ishdssl wrote on October 16, 2022 at 12:20 AM

So this was about the time we started to create art and religion, among other things? I wonder if it's related.

[deleted] t1_isilbet wrote on October 16, 2022 at 7:05 AM

[removed]

jadierhetseni t1_isgwy84 wrote on October 15, 2022 at 10:12 PM

Eh. It’s hard to overstate how much of the genome isn’t code-specific. That is, some of it is useless, some of it is structural (need x bases of any sort), some of it is compositional (need a lot of g and c but the precise ratio isn’t important) etc

A lot of the major protein-coding, structural, and regulatory stuff is highly conserved, so there’s a lot of overlap between any two species (Eg humans + bananas)

But all of that other stuff? Eh. It can vary basically as much as it wants consequence-free, producing a lot of within species differences.

BiPoLaRadiation t1_isgxrl6 wrote on October 15, 2022 at 10:18 PM

To be fair the percentage of genes that are different is probably a lot higher than 30 percent. The 30 percent is the number of base pair sequences that are similar between humans and bananas. So us and bananas both have a gene for a sodium pump or some other gene that is shared between most living things and on average the similarity between our average gene and their average gene (of the roughly 7000 genes that they compared in the original study) is about 40 (actual original number) percent (or less because they tested gene products and not base pairs so a lot of minor variability will still result in the same protein product).

If you were to compare on a gene by gene basis then probably none of our genes would be the exact same as a bananas. We and bananas also have multitudes of genes that are exclusive to us or them due to the structural differences and the long long evolutionary divergence.

So a 0.6% difference in genetic sequence between humans including not just base pairs of genes but also non coding sequences is actually really tiny. It's enough of a difference to do a lot but it's not as big of a difference as you are imagining.

Sylvurphlame t1_isis7ih wrote on October 16, 2022 at 8:40 AM

The way my biology professor explained it, assuming I recall correctly after decades, is that it takes most of the DNA just to make a functional life from of any sort of complexity. So the amount the separates species, or individuals within a species is relatively small. But important.

[deleted] t1_isk6hf5 wrote on October 16, 2022 at 4:34 PM

[removed]

bschug t1_isj3h9q wrote on October 16, 2022 at 11:16 AM

Is that overlap the same for every human, or are some humans closer to a banana than others?

sunplaysbass t1_isjecm7 wrote on October 16, 2022 at 1:11 PM

Given this variance I can only assume some humans are closer or farther from being a banana than others. It could be a new path for eugenics, or perhaps a banana cult ranking system.

danby t1_islfy04 wrote on October 16, 2022 at 9:25 PM

> To me that is why 0.6% variance within humans is a lot

Sure but this includes non coding and repetitive DNA which between individuals is somewhat unconstrained. If you look at only protein coding genes you get back down to variances closer to 0.1%

[deleted] t1_isgxjj7 wrote on October 15, 2022 at 10:17 PM

[removed]

[deleted] t1_ishe377 wrote on October 16, 2022 at 12:22 AM

[removed]

dunnp t1_ishmgl8 wrote on October 16, 2022 at 1:29 AM

That’s comparing just coding regions with bananas, not the non-coding regions which are the vast majority of the human genome. So more like 70% of the coding 2% of the genome are shared with bananas.

[deleted] t1_isgv7dh wrote on October 15, 2022 at 9:59 PM

[removed]

[deleted] t1_ish0xoh wrote on October 15, 2022 at 10:42 PM

[removed]

[deleted] t1_ish988w wrote on October 15, 2022 at 11:44 PM

[removed]

[deleted] t1_ishfl4i wrote on October 16, 2022 at 12:34 AM

[removed]

Thormeaxozarliplon t1_ishpbj8 wrote on October 16, 2022 at 1:51 AM

That's only anecdotal. It's meant to show the common evolution of life. Most of that similarity is due to things like "housekeeping" genes and common biochemistry.

TomaszA3 t1_ish7nri wrote on October 15, 2022 at 11:32 PM

I'll just drop here that small things can disable or enable almost entirety of other "code". Like, change an "if" to opposite symbol, 0.0...1% of the code has been changed but 99.9...% of code is not executing at all. Or only half of total code is executing on one branch and other half at another.

0.6% in such very highly flexible codebase should definitely bear massive functional(or not, but evolution) changes.

sometimesgoodadvice t1_isp75d7 wrote on October 17, 2022 at 5:50 PM

An interesting analogy but slightly flawed in terms of looking at genomes of already viable organisms. A person whose genome is sequenced to compare to the reference has already undergone the selection criteria for viability and development. Basically, there are plenty of sites where single mutations would lead to a complete breakdown of making a "human" but those would never be seen in a sequenced genome.

The other main difference is that of course code is written to be concise and concrete. As far as I know, no one pastes in some random code that doesn't perform a function just in case it may be needed in the future. Of course, biology works precisely in that way and the genome is a mess of evolutionary history with plenty of space for modification without really resulting in any functional change. So a better example of those 0.6% may be that you can have typos in the comments of the code. In fact, for any large piece of software, I would be surprised if the comment section did not contain at least 0.5% typos.

[deleted] t1_ish3uxz wrote on October 15, 2022 at 11:03 PM

[removed]

[deleted] t1_ishrifd wrote on October 16, 2022 at 2:09 AM

[removed]

[deleted] t1_isi9gfi wrote on October 16, 2022 at 4:46 AM

[removed]

[deleted] t1_isif8vw wrote on October 16, 2022 at 5:50 AM

[removed]

[deleted] t1_isifsbd wrote on October 16, 2022 at 5:56 AM

[removed]

Shadows802 t1_ishatal wrote on October 15, 2022 at 11:56 PM

So 99.4%?

[deleted] t1_isjv2o5 wrote on October 16, 2022 at 3:18 PM

[removed]

promonk t1_ish7g8b wrote on October 15, 2022 at 11:31 PM

Now I'm curious: whose genome is the human reference genome?

Kandiru t1_ishbuwl wrote on October 16, 2022 at 12:05 AM

It's no one person's. It's a mishmash of several different high quality genomes, and then over time it's been changed to have the more common variants as the reference rather than the reference being a rare mutation for some genes.

promonk t1_ishdkr1 wrote on October 16, 2022 at 12:18 AM

When you say "more common variants," common in what way?

I'm fascinated by the idea of a "reference human."

Kandiru t1_ishg2ww wrote on October 16, 2022 at 12:38 AM

Say a certain position is a A for 90% of people, but a C for 10%. The A variant is more common than the C.

So when the reference had previously had a C there, in a later version it's often been changed to the most frequent base.

promonk t1_ishu3eb wrote on October 16, 2022 at 2:29 AM

I get that. What I'm curious about is sampling. 90% of which population? Is it 90 of some college-age kids being paid a hundred bucks for a cheek swab? Or is it drawn from a broad swathe of demographics and locations?

emfts t1_isiaa4s wrote on October 16, 2022 at 4:55 AM

The first human reference genome (from the human genome project) was a group of people from all over, random volunteers.

You can read all about it here:

https://www.genome.gov/12513430/2004-release-ihgsc-describes-finished-human-sequence

Kandiru t1_isimfb1 wrote on October 16, 2022 at 7:20 AM

The 1000 genome project used populations around the world

http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/README_populations.md

Has a list of the ones used.

[deleted] t1_ishhjrb wrote on October 16, 2022 at 12:49 AM

[removed]

[deleted] t1_isiu3kr wrote on October 16, 2022 at 9:07 AM

[removed]

tsunamisurfer t1_ishsh54 wrote on October 16, 2022 at 2:16 AM

Originally though, the reference genome was that of the first sequenced human genome, which I believe belonged to J Craig Venter.

Kandiru t1_isim3ny wrote on October 16, 2022 at 7:15 AM

Actually there were two competing approaches at the beginning. Venter did sequence himself with shotgun sequencing, while the high fidelity BAC sequencing with Sanger sequencing was done on a range of different individuals spanning the genome.

So the first version of the reference was a mixture of them all.

Angdrambor t1_isjtnw2 wrote on October 16, 2022 at 3:08 PM

What makes a genome "High quality"?

[deleted] t1_isjukzq wrote on October 16, 2022 at 3:14 PM

[removed]

danby t1_islg7bo wrote on October 16, 2022 at 9:26 PM

Though I only spent a handful of years in genome sequencing I suspect what is probably meant here is that the sequence was based on several genomes where they were able to prepare high quality genomic libraries for those genomes.

Angdrambor t1_ismpsxm wrote on October 17, 2022 at 3:12 AM

What makes a genomic library high or low quality? Few errors? Faithful representation of the original?

Splatulance t1_isia4bj wrote on October 16, 2022 at 4:53 AM

Typically the question of variance comes down to an aggregate statistic. The most common is "the maximum likelihood estimate", which for a normal enough distribution (bell curve) is the mean.

It's called maximum likelihood because most of x is most likely to be close to the mean.

The more samples you have, the more genomes in this case, the better you can estimate the actual average. With enough samples the actual population mean is overwhelmingly likely to be the same as your estimate.

If the vast majority of people have 99% identical whatever, that's a very tightly grouped distribution around the mean with very low variance. It's practically a vertical line instead of a curve.

[deleted] t1_isime79 wrote on October 16, 2022 at 7:19 AM

[removed]

Slappy_G t1_isjxerj wrote on October 16, 2022 at 3:34 PM

This is totally unrelated but I've also heard the figure of 1.6% for how different chimpanzees are compared to humans. So has that figure been revised, or are we saying that the variation inside of humans is much closer to the variation between the two species?

Cuco1981 t1_isk0dcq wrote on October 16, 2022 at 3:54 PM

The differences are too complex to be reduced to a simple percentage. For instance, we have differing number of chromosomes.

[deleted] t1_ispqcme wrote on October 17, 2022 at 7:56 PM

[removed]

[deleted] t1_isgtc9q wrote on October 15, 2022 at 9:46 PM

[removed]

[deleted] t1_ishrxb0 wrote on October 16, 2022 at 2:12 AM

[removed]

Cornelius_Physales t1_isig1v7 wrote on October 16, 2022 at 6:00 AM

And even the reference of the 1000 genome project leaves out low-complex regions. The first whole genome of a human from telomere to telomete was only completely sequenced last year.

creperobot t1_isigt3p wrote on October 16, 2022 at 6:08 AM

So what is the largest difference between the two most extreme samples?

[deleted] t1_isirhgs wrote on October 16, 2022 at 8:29 AM

[removed]

PeanutSalsa OP t1_iskk2g1 wrote on October 16, 2022 at 6:00 PM

How is it determined this is talking about both coding and non-coding DNA combined?

Ferociousfeind t1_isfvph0 wrote on October 15, 2022 at 5:46 PM

It's generally not a very scientific number at all. There are multiple types of mutations that can occur, and they're not all readily comparable. If a certain section of 24 nucleotide pairs get duplicated in one mutation event, is that 1 change, 24, or hundreds to thousands (by offsetting everything after the duplication)?

What this number really means is "human DNA is remarkably similar, human to human" and that's about as far as it goes.

Tractorcito22 t1_ishdyjn wrote on October 16, 2022 at 12:21 AM

> remarkably

Is it remarkable though? I would imagine most species' DNA is unremarkably similar to the same species DNA? Are humans more remarkable?

Sincerly_ t1_isi5c28 wrote on October 16, 2022 at 4:05 AM

Yes, we are compared to other animals, and the reason we think this is because tens of thousands of years ago there was a super volcano eruption that made our population around 10,000 or less. So we are not as genetically diverse as other animals are

Mylaur t1_isioah7 wrote on October 16, 2022 at 7:45 AM

What?... So we would have been even more genetically diverse huh.

[deleted] t1_isk3nt4 wrote on October 16, 2022 at 4:16 PM

[removed]

pihwlook t1_isjmc6t wrote on October 16, 2022 at 2:16 PM

Why did this event not also constrict other species?

Because they had time to diverge before it, and our divergence came after it?

Sincerly_ t1_isk48bd wrote on October 16, 2022 at 4:19 PM

It did affect certain animals of course, but we were ran to near extinction. But most of the animals we see today don’t seem like there genetic diversity was affected that much. And yeah, most likely they diverged way before that, even before the super volcano we have had times where we have had extremely low population.

[deleted] t1_isibeof wrote on October 16, 2022 at 5:07 AM

[removed]

DuskyDay t1_isjejxw wrote on October 16, 2022 at 1:13 PM

From what I've read, other species have generally much higher variance.

[deleted] t1_isk3l31 wrote on October 16, 2022 at 4:15 PM

[removed]

Ffdmatt t1_isgfk3v wrote on October 15, 2022 at 8:07 PM

Would it be better to say that we use the same base pieces for the code? We start with the same gene set and build from there. Mutations and differences in combinations are bound to occur (as random distribution helps prepare for change and survival anyway), but after millions of years I'd imagine a ton of the "formulas" end up the same or similar, simply because it's the most optimal combination for the function.

halfhalfnhalf t1_isghm3t wrote on October 15, 2022 at 8:22 PM

It's not that our cells are optimally designed, it's that they are so intricate that any major deviation from the genome results in a non-viable organism.

Multi-cellular organisms are SO complex that there are extremely tight tolerances on most of their parts. A tiny deviation in one protein can mean the organism won't ever make it past fertilization. Most of those gene combinations were eliminated from the pool eons ago.

The 0.1% or whatever difference between humans is the wiggle room that can result in a viable human.

regular_modern_girl t1_ishp1le wrote on October 16, 2022 at 1:49 AM

yeah genes are just sequences of codons which each correspond to an amino acid subunit of a protein, certain amino acids have to be in just the right places in a protein’s structure for it to not end up as a useless squiggly mess (useless at best, potentially toxic at worst, just look at the formation of amyloid plaques), and if even one base pair is off in DNA, that changes a given codon to another one (meaning there will be the wrong amino acid, and the whole protein is probably ruined).

I do 3D printing, and kind of think protein synthesis and folding as similar in a certain way; when you’re 3D printing something (on a FDM printer, at least), all it takes is one little crossing of one layer being set down wrong, and before you know it, you have an unrecognizable mass of plastic spaghetti that doesn’t resemble what you were originally trying to print in the slightest, and you have no choice but to toss the whole thing in the recycle bin and start over. The problem is, with misfolded proteins there sometimes isn’t any “starting over” if they’re essential enough to a cell, and there often isn’t an analogue to a recycle bin either (so some misfolded proteins can just keep accumulating until there’s severe disease).

Basically, in both cases all it takes is one small error, and an entire print/protein ceases to be functional.

This is why mutations that lead to disease are generally more common than ones which end up being beneficial (as for an organism to benefit, it basically takes the altered protein actually being better than the original, or good for something else).

Splatulance t1_isiapmr wrote on October 16, 2022 at 5:00 AM

Some dna isn't transcribed but has a significant impact on transcription/expression. Transcription is like the publicly exposed API

regular_modern_girl t1_islczzm wrote on October 16, 2022 at 9:04 PM

Yeah I was kind of thinking how errors in promoters could be thought of like issues in the g-code (the programming language that 3D printers, laser cutters, etc. use) leading to certain layers not being printed and stuff like that.

Of course one aspect where this metaphor really breaks down is the time it takes to 3D print something versus a protein to fold into shape; the former takes anywhere from minutes to hours (depending on the size of the print, resolution, etc.), whereas the latter somehow occurs in just fractions of a second (and the mechanics of exactly how it happens so fast is still not entirely clear, which is why we still don’t really have accurate computer models of protein folding, and the field of protein engineering is still fairly nascent. Once we do have a better understanding, synthetic biology will enter a new age in which it will become not only possible to use tools like CRISPR Cas9 to edit genomes by inserting or removing pre-existing genes like we do now, but actually build entirely new genes from scratch, for novel proteins that have never existed in nature. We’ll basically have the most powerful pre-existing system for nano-engineering right at our fingertips).

Georgie_Leech t1_isgggwj wrote on October 15, 2022 at 8:13 PM

That's just how DNA is though, it doesn't tell us anything about how similar it is across a given population. Like, you wouldn't make the observation that "most books are written transcriptions of language."

Muroid t1_isggua6 wrote on October 15, 2022 at 8:16 PM

That’s really a better description of all DNA-utilizing life.

Humans are remarkably similar in their genetic code even by that baseline.

Quantum-Carrot t1_isgvs4l wrote on October 15, 2022 at 10:04 PM

You also have to take into account epigenetics, like DNA methylation and histone modifications.

> I'd imagine a ton of the "formulas" end up the same or similar, simply because it's the most optimal combination for the function.

This happens even between different species. It's called convergent evolution.

regular_modern_girl t1_ishnfh1 wrote on October 16, 2022 at 1:36 AM

I’ve brought this up several times here, but evolution doesn’t “optimize” things (at least in the way an intelligent being would), it’s a mindless process that stumbles onto “good enough” solutions for keeping organisms alive long enough to reproduce in a given niche. If evolution optimized things, we’d probably have really different anatomy.

adc34 t1_isgl231 wrote on October 15, 2022 at 8:46 PM

I don't agree with so much in your comment, that I have to reply, sorry. First of all, we don't start with the same gene set. Ok, maybe we did 4 billions years ago, but it's a pure speculation and we don't have any instruments to infer anything valuable from the fact that 'there was a single organism at some point of time from which everything evolved'. I'm pretty sure the whole picture is much-much more complicated. As for humans, I can say with a high confidence that there was not a single human being evolved that become the genesis of all humans. Specification is a complex thing and there's always a period of hybridisations with closely related species. I won't delve into it deeper, but some fishes even rely on other species for their reproduction. This fish is actually really amazing and there's a lot to unpack with its reproduction. Secondly, there's no such thing as random distribution that does something for organism fitness. The species that got an appropriate set of genes (and maybe even more than genes, like some epigenetic markers) survived. The ones who didn't, didn't. That's it. In many organisms there's not a unique set of genes ("formula") that lets it survive in a given environment. Gene regulation is incredibly intricate and has a ton of feedback loops. For example some genes, that are very important are often duplicated, like ribosomal or histone genes, and mutations in them doesn't do much.

Kevin_Uxbridge t1_ish01ze wrote on October 15, 2022 at 10:35 PM

So similar that it bespeaks of an interesting population history, recent bottlenecking and rapid expansion.

reginald_burke t1_ishxsag wrote on October 16, 2022 at 2:59 AM

Don’t we have good definitions for this, such as the Levenshtein edit distance? For your example, Levenshtein would say 24 edits (via 24 additions).

Ferociousfeind t1_isib75n wrote on October 16, 2022 at 5:05 AM

Single mutations can also involve the copying or deletion of large chunks of DNA. Levenshtein would be 23 edits off, because only one event was involved in adding a single 24-segment DNA piece. This is a simple thing to calculate, but it misses some of the behavior of mutation, and so misses a bit of the picture. The more true-to-life version is more complex, more nuanced, a bit more up to interpretation, and less capable of giving a single concrete percentage.

light24bulbs t1_isi9wop wrote on October 16, 2022 at 4:51 AM

Truly that's just a count of the number of differing base pairs, which makes complete sense. This isn't that complicated. I'm sure you could argue it isn't the most RELEVANT figure that a geneticist would be concerned with, but, I think it's fair to say that's what they would take it to mean. I'd love to know if I'm wrong about that.

It's binary data, run a diff and give me the count. Since we are talking about the number 24, if there's 24 base pairs out of the total different, it's just total / 24 = variance ratio.

Likewise, the average is simply: take any two people, could the number of base pairs differing in each or present in one and not the other. Do that many times between different people, that's the average.

[deleted] t1_isgjokt wrote on October 15, 2022 at 8:36 PM

[removed]

JackOCat t1_isic5l1 wrote on October 16, 2022 at 5:16 AM

What no one explains that often is that it isn't really differences in our proteins that make us different from each other (they do a bit of course). What really differentiates humans is all the differences in the timing genes that control our development. The fluctuations there have huge effects on how we look and function at our macro level.

[deleted] t1_isjehmg wrote on October 16, 2022 at 1:12 PM

[removed]

Nemisis_the_2nd t1_isgogji wrote on October 15, 2022 at 9:10 PM

It's tangentially related, but people don't realise just how accurate and consistent DNA replication is. I don't have the error rate for humans, but E.coli is 1 error every ~1,000,000,000,000 replications (give or take a 0. I also asume DNA error correction is taken into account here) For context, humans have ~3,200,000,000 nucleotides.

This incredibly low error rate means that organisms that are related to each, even if a common ancestor was a few dozen generations ago, will have very similar DNA. As a result, the broad "99.9%" statistic would likely be accurate for both coding and non-coding DNA.

hergen20 t1_ishqvvs wrote on October 16, 2022 at 2:04 AM

The averages that I learned were that a mutation occurs in 1 in 10000 during replication but DNA repair corrects all but 1 in 10000 mutants. Is your number a little low for error rate?

[deleted] t1_isgsjmj wrote on October 15, 2022 at 9:40 PM

[removed]

Yusnaan t1_isgqqta wrote on October 15, 2022 at 9:27 PM

Also don't forget that the epigenome (methylation, acetylation, histone modification, etc.) can make nearly identical DNA act very differently.

This has been observed in identical twins, and is still being heavily explored.

[deleted] t1_isi33ml wrote on October 16, 2022 at 3:45 AM

[removed]

[deleted] t1_isi531e wrote on October 16, 2022 at 4:03 AM

[removed]

Thormeaxozarliplon t1_ishp3m7 wrote on October 16, 2022 at 1:49 AM

This generally refers to total base pair comparison. However, the underlying concept of humans having very low genetic diversity is true. Things like gene allele frequencies are still an important way to measure diversity, and compared to most animals humans have extremely low genetic diversity due to human populations rapidly exploding in a geologically very short time as humans migrated around the world.

the__truthguy t1_ishsbbi wrote on October 16, 2022 at 2:15 AM

This old statistic is always thrown out there to show how "similar" we are, but it's actually a pretty useless statistic. We share 96% of our DNA with chimps. 99.84% with Neanderthals. And 2-4% of non-Africans get their DNA from Neanderthals.

A single SNP or 1/3,054,815,723 of the genome can be the difference between a normal person and a cold-blooded killer.

https://pubmed.ncbi.nlm.nih.gov/24326626/

dion_o t1_isillbz wrote on October 16, 2022 at 7:09 AM

We share more than 99% of out DNA with chimps. And slightly less than 99% of our DNA with lettuce.

jawshoeaw t1_iskblhi wrote on October 16, 2022 at 5:08 PM

We have almost no homology with lettuce. 10-15 % of coding DNA maybe . If you cherry pick (lettuce pick ?) certain genes common to all life on earth , then unsurprisingly we have 99% similar genes. But what about the lettuce genes for the spine? Or blood cells? There’s no way to compare because plants lack these genes completely.

And comparing to chimps even is misleading. Which genes are you comparing? Do they have to be 100% the same to count as 100% homologous? Any two cells in my body might be off .1% which changes count? One paper I read said we are only 18% homologous to chimps

TheSwarm2006 t1_isikm8w wrote on October 16, 2022 at 6:56 AM

To add on to this, most of the dna tells the body how to make a cells, organs, etc. Very few of this is a defining feature between humans and other animals. Also dna can repeat a LOT (look up salamander dna repitition)

994phij t1_isj39ih wrote on October 16, 2022 at 11:13 AM

Your paper talks about an increased risk for shooting and stabbing behaviors

That's not the difference between a normal person and a cold blooded killer, it's a single factor amongst many.

the__truthguy t1_isj509v wrote on October 16, 2022 at 11:34 AM

The MAOA 2-repeat allele is a very interesting mutation. I recommend reading all the papers published about it and educate yourself.

[deleted] t1_isifrua wrote on October 16, 2022 at 5:56 AM

[removed]

[deleted] t1_isj99f0 wrote on October 16, 2022 at 12:22 PM

[removed]

[deleted] t1_isjhtzn wrote on October 16, 2022 at 1:40 PM

[removed]

rva_law t1_isjbw6i wrote on October 16, 2022 at 12:48 PM

More recent and more accurate number reflects 99.4% of coding and non-coding DNA is shared between Homo sapiens individuals. But also more recent and possibly more important is the timing of coding gene activation. Called epigenetics, it's shown that even shared coding DNA expressed at slightly different times results in differences between individuals.

[deleted] t1_isgh4z9 wrote on October 15, 2022 at 8:18 PM

[removed]

[deleted] t1_isgq58c wrote on October 15, 2022 at 9:22 PM

[removed]

[deleted] t1_isgqo8g wrote on October 15, 2022 at 9:26 PM

[removed]

[deleted] t1_isgrl3s wrote on October 15, 2022 at 9:33 PM

[removed]

[deleted] t1_ish6wv7 wrote on October 15, 2022 at 11:27 PM

[removed]

[deleted] t1_ish80ze wrote on October 15, 2022 at 11:35 PM

[removed]

[deleted] t1_ishc7w3 wrote on October 16, 2022 at 12:07 AM

[removed]

[deleted] t1_ishlp0l wrote on October 16, 2022 at 1:22 AM

[removed]

[deleted] t1_ishpq6a wrote on October 16, 2022 at 1:54 AM

[removed]

[deleted] t1_ishtgbq wrote on October 16, 2022 at 2:24 AM

[removed]

[deleted] t1_isijc0k wrote on October 16, 2022 at 6:39 AM

[removed]

[deleted] t1_isix3uo wrote on October 16, 2022 at 9:49 AM

[removed]

[deleted] t1_isix7e5 wrote on October 16, 2022 at 9:51 AM

[removed]

[deleted] t1_isjl1li wrote on October 16, 2022 at 2:06 PM

[removed]

[deleted] t1_isjq95f wrote on October 16, 2022 at 2:45 PM

[removed]

murica_dream t1_isk5hog wrote on October 16, 2022 at 4:28 PM

People don't appreciate how much it takes to code cells to create a beating heart from scratch. We all have need the same basic body parts.

The misconception is our cultural obsession with being special, unique, and perfect to justify our existence.

In more mature cultures, the focus is on collective society and you justify your existence by contributing to society. It can be volunteering like other people, doing the same exact things as other people but you bring value and that justify your worth.

dos0mething t1_ish6mhh wrote on October 15, 2022 at 11:24 PM

this is a buzzword, plain and simple. It takes one nucleotide switch to go from perfectly healthy to sickle cell disease. The vast majority of genetic information is nonsense on purpose. if every line of genetic code mattered, any genetic insult would result in catastrophic change.

Ok_Common_1700 t1_ishcp2h wrote on October 16, 2022 at 12:11 AM

99% of our DNA is noncoding (does not contain information for the production of proteins), but that doesn't mean it is junk. It may well be involved in the control of gene activity.

dos0mething t1_iskdr37 wrote on October 16, 2022 at 5:22 PM

I didn't say all, i said vast majority. I am well aware that there are epigenetic factors indirectly related to gene expression. What I'm saying is there is a quality over quantity argument in which a change in the right spot is what matters, not a raw number which is what is touted in these buzzword

[deleted] t1_ish952z wrote on October 15, 2022 at 11:43 PM

[removed]

jethomas5 t1_ishnaya wrote on October 16, 2022 at 1:35 AM

A fraction of DNA is there to aid in folding and orderly structure. That DNA has generally the same sequence, but it can get duplications and such without causing much trouble, and that does happen. That DNA is important, but it pretty clearly isn't coding for anything.

A lot of the rest is unclear. We COULD collect a lot of junk that caused no problems. It could just collect and do nothing. Useless fragments of ancient viruses etc. That's plausible. Or the same DNA could be somehow important. We might have libraries of inactivated viruses so we can recognize them if they show up again. There are lots and lots of possibilities.

The next 50 years are sure to give us a lot of exciting discoveries if climate change doesn't stop us. (Or nuclear war. That would be sad, to knock ourselves down entirely due to our own inability to solve problems.)

DarkestDusk t1_ishnlmf wrote on October 16, 2022 at 1:38 AM

Well, thank you for the answer jethomas5.

[deleted] t1_ishb1ca wrote on October 15, 2022 at 11:58 PM

[removed]

CanadianJogger t1_isietuw wrote on October 16, 2022 at 5:45 AM

>any genetic insult would result in catastrophic change.

That's nonsense. Every cell has its own DNA. After the zygote and blastocyte phases, genetic redundancy increasingly wards against catastrophe.

dos0mething t1_iskdz8j wrote on October 16, 2022 at 5:23 PM

Don't think you read clearly what I wrote. I said if there was no genetic redundancy or large swaths of intronic regions, any genetic insult would result in catastrophe.

Comments