The first complex cells had genes from a complex mix of species

Jun 11, 2026 - 19:10
0 2
The first complex cells had genes from a complex mix of species

Our ancestors’ genomes were built through successive waves of gene transfers.

We tend to view ourselves and the complex cells that build us as a distinct branch of the tree of life from the compact, seemingly featureless cells of bacteria and archaea. But we’ve found that our genome is actually a hybrid, a mish-mash of genes from bacteria and archaea, along with some that have evolved in our own lineage.

Scientists gradually settled on a simple explanation for this: the first complex cells were the product of a fusion between archaeal cells and bacteria, with the bacteria ultimately evolving into the mitochondria, a chemical-power-generating structure that still retains a bit of its own genome. Over time, many of the other bacterial genes were transferred to the nucleus of what was becoming what we now call a eukaryote, intermingling with the archaeal genes there.

But a new study has taken a careful look at some of the genes shared by all eukaryotes and comes to the conclusion that the reality is a little more complicated and that there were several waves of gene transfers from bacteria. The big picture of a merger between bacteria and archaea is still right, but it was only part of a picture where gene transfers among species were commonplace.

Clouding the big picture

The road to the current picture was a complicated one. For starters, it took ages for anyone to even recognize that archaea were a distinct lineage. And the big advocate for mitochondria being the product of bacteria taking up residence in a different cell was laughed at for a number of years before her ideas became widely accepted, after which she started arguing that every complex structure inside eukaryotic cells had come about through similar processes. (There’s no evidence this is the case.)

Over time, especially as genome sequences became widely available, it became clear that the mitochondria’s genes, both the ones in their tiny remaining genome and the ones that are now found in our cells’ nucleus, originated in a bacterial lineage called alphaproteobacteria. But figuring out what had swallowed the alphaproteobacteria in the first place took a while longer. Suspicion fell on the archaea, but there are some key biochemical differences between them and us, and known archaea lacked even rudimentary versions of many of the systems that are key features of eukaryotes. Plus there were no archaeal genomes that were especially close to those of eukaryotes.

That only changed about a decade ago. After researchers developed the ability to assemble entire genomes from environmental samples without first separating out different cell types, they discovered the Asgard archaea, a group so closely related to eukaryotes that it led people to ask whether we shouldn’t just consider eukaryotes an elaborate branch of the archaea.

But even as the big picture became ever more complete, there was a steady addition of complications. For example, it became very clear that horizontal gene transfer—the swapping of genes among species that may only be distantly related—was incredibly common in microbial communities. Consistent with this, researchers continued to find clusters of genes in eukaryotes that came from lineages other than alphaproteobacteria.

Plus there was the challenge of figuring out what the ancestral eukaryotic genome looked like. If we defined a eukaryotic gene simply as a gene that hadn’t shown up in a bacterial or archaeal genome yet, that ran the risk of it turning up in a later discovery. We also may lack a broad enough collection of eukaryotic genomes to be confident of our ability to judge which genes belonged to the common ancestor of all eukaryotes. Different assumptions about issues like this could produce different evolutionary histories.

The first eukaryotes

That’s the situation that a group of Barcelona-based researchers decided to wade into. Their first step was to limit the number of species included in the eukaryotic family tree. The species we’ve sequenced tend to be heavily weighted toward animals and species found in familiar environments, resulting in an overrepresentation of some branches of the family tree. The team selected species that resulted in a more even distribution of samples across the tree.

Within the genomes they kept, they got rid of any genes that would produce what they termed “low complexity” proteins—think of something that repeated short stretches of the same amino acids a lot of times. A lot of eukaryotic genes are also close relatives of one another, brought about by multiple duplications of an ancestral gene. The researchers only kept one gene from these collections of related proteins. This resulted in a much smaller group of genes than are present in a typical eukaryotic genome.

And, once all of that was done, they repeated the process two more times, making different choices each time such that over half of the genes in each of the groups differed from the ones in any of the other groups. (Amusingly, the groups of genes in each were termed “orthologous groups,” leading to a lot of the paper talking about “OGs.”)

Looking through the functions of the genes that were present in these simplified genomes, the researchers could make some estimates of what sorts of genetic functions were present in the last common ancestor of all eukaryotes. Their conclusion is that the organism lived in an oxygen-containing environment and harvested energy either by eating other living things or feeding on their remains.

These cells already had complex interiors, with internal protein trackways traversed by motor proteins that move cargo within the cell. There were structures (lysosomes and peroxisomes) meant to digest proteins within the cells, and all the basics of eukaryotic metabolism, DNA replication, and RNA production. One of the big features that was absent were sets of genes used to determine when a cell should divide and managing the events that need to take place for that to happen. This may suggest that cell division started out as simply limited by metabolic concerns.

How’d this happen?

Roughly a third of gene groups appear to be distinct to eukaryotes and don’t have equivalents in other kingdoms. Some of those may have been present in the lineage that produced the last common ancestor of eukaryotes, and some may have been generated before eukaryotes really started to diversify and branch out.

As expected, many of the other genes came from either the Asgard archaea or Alphaproteobacteria, consistent with the big picture model of our origins. But the researchers also found roughly equal contributions from two other bacterial groups: Planctomycetota and Myxococcota. (All of the bacterial groups involved are diverse and relatively common, in sharp contrast to the Asgard archaea.) These results held up in each of the three different choices of genes they had performed, so aren’t likely to be an artifact of the analysis.

There were also small contributions from a range of different bacterial groups. But species from the group of viruses that includes giant viruses contributed more than any single bacterial group.

The researchers also estimated the timing of when groups of genes were introduced. Asgard archaea represent the earliest contribution, as would be expected. But there was a bacterial lineage that introduced a lot of genes before the mitochondria were present and a second group that made a major contribution afterward. This makes sense if eukaryotes evolved within a microbial mat, where lots of species are in close proximity for long periods of time and may depend on each other for certain metabolites.

A complicated picture

While it’s clear that there was at least one case of endosymbiosis here—bacteria living on within another cell and eventually forming the mitochondria—it’s entirely possible there were others that were resolved in a way that only left a genetic remnant. It’s impossible to tell from this data.

The viral contribution may also be complex. It’s possible some of the genes are identified as viral because they’re found in present-day viruses, but had originated in other lineages that ultimately died out. Viruses commonly mediate horizontal gene transfer in microbial communities, which would be similar to this sort of scenario.

The authors also acknowledge that this paper won’t be the last word on the topic. “Database completeness,” they write, “rather than methodological choices is likely to drive differences across studies.” In other words, as more genome sequences continue to find their way into public data banks, it may be worth revisiting this analysis. (The quoted statement also explains why their results may differ from others who have attempted similar things in the past.)

But, regardless of the uncertainties and the prospect of future revisions, the authors make one statement that will almost certainly remain correct, even if some of their results don’t hold up: “The prokaryotic-to-eukaryotic transition was probably a gradual and complex process.”

Nature, 2026. DOI: 10.1038/s41586-026-10639-9  (About DOIs).

Photo of John Timmer

John is Ars Technica's science editor. He has a Bachelor of Arts in Biochemistry from Columbia University, and a Ph.D. in Molecular and Cell Biology from the University of California, Berkeley. When physically separated from his keyboard, he tends to seek out a bicycle, or a scenic location for communing with his hiking boots.

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Wow Wow 0
Sad Sad 0
Angry Angry 0

Comments (0)

User