1. Headline
  1. Headline
Discovery
A new algorithm analyzes the text of the Bible to try and decipher its authors.
By
updated 10/14/2011 1:36:24 PM ET 2011-10-14T17:36:24

A group of Israeli researchers has built a computer algorithm to decode one of the most important books in Western culture: the Bible.

The results accord generally with the consensus of scholars that the book contains writing styles defined as "priestly" and "non-priestly."

BLOG: Supercomputer Predicts Civil Unrest

The scientists developed an algorithm able to analyze the writing styles found in different parts of the "five books of Moses," or Pentateuch, that is Genesis, Exodus, Leviticus, Numbers and Deuteronomy.

  1. More from TODAY.com
    1. 'Unconditional mother’s love': Get the story behind the sweetest photo

      This photo touched thousands of hearts when TODAY viewer Ariane Grabill shared it with us summer — a shot of her cradling ...

    2. Can this hobby help you live longer? 104-year-old shares health secret
    3. How to make a traditional Christmas Eve dinner fit for kings
    4. Mike Myers brings back Dr. Evil in guest-filled 'Saturday Night Live'
    5. High school sweethearts wed in Hobbit, Harry Potter-inspired DIY bash

The algorithm compared sets of synonyms (called synsets) in blocks of text, along with "function" words, such as prepositions. It then looked at the distribution of the most common words in the Bible. By finding sets that were similar in any two blocks, it was able to group them according to the style they were written in.

The synonyms were identified using Hebrew roots that were translated the same way in the King James version, based largely on the work of the 19th century scholar James Strong.


DNEWS VIDEO: CAN YOU REALLY DETECT A LIE?

Computer scientist Moshe Koppel of Bar-Ilan University, a member of the team that developed the algorithm, noted one interesting result: the synonyms for "God" weren't that important. "Some of the (synonyms) that do the heavy lifting on the Pentateuch had been noted before by scholars, but the most famous synset -- names of God -- actually didn't help at all."

That may sound counter intuitive, but Koppel said there are about 150 different sets, so the fact that a word of historical significance doesn't help determine authorship isn't that shocking.

BLOG: The Star of Bethlehem: Was it Jupiter?

To test out the algorithm, the researchers used it to analyze two well-known books of the Bible, Jeremiah and Ezekiel, who scholars agree had two different authors. They cut the text up and mixed them together at random. The algorithm managed to separate the two with near 99 percent accuracy, demonstrating that the method worked.

Koppel stressed that the algorithm can't say exactly how many authors the Bible has (or doesn't have). But it can say where styles change. That alone can shed light on debates over authorship. Generally speaking current scholarship divides the Pentateuch into two writing styles: priestly and non-priestly. The algorithm in most areas divided the text the same way, so that would seem to show that the division is valid.

NEWS: God's Wife Edited Out of the Bible - Almost

But there was one big caveat: the researchers had to tell the algorithm how many stylistic "families" they wanted the text to be split into. While asking for two gave a result that agreed generally with scholarly consensus, dividing the text into more than that seemed to stray from it.

University of Pennsylvania professor of linguistics Mark Liberman, who wasn't connected with the research, noted the big innovation was the use of synsets rather than just the location of words or their frequencies.

"The key to making such methods work is to hit on features (words or constructions or word-senses or whatever) that genuinely differentiate the authors," he said. "In their experiment on un-munging Jeremiah and Ezekiel, they found that word distributions did not work well; but synonym choice (as estimated in a clever way) did work."

That could make the algorithm useful for analyzing other historic texts. Because it uses criteria not subject to interpretation. Ignoring what the writer "meant," it can quickly zero in on what was actually written. It can also pick up more subtle changes in word use and distribution than a human can, since it can instantly check through hundreds of synonym sets.

© 2012 Discovery Channel

Discuss:

Discussion comments

,