Google Definitions is a search result at the top of Google Search Result Pages that lists a definition without the user clicking on the result to find the information. Google Definitions will give the user a quick answer without having to search for it on a webpage. This saves time for users, but it is also seen as a zero-click search result.
You can learn a lot from Google’s patent:
Google Definitions Patent
(US Patent Application 240040236739)
Invented by Craig Nevill-Manning
Filed on June 27, 2003
Published on November 25, 2004
Patent Granted on 8/28/2012.
A system and method for providing definitions are described. A phrase to be defined is received. One or more documents, which each contain at least one definition, are determined. The phrase is matched to at least one of the definitions. One or more definitions for the phrase are presented.
Where do Google definitions come from?
Google sources its definitions from several locations, according to the patent application.
- Web-crawling and Spidering: If determined that a page contains definitions.
- Authoritative sources: such as Wikipedia
- Real-time Queries: Search engines finding definitions in real-time.
- A Mix: Search engines could use all three methods together at once.
How does Google Determine that a document has definitions?
Here are some possibilities of how Google finds definitions:
- Terms: such as “glossary,” “definition,” “dictionary,” and other synonyms
- Wording: The flow of content, surrounding words, and meta
- HTML: Code that conveys meaning
Possible methods used to parse documents and identify topics used for Google Definitions:
The patent application tells us that “definition containing documents” may be organized with headwords. A headword is a word or phrase used to identify and separate from the definition for words.
Some examples from the Patent application how Google my identify definitions:
- HTML definition tags: The <dl> HTML element represents a description list. The element encloses a list of groups of terms (specified using the <dt> element) and descriptions (provided by <dd> elements). Common Uses for this element is implementing a glossary or displaying metadata (a list of key-value pairs).
- HTML separators: The search engine will look for HTML such as <p>, <tr>, and <li> to identifiy different definitions.
- HTML identifications: HTML such as <b>, <strong>, <em>, <code>, or <span> may be helpful in identifying headwords.
- The Number of definitions: The patent application notes that a certain number of definitions are needed to qualify on a page; if not, Google my remove those definitions from consideration.
- Referring Resources: With the internet’s size, accurate definitions can be legitimized from duplicate referral sources.
How does Google process definitions?
- PageRank: Which definition Google decides to use could be based on PageRank.
- HTML Processing: One or more of the following steps might be taken when presenting Google definitions.
- HTML markup
- Parsing out white space
- Removing unnecessary punctuations: (. : ; ! ? -)
- Eliminating non-alpha, non-parenthesis and non-alphanumeric characters
- Google adds a capital letter to the definition.
A definition could be disqualified if it:
- It starts with “see” in the text
- Existing Definition
Presenting additional information
Sometimes an exact match definition isn’t always possible, in those cases, Google could:
- Use Related Topics A word could have a parent or related headwords that Google could show as a result.
- When No definitions were found: Alternate or related terms could be presented to determine if that searcher was looking for that result.
Google Definitions Takeaway
It’s essential to understand this process may not even be used by Google. But still, it might offer insight into how Google may process these requests.
Credit must be given to Bill Slawski; I started reading his articles and found them too advanced, so I started re-writing them to understand the patents better. For a more advanced understanding of these patents, please check out his blog.
Published on: 2021-08-23
Updated on: 2021-09-03