N-Gram Analysis

N-Gram Analysis
N-Gram Analysis

An n-gram is a collection of n successive items in a text document that may include words, numbers, symbols, and punctuation. N-gram models are useful in many text analytics applications, where sequences of words are relevant such as in sentiment analysis, text classification, and text generation.

Applications

An n-gram model is a type of probabilistic language model for predicting the next item in such a sequence in the form of a (n − 1)–order Markov model.[2]n-gram models are now widely used in probabilitycommunication theorycomputational linguistics (for instance, statistical natural language processing), computational biology (for instance, biological sequence analysis), and data compression. Two benefits of n-gram models (and algorithms that use them) are simplicity and scalability – with larger n, a model can store more context with a well-understood space-time tradeoff, enabling small experiments to scale up efficiently.

Examples

Here are further examples; these are word-level 3-grams and 4-grams (and counts of the number of times they appeared) from the Google n-gram corpus.[3]

3-grams

  • ceramics collectibles collectibles (55)
  • ceramics collectibles fine (130)
  • ceramics collected by (52)
  • ceramics collectible pottery (50)
  • ceramics collectibles cooking (45)

4-grams

  • serve as the incoming (92)
  • serve as the incubator (99)
  • serve as the independent (794)
  • serve as the index (223)
  • serve as the indication (72)
  • serve as the indicator (120)

FAQs

  • How do you make N-Grams?

Conclusion

N-gram models are a powerful tool for text analytics and should be considered when analyzing text data. They can be used to determine sentiment, classify documents, and generate text. If you’re looking for a way to get more insights from your text data, consider using an n-gram model.


Published on: 2022-03-28
Updated on: 2022-05-12

Avatar for Isaac Adams-Hands

Isaac Adams-Hands

Isaac Adams-Hands is the SEO Director at SEO North, a company that provides Search Engine Optimization services. Isaac has considerable expertise in Search Engine Optimization, Server Administration, and Cyber Security, which gives him a leg up as a Google Algorithm Analyst and SEO Expert.