Google open-sources MT5, a multilingual model trained on over 101 languages

Venture Beat | Oct 26, 2020 at 5:37 PM
  • Not to be outdone by Facebook and Microsoft, both of whom detailed cutting-edge machine learning language algorithms in late October, Google this week open-sourced a model called MT5 that the company claims achieves state-of-the-art results on a range of English natural processing tasks.
  • MC4 covers 107 languages with 10,000 or more webpages across all of the 71 monthly scrapes released to date by Common Crawl.
  • Some studies suggest that open-domain question-answering models — models theoretically capable of responding to novel questions with novel answers — often simply memorize answers found in the data on which they’re trained, depending on the data set.