Amazon's new Alexa Teacher Models language model (AlexaTM 20B) has outperformed OpenAI GPT-3 and Google PaLM in various NLP tests. The model has not yet been published, but it has already managed to win over experts.
The OpenAI GPT-3 and Google PaLM models previously introduced were decoder only models, while AlexaTM 20B is a seq2seq model with an encoder and decoder to improve machine translation (MT) and generalization performance. It demonstrates SOTA's ability to learn in a few frames.
Amazon's new language model outperformed GPT-3 in the SQuADv2 and SuperGLUE benchmarks at 1/8 parameter count and achieved excellent performance in MT tasks with multiple shots.
AlexaTM 20B was the best even of a benchmark such as MLSum, outperforming all single-shot summarization models in Spanish, German, French and most language pairs on single-shot MT tasks. As for English languages, the model outperformed GPT-3 in MT tasks, but lost to the larger PaLM model.
Check out the GitHub repository and arXiv paper.
Another name for the model is "sequence-sequence". It is a special class of recurrent neural network architecture. Because of this, the neural network model can now not only solve complex university-level mathematical problems, but also help with complex language tasks, including machine translation, chatbot creation, question answering, text summarization, etc.
Comments