PaLM

Large language model developed by Google
PaLM
Developer(s)Google AI
PredecessorLaMDA
SuccessorGemini
Available inEnglish
TypeLarge language model
Websiteai.google/discover/palm2/ Edit this on Wikidata

PaLM (Pathways Language Model) is a 540 billion parameter transformer-based large language model developed by Google AI.[1] Researchers also trained smaller versions of PaLM, 8 and 62 billion parameter models, to test the effects of model scale.[2]

PaLM is capable of a wide range of tasks, including commonsense reasoning, arithmetic reasoning, joke explanation, code generation, and translation.[2][3][4][5] When combined with chain-of-thought prompting, PaLM achieved significantly better performance on datasets requiring reasoning of multiple steps, such as word problems and logic-based questions.[1][2]

The model was first announced in April 2022 and remained private until March 2023, when Google launched an API for PaLM and several other technologies.[6] The API was initially available to a limited number of developers who joined a waitlist before it was released to the public.[7]

Google and DeepMind developed a version of PaLM 540B called Med-PaLM that is fine-tuned on medical data and outperforms previous models on medical question answering benchmarks.[8][9] Med-PaLM was the first to obtain a passing score on U.S. medical licensing questions, and in addition to answering both multiple choice and open-ended questions accurately, it also provides reasoning and is able to evaluate its own responses.[10]

Google also extended PaLM using a vision transformer to create PaLM-E, a state-of-the-art vision-language model that can be used for robotic manipulation.[11][12] The model can perform tasks in robotics competitively without the need for retraining or fine-tuning.[13]

In May 2023, Google announced PaLM 2 at the annual Google I/O keynote.[14] PaLM 2 is reported to be a 340 billion parameter model trained on 3.6 trillion tokens.[15]

In June 2023, Google announced AudioPaLM for speech-to-speech translation, which uses the PaLM-2 architecture and initialization.[16]

Training

PaLM is pre-trained on a high-quality corpus of 780 billion tokens that comprise various natural language tasks and use cases. This dataset includes filtered webpages, books, Wikipedia articles, news articles, source code obtained from open source repositories on GitHub, and social media conversations.[1][2] It is based on the dataset used to train Google's LaMDA model.[2] The social media conversation portion of the dataset makes up 50% of the corpus, which aids the model in its conversational capabilities.[2]

PaLM 540B was trained over two TPU v4 Pods with 3,072 TPU v4 chips in each Pod attached to 768 hosts, connected using a combination of model and data parallelism, which is the largest TPU configuration described to date.[2][17] This allowed for efficient training at scale, using 6,144 chips, and marked a record for the highest training efficiency achieved for LLMs at this scale: a hardware FLOPs utilization of 57.8%.[3]

See also

  • LaMDA, PaLM's predecessor
  • Gemini, PaLM's successor
  • Chinchilla

References

  1. ^ a b c Narang, Sharan; Chowdhery, Aakanksha. "Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance". ai.googleblog.com. Retrieved 17 March 2023.
  2. ^ a b c d e f g Chowdhery, Aakanksha; Narang, Sharan; Devlin, Jacob; et al. (2022). "PaLM: Scaling Language Modeling with Pathways". arXiv:2204.02311 [cs.CL].
  3. ^ a b Anadiotis, George (12 April 2022). "Google sets the bar for AI language models with PaLM". VentureBeat. Retrieved 17 March 2023.
  4. ^ Bastian, Matthias (5 April 2022). "Google PaLM: Giant language AI can explain jokes". THE DECODER. Retrieved 17 March 2023.
  5. ^ "Google: Why Is No One Talking About PaLM (NASDAQ:GOOG) | Seeking Alpha". seekingalpha.com. 12 December 2022. Retrieved 17 March 2023.
  6. ^ Vincent, James (14 March 2023). "Google opens up its AI language model PaLM to challenge OpenAI and GPT-3". The Verge. Retrieved 17 March 2023.
  7. ^ Huffman, Scott; Woodward, Josh. "PaLM API & MakerSuite: an approachable way to start prototyping and building generative AI applications". Retrieved 17 March 2023.
  8. ^ Singhal, Karan; Azizi, Shekoofeh; Tu, Tao; et al. (2022). "Large Language Models Encode Clinical Knowledge". arXiv:2212.13138 [cs.CL].
  9. ^ "MedPaLM: New Chatbots Will Soon Be Better Than Waiting For A Doctor". The Medical Futurist. 17 January 2023. Retrieved 17 March 2023.
  10. ^ Matias, Yossi; Corrado, Greg (14 March 2023). "Our latest health AI research updates". Google. Retrieved 17 March 2023.
  11. ^ Driess, Danny; Xia, Fei; Sajjadi, Mehdi S. M.; et al. (2023). "PaLM-E: An Embodied Multimodal Language Model". arXiv:2303.03378 [cs.LG].
  12. ^ Driess, Danny; Florence, Pete. "PaLM-E: An embodied multimodal language model". ai.googleblog.com. Retrieved 17 March 2023.
  13. ^ Edwards, Benj (7 March 2023). "Google's PaLM-E is a generalist robot brain that takes commands". Ars Technica. Retrieved 17 March 2023.
  14. ^ Lardinois, Frederic (May 10, 2023). "Google launches PaLM 2, its next-gen large language model". TechCrunch. Archived from the original on May 10, 2023. Retrieved May 10, 2023.
  15. ^ Elias, Jennifer (16 May 2023). "Google's newest A.I. model uses nearly five times more text data for training than its predecessor". CNBC. Retrieved 18 May 2023.
  16. ^ "AudioPaLM". google-research.github.io. Retrieved 2023-06-30.
  17. ^ "An empirical analysis of compute-optimal large language model training". www.deepmind.com. Retrieved 17 March 2023.
  • v
  • t
  • e
Computer programs
AlphaGo
Versions
Competitions
In popular culture
Other
Machine learning
Neural networks
  • WaveNet (2016)
  • Transformer (2017)
  • Gato (2022)
Other
Generative AI
Chatbots
  • Assistant (2016)
  • Sparrow (2022)
  • Gemini (2023)
Language models
  • BERT (2018)
  • LaMDA (2021)
  • Chinchilla (2022)
  • PaLM (2022)
  • Gemini (2023)
  • VideoPoet (2024)
See also
  • Category
  • Commons
  • v
  • t
  • e
Company
Divisions
People
Current
Former
Real estate
Design
Events
YouTube
Projects and
initiatives
Criticism
YouTube
Operating systems
Libraries/
frameworks
Platforms
Apigee
Tools
Search algorithms
Others
File formats
Entertainment
Play
YouTube
Communication
Search
Navigation
Business
and finance
Organization
and productivity
Docs Editors
Publishing
Education
Others
Chrome
Images and
photography
Hardware
Smartphones
Laptops and tablets
Wearables
Others
  • v
  • t
  • e
Advertising
Antitrust
Intellectual property
Privacy
Other
  • Category
Terms and phrases
Documentaries
Books
Popular culture
Others
  • v
  • t
  • e
Differentiable computing
General
Concepts
Applications
Hardware
Software libraries
Implementations
Audio–visual
Verbal
Decisional
People
Organizations
Architectures
  • Portals
    • Computer programming
    • Technology
  • Categories
    • Artificial neural networks
    • Machine learning
  • v
  • t
  • e
General terms
Text analysis
Text segmentation
Automatic summarization
Machine translation
Distributional semantics models
Language resources,
datasets and corpora
Types and
standards
Data
Automatic identification
and data capture
Topic model
Computer-assisted
reviewing
Natural language
user interface
Related