December 2, 2023

Taking a serious hit from OpenAI’s GPT-4, Google fought back with a new, more powerful large language model (LLM) to upgrade Bard and create a new suite of AI services—starting with a model aimed at doctors. It also teased its next-generation Gemini AI.

Introduced at the company’s I/O ’23 conference, Handheld 2 according to a Technical Reportsbut Google has chosen to highlight three areas where it believes the new model is particularly strong.

The first is multilingualism. PaLM 2’s training data is loaded with a larger proportion of non-English text, and it can now pass a range of different language exams at the “proficient” level. It now outperforms Google’s own translation engine and shows a subtle understanding of language, idioms, metaphors and the culture behind them.

The second is “reasoning” — there has been a strong focus on math and science papers in the training data, and Google says it’s demonstrating “improved capabilities in logic, commonsense reasoning, and math.” Mathematics in particular is an area where LLM as a whole is struggling; it’s not their strong suit – indeed, while PaLM 2 does beat GPT-4 on selected benchmarks, the gains here appear to be incremental rather than revolutionary of.

The third is coding, an area where these LLMs have great potential. Google claims that PaLM 2 is super powerful in Python and Javascript, but also very powerful in a range of more specialized programming languages.

(embed) (/embed)

Introducing PaLM 2, Google’s next-generation large-scale language model | Research Bytes

PaLM 2 has already launched as part of the company’s embattled Bard Search AI. It now also appears in Workspace as a collaboration, including Gmail and Google Docs.Double AIwhich generates images and text for your projects, helps you brainstorm ideas, organize spreadsheets, analyze and label data, and do other little things designed to get things done.

But perhaps more interestingly, Google is branching out into industry-specific AI models, starting with one aimed specifically at doctors.

Most of the calibration and human feedback for Med-PaLM 2 was performed by a team of health researchers and medical professionals. As a result, it became the first AI to achieve an “expert” level on a test designed to mimic the U.S. medical licensing exam. It can answer a wide variety of health-related questions and is widely used in various medical literature.

Like GPT-4, PaLM 2 is starting to gain multimodal capabilities—the ability to understand images and other media in the same way it “understands” text. In the context of Med-PaLM 2, that means it will soon be able to look at your X-rays and other medical scans and report on them — an area where AI has excelled in early trials, sometimes outpacing medical expert.

(embed) (/embed)

Med-PaLM 2, our expert medical LLM | Research Bytes

Google will open up the tool to a small group of users in the coming months, with the aim of “identifying safe, useful use cases” that would allow Med-PaLM 2 to roll out to doctors’ offices. It’s both exciting and intimidating. It promises a leap forward in healthcare and could put incredible tools in the hands of medical professionals.

At the same time, it’s hard to ignore the fact that many of humanity’s best and brightest students go into medicine. ChatGPT is already far better than human doctors at answering medical questions—with superior bedside attitude and empathy, as judged by healthcare professionals themselves—as these machines inevitably expand their capabilities to outperform doctors across the board. At the time, it would be another humble pie for a species that considers itself very special.

Google also used the opportunity to announce a restructuring plan that it hopes will “significantly accelerate” the development of next-generation artificial intelligence, merging the Google Research Brain team with DeepMind to form google deep thinking.

With that, the company reveals what sounds like an absolute beast of AI in development: “We’re already working on Gemini – the next model we’ve created from scratch is multimodal and very efficient in terms of tools and API integration , and is designed to enable future innovations such as memory and planning. Gemini is still being trained, but it has already demonstrated multimodal capabilities never seen in previous models. Once fine-tuned and rigorously tested for safety, Gemini will offer a variety of The size and capabilities, like PaLM 2, ensure it can be deployed across different products, applications and devices to benefit everyone.”

Gemini was trained from day one on audio, video, image and other media (as well as text) and the ability to use other tools and APIs, which means that compared to today’s large LLMs, this thing is designed to be like Humans learn, and its ability to interact with the outside world in many ways is more than just a text window, it’s built in rather than added on. It could well prove to be as big of a leap as anything else we’ve seen in the past six months — a thought-provoking thought in itself.

On the surface, today’s announcement seems to indicate that Google has made solid progress, bringing it closer to what OpenAI has achieved with GPT-4 for several months. The stock market certainly seemed content, pushing Alphabet stock up more than 4% — but it will be interesting to see how PaLM 2 performs in the harsh real-world light in the coming weeks.

You can watch the full Google I/O keynote in the video below.

(embed) (/embed)

Google Keynote (Google I/O ’23)

source: google artificial intelligence