Covering Scientific & Technical AI | Thursday, November 28, 2024

Microsoft Azure Bringing Semantic Search Public Preview to All Users 

As Microsoft continues to develop and bolster its growing collection of semantic search tools and features, the company is expanding their reach into new product lines to deliver cutting edge capabilities to a broader range of customers.

This week Azure Cloud is the latest Microsoft platform to get semantic search capabilities, as the company unveiled the arrival of the technology for all Azure users on a preview basis. Semantic search adds depth to simple keywords by also bringing in contextual meaning and intent to help users more closely find answers they are searching for.

“Today, Microsoft is excited to announce that we’re bringing semantic search capabilities to all Azure customers in preview,” wrote a team of Microsoft staffers including Rangan Majumder, Microsoft’s vice president of search and AI, in a March 2 post on the Microsoft Research Blog.

“You no longer need a team of deep learning experts to take advantage of this technology: we packaged all the best AI at Scale technology, from models to software to hardware, into a single end-to-end AI solution,” the post continued. AI at Scale is a Microsoft initiative that aims to develop next-generation AI capabilities that are scaled across its products and AI platforms.

Among the new semantic search features being released in the Azure public preview is semantic ranking, which moves beyond keyword-based ranking to a Transformer-based semantic ranking engine that understands the meaning behind the text, according to the blog post. Semantic ranking goes beyond keywords to capture the meaning behind search terms to return more relevant results.

Semantic rankings replace traditional keyword-based retrieval and ranking frameworks with a ranking algorithm using deep neural networks. The algorithm prioritizes search results based on how meaningful they are based on query relevance.

Also included are semantic captions, which use extractive summarization to pull a snippet from a document that best summarizes why it might be relevant for a query, and semantic highlights, which goes beyond keyword-based highlighting in captions and snippets to more quickly help users to immediately and directly find the answer they are looking for, the post continued. “Machine reading comprehension enables semantic highlights of relevant words or phrases on answers and captions to save people time.”

Another included new semantic feature, called instant answers, uses machine learning to read through all the documents in the search findings, run extractive summarization, and then use machine reading comprehension to provide a direct answer to an individual’s question at the top of the results.

One other useful feature is also included – automatic spell correction. It turns out that 10 percent to 15 percent of the queries issued to search engines are misspelled, according to Microsoft’s research. “When a query is misspelled, it’s difficult for any of the downstream search components to deliver results that match intent,” the post reports. “Semantic search enables automatic spell correction, so customers don’t have to worry about having the perfect spelling.”

Before including the new semantic search features in Azure under preview, the company tested them on its own products and found “dramatic improvements in results we achieved by applying AI at Scale technology,” the post said.

Image credit: Shutterstock

The Azure semantic search features integrate what Microsoft estimates are hundreds of development years and millions of dollars in compute time amassed by the company’s Bing search team. The researchers noted they have relied on recent development in transformer-based language models to boost the quality of Bing search results.

So far, the semantic search engine only supports U.S. English. Microsoft said it expects to add other unspecified languages soon.

Microsoft has been steadily upgrading its enterprise search capabilities, recently targeting previously “unsearchable” unstructured data in the form of PDFs, Word documents, text files and JPEGs. The result was Azure Cognitive Search, a cloud-based service with built-in AI capabilities, announced in 2018.

The cognitive search engine is based on the BM25 algorithm, (as in “best match”), an industry standard for information retrieval via full-text, keyword-based searches.

Semantics-based ranking is applied on top of the results returned by a BM25-based ranker, Luis Cabrera-Cordon, group program manager for Azure Cognitive Search, explained in a March 2 post on the Microsoft Tech Community website.

“The resulting semantic answers are generated using an AI model that extracts key passages from the most relevant documents, then ranks them as the sought-after answer to a query," wrote Cabrera-Cordon. A passage deemed by the model to be the most likely to answer a question is promoted as a semantic answer, he continued.

The company (NASDAQ: MSFT) also integrated into its query infrastructure new semantic search capabilities developed by its Bing search team. The enhanced search feature would allow app developers, for example, to apply semantic search tools to in-house or managed content.

Microsoft touts the cloud-based service as combining search relevance with improved development tools, including APIs and tools for scanning content in web, mobile and enterprise applications.

As reported last month, market tracker Forrester identified transformer models as among its top five technologies for advancing AI technology. Microsoft’s AI framework is among the first to apply the approach to semantic search.

This story first appeared on sister website Datanami.

 

About the author: George Leopold

George Leopold has written about science and technology for more than 30 years, focusing on electronics and aerospace technology. He previously served as executive editor of Electronic Engineering Times. Leopold is the author of "Calculated Risk: The Supersonic Life and Times of Gus Grissom" (Purdue University Press, 2016).

AIwire