Fujitsu Introduces New AI Mechanisms for Hallucination Detection and Phishing URL Identification

September 26, 2023

TOKYO, September 26, 2023 -- Fujitsu today announced the launch of two new AI trust technologies to improve the reliability of the responses from conversational AI models. The newly developed technologies include a technique to detect hallucinations in conversational AI models – a phenomenon in which generative AI creates incorrect or unrelated output – and a technique jointly developed at its small research lab at Ben Gurion University to detect phishing site URLs implanted in the responses of the AI through poisoning attacks that inject false information.

With the new technologies, Fujitsu aims to provide corporate and individual users a tool to evaluate the reliability of replies from conversational AI, ultimately contributing to a more secure use of AI across a range of use cases, including for businesses aiming to implement the technology in actual operations.

Professor Yuval Elovici, Ben Gurion University, commented: “Generative AI stands as a critical domain, and within it, the hallucination detection technology Fujitsu has developed emerges as pivotal for establishing trustworthy conversational AI systems. Researchers from Ben-Gurion University (BGU) and Fujitsu have pioneered an innovative technique to enhance the security of AI-based URL filtering against adversarial threats. Our breakthrough focuses on tabular data, resulting in a more resilient defense mechanism against adversarial attacks in the realm of AI-driven URL filtering. Moving ahead, Fujitsu and Ben-Gurion University are set to collaborate on forging novel security-centric advancements within the realm of Generative AI.”

Fujitsu will include these new technologies in its conversational AI core engine provided through the “Fujitsu Kozuchi (code name) - Fujitsu AI Platform,” which offers users access to a wide range of powerful AI and ML technologies.

The technology to detect hallucinations in conversational AI will be available to users in Japan starting September 28, 2023, and the technology to detect phishing site URLs in responses of conversational AI starting October 2023. The new technologies will be both available to corporate users as a demo environment via Kozuchi and to individual users via a dedicated portal site. Fujitsu plans a roll-out of both technologies to the global market in the future.

Technology for Highly Accurate Detection of Hallucination in Responses of Conversational AI

When applying conversational AI in business operations, businesses often use the technology to extract information related to questions from pre-registered business data and add the data as reference information when asking questions to an external conversational AI. While this method provides accurate replies and reduces hallucinations, complete prevention of hallucinations represents an ongoing issue as conversational AI in some cases is unable to correctly extract information related to questions and accordingly creates unrelated, incorrect replies. Although methods to estimate the degree to which the reply of an AI might be a hallucination (hallucination score), accurate estimation of this score remains a difficult task as conversational AI uses various different phrases to express the same fact.

Overview of technology to detect hallucinations in conversational AI. Credit: Fujitsu.

Based on the observation that conversational AI frequently generates incorrect information for proper nouns and numbers, and contents of replies tend to differ with repeated questions, Fujitsu has developed a technology to identify and focus on parts of sentences where hallucinations are likely to occur.

To calculate a highly accurate hallucination score, the new technology first breaks down the AI’s reply into three parts (subject, predicate, object, etc.) and then automatically identifies named entities within the reply. As a next step, the technology leaves these named entities blank and repeatedly asks the external AI to more accurately define these specific expressions.

Fujitsu benchmarked this technology using open data, including the WikiBio GPT-3 Hallucination Dataset and found that it could improve the accuracy of detection (AUC-ROC) by approximately 22% compared to other state-of-the-art methods for detecting AI hallucinations, such as SelfCheckGPT.

Technology for Detection of Phishing URLs in Responses of Conversational AI

As conversational AI creates responses based on its training data, hostile entities can trick the AI into creating responses that include manipulated information such as phishing URLs that lead to fake websites by implanting malicious information in the AI training data.

To address this issue, Fujitsu has developed a technology to detect manipulated URLs in the responses of conversational AI. Once the technology identifies a phishing URL, it issues a warning message to users.

Overview of technology to detect phishing URLs. Credit: Fujitsu.

Fujitsu’s new technology not only detects phishing URLs, but also increases the AI’s resistance against existing attacks tricking AI models into making a deliberate misjudgment to ensure highly reliable responses by the AI. The newly developed technology leverages a technique jointly developed by Fujitsu and Ben-Gurion University of the Negev at the Fujitsu Small Research Lab established at Ben-Gurion University. The technology leverages the tendency that hostile entities often attack a single type of AI model, and detects malicious data by processing information with various different AI models and evaluating the difference in rationale for the judgment result.

The technology can not only be used for the detection of phishing URLs, but also to prevent general attacks to deceive AI models that use tabular data, and can thus also be used to avoid attacks on other services.

About Fujitsu

Fujitsu’s purpose is to make the world more sustainable by building trust in society through innovation. As the digital transformation partner of choice for customers in over 100 countries, our 124,000 employees work to resolve some of the greatest challenges facing humanity. Our range of services and solutions draw on five key technologies: Computing, Networks, AI, Data & Security, and Converging Technologies, which we bring together to deliver sustainability transformation. Fujitsu Limited (TSE:6702) reported consolidated revenues of 3.7 trillion yen (US$28 billion) for the fiscal year ended March 31, 2023 and remains the top digital services company in Japan by market share.

Source: Fujitsu

Categories: Happening Now

Fujitsu Introduces New AI Mechanisms for Hallucination Detection and Phishing URL Identification

Related

Happening Now

Recent News

Contributors

Fujitsu Introduces New AI Mechanisms for Hallucination Detection and Phishing URL Identification

Related

Happening Now

Recent News

Contributors

Share

Copy short link