Intellectual property rights versus AI training needs

Author : Zohar Kantor, QualiSense

28 August 2024

Access to copyrighted material will be essential to developing effective AI systems for manufacturing, argues Zohar Kantor, chief revenue and customer success officer at quality inspection specialist QualiSense.

OpenAI, the company behind ChatGPT, has been hit by a series of lawsuits for having trained its models on materials available on the internet. The case will be a major legal landmark in how copyright law is applied to AI training. In this article, Kantor explains how QualiSense has managed to access copyrighted production data, and why the AI systems they have developed would not be effective without it.

In April 2024, a group of eight US newspapers, including The New York Daily News, Chicago Tribune and Denver Post, sued OpenAI and Microsoft for allegedly using their copyrighted articles without permission to train their AI models. The New York Times had already sued both OpenAI and Microsoft in December, on similar grounds.

These legal challenges reflect broader issues in the field of generative AI, particularly concerning the ethical sourcing of training data and the accuracy and reliability of AI-generated content. But, when training an AI model, there is no escaping the need for vast quantities of data. For an application like ChatGPT, which is designed to provide information about any topic, the data required is extraordinarily vast. Putting aside the legal rights and wrongs of this case, you cannot train a model without this data.

In addition to the quantity of data, the relevancy of that data is also crucial. If you want to build an application for a manufacturing environment, for example, you need manufacturing data. Unlike the data that is freely available on the internet, the relevant data here is closely guarded by manufacturing companies. They have a dilemma: they want to unlock the power of AI, but they won’t easily give away their data.

Model training for defect detection
If you want to build an AI model for a specific use case, for example quality inspection, you need data that is highly relevant to that specific use case. However, to achieve the end goal of a deployable model, there are different routes. If you are starting from zero every time, the process of building the model for your production line will take much longer, require more images, and necessitate greater input from the quality manager.

The alternative route, which achieves the same outcome, but in less time and with less hassle, is to develop an AI backbone, essentially pre-training a model with relevant data. In the same way that a human being would recognise a new car they had never seen before as belonging to the category of “car” based on their prior knowledge of cars, so too an AI model, trained on vast quantities of relevant manufacturing data, can recognise a “crack” or a “watermark” on a metal surface, based on its pre-training data.

This will not get you to the end goal, but it will give you a significant head start. Whereas ChatGPT might be able to make mistakes, the KPIs for defect detection typically allow an error rate close to zero. It’s a different game, with a much higher standard. The only way you can achieve a deployable model that will meet this KPI is to tailor its training to the specific use case, by giving it data from your production line and feedback from the quality manager.

However, if the pre-training data is voluminous and relevant, you have a powerful backbone in place. With this backbone, you have already completed half the job of training a fully deployable AI model. The problem with building this model is, as the case of ChatGPT shows, you need a lot of data.

At QualiSense, for example, our solution has been to partner with Johnson Electric. We’ve secured an agreement with them that provides access to vast troves of manufacturing data from eighteen manufacturers across the world. This means that our model has a powerful backbone allowing it to recognise different types of defects on metal surfaces.

QualiSense is a fast-growing start-up developing AI software for manufacturing use-cases. Find out more at https://qualisense.ai/ .

Contact Details and Archive...

QualiSense

Print this page | E-mail this page

Your name :	Required!
Recipient name :	Required!
Recipient e-mail :	Required!Invalid!

Enter the code shown :	Invalid!

Intellectual property rights versus AI training needs

Access to copyrighted material will be essential to developing effective AI systems for manufacturing, argues Zohar Kantor, chief revenue and customer success officer at quality inspection specialist QualiSense.

Contact Details and Archive...

E-mail this page

Related Articles

Scaling the silicon: Why GPUs are leading the AI data centre boom

Scaling the silicon: Why GPUs are leading the AI data centre boom

Machine learning predicts the future of flu

Machine learning predicts the future of flu

Industrial robotics: A new era of radical change

Industrial robotics: A new era of radical change

Webinar: AI and sustainability - Building trust beyond the buzzwords

Webinar: AI and sustainability - Building trust beyond the buzzwords

Most Popular

ABB adds Visual SLAM navigation to Flexley Mover P604 platform

Keeping batteries compliant: Preparing for updates to the EU Battery Regulation

Reflections from Davos 2025: Where are we now?

A lasting partnership: Omniflex and Sasol’s relationship spans half a century

Success cases using Advantech's WISE-iFactory suites

Videos

ABB adds Visual SLAM navigation to Flexley Mover P604 platform

Success cases using Advantech's WISE-iFactory suites

Robotics help modular battery cell mass production

IoT-enabled interoperable architecture enables plug-and-play innovation

Follow us on Social Media

Contact Us

Terms & Conditions