We must admit that AI is on its way to consume not just the technology world, but all of humankind too, embracing us in many ways – unthinkable and thinkable, from performing trivial tasks to the most complex functions of our day-to-day life. Identification of protein folding, FSD cars, or auto-generation of content, it is AI all the way; slowly and subtly becoming an omnipresent phenomenon. Probably for the first time in the history of inventions and innovations, we are witnessing a greater collaboration of Academia, Businesses, and Tech Companies.
All are coming together and joining hands to innovate and develop AI-based, AI-driven, and AI-enabled applications. It is all happening at blistering speed. We have already passed 2 decades of this millennium, and we have come a reasonably long way in giving life to algorithms that were once living only in research papers.
Responsible AI is an extremely broad and generic term for anyone to understand and interpret according to the whims and fancies of the individuals. To make it more meaningful and describable in the context of the evolving best practices of AI systems, let us break it down into tenets that are fit into the context of AI.
AI is unfair, unexplainable, uninterpretable, lacks trust, not reproducible, etc. These are the often-heard list of complaints that we come across often, many times to the extent of shutting down the AI systems like Houston Teachers performance monitoring system, or Amazon’s ads, Facebook’ hiring ad, US healthcare system, etc. some of these made headlines, while others were shut down without making headlines.
Some are fixed for biases through a third-party AI practitioner. Gov across the globe are in the process of issuing AI policies and regulations, among them, Europe issued the most comprehensive white paper on AI – their approach towards addressing and creating a responsible AI eco-system.
Fairness & Explainability:
European Union White paper on Fairness AI requires models’ ability to treat individuals fairly (Inclusiveness) against any discrimination (gender, ethnicity, economic strata, etc). Fairness
Index, Disparity Index, Equal Opportunity Index, etc all within the acceptable range. At the same time, Individual to give consent in case of AI-driven profiling, if given, AI should be explainable to the individual with a high degree of clarity, Individual should be able to contest the decision (in case of denial of social benefits), a person has right to claim for complete transparency, whole decision-making process that is reproducible (transparent) and interpretable by the individual. Its fairness on both counts, both on the model and from the individuals’ right.
While inclusivity is part of fairness, another dimension to it is being “fair” reasoning for accepting or rejecting an application (could be any application for processing, let us say college admission). Models should be able to explain about acceptance or rejection of every application. If the reasoning for rejection is “fair” then the student contesting the outcome gets convinced, the case is settled. Such workflows then become transparent, explainable, interpretable, and reproducible. AI systems of that kind gain trust with the public.
When Google did the first classification of cats and dogs, everyone was thrilled about the outcome of the classifier, but a simple question asked then could have made the whole Explainability problem lot easier to handle. Questions like ‘how did the classifier arrive at either a ‘dog’ or ‘cat’, what features were used to classify them’? isn’t it?
We witness bias today in most of the implementations; gender, ethnicity has constantly been the core issue, be it allocating healthcare benefits, hiring for jobs, crime prediction, credit card or loan processing.
Bias is purely man-made, has nothing to do with the AI as such. AI does not have enough intelligence yet to have human-like biases. It rather picks it up from the biased data used for training from the biased (could be unknowing as well).
If we humans can introduce biases either knowingly or unknowingly through the datasets, we should be able to de-bias the models as well. If there is a contention between accuracy Vs fairness, trade-off accuracy for fairness and take responsibility for the outcomes delivered.
In the report, ‘The Global AI agenda’ published by MIT in January 2021, Gary Marcus, founder, and CEO of Robust.AI and Professor Emeritus at New York University, argues, “Every company probably ought to have an ethics board, and not just for its use of AI. But we cannot leave everything to self-regulation since the interests of companies are rarely fully aligned with the interests of society as a whole.” Even for companies with long experience of AI, he argues, human oversight should remain: “AI just isn’t smart enough yet to be fully trusted.”
This calls for a serious question on Auditing like financial auditing by qualified AI practitioners. As of today, there is no governing body outside the tech companies or any company for that matter who has access to their training data sets or challenge the companies on the usage of the data sets in developing their products and the fairness aspect of the models been developed and/or deployed.
In the absence of such third-party auditing, how do we even trust the data privacy policies posted by these companies? Trust is certainly questionable.
Another important part of the eco-system being effectively communicating the new AI model to both technical and non-technical teams across the organization to get everyone on the same page about the models’ intended use, number of datasets used for training, model’s performance, outcome, benchmark metrics, limitations, trade-offs, fairness index, etc. This would be the beginning of the model documentation and include a feedback mechanism to receive and incorporate appropriate feedback from as many diversified users (internal – non -technical) as possible.
This would be a put the model to test to be checked vigorously for fairness, transparency, reproducibility, AI-stupidity in some cases before it is deployed in the public domain. The internal non-technical users would indirectly be playing the role of Ethicists helping the organization to have enough checks and balances to deliver an AI that is responsible in every possible aspect. These Ethicists could very well be part of the eco-system. Of course, these ethicists to be trained on the org AI guidelines, Gov policies and regulations on AI, and any industry-specific guidelines too.
Following may be the few steps of a responsible AI eco-system that could deliver an unbiased, explainable, accurate AI systems:
- Carefully identify and curate the training dataset
- Design model workflow
- Use IBM’s AI Fairness 360 for Fairness test.
- Google’s Explainable AI or IBM’S Explainable AI
- Google’s Model Cards for documentation
- Deploy diversified team of AI practitioners while developing the models.
- Continuously monitor the models for any drift (Content, Feature or Labels drift)
- Allow third party AI auditors to investigate the diversity of the datasets, models for fairness quotient, transparency, interpretability, reproducibility of the model workflows.
- Allow trained Ethicists to validate the model before and after deployment.
“Deborah Raji, a fellow at non-profit Mozilla, and Genevieve Fried, who advises members of the US Congress on algorithmic accountability, examined over 130 facial-recognition data sets compiled over 43 years. They found that researchers, driven by the exploding data requirements of deep learning, gradually abandoned asking for people’s consent. This has led more and more of people’s personal photos to be incorporated into systems of surveillance without their knowledge. “
This is a real threat if companies start selling the photos to third-party and use it against the public in whatever form. We upload our photos every moment (capturing moments of our lives like never!) to FB, Instagram, WhatsApp, Google you name it. How and who uses them and for what purpose behind the doors of the data cloud is anyone’s guess. We are giving away the data assets with no charges to these companies which in turn makes billions of dollars in revenue. Security and Privacy are the two most crucial arm of building a trustworthy AI eco-system. One cannot be compromised for others.
“If you strip all the identifying information, it doesn’t protect you as much as you’d think. Someone else can come back and put it all back together if they have the right kind of information,”Anil Aswani, PhD, lead author of the study and an engineer at UC Berkeley, said in a news release.
“In principle, you could imagine Facebook gathering step data from the app on your smartphone, then buying healthcare data from another company and matching the two,” he added. “Now they would have healthcare data that’s matched to names, and they could either start selling advertising based on that or they could sell the data to others.”
De-monetization of datasets:
Only way to fix this problem of privacy and security is to decentralize the data-sets cloud. Each of us should have our own cloud (given the storage cost is already peanuts, will be insignificant in coming years) to hold our personal data, purge all the data residing in the clouds of these tech companies and grant access to whoever we wish to, to run the algorithms using our data assets either for a fee or for free depending on the usage of our datasets.
Until then; while we have new tools emerging in the industry to create a reasonably responsible AI eco-system, there is no vaccine for data privacy disease yet that is shaking up the world. Data privacy pandemic going to exist forever, and we need to learn to live with it!