The Need for an Algorithmic Accountability Bill in India

SriyaS · September 19, 2020, 8:18am

Hello everyone,

I’d like to discuss the pros and cons of proposing a framework for algorithmic accountability in India. I’d like to ask if anyone would support the adoption of a framework for algorithmic accountability as has been done in Europe and now being considered in the USA?

Part of Facebook and Google’s ability to perpetuate biases and adversely affect neutrality lies within their algorithms, which are often automated. Therefore, they often use these automated processes as an excuse, to say that they lack control over the outcomes for users, suppressing speech, as well as targeted advertisements and news which can easily lead to discriminatory outcomes and the exclusion of communities/a section of users.

What Europe has done is prohibit solely automated processes and decisions where there could be a significant or legal impact on an individual, along with the right to human-in-the-loop and a non-binding right to explanations in other cases without a significant legal impact. There is a lot of evidence to show that even without human intervention, the algorithms of Facebook are biased in the way they show content to users - which is problematic when the content is an opportunity for employment, housing, or fake news.

Therefore, would such a framework to force accountability in the input process be feasible? In my opinion, when the input process is regulated, it will force such corporations to not hide behind the skirts of trade secrets or a lack of human intervention and make sure that the output is not biased to the extent it is right now. Would love to hear what everyone has to say or anything I can learn about this issue.

neeraj · September 22, 2020, 3:02pm

The bad thing about bureaucracy is that it wants to regulate everything under the pretext that it is saving the society, it plays role of an autocratic rather than assistance. That being said, regulation is good only when their is a massive downside to something if things go wrong like Nuclear Power Plant or Police Force. We have to understand that tech itself has never been bad, its the intentions of its users which make it bad.

AI and algorithms are vastly misunderstood pieces. To define in simple terms, algorithms is nothing but steps to do something and to define AI (I use the term ML or Machine Learning because that’s what it is), it is made of a mathematical model which stores probability of having two things together. Most common use of this ML is to predict what’s next - book recommendation, post recommendation, advertisement etc. So, there is nothing inherently wrong with these models.

Since ML is just probability, that’s where that bias comes in. It is the bias. The change in probability happens whenever someone generate a positive or negative feedback loop for the prediction made by the model. For example, if Upvote/Like then show more of same things and if Downvote/Dislike/Report Abuse(often missing) show less of those. Basically, confirmation bias is a good strategy for the algorithm. Companies use them to grow and gain traction and users. Simple.

Probability is good predictive indicator only if the experiment is done lot of times. In the case of ML, it is Data that is fed to mathematical models to generate the probability network. In tech space, there is a saying “Garbage in, garbage out”. So if you put in meaningless stuff, models will still make something out of it. The same thing happens if there is very less data to train on. Now, this is where it gets interesting.

To create a good ML service, you need lots of data and that data cannot manually be cleaned by a company. Google created Captcha, in which it asks users to select boxes having traffic lights, zebra crossing etc. and used that data to train its self-driving cars. They got it done massive number of times from the users, that’s how much effort it requires.

Now, coming to the point of regulation by having a manual intervention. To break out the truth, there already is manual intervention. The problem is that it cannot be done at scale and hence, the regulation is destined to fail. The regulations are not on the companies on how they should do something, that’s for companies to decide. If regulations has to interject between the process of the companies, it will slow them down or cause disinterest. We usually call it “License Raj”.

How to regulate then? In case of self-driving cars, regulations are that “anything” on road has to follow traffic rules and it should not cause accidents. Very clear regulations and its applicable for all. Companies can choose their ways to follow this regulation. That is how it should be done.

IMO for FB, the regulations of media should apply as it allows news portals to exist on its platform and if it allows political parties to exist then it should also be under the scanner of Election Commission and so on and so forth. Why? These entities have regulations of what kind of activities they can do in physical space and those should be applicable in digital space as well. This doesn’t mean every regulation should happen as such because:

The entities has far reaching consequences. For example, if a company on Twitter announces that their CEO is dumping shares in market without telling SEBI then SEBI can take action against them. There is a protocol for them to function. While for an individual there is none, almost everything will be under Freedom of Speech and that is how it should be.
Internet is a subscriber media while TV and newspapers and broadcast media. Govt. banning adult websites is a bad idea because people visit them willingly but banning same content in newspaper or TV is a good idea because it is broadcasted. Perhaps, even newspaper can have some slack.

Adding few more things just for context but can be ignored. There is inherent bias in the data that is fed into these models.

There is racial bias because most of the data generated is by the users who are privileged to have the internet.
The face recognition fails because of lack of racial diversity in the organization.
The self-driving cars still have very tiny possibility of not able to predict what’s next and can cause accident. Way lower than humans though.

asd · September 25, 2020, 6:29pm

I agree almost entirely with @neeraj till at the end I don’t understand the conclusion neeraj makes.

Regulating algorithms themselves is going to be impossible. Rather you should put responsibility on the companies that uses algorithms. If Facebook curates posts/news using algorithm, then Facebook is a media outlet and should be bound by the same regulations that every media company is bound by.

Facebook usually scoots this by saying they are just dumb pipes that shows user generated content to other users. But that would be a very clever hideous lie. By algorithmically surfacing content and hiding other content, Facebook is actually becoming a publishing house.

So, facebook should be presented with two options-

don’t use ML algorithms to change feed. Use simple chronologically ordered feed - at which point they are just dumb pipes connecting users.
continue using ML algorithms but take up responsibility for what they surface. Censor fake news. Censor hate speech. Even if this is impossible because of the scale.

Either they give up ML or take up responsibility.

This probably can be read with intermediary liability approach. 85% of https://stratechery.com/2019/a-framework-for-moderation/ I agree with.

abx_04 · September 27, 2020, 11:49am

I think both @asd and @neeraj have missed a few points.

First, on regulations, we don’t just regulate traffic, we also regulate vehicles(the product)…they have to follow certain standards, usually for safety. And this applies to all (formal) industries. Similarly, digital products should also be safe. And there should be regulations to achieve that. Targeted ads and alfgorithms designed to optimize ‘clickability’ of those ads often create a walled garden that end up affecting the society as a whole. Also, there is the concern of depression.

I think these can be solved through a good privacy law and making sure companies collect only the ‘minimum’ required data.

As far as Algorithmic Accountability is concerned, I think it applies more to the government. The algorithms used in govt services should be open-sourced and the government should be responsible to ensure it doesn’t negatively affects any citizen.

SriyaS · September 27, 2020, 11:53am

I agree with your perspective! Given that the essential business model of such companies is to sell our attention to third parties, the algorithms capturing our attention need to be regulated and made safe as you say. And yes, I agree that government algorithms should all be open sourced.

abx_04 · September 27, 2020, 12:31pm

Regulating algorithm is a tricky part. And I don’t think it can be done. It is research and should be allowed to happen.

What needs regulation is the application of the algorithm and the way data is collected. Data collected should be non-identifiable and the people should have right over their digital self.

If the research in an algorithm does need identifiable data, proper consent should be taken, like it is done in pharma research…or something like that.

The government can provide incentives for research in safe and unbiased algorithm.

neeraj · September 30, 2020, 7:22am

First, on regulations, we don’t just regulate traffic, we also regulate vehicles(the product)…they have to follow certain standards, usually for safety.

I think my point did not convey what I wanted to say effectively. My point was that the regulations should not dictate how a company should be doing things. It should instead only regulate the outcome and let the companies execute however they want to. For example safety as mentioned and suppose a vehicle needs to score at least 7 out of 10 to be on road. Now its in the hand of the company if it wants to use Iron, Aluminum or Carbon Fibre, its not for the regulation to decide. Of course, there are exception like you can’t use Lead or radioactive material. In case of SM posts, things are trickier and very subjective and the debate often oscillate between Right/Left POV. And then there is very thin line and the post becomes hate speech. Hence, IMO we should look at it on case to case basis rather than building a regulation which could quickly evolve into censorship which is far worse.

Targeted ads and alfgorithms designed to optimize ‘clickability’ of those ads often create a walled garden that end up affecting the society as a whole

I have different POV on this. Ads have not led to creation of walled garden, it was unavailability of other(and better) options and somewhat breakdown of Net Neutrality, if not in principle then at least on ground. And then there is network effect. I think its the responsibility of the consumer to decide what is good for them given they know the facts.

Agree to your point of good privacy law and government accountability but IMO it should be only to educate and again let people choose for themselves.

aparatbar · October 11, 2020, 2:59pm

Algorithmic equality is a complex area to my understanding. We can approach it from the prism of gender, exclusion and anti-discrimination doctrines but also data protection. I think generally there will be several elements to bring legal protections from diverse areas given the widespread impact of automated decision making.

One such attempt is contained in the Indian Privacy Code where a person is provided an option under Section 28 for obtaining exemption/alternatives to automated decision making that has legal impacts. Such a system will require human oversight and supervision.

SriyaS · October 11, 2020, 3:27pm

What concerns me about an individualistic framework is that I don’t believe it’s a meaningful way to exercise control in a situation where data is aggregated on a scale that goes far, far beyond any one individual, but how the data is employed has drastic consequences for everyone. Examples of this have been disparities in employment advertisements based on gender, race and religion.

When the option for an individual is only to opt out of the system, it would have no real impact on a corporation processing their data on a much larger level, to influence and engage in behavioural modification on a mass level.

What could be explored in India is the concept of data trusts and exercising consent on a collective level. This has been touched upon in the non personal data governance framework, but must be detailed a lot further!

aparatbar · October 11, 2020, 4:25pm

Thanks Sriya, it’s extremely important that enforcement is manageable for people who face a huge burden and information asymmetries which often makes consent, accountability and liability processes a chimera. Here collective processes and group based claims are useful and also present their own challenges.

The primary understanding on the non-personal data framework as we looked at it closely in discussions on this forum were underdeveloped (my own reading and IFF’s comments heavily relied on them which for instance as we discovered the consultation paper did not even cite the Puttaswamy Judgement) and I hope better articulations on it can follow. At the same time group based claims such as those under the RTI act can be activated by many individuals (I also think this helps preserve autonomy, helps democratise claims but risks to this also exist) which may spur greater compliance and enforceability. Another measure may be standard setting by the Data Protection Authority which I hope is created soon. Unfortunately, I do think the absence of such a regulatory body has left us largely unprotected and searching for solutions that may be found within conventional frameworks.

My larger worry is when models of data fiduciaries and trusts are to look at how accountable these body corporates themselves will be in their governance towards the masses that they represent. I know, I am speaking from bias and I am open to correction here, but they do seem fairly inspired from capital market and financial regulation that have often not been able to deliver on their promises. It also risks a property rights based approach that is a useful, but also limited lens for looking at the wider issue of data protection and informational privacy as a more developed and wholesome human right.

SriyaS · October 11, 2020, 4:43pm

I very much agree with all the points you have mentioned. The current iteration of the NPD Framework is worrying for all the reasons you’ve stated, and the conceptualisation of collective consent is itself flawed. I’m working on a short paper regarding this issue which I hope to share with you all in the forum soon.

There are some successful models of data trusts and collective consent being exercised such as the UK Biobank and Open Data Institute, which approach privacy as a wholesome human right.

We must definitely have mo and come up with ideas with respect to models that can successfully implement consent, to more meaningfully allow users to have an idea of what their data is used for. Currently in my view, it is MOST important for all of us to realize that the “informed consent” model is quite simply broken, due to the sheer power imbalance between an individual and the use of their online behaviour as a raw material for the machinery of surveillance capitalism.

When we move beyond the purported adequacy of informed consent, only then can we even have a meaningful dialogue about algorithmic accountability or collective consent, in my view!