In my opinion, the term ‘non-personal data’ is rather ambiguous, specially in the times of modern technology. Here’s why:
A. Personal data has been defined as information which is identifiable to a person. Thus, non-personal data would be information which is not identifiable. The ambiguity kicks in when we are ascertaining what degree of identifiability is needed to categorize information as personal or non-personal data.
B. In this article written by me https://intellectechlaw.in/2020/06/26/blockchain-and-the-data-protection-dilemma-compatibility-of-gdpr-with-blockchain-system/ , I discussed the challenges of blockchain systems to comply with the GDPR. One of the points discussed here is the scope of personal data. Quoting from the article:
"The basic premise for the applicability of GDPR rests on the data processed being “personal data”, that is, the data must be in relation to an identified or identifiable natural person. There are some concerns whether the GDPR would be applicable to blockchain servers where data is stored as encrypted or hashed data. The primary question would be whether the data stored would be identifiable to a natural person. GDPR recognizes the concept of ‘pseudonymisation’, that is, the processing of personal data in a way that it can no longer be identifiable to a specific data subject. However, it must be noted that pseudonymised data is not precluded from the data protection measures under the GDPR but is acknowledged as a recommendation to reduce the risks with respect to privacy of the data subjects. The EPRS also notes that even encrypted data would ‘likely’ qualify as personal data under GDPR as it is difficult to assess whether the encrypted data has been sufficiently anonymised. Additionally, Recital 26 to the GDPR also notes that pseudonymised personal data which could also be attributable to a natural person by use of any additional information would also have to comply with the GDPR obligations. Additional information could include internet protocol addresses, cookie identifiers, radio frequency identification tags, etc. as these may leave traces which may allow data to be attributable to a particular data subject.
Blockchain servers often use public keys which are essentially a string of letters and numbers that represent each user’s data – somewhat similar to an account number. There are also private keys, also letters and numbers, but somewhat similar to passwords. While the data stored in blockchain servers is encrypted or hashed, based on the technical design, such data could also be decrypted by the use of private keys. There is definitely some uncertainty regarding the degree of identifiability the data must have to come under the meaning of personal data. In this regard, the EPRS suggests that the appropriate test would be whether the controller or another person are able to identify the data subject in using all the ‘means reasonably likely to be used’. "
C. In the Indian context, there is a further lack of clarity. The EPRS has suggested using the test of identifying the data subject using all means reasonably likely to be used. Thus, more or less, whether the data has been permanently anonymysed. Meanwhile, no such stance has been taken in India.
D. In machine learning systems, algorithms are fed to AI systems, which is again encrypted data. A pertinent question then is, are these machine learning systems capable of ‘forgetting’ or erasing the data? Whether such data could be decrypted? Here is another relevant article