Ethical Considerations for Data Science: Transparency and Privacy

Sarah Wright
3 min readApr 15, 2021
Source: Unsplash

Data privacy and lack of transparency in terms of how one’s data is shared is a huge ethical concern in the data science community. As more and more personal data is collected and stored through online sources, control over what information is shared and how it will be used has become increasingly difficult, and some companies aren’t as straightforward as others. Due to the recent explosion in the collection of consumer data, legislation such as the GDPR (General Data Protection Regulation) in the European Union and the CCPA (California Consumer Protection Act) in California has been enacted to protect consumers’ data privacy. Some of the stipulations of these new laws include the obligation for companies to disclose what data is being collected and with whom it has been shared, as well as requiring companies to erase data upon the consumer’s request. Nevertheless, it’s crucial for organizations to be more proactive regarding data protection policies and transparency.

Cases Surrounding Data Privacy and Transparency

We can look at various examples of questionable ethics regarding data privacy and transparency. These controversies have brought about significant criticism of the data collection, storage and dispersion policies of the organizations involved and, in some cases, damaged the companies financially. Let’s look at a few case studies:

  • Facebook Cambridge Analytica: Arguably the most popular abuse of data privacy was Facebook’s Cambridge Analytica scandal, in which researchers legally accessed user information for over 87 million people. The information was then used to target users in political ads based on their profile, and lead to Facebook being held responsible for breaching data protection laws and paying a fine of 5 billion dollars.
  • 23andMe Genomics: Another more recent example of questionable data privacy and transparency has been in regards to the genetic testing company, 23andme. Genetic data is collected through saliva samples and customers can then opt in to have their data shared with the company’s partners, including pharmaceutical labs, insurance companies, and academic or government entities. However, the vagueness about how the data will be used exactly has raised concerns about transparency and the potential fallout of having a company possess the genetic information of millions of people.
  • Google: In 2018, Google faced scrutiny and legal challenges for tracking customers location data when they were unaware of it. Some criticized the company for not being transparent enough about when customers were being tracked and tracking that may have continued after disabling the feature. Especially in the context of the GDPR regulation, many argued that the tracking notifications were misleading and thus, customers could not consent to having their location tracked. The issue almost resulted in large fines for Google and raised questions about consumer rights and privacy.

How can Companies Better Protect Consumer Rights?

Organizations not only have an ethical obligation to protect consumer data, but a vested interest in doing so, as to avoid hefty fines and loss of trust. There are various ways in which governments and organizations can proactively address these issues:

  • Legislation: While the US does not have an overarching data privacy law like the GDPR, many states are following California’s lead with the CCPA. New York, Maryland, Massachusetts, Hawaii and North Dakota all have proposed legislation that allows users right to access their data, and in some cases erase or correct it.
  • Inform Consumers of Changes to Policy: Stephen Ritter from Forbes uses the example of companies actively informing their customers of updates to their consent and notification practices as a way companies can be proactive in their approach to data protections.
  • Establish Data Protection and Storage Practices: To mitigate the risk of consumer data being leaked, organizations can limit the amount of time data is stored. If data is stored long-term, a multi-faceted approach using methods such as anonymization, encryption and hashing can increase security.

As the collection and processing of data becomes an integral part of business practices, companies can maintain consumer trust by adopting ethical and intelligent data practices.

Sources:

Forbes: https://www.forbes.com/sites/forbestechcouncil/2020/03/31/the-ethical-data-dilemma-why-ethics-will-separate-data-privacy-leaders-from-followers/?sh=17a7227114c6

Council for Big Data, Ethics, and Society: https://bdes.datasociety.net/council-output/example-big-data-research-controversies/

DataFloq: https://datafloq.com/read/top-data-privacy-and-security-scandals/5701

Varonis: https://www.varonis.com/blog/us-privacy-laws/

--

--