Social Research using Social Media 3: Ethics in Social Media Research

Whilst social media data is very easy to access and ‘harvest’ for social research, the ethical implications of doing this are complex. Nadia von Benzon’s third post in her series on social media research considers some of the issues around using pre-existing social media content as a data source. 
Issues of consent

A basic tenet of conducting contemporary research is to ensure that as researchers we have appropriate permission to collate, analyse and publish the information that we choose to work with. Whilst consent is only a legal requirement where research is physically ‘invasive’, it is normalised as a baseline consideration in most ethical frameworks and assessments surrounding research. Collecting this consent, however formally or informally, clearly requires direct communication with the creator or owner of the data.

However the same freedom and accessibility that allows members of the public to post comments, longer form writing, and creative content online without cost or gatekeeping, also means that it can be very difficult to identify or contact anyone with authority to offer consent to use social media material in research. 

This may be exacerbated by the fact that authors of online content are often posting anonymously, and usually without any contact details. It is also common for social media content to be shared in a way that the material becomes ‘detached’ from the original author attribution. It can be impossible, therefore, to obtain informed consent from those whose words or other content we wish to incorporate in our research.  

The ethics of using social media as research data is further complicated by the nature of the different types of platforms available and the different characteristics or competencies of users. There are three key considerations when deciding whether or not it is reasonable to use material found online as data in research:

1. Public, private, semi-public and semi-private online spaces.

Not all social media spaces are created equal – in many ways. One of the significant differences between social media spaces – at least from an ethics-in-research perspective – is the potential size of the readership or audience that different platforms allow. I consider that some online spaces provide a platform for writers or creators that are so publicly accessible that the creative must assume that their work has the potential to ‘go viral’ -they must assume that work that they publish to that space may be reproduced without their consent or knowledge. 

This is particularly the case if the author knowingly publishes content anonymously, making it impossible for anyone intending to reproduce the work to seek consent. In these contexts it seems rational for researchers to assume that using the data is reasonable.

It might be helpful to think of social media spaces as falling into one of the following four categories:

Public social media Spaces that do not require memberships or log in to view content – publicly accessible and free of charge. E.g. Guardian comments, published blogs, Twitter, YouTube, Mumsnet, Reddit – some of these spaces may offer additional functionality (such as the ability to comment yourself, If you register as a user).
Semi-public social media Spaces that require you to log in to the site (and perhaps have your own ‘profile’) but then give you largely free access to most content such as  business pages and public groups on Facebook, Instagram,  Tiktok and Pinterest. Some of these spaces offer limited functionality to non-registered users (e.g., Tiktok allows you to view videos but not associated comments without logging in)
Semi-private social media Spaces that are managed by specific gatekeepers who choose whether to approve new members for access. For example, closed Facebook groups.
Private social media

Social media groups that typically contain only friends or acquaintances or those ‘working together’ in a broad sense. This might include WhatsApp groups and private Facebook groups.


Personally, I would consider using social media content posted to public social media spaces without consent to be ethically preferable to using data in this way from other sources. At the other end of the spectrum, I would consider any research taking place in semi-private or private social media spaces, without contributor consent, to be covert research, requiring specific and rigorous ethical reflection.

From the table above, I’d highlight Mumsnet as an interesting example of the way in which a public social media platform acknowledges and engages with its potential as an archive for social researchers. In discussion threads, participants are often explicitly aware of the use of content by researchers and the media, with conversations on occasion turning to the utility of the material for these purposes, or the desirability of republication of discussion content elsewhere.

As a website, Mumsnet hosts events like Q and A sessions with researchers, serving to further raise awareness of the potential of discussion forums to be used as data, as well as providing a space for direct interactions between users of the platform and researchers.

2. Author competency

Author competency relates to the ability of the author of the online content to understand that the material they have posted online could be accessed by a researcher, and to understand the implications of this. 

At one end of the scale would be online publication by journalists or other public figures – perhaps using their own blogs, Twitter, Instagram or Facebook to publish images, ideas or commentary. These are people who will be very used to having their ‘publications’ disseminated widely, and can be assumed to publish content in acceptance of likely redistribution. 

We might also argue that public figures such as celebrities from all walks of life, who are held in public esteem, should be held responsible for their publications as a matter of public interest. 

At the other end of the scale we might see Facebook comments by elderly parents, in which personal communications are made public through a misunderstanding of the functionality of the forum. These sorts of contributions are clearly not expected to be disseminated or used beyond the intended readership, perhaps of a single person.

Most social media authors will fall between these two extremes; they will be ‘ordinary’ people publishing content to platforms in the knowledge that this information can be read or viewed by others. In legal terms, these ordinary people,  can be, and are, held accountable for their content, with convictions being made in cases where social media postings have been considered to constitute hate speech or incitement to violence. However, the morality of whether or not a researcher should use content, and the way in which they should use the content, is more of a gray area.

In my own paper on informed consent, I argue that using social media contributions in research, and even attributing the author’s name to the content, respects the agency of those posting online and acknowledges their contribution in the development of your understanding of a topic. However, not everyone who posts on social media has the capacity to understand the implications of this, and researchers need to consider the importance of trying to identify posts from children, or those who otherwise may lack the capacity to consider the long-term or broader implications of their social media contributions. 

3. Respecting emotions in social media content

Finally, I would also consider the nature of the data before deciding whether or not it was appropriate to ‘harvest’ social media contributions for analysis. Here I’m thinking particularly of a) the way in which people might respond if they are involved in a heated exchange online and b) discussion of very personal and sensitive issues c) the amount of time that has passed since the content was published. 

In the first case I would be concerned that people posting to social media as part of a quick-fire discussion may well not be carefully considering their contributions, and may well post something that they later regret. Whilst many platforms allow you to delete contributions, this functionality is not universal, and some spaces will have rules against it. Moreover, participants might not remember that they need to go back and remove content, after the moment has passed. 

In terms of the sensitivity of topics, I do think it’s important to consider that for some people, the internet provides an important space for connection in very difficult life circumstances. This may be through the ability to share a story with complete strangers, or through the fact that social media allows people to make contact in spite of the barriers imposed by geographical distance and time difference. Social media may be seen by some people as a lifeline for connecting and sharing, and to capitalize on this for research –however interesting the discussion may be – seems morally dubious. 

It’s also important to remember that, despite the quantity of content online, social media postings have a long shelf life. Inactive blogs may be forgotten rather than deleted, whilst decade-old conversations are still accessible through forums like Mumsnet. People’s views can change significantly over that sort of time frame, and engaging with ‘old’ material that might otherwise be forgotten may not feel appropriate.

Final thoughts

Having considered these issues and decided whether or not to use the data in hand in your research, the researcher faces further decision-making concerning how to use the data. Will the authors of the content be anonymized through assigning project-specific pseudonyms, and their data paraphrased to avoid location through search engines? Or rather, will you name the authors, use their comments verbatim and cite the original sources, in order to respect the contributors as agentic authors who deserve attribution for their contribution to the project? From my perspective, these decisions need to reflect the research design and the nature of the data that you intend to use. 

For more on the ethics of research using social media, I’d recommend:

AUTHOR BIO: Nadia von Benzon is a lecturer in Human Geography a Lancaster University. Her current research uses Facebook as a platform for sharing birth stories and she has also published from research using Mumsnet and . She is editor of the textbook Creative Methods for Human Geographers and is a member of the editorial board of the SRA blog.