Data collection’s new powers of observation

Data collection’s new powers of observation

Technology is transforming the way health and social researchers gather data on how people live, work and make choices. The methods may be changing rapidly but the ethics remain the same.

In a world of smartphones and social media, traditional approaches to research run the risk of no longer gathering truly accurate data samples. To collect data about the whole community, researchers need to keep pace with technological change, reaching out to people on their preferred devices and channels.


From landlines to mobile

An example of these challenges was seen in the preparation of the VicHealth Indicators Survey 2015, conducted using CATI or ‘computer assisted telephone interviews’. For 2015 the decision was made to move to a dual sampling methodology, by calling both landlines and mobile phones.

Many individuals and households have ditched the landline in favour of a mobile. In June 2016, 5.78 million Australians, around 31 per cent of the population, lived in a mobile-only household1. This was a sharp rise from the 3.4 million just a few years earlier in 2012.

Among younger demographics the proportions are even higher: 40 per cent of 18–24-year-olds are mobile-only, while that jumps to 50 per cent of people aged 25–34. Overcoming this skew to ensure a truly representative sample in the VicHealth Indicators Survey 2015 required a change in processes, according to Dr Annemarie Wright, VicHealth Manager of Knowledge and Health Equity.

Previously, potential survey respondents were sent ‘initial approach messages’ by mail, informing them about the survey and that a representative from VicHealth would be calling them to invite them to participate. Using mobile numbers as data sources for 2015 meant designing ‘approach messages’ to be sent via SMS, too.

VicHealth Indicators is also a highly specific survey, requiring around 300 respondents from each of Victoria’s 79 Local Government Areas (LGA). This is easy to target with a landline, but random number generation for mobiles doesn’t even guarantee that the phone owner will be in the correct state, let alone the right LGA.

“There was a cost associated, in terms of resources and time,” says Wright, “but dual sampling definitely gave a more representative sample of the Victorian population than if we’d just gone with landline.”

The CEO of the Social Research Centre, Darren Pennay says that conducting a CATI interview via a mobile phone comes with its own set of considerations.

“When you call a person on their mobile there’s no guarantee that they’re at home and in the middle of dinner the way they so often are when you use a landline. They can be doing many different things: they might be out and about, they might even be driving,” Pennay says.

“This means that the first thing you have to do is make sure they’re able to speak safely. Even after that they might be distracted and this means you’ve got to keep the questionnaire quite short and to the point. If you can do that then the mobile phone is a really quite good way of collecting high quality data.”


Digital data collection

The challenges and opportunities increase when digital data collection is considered. Social media is often looked at as a brave new world when it comes to collecting digital data. It can be as simple as using social networks for conducting interviews and surveys, or it can go much deeper. 

This brave new world is not without its share of issues and pitfalls.

VicHealth and the Centre of Sport and Social Impact at La Trobe University used Facebook as the primary distribution channel for a survey around the 2016 AFL Pride Game2. The surveys were used to evaluate the effect of the Pride Game on attitudinal changes towards LGBTIQ groups in the community. Wright says that the use of social media in this survey allowed for some high targeting of potential respondents.

“Social media has been very helpful for getting a very specific data capture set. With the Pride Game evaluation, we could for example ensure we were capturing AFL fans by using Facebook page information.”

Web-based surveys have become a popular way of gathering data. Pennay even suggests that they are the “dominant mode of data collection in the social research industry at the moment”. Once again, however, the popularity of the mobile phone has shifted the goalposts for researchers.

“Nowadays 30–40 per cent of people who respond to a request to fill out an online survey will do so on a mobile device,” says Pennay. “That means researchers need to design questionnaires that display properly and collect data properly on a mobile device, whether that’s creating shorter questionnaires or surveys that scale to screen size or whatever is required.”

Using social media to gather data from surveys is just one aspect of what can be done. Dr Stefan Hajkowicz is the Senior Principal Scientist of Strategy and Foresight at the CSIRO and a co-author of the report Bright Futures: Megatrends impacting the mental wellbeing of young Victorians over the coming 20 years3.

Hajkowicz believes there is still a great deal of value in traditional data collection, but these days focuses his efforts on what he calls “transactional data”. This is the data that sits behind the scenes, tracking actions as they take place in time: appointments, payments, travel.

It’s this transactional data that forms a component of what’s become known as ‘Big Data’.


The rise and rise of Big Data

Big Data might be the buzz term de jour but it is notoriously hard to define4. It can be useful to think of it as large sets of structured and unstructured data coming from a variety of sources, often generated without the conscious input of the individual.

Nowadays 30–40 per cent of people who respond to a request to fill out an online survey will do so on a mobile device

Advocates of Big Data see it having big benefits for healthcare outcomes, including developing more personalised and targeted healthcare5, as well as offering better ways to identify and analyse the effectiveness of healthcare interventions6. Big Data, Big Possibilities, a report published by the McKell Institute, argues the case for overcoming the legislative and technical barriers to the use of Big Data in the public health sector for the benefit of the broader community. It suggests, for example, that access to data combined with advanced risk stratification techniques could enable early identification of patients at high risk of developing chronic diseases like diabetes.

From a research perspective, Hajkowicz cautions about pushing into Big Data too early. “In terms of data reach, we’re more oversupplied than we’ve ever been before, and if we know how to use it, it tells us quite a bit. But we’re also seeing quite a few people being underwhelmed by Big Data analytics. It doesn’t guarantee useful insights in and of itself.”

However, with a clear strategy, Wright can see very identifiable and practical benefits. “We’re looking into the potential of Big Data as part of our research strategy for the next couple of years,” she says. “It has the potential to save taxpayers a great deal of money, because we can gather information to develop better programs more efficiently and potentially more cheaply.”

Pitfalls of the new data

The various new pathways of data collection require a very careful eye on the potential minefields, which include not only the obvious issues around privacy but also around the equitable representation of socio-economic groups.

Looking back at something as seemingly innocuous as conducting surveys by mobile phone, the Social Research Centre found that when compared to landline surveys, mobile-only respondents are more likely to be male, younger, residing in capital cities and living in a group household7.

Pennay stresses that public opinion research is entering a new era, one in which traditional survey research may play a less dominant role. He says, “The proliferation of new technologies such as mobile devices and social media platforms is changing the societal landscape across which public opinion researchers operate. The ways in which people both access and share information about opinions, attitudes and behaviours have gone through perhaps a greater transformation in the last decade than in any previous point in history, and this trend appears likely to continue.

“The rapid adoption of smartphones and the ubiquity of social media are interconnected trends which may provide researchers with new data collection tools and alternative sources of information to augment or, in some cases, provide alternatives to more traditional data collection methods. However, this brave new world is not without its share of issues and pitfalls – technological, statistical, methodological and ethical.”


Infographic: 10% of Australians don't have an internet connection

Ethical frameworks under scrutiny

With ethical frameworks yet to be agreed upon or documented by researchers, the ethics of digital data collection are one of those potential pitfalls.

“Are there ethical issues?” asks Hajkowicz. “There are massive ones which are being overlooked by some companies. For a government or public body those concerns are even more important.”

Wright concurs but suggests that the basic tenets of ethics haven’t changed. “The same fundamental ethical principles are in play as with any data collection: the research must have merit and integrity, and must uphold the core values of justice, welfare and respect in relation to participant involvement. Simply put, you must ensure that you’re not disadvantaging or bringing harm to anyone.”

It’s likely to be some time before a competent framework for the ethical gathering of data from online sources – “years of trial and error” according to Hajkowicz – but the end result will be worth it.

“The flipside to all this is that preventive health can be done so much better when we have access to personal data,” Hajkowicz says. 


1 ACMA Communications report 2015–16
3 VicHealth & CSIRO (2015) Bright Futures: Megatrends impacting the mental wellbeing of young Victorians over the coming 20 years. Victorian Health Promotion Foundation, Melbourne.
6 Raghupathi W and Raghupathi V, 2014, ‘Big data analytics in healthcare: promise and potential,’ Health Information Science and Systems, Vol. 2, No. 3.
7 VicHealth 2016, VicHealth Indicators Survey 2015, Victorian Health Promotion Foundation, Melbourne
8 Dhavan V. Shah, Joseph N. Cappella, W. Russell, Neuman, Eszter Hargittai Is Bigger Always Better? Potential Biases of Big Data Derived from Social Network Sites. Annals of the American Academy of Political and Social Science Vol 659, Issue 1, pp. 63 - 76
9 ACMA Communications report 2015–16


Keyword Results