Where to Recruit for Security Development Studies:
Comparing Six Software Developer Samples
You are planning to do a security study with developers and are confused about where to recruit from? Well, you have landed at the right place!

This is the supplementary website for our paper “Where to Recruit for Security Development Studies: Comparing Six Software Developer Samples” which will appear at USENIX'22. In this paper, using a literature review and surveys across six recruitment platforms, we offer a better understanding of the advantages and disadvantages of available developer recruitment platforms.

This website contains all the essential details about our paper. For more information, please refer to our USENIX'22 paper.

Have questions or want to know more? Contact: kaur@sec.uni-hannover.de

Publication #


First page of the publications
Where to Recruit for Security Development Studies: Comparing Six Software Developer Samples
Harjot Kaur, Sabrina Amft, Daniel Votipka and Yasemin Acar and Sascha Fahl.
31st USENIX Security Symposium (USENIX Security 2022), August 10-12, 2022.

Here is the paper pdf with the “artifacts available” badge and the supporting “artifact appendix”.
Abstract

Studying developers is an important aspect of usable security and privacy research. In particular, studying security development challenges such as the usability of security APIs, the secure use of information sources during development or the effectiveness of IDE security plugins raised interest in recent years. However, recruiting skilled participants with software development experience is particularly challenging and it is often not clear what security researchers can expect from certain participant samples, which can make research results hard to compare and interpret. Hence, in this work we study for the first time opportunities and challenges of different platforms to recruit participants with software development experience for security development studies.

First, we identify popular recruitment platforms in 59 papers. Then, we conduct a comparative online study with 706 participants based on self-reported software development experience across six recruitment platforms. Using an online questionnaire, we investigate participants' programming and security experiences, skills and knowledge.

We find that participants across all samples report rich general software development and security experience, skills, and knowledge. Based on our results, we recommend developer recruitment from Upwork for practical coding studies and Amazon MTurk along with a pre-screening survey to reduce additional noise for larger studies. Both of these, along with Freelancer, are also recommended for security studies. We conclude the paper by discussing the impact of our results on future security development studies.

Replication Package

In order to support reproducibility of our work, we provide all the necessary supplementary materials in a replication package.

Filename Type
surveys.pdf Screening and final survey
question-bank.pdf Question bank created after literature review
recruitment-emails.pdf Recruitment emails for students
job-posts.pdf Job posts for public platforms (MTurk, Prolific, Upwork, Freelancer)
additional-tables.pdf Additional data to the demographics in the paper
additional-figures.pdf Additional result figures
consent-forms.pdf Consent forms used for the screening and the final survey

Here is a neat little zip of the complete replication package.

Overview #

In order to understand opportunities and challenges of different platforms to recruit participants with software development experience for security development studies:

  • We first analysed 59 papers studying security expert work published in the last five years and identified common recruitment platforms and experiences, skills and knowledge researchers required from their participants.
  • Based on the literature review results, we conduct a comparative online study with 706 participants based on self-reported software development experience across six recruitment platforms.
Motivation

Human factors research is essential for improving overall computer security and privacy. In particular, developers have increasingly received research attention in the community in recent years. However, recruiting expert samples is often challenging and might be time consuming and expensive since developers often hold well paid jobs and have high workloads, thus offering motivating incentives for their participation can be hard. Hence, a better understanding of the advantages and disadvantages of available recruitment platforms is valuable for the community and can help researchers to recruit the participants they need more efficiently.

While some of the previous studies discussed their recruitment experiences, to the best of our knowledge, we are the first to systematically compare participant samples with software development experience across the popular recruitment platforms used in previous work.

Research questions

The goal of our comparative online study was to answer the following research questions:

Research Question 1

Which general software development and specific security development experiences, skills and knowledge can researchers expect from the common recruitment platforms we identified in previous work?

Research Question 2

How well do samples compare and what are differences between them?

Research Question 3

What should researchers take into account when considering sampling for a security development study?

Methodology #

In the following section we investigate recruitment strategies and survey questions in security studies with experienced software developers. We aim to gain insights into common recruitment strategies and the experiences, skills and knowledge previous studies required from their participants. Therefore, we collected and reviewed five years of relevant research published at important security, privacy and HCI venues. We did not aim for an exhaustive literature review across all potential venues from the beginning of studying developers. Instead, our goal was to learn recent common practices. The literature review is the foundation of the comparative studies which follow the literature review.

Structure of the surveys consisting of 9 sections.

Figure 1: Methodology overview: The literature review informed and motivated our comparative online studies.

Literature Selection and Survey #

  • Selection:

We broadly selected publications in the field of usable security and privacy that conducted user studies with security experts. We focused on works published between 2016 and 2020 at the top (usable) security and privacy, and human computer interaction venues: the USENIX Security Symposium (SEC) , the ACM Conference on Computer and Communications Security (CCS), the IEEE Symposium on Security and Privacy (S&P), the Network and Distributed System Security Symposium (NDSS), the Symposium on Usable Privacy and Security (SOUPS), the Human Factors in Computing Systems (CHI), as well as the workshops Usable Security (USEC) and its European counterpart (EuroUSEC). Overall, we found 74 papers that at least one author identified as relevant and used them for further analysis.

Although we only survey developers, studies with different types of experts are often related and similar in nature, and therefore including them gives us a wider net, which helped us design a fitting survey questionnaire.

  • Survey:

For the remaining set of papers, we collected information about participant recruitment and survey questions if available. Two authors independently reviewed each paper for the above information in detail. Overall, we found participant recruitment information in 58 papers and 363 questions in 45 papers. We assigned all extracted survey questions and answer options to one of the following categories:

  • General: Demographics such as age, gender, or education.
  • Experiences, Skills and Knowledge: Security and programming experiences, skills and knowledge.
  • Scales: Established scales such as the System Usability Score (SUS) or the Secure Software Development Self-Efficacy Score (SSD-SES).
  • Specific: Specific questions for studies, e. g. self-assessment of task success or failure.

Results #

In the following section, we report and discuss results of our literature selection. Overall, we identified 25 recruitment platforms. In 12 papers, recruitment was described at a superficial level, such as e.g. social media ads or online cold calling. We assigned all 25 strategies to six categories (number of papers in parentheses):

Unsolicited Emailing (10)

Papers in this categories sent unsolicited emails by collecting participants’ contact information. Example email collection platforms are GitHub (5) and Google Play (5).

Social or Regional Contacts (75)

Strategies based on some form of professional or personal network (29), snowball sam- pling (10), as well as recruiting through security related events (13) or regional expert meetups (4). Also includes the dis- tribution of flyers (6), craigslist (2), and the recruitment of computer science students (11).

Social Media (10)

Posting study information on social me- dia platforms such as Twitter (5), Facebook Groups (1) and Ads (2), and the chat software Slack (2).

Online Forums/Blogs (33)

Strategies relying on discussion platforms such as Reddit (8), online forums (8), mailing lists (15), and blogs (2) dedicated to computer science topics.

Paid Workers (4)

Freelancing or crowdsourcing platforms such as Prolific (1), Upwork (1), and Freelancer (2).

Networking (8)

Professional networking platforms such as LinkedIn (7) or its German counterpart Xing (1).

Survey #

We designed, pre-tested and conducted a comparative online survey study with six samples of developers based on our literature survey findings. We collected their demographic information, as well as data about participants’ security and programming knowledge, skills and experience with an online questionnaire.

Pre-test and Screening
  • Pre-test:

Before we recruited participants and conducted the surveys, we pre-tested our survey twice. We used our findings to iteratively revise and adapt the survey questions and answer options to minimize bias and maximize validity.

Pre-test 1 [cognitive interviews]:

Pre-test with members of our research group who were not part of this project.

Pre-test 2 [survey]:

Pre-test using two rounds of pilot studies on Prolific with 20 users each.

  • Screening:

For recruitment purposes, we created a short screening survey (see “Supplementary Materials | Replication Package” above), that inquired about software development experience, current job role and gender to be able to filter eligible participants wherever necessary. For our final survey, we exclusively invited participants who claimed to have experience as software developer in our screening survey.

To confirm the self-reported development experience of Prolific and MTurk users, we used two additional programming screening questions from Danilova et al. [Add citation link here] in a new screener and repeated these samples.

Ethics

None of the involved institutions required a formal IRB ap- proval. However, we only used previously established ques- tions that always included options to decline to answer. Fur- thermore, every participant agreed to our consent form with detailed information about the study, responsible researchers and contact information, risks, benefits as well as privacy and participant rights. At the end of the survey, participants had the chance to not submit their answers to exclude them from our analysis.

Our survey did not collect any PII except for the email addresses of participants interested in the raffle, which were deleted after the raffle was done. We stored the collected data on our encrypted cloud server, which only involved authors could access. Additionally, we used random six-digit numbers to identify valid submissions for compensation pur- poses, but they were not stored and processed further in any other way.

Compensation depended on the platform, but we aimed to award at least US federal minimum wage. For the screening surveys on Prolific and MTurk we paid $0.15 for one minute of work which was increased to $0.52 for three minutes in the rerun, while we awarded $5 for the full survey. Although we do not have data on the exact survey completion times, Prolific reported rates well above the US federal minimum wage. Participants we needed to contact via email had the chance to take part in a raffle for 20 $50 gift cards.

Structure #

We built the survey based on questions we extracted from papers in our literature analysis and the 2020 StackOverflow Developer Survey. Overall, the survey consisted of five sections with 46 questions. The sections ranged from general to specific job experience, to organizational, to coding and finally demographic questions. We distributed the survey in English, but translated the survey for German students.

Structure of the surveys consisting of 9 sections.

Figure 2: Illustration of the survey flow. In each section, participants were presented with relevant questions as shown in the figure.

Distribution #

We distributed our survey on six different recruitment platforms we identified from our literature survey results: MTurk, Prolific, Upwork, Freelancer, Google Play and computer science students. Although we are not aware of previous work that recruited participants with software development experience from Amazon MTurk, we chose to include it because of its general popularity in usable security and privacy research and to investigate to what extent it can be used in future security developer studies.

Results #

In the following section, we give a brief overview of the findings from the comparative survey. Please refer to our USENIX'22 paper for details.

Valid Participants
Total valid overall: 706
  • MTurk (valid): 101
  • Prolific (valid): 122
  • Upwork (valid): 72
  • Freelancer (valid): 100
  • Google Play (valid): 103
  • CS Students (valid): 208

Demographics #

  • Cultural background and language proficiency: Participants across all samples reported a wide variety of ethnicities.

    Structure of the surveys consisting of 9 sections.

    Figure 3: Distribution of ethnicities among our participants.

    English (US) was reported as the native language for a majority of participants on MTurk (78.2%) which is likely due to most of them being located in the U. S. (71.3%). Participants on Prolific, Upwork and Freelancer reported a variety of native languages with none of them standing out.

  • Education and employment: A majority of participants on MTurk (76.2%), Upwork (63.9%) and Freelancer (53.0%) reported having a Bachelor’s degree, while master’s degree was the most common on Google Play (27.2%). A majority within the student sample reported to be students (60.1%) or part-time employees (22.1%)% as opposed to other samples, where most were full-time employees or self employed. An exception here was Prolific, where 26.2% of the participants reported to be students, the second largest group after full-time employees (45.9%).

  • Workplace size and working hours: Across all platforms, the majority of participants (70.1%) reported working in a small company with less than 500 employees. Students (μ: 22.7) reported the lowest overall working hours per week. Participants on Freelancer (μ: 39.0) followed by Google Play (μ: 37.0) and MTurk (μ: 36.0) reporting the highest working hours.

Key Findings: Demographics.

  • Software development experience is by far most common.
  • Participants are also (less) experienced with security-relevant areas such as reverse engineering or vulnerability research.
  • The majority of our participants studied computer science with no security focus and many participants work full-time in smaller companies.
  • Disabilities are very rare.
  • Caregiving (for children) was only common on MTurk and Freelancer.
  • We find a wide variety of ethnicity and native languages between platforms.
  • Bachelor/Master degrees are common.

General Programming Experience and Knowledge #

  • Development experience: Google Play participants re- ported both the highest overall (μ: 17.8) and professional development experience (μ: 11.0). In contrast, students were least experienced regarding both overall (μ: 7.0) and professional experience (μ: 2.1).

    Structure of the surveys consisting of 9 sections.

    Figure 4: Years of total, professional and computer security experience for developers across all samples.

  • Development tasks per week: The majority of Freelancer (79.0%), Google Play (61.2%) and Upwork (63.8%) participants reported working more than 20 hours per week.

  • Proficiency in programming, scripting and markup languages: MTurk participants report higher ratings for most programming languages as well as in development areas, although no single language stood out.

    Structure of the surveys consisting of 9 sections.

    Figure 5: Developers’ average self ratings for proficiency with the top 15 programming, scripting and markup languages.

When examining different development areas, we find that frontend and full stack development were rated the highest overall.

Structure of the surveys consisting of 9 sections.

Figure 6: Developers’ average self ratings for proficiency in different development areas.

Key Findings: General Programming Experience and Knowledge.

  • Google Play developers reported the most and student developers the least experience in years.
  • On Freelancer, Google Play, and Upwork we recruited mostly full-time developers.
  • MTurkers reported the highest proficiencies across most development areas and programming languages.

Security Experiences, Knowledge and Skill #

  • General security experience, knowledge and skills: Google Play participants self-reported the highest security experience (μ: 4.4) and Prolific users the lowest (μ: 1.3).

  • Implementing specific security features:: Our participants reported most experience with implementing input validation, authorization and authentication features, using API keys, using encryption and storing user creden- tials. They reported least experience with cryptographic key management, digital signatures and fraud prevention features.

Structure of the surveys consisting of 9 sections.

Figure 7: Developers’ security features usage.

  • Security training: Regarding security training, in almost all samples except Prolific, more than one third of participants received a security related training at work, with Google Play (53.4%) and MTurk (54 5%) participants reporting extraordinary high numbers.
Structure of the surveys consisting of 9 sections.

Figure 8: Security related training our participants received.

  • Security-related activities and events: Regarding security-related activities like CTF contests, submission to bug bounty programs etc, we found that participants most commonly at- tended security related events or had previously disclosed vulnerabilities. However, this did not apply to Google Play developers, where only up to 22.3% participated in any ac- tivity, while, e. g., 42.3% of the lesser experienced students stated to have attended security related events.
    Structure of the surveys consisting of 9 sections.

    Figure 9: Security related activities and events our participants took part in. The values represent the percentage per sample that stated to have taken part in the respective activity/event.

Key Findings: Security Experiences, Knowledge and Skill.

  • MTurk along with Freelancer and Upwork participants reported the highest values for most security related questions while CS students and partially Prolific participants reported the lowest.
  • Secure development features were used, but not by a majority.
  • Most common features included input validation, authorization and authentication, API keys, encryption and storing passwords.
  • More than a third of all participants reported security training at work.
  • Security activities were not common in most samples.
  • We found a security focus and tasks more common in teams than for solo workers.

Recommendations #

In this section, we provide key recommendations for platforms suitable for certain security study types.

Participant Characteristics #

MTurk and Freelancer participants report the second-highest overall and professional development experience, respectively and can be considered reasonable developer recruitment alternatives to Google Play, with the benefit of offering faster recruitment.

We found MTurk to be the most diverse as participants reported high proficiency levels for most development areas and languages.

Security experience, skills and knowledge #

Freelancer and Upwork participants often reported high values (more than MTurk at some instances) on security-related questions. They are viable options when researchers are trying to reach a diverse sample.

Participants who work in teams are more likely to have a security focus in their job or work on security-related tasks when compared to those who work alone. This is another pointer towards using platforms like MTurk, where comparatively more participants work in teams, to recruit security experts.

Sample Diversity #

Regarding gender, we find an overwhelming majority of male participants over all platforms, which is sadly usual in security research. MTurk and Freelancer offer the widest degree of diversity within their samples with respect to ethinicity.

Recruitment Strategies #

Crowdsourcing #

We recommend MTurk for larger studies with more participants where some noise (i. e., fraud) in the data could be acceptable. Studies with a smaller number of required participants should use platforms like Freelancer or Upwork, where the data is less likely to be noisy.

Freelancers #

We recommend Upwork to recruit experienced developers for practical coding studies or similarly larger tasks.

Email Invites #

Distributing a survey via email is the fastest and cheapest, but also offers the worst response rates. As there is no platform for handling payments, offering compensation is more complex.

Key Points: Recruitment advice.

  • While there are developers on Prolific, they are less experienced in security topics.
  • Crowdsourcing platforms should only be used with filtering via screening questions, especially MTurk.
  • Freelancing platforms require a lot of manual work and are more expensive.
  • While emails are cheaper, they have low response rates.

Summary #

In this work, we first identified common recruitment strategies for user studies with participants with software development experience. We extracted relevant survey questions from these papers, and designed and tested a questionnaire to study the general and security programming, knowledge, skills and experience of participants.

Finally, we surveyed 706 participants across six samples, and provide detailed insights into their survey responses.

  • We find that Google Play, Freelancer, Upwork, and MTurk participants reported the most professional software development experience.
  • Experience performing security tasks was similar across all platforms, with MTurk participants reporting the most security experience overall, and Upwork and Freelancer often performing high as well.
  • We see especially Google Play, Upwork, and Freelancer participants reporting the most experience with specific security tasks such as authorization/au- thentication, input validation and using API keys.
  • CS students and Prolific users reported the least experience performing security tasks.

Overall, we found that participants across samples varied significantly, and that the characteristics of different recruitment strategies highly influenced their suitability for different study types.

Acknowledgements #

We want to thank all our pre-test and final survey participants for their participation. We appreciate your knowledge, effort, and the valuable time you have generously given.

We hope that with this work and your contribution, the research community will better understand the pros and cons of different recruitment platforms while recruiting developers best suited for their studies.

Cite this Work #

@inproceedings{conf/usenix/kaur22,
author = {Harjot Kaur and Sabrina Amft and Daniel Votipka and Yasemin Acar and Sascha Fahl},
title = {Where to Recruit for Security Development Studies: Comparing Six Software Developer Samples},
booktitle = {31st USENIX Security Symposium (USENIX Security 22)},
year = {2022},
isbn = {978-1-939133-31-1},
address = {Boston, MA},
pages = {4041--4058},
url = {https://www.usenix.org/conference/usenixsecurity22/presentation/kaur},
publisher = {USENIX Association},
month = aug,
}
Kaur et al. "Where to Recruit for Security Development Studies: Comparing Six Software Developer Samples" 31st {USENIX} Security Symposium ({USENIX} Security 22). 2022. 
Kaur, H., Amft, S., Votipka, D., Acar, Y., & Fahl, S. (2022, August). Where to Recruit for Security Development Studies: Comparing Six Software Developer Samples. In 31st {USENIX} Security Symposium ({USENIX} Security 22).
%0 Conference Proceedings
%T Where to Recruit for Security Development Studies: Comparing Six Software Developer Samples
%A Kaur, Harjot
%A Amft, Sabrina
%A Votipka, Daniel
%A Acar, Yasemin
%A Fahl, Sascha
%B 31st {USENIX} Security Symposium ({USENIX} Security 22)
%D 2022
TY  - CONF
T1  - Where to Recruit for Security Development Studies: Comparing Six Software Developer Samples
A1	- Kaur, Harjot
A1	- Amft, Sabrina
A1	- Votipka, Daniel
A1	- Acar, Yasemin
A1	- Fahl, Sascha
JO  - 31st {USENIX} Security Symposium ({USENIX} Security 21)
Y1  - 2022
ER  -

Sharing is caring!