Wikipedia talk:Subject Recruitment Approvals Group/Requests

Applications that are not made from a registered account will be denied

Why? What's wrong with an IP applying? You're planning on asking their real name, contact details, etc, so why demand an account? Josh Parris 06:19, 15 December 2009 (UTC)[reply]

A registered account allows us to monitor their activity and it provides a place for Wikipedians to contact them using the communication mechanisms of the system. Also, it could be difficult for researchers to participate in discussions about their application without an account since their IP address could be dynamic or they could need to edit from a different machine at some point. --EpochFail^(talk|work) 20:44, 17 December 2009 (UTC)[reply]

Added to page. Josh Parris 05:11, 22 December 2009 (UTC)[reply]

Fine tuning

What's the process for a researcher changing their proposal in response to feedback - can it happen during deliberations? Are there instructions to help the researcher do that? Josh Parris 09:12, 15 December 2009 (UTC)[reply]

Why do you feel that there needs to be an explicit process for changing the application/request in response to feedback? --EpochFail^(talk|work) 20:45, 17 December 2009 (UTC)[reply]

Because feedback left early in the process might be different to that left later in the process. Look at a contentious AfD nomination; the article becomes very dynamic as complaints are issued and addressed. A researcher might assume that the application is submitted, it's commented upon, and once rejected it's altered and resubmitted. If altering a proposal in response to feedback is explicitly permitted or denied, everyone knows where they stand. Josh Parris 05:11, 22 December 2009 (UTC)[reply]

I think I understand what you mean now. I see two possibilities being weighed here:

Applications can be changed at any time due to feedback.
Applications must remain constant through the application process.

I'm a fan of simply allowing people to change their applications at any time. It seems to be the norm for Wikipedia that we talk about things while we work on them. If we didn't allow changes during the approval process, I can imagine that many people would withdraw their applications simply to resubmit them with a quick change. I'll try to make this explicit in the article. --EpochFail^(talk|work) 19:46, 4 January 2010 (UTC)[reply]

Recruitment end date

(Optional) In studies where recruitment has an end date and that end date has passed, User:StudyRecruitmentBot will update the posted message to indicate this.

Why not just disclose this date in the recruitment message? Josh Parris 05:47, 15 December 2009 (UTC)[reply]

The end of recruitment could possibly change. In the case where recruitment was only planned to be open for a week, the researchers could decide to leave it open longer in order gather more subjects. The end of recruitment could also happen due to non-time-specific events such as when enough subjects have been successfully recruited. --EpochFail^(talk|work) 20:54, 15 December 2009 (UTC)[reply]

Suggestion: include that information in a template, and change the template when something happens. Initially, it could read:

I'm testing the effects of electric shocks on Wikipedia editors. Please join my study by following this link, open for a limited time.

and after changing the template, would read:

I'm testing the effects of electric shocks on Wikipedia editors. This study is now closed (the details of the study are here).

Saving the complexity of the bot figuring out where its messages have disappeared to when archived, or subsequent modification by other editors, or what-not. Except, of course, the text of the template will need to be tightly controlled; or at least, kept out of researcher's prying hands. Editprotected perhaps. Josh Parris 02:21, 27 December 2009 (UTC)[reply]

I was imagining using transclusion as well to post the "study has been completed" message. I'm not sure the message would need to be write protected though. So long as the right people are watching the page, I can't imagine anyone taking advantage of the study complete/incomplete inclusion message. I think that the rest of the message should not be transcluded. --EpochFail^(talk|work) 19:30, 4 January 2010 (UTC)[reply]

I'm going to assume this means I can remove the bit about the bot going back and messaging again/altering the message. Josh Parris 11:22, 5 January 2010 (UTC)[reply]

Sure. If we transclude, any member of SRAG or the researcher could make the modification without much trouble. --EpochFail^(talk|work) 22:48, 5 January 2010 (UTC)[reply]

Researchers create a random sample of editors

I pulled the {{bot}} filter out of this; the recruitment bot does that filtering, and does it later (later=better). Josh Parris 06:27, 15 December 2009 (UTC)[reply]

We were thinking that the filtering ought to occur in the sampling process with the researcher. There is the possibility that a researcher would like to sample a small group of highly sought-after participants, and a random sample generated from them may need to have a significant portion thrown out because they are ineligible due to having been recruited recently for other studies. So, some feedback where the researcher submits a list, and the bot checks to see how many are unavailable, which allows the researcher to adjust his sample prior to messages going out seemed better than than coming back after messages have been sent and saying "1/3 of the subjects were ineligible". -- PiperNigrum ^(hail|scan) 15:53, 15 December 2009 (UTC)[reply]

I presume a highly sought-after group would be admins, of which there's only about a thousand active. I can see why such a desirable group could quickly tire of recruitment messages. I expect records would need to be kept to extend the time between messages.

I assume that you're proposing a double filter, the bot refusing to message {{nobots}} users, and the researcher refusing to put them in the sample group. I believe that this duplication of effort helps no-one; in my opinion this just increases the complexity of generating the list of candidates. It also means that someone who might have been available to recruit because the template was removed after the candidate list is generated won't be contacted because they were never on the list in the first place.

The size of the sample to be provided by the researchers will be specified by the SRAG based on the researcher's desired number of participants and SRAG's estimated conversion rate. Let the researchers concentrate on what they know - the criteria - and SRAG on their area of expertise, recruitment. Josh Parris 02:21, 27 December 2009 (UTC)[reply]

The double filter for ensuring that editors are not over-bothered is a common mechanism used in these sort of systems. No only do we have policy against unnecessarily bothering people, but the system will also stop us if we accidentally ask it to message someone it shouldn't. We also need to be able generate the sample around the editors who are not to be bothered in order to ensure that the sample can even be made. For instance, if there were 1000 editors in a pool to be sampled from and 400 of them should not be bothered for some reason, the random sample should be gathered from the remaining 600. Otherwise, about half of the sample will fail when we actually try to send the message.

These sort of problems can be worked out in process. If we had the researchers generate the pool (rather than the sample) we could subtract the list of editors they wouldn't be able to message. From the remaining editors, the researchers could build the sample. Finally, when the sample is to be messaged, our final failsafe will ensure we don't accidentally message anyone who added the {{bot}} template between the sample generation and the messaging. --EpochFail^(talk|work) 22:46, 5 January 2010 (UTC)[reply]

These are clear arguments.

Just defining terms for those (like me) who haven't looked at stats for about 20 years:

Pool = the Users that meet the selection criteria
Sample = the number of participants the researcher has requested

So there's an intermediate value, Invitation group, that has to account for sign-up rates. It's the size of this that SRAG determines, right? Or am using the wrong terminology? Josh Parris 10:47, 6 January 2010 (UTC)[reply]

Pool is correct. Sample = the invitation group. Then, the number of participants would be a subset of the sample.

At no time does SRAG (or the researcher) specify the number of people they would like to have in their study. SRAG can only say "We will ask these 500 people", not "We will get you 134 subjects," or even "We will ask these 500 people to get you about 120 subjects." There are too many variables involved in knowing how many of a sample will particpate to make any predictions/guarantees.

-- PiperNigrum ^(hail|scan) 15:15, 6 January 2010 (UTC)[reply]

Won't the statistics for a study be nearly worthless if there are too few participants? I want to save everyone's time by eliminating studies that aren't going to be statistically significant.

I'd like to assert that (eventually) SRAG members will be more experienced than the researchers about take-up rates (given all the contributing factors in a study), and that they ought to be providing guidance to the researcher about how many will need to be invited to get the desired participation numbers. The community decides whether that's an appropriate number of people to bother within the pool for the study to be successful.

But you've said that researchers shouldn't disclose a desired number of partipants. You've hinted that the problem is making guarantees about conversion rates, but I think conversion rates would be very important information for SRAG to collect to inform future sample sizes. Otherwise, each time a study is proposed a wild guess has to be made by the researcher as to how many to ask to be invited, and SRAG learns nothing from the recruitment for each study.

If a study sounds boring and unimportant, the invitation group would have to be quite large and therefore the study is likely to fail to achieve consensus to proceed (on the assumption that the community doesn't like the idea of inviting 36,000 active editors to "rank twelve two-page essays on medieval Chinese pottery decoration techniques" in an attempt to quantify what kind of prose currently active editors prefer, a study that requires 120 participants for strong statistical results). The researcher (who is, in their spare time, a connoisseur of Ming dynasty ceramics) may think that this sounds like a delightful way to spend an hour and a half, and only ask for 350 invitations, they may then be quite disappointed when 350 is approved and they get two participants; we've bothered 350 people - two of whom have wasted an hour and a half; wasted because each now there's no study, not with two participants - and there's a suicidal researcher.Josh Parris 01:44, 9 January 2010 (UTC)[reply]

Not all studies that lack statistical significance are worthless. There are large categories of research that are not primarily concerned with statistical significance.

I think it seems reasonable that SRAG would eventually be able to advise researchers about appropriate sample sizes and effective recruitment messages, but managing recruitment seems like too much responsibility for a group of volunteers primarily concerned with evaluating recruitment requests.

I think that the concerns you've brought up about recruitment sizes and statistical significance are important and should enter the SRAG discussions, but I worry that turning those concerns into responsibilities for SRAG is unnecessary and would be mostly unworkable. --EpochFail^(talk|work) 16:27, 11 January 2010 (UTC)[reply]

One thing that could be done is have two messages for each recruitment drive (especially early on), so as to allow measuring response rates to different formats/content - meta research, or recruitment research and hypothesis testing. Collecting information like that could accelerate the learning process; there wouldn't be so much of a gut instinct thing; the discussion could point to previous recruitment drives and say "testing has shown offering financial compensation in the recruitment message lowers the click-through rate by 64%".

Have we specified anywhere what ought (and ought not) be discussed in an application? I've started something over at Wikipedia:Subject Recruitment Approvals Group/Requests Josh Parris 01:59, 12 January 2010 (UTC)[reply]

We have specified requirements for acceptance of applications. These should provide a guide for discussion. --EpochFail^(talk|work) 16:42, 14 January 2010 (UTC)[reply]

Approved applications

Wikipedia:SRAG/Requests/Approved is a holding area for all approved applications. It needs to be created. Josh Parris 02:39, 5 January 2010 (UTC)[reply]

In that page there needs to be a delineation between approved and running applications; approved applications haven't got their sample together yet, plus any other administration that needs to be done, like message templates. Once that's all finalised, then it becomes "running". Running studies can be closed, published, open, abandoned and a bunch of other statuses I guess. Josh Parris 02:43, 5 January 2010 (UTC)[reply]

I'm not sure why SRAG would be concerned with the status of a study. SRAG's only responsibility is to help with recruitment. Once the messages have been sent SRAG is done until another request if filed. --EpochFail^(talk|work) 22:12, 5 January 2010 (UTC)[reply]

Well, other than interested parties want to know what's happening with the study, like are results published (if so where), what other research has already been done on Wikipedia, etc Josh Parris 10:47, 6 January 2010 (UTC)[reply]

This sounds like a job for WP:WikiProject Research or possibly spin-off project interested only with cataloging research. --EpochFail^(talk|work) 17:15, 6 January 2010 (UTC)[reply]