{"title": "Assessing Blinding in Clinical Trials", "book": "Advances in Neural Information Processing Systems", "page_first": 521, "page_last": 529, "abstract": "The interaction between the patient's expected outcome of an intervention and the inherent effects of that intervention can have extraordinary effects. Thus in clinical trials an effort is made to conceal the nature of the administered intervention from the participants in the trial i.e. to blind it. Yet, in practice perfect blinding is impossible to ensure or even verify. The current standard is follow up the trial with an auxiliary questionnaire, which allows trial participants to express their belief concerning the assigned intervention and which is used to compute a measure of the extent of blinding in the trial. If the estimated extent of blinding exceeds a threshold the trial is deemed sufficiently blinded; otherwise, the trial is deemed to have failed. In this paper we make several important contributions. Firstly, we identify a series of fundamental problems of the aforesaid practice and discuss them in context of the most commonly used blinding measures. Secondly, motivated by the highlighted problems, we formulate a novel method for handling imperfectly blinded trials. We too adopt a post-trial feedback questionnaire but interpret the collected data using an original approach, fundamentally different from those previously proposed. Unlike previous approaches, ours is void of any ad hoc free parameters, is robust to small changes in auxiliary data and is not predicated on any strong assumptions used to interpret participants' feedback.", "full_text": "Assessing Blinding in Clinical Trials\n\nOgnjen Arandjelovi\u00b4c\n\nDeakin University, Australia\n\nAbstract\n\nThe interaction between the patient\u2019s expected outcome of an intervention and\nthe inherent effects of that intervention can have extraordinary effects. Thus in\nclinical trials an effort is made to conceal the nature of the administered inter-\nvention from the participants in the trial i.e. to blind it. Yet, in practice perfect\nblinding is impossible to ensure or even verify. The current standard is follow up\nthe trial with an auxiliary questionnaire, which allows trial participants to express\ntheir belief concerning the assigned intervention and which is used to compute a\nmeasure of the extent of blinding in the trial. If the estimated extent of blinding\nexceeds a threshold the trial is deemed suf\ufb01ciently blinded; otherwise, the trial\nis deemed to have failed. In this paper we make several important contributions.\nFirstly, we identify a series of fundamental problems of the aforesaid practice and\ndiscuss them in context of the most commonly used blinding measures. Secondly,\nmotivated by the highlighted problems, we formulate a novel method for handling\nimperfectly blinded trials. We too adopt a post-trial feedback questionnaire but in-\nterpret the collected data using an original approach, fundamentally different from\nthose previously proposed. Unlike previous approaches, ours is void of any ad hoc\nfree parameters, is robust to small changes in auxiliary data and is not predicated\non any strong assumptions used to interpret participants\u2019 feedback.\n\n1\n\nIntroduction\n\nUltimately, the main aim of a clinical trial is straightforward: it is to examine and quantify the effec-\ntiveness of a treatment of interest. Effectiveness is evaluated relative to the effectiveness of a partic-\nular reference, the so-called control intervention. To ensure that the aforementioned comparison is\nmeaningful, it is of essential importance to ensure that any factors not inherently associated with the\ntwo interventions (treatment and control) are normalized (controlled) between the two groups. This\nensures that the observed differential outcome truly is the effect of differing interventions rather than\nany orthogonal, confounding variables. A related challenge is that of blinding. Blinding refers to the\nconcealment of the type of administered intervention from the individuals/patients participating in a\ntrial and its aim is to eliminate differential placebo effect between groups [10, 3, 11]. Although con-\nceptually simple, the problem of blinding poses dif\ufb01cult challenges in a practical clinical setup. We\nhighlight two speci\ufb01c challenges which most strongly motivate the work of the present paper. The\n\ufb01rst of these stems from the dif\ufb01culty of ensuring that absolute blinding with respect to a particular\ntrial variable is achieved. The second challenge arises as a consequence of the fact that blinding can\nonly be attempted with respect to those variables of the trial which have been identi\ufb01ed as revealing\nof the treatment administered. Put differently, it is always possible that a particular variable which\ncan reveal the nature of the treatment to a trial participant is not identi\ufb01ed by the trial designers\nand thus that no blinding with respect to it is attempted or achieved. This is a ubiquitous problem,\npresent in every controlled trial, and one which can severely affect the trial\u2019s outcome.\nGiven that it is both practically and in principle impossible to ensure perfect blinding, the practice\nof post hoc assessment of the level of blinding achieved has been gaining popularity and general\nacceptance by the clinical community. The key idea is to use a statistical model and the partici-\npants\u2019 responses to a generic post-trial questionnaire to quantify the participants\u2019 knowledge about\nthe administered intervention. While the statistical model used to this end has been a source of\ndisagreement between researchers, as discussed in detail in Sec 2, the general approach is shared\nby different methods described in the literature. In this paper we argue that this common approach\nsuffers from several important limitations. Motivated by these, in the present work we propose a\nnovel statistical framework and use it to derive an original method for integrated trial assessment\nwhich is experimentally shown to produce more meaningful and more clearly interpretable data.\n\n1\n\n\fSymbol\na\ng\n\nTable 1: Notational convention for mathematical symbols adopted in this paper.\nDescription\nsubscript specifying group assignment; a = C and a = T signify control and treatment groups\nsubscript specifying membership belief; g = \u2212 and g = + signify belief in control and\ntreatment group memberships, g = 0 signi\ufb01es uncertainty\nproportion of participants who were assigned to group a and believe the membership to be g\nproportion of participants who were assigned assigned to group a\nproportion of participants who believe their group membership to be g\n\nPag\nPa\nPg\n2 Previous Work\n\nIn this section we describe the general methodology of auxiliary post-trial data collection, the two\nmost in\ufb02uential statistical models which use the aforesaid data to quantify the extent of blinding in\na trial, and discuss the key limitations of the existing approaches which motivate the work described\nin the present paper.\n\n2.1 Method 1: James\u2019s Blinding Index\n\nAt the heart of the so-called blinding index proposed by James et al. [7] is the observation that the\neffect of a particular intervention is affected by the participant\u2019s perception of the effectiveness of\nthe intervention the participant believes was administered. For example, a control group member\nwho incorrectly believes to be a member of the treatment group may indeed experience positive\neffects expected from the studied treatment. The is the extensively studied placebo effect [2, 9].\nAuxiliary Data\nJames et al. propose the use of a post-trial questionnaire to assess the level of\nblinding in a trial. The participants are asked if they believe that they were assigned to the (i) control\nor (ii) treatment groups, or (iii) if they are uncertain of their assignment (the \u201cdon\u2019t know\u201d response).\nExtensions of this scheme which attempt to harness more detailed information have also been used,\ne.g. allowing the participants to quantify the strength of their belief.\nBlinding Level The existing work on the assessment of trial blinding uses the collected auxiliary\ndata to calculate a statistic referred to as the blinding index. For a 3-tier auxiliary questionnaire,\nJames et al. [7] de\ufb01ne their index as (our mathematical notation is summarized in Tab 1):\n\n(cid:104)\n\n\u03c11 =\n\n1\n2\n\n1 + P0 + (1 \u2212 P0) \u00b7 \u2206\n\n(cid:105)\n\n(1)\n\n(2)\n\nIt can attain values in the interval [0, 1], higher values denoting increasing level of blindness. Thus\n\u03c11 = 1 indicates perfect blinding and \u03c11 = 0 an unblinded trial. The statistic \u2206 takes into account\nthe distribution of participants who have a decisive belief regarding their assignment:\n\n(cid:88)\n\n(cid:88)\n\n\u2206 =\n\n\u03c9ag\n\na\u2208{P,T}\n\ng\u2208{+,\u2212}\n\nPag(1 \u2212 P0) \u2212 Pg(Pa \u2212 Pa0)\n\n(1 \u2212 P0)2\n\nThe constants \u03c9ag are weighting coef\ufb01cients whose effect is to scale relative contributions of the\ncorrect and incorrect assignment guesses. To gain intuitive insight into the nature of \u03c11, consider the\nplot shown in Fig 1(a). It is readily apparent that \u03c11 is a concave function which attains its maximal\nvalue of 1 when (i) all participants are uncertain of their assignment or (ii) when all participants have\nan incorrect belief regarding their assignment.\nIn comparison with the case of P0 = 1 the attainment of the maximal value \u03c11 = 1 for\nPT + = PC\u2212 = 0 is more questionable. While it is tempting to reason that blinding must have\nbeen successful since no participant correctly guessed their assignment, it would be erroneous to\ndo so. In particular, the consistency of the wrong belief amongst trial participants actually reveals\nunblinding, but with the participants\u2019 incorrect association of the unblinded factor with the corre-\nsponding group assignment. For example, the treatment may cause perceivable side effects (thus\nunblinding the participants) and the worsening of the condition of the treatment group participants.\nThis observation which could lead them to the conclusion that they were assigned to the control\ngroup.\n\n2.2 Method 2: Bang\u2019s Blinding Index\n\nThe blinding index \u03c11 places a lot of value on those participants who plead ignorance regarding their\nassignment status. Bang et al. argue that the non-decisive \u201cdon\u2019t know\u201d response may not express a\n\n2\n\n\f(a)\n\n(b)\n\nFigure 1: Dependency of the blinding indexes (a) \u03c11 and (b) \u03c1(cid:48)\n2 on the proportions of \u201cdon\u2019t know\u201d responses\nP0, and the correct assignment guesses PT + and PC\u2212. Note that although PT + and PC\u2212 are independent\nvariables, due to their symmetric contributions and for the purpose of easier visualization, in this plot it was\ntaken that PT + and PC\u2212 were always equal.\n\ntrue lack of knowledge but rather that it may be a conservative response born out of desire to appear\nbalanced in judgement [1]. Thus, they propose an alternative which instead most heavily weights the\ncontribution of decisive responses. Because decisive responses can be in either the positive or the\nnegative direction, the index is asymmetrical and can be applied separately to treatment and control\ngroups. For a 3-tier auxiliary questionnaire, the index for the treatment group is de\ufb01ned as:\n\n\u03c1(cid:48)\n2 =\n\n2\n\nPC\u2212\n\nPC\u2212 + PT\u2212\n\n\u2212 1\n\n\u00b7 PT\u2212 + PT +\ng\u2208\u2212,0,+ PT g\n\n.\n\n(3)\n\n(cid:18)\n\n(cid:19)\n\n(cid:80)\n\nThe behaviour of \u03c1(cid:48)\n2 can be seen in Fig 1(b) which plots it against the proportions of indecisive\nresponses and correct guesses. It is readily apparent that the plot has a form very different from that\nin Fig 1(a) showing the corresponding variation of \u03c11. Firstly, note that unlike \u03c11, the range of values\nfor \u03c12 is [\u22120.5, 0.5]. The value of \u03c12 = 0 indicates perfect blinding, \u03c12 = 0.5 an unblinded trial and\n\u03c12 = \u22120.5 an unblinded trial with incorrect assignment association, as discussed in Sec 2.1.\nAs the plot shows, this index achieves its perfect blinding value only when P0 = 1. Unlike \u03c11, the\ncase when PT + = PC\u2212 = 0 does not necessarily result in perfect blinding. Also, PT + = PC\u2212 = 1\nand P0 = 0 deems the trial unblinded, as does PT + = PC\u2212 = 0 and P0 = 0 but with the incorrect\nassignment association. Contrast this with the corresponding value of \u03c11.\n\n3 Limitations of the Current Best Standards\n\nIn the preceding sections we described two blinding indexes most widely used in practice to assess\nthe level of blinding in controlled clinical trials. To highlight and motivate the contribution of the\npresent work, we now analyze the limitations of the aforesaid approaches.\nAdjustment of Free Parameters One of the most obvious dif\ufb01culties encountered when applying\neither of the described blinding indexes concerns the need to choose appropriate values for the free\nparameters in Equations (2) and (3) in their general form. These are the weighting constants wag.\nRecall that their purpose is to scale the relative contributions of different responses. Although not\nwithout an intuitive appeal, a thorough analysis of this ad hoc approach reveals a series of problems,\nboth inherent and practical. Firstly, there is no objective underlying mechanism which would explain\nwhy the contributions of different responses should be combined linearly at all. What is more, even\nif linear combination is adopted, it is fundamentally the case that there is no principled method of\nchoosing the values of the weighting constants \u2013 the lack of observable \u201cground truth\u201d means that\nit is not possible to objectively compare the quality of different predictions. Lastly, the values of\n\u201cbest\u201d weighting constant ratios are likely to differ from trial to trial.\nInterpretation of Participants\u2019 Feedback It is important to highlight that both the index of James\net al. as well as that of Bang et al. use the same type of feedback data collected from the trial\nparticipants \u2013 the participants\u2019 stated belief regarding their trial group assignment and their degree\nof con\ufb01dence. Where the two approaches differ in is the interpretation of the participants\u2019 answers.\nJames et al. interpret the non-decisive, \u201cdon\u2019t know\u201d response as indicative of true lack of knowledge\nregarding the nature of the intervention (treatment or control). If the trial participants are ignorant of\ntheir group assignment, it is assumed that they have indeed been blinded. Consequently, \u03c11 heavily\nrelies on the proportion of the non-decisive participants. However, the \u201cdon\u2019t know\u201d response may\n\n3\n\n\fnot truly represent lack of knowledge. Instead, this response may be seen as a conservative one,\nre\ufb02ecting the participants\u2019 desire to appear balanced in their judgement or indeed the response that\nthe participants believe would please the trial administration staff. Thus, \u03c1(cid:48)\n2 mostly relies on the\nresponses of those trial participants who did express belief regarding their group assignment. Blind-\nness is measured by comparing the observed statistics of decisive responses with those expected\nfrom an ideal, fully blinded trial. However, this interpretation of participants\u2019 responses is readily\ncriticized too. As Hemili\u00a8a amongst others notes, because the participants\u2019 feedback is collected post\nhoc it is possible that even a perfectly blinded subject becomes aware of the correct assignment by\nvirtue of observing the effects (or lack thereof) of the assigned intervention [5]. Considering the\nsame issue, Henneicke-von Zepelin [6] suggested that auxiliary data should be collected before or\nshortly after the commencement of a trial. However, this is in most cases unsatisfactory as the par-\nticipants would not have yet been exposed to any unblinded aspects of the trial. As we demonstrate\nin the next section, the approach proposed in this paper entirely avoids this problem.\nSensitivity to Small Input Differences Both James et al. and Bang et al. establish the level of\nblindness in a trial by computing a blinding index and then comparing it with a prede\ufb01ned threshold.\nThis hard thresholding whereby a trial is considered either suf\ufb01ciently well blinded or not means\nthat the outcome of the blinding assessment can exhibit high sensitivity to small differences in\nparticipants\u2019 responses. The response of a single individual may change the assessment outcome.\nYet, such binarization in some form is necessitated by the nature of the blinding indexes because\nneither of the two described statistics has a clear practical interpretation in the clinical context. The\ntask of choosing the value of the aforesaid threshold suffers from much the same problems as the\ntask of selecting the values of the weighting constants, discussed previously \u2013 inherently, there is\nno objective and meaningful way of de\ufb01ning the optimal threshold value, and the value actually\nselected by the practitioner is likely to vary from trial to trial.\nInference Atomization The problem of high sensitivity to small input differences considered pre-\nviously is but one of the consequences of the inference atomization. Speci\ufb01cally, observe that the\nanalysis of the trial outcome data is separated from the blinding assessment. Indeed, only if the\ntrial is deemed suf\ufb01ciently well blinded does the analysis of actual trial data proceed. Thus, if the\nblinding index falls short of the predetermined threshold, the data is effectively thrown away and\nthe trial needs to be repeated. On the other hand, if the blinding index exceeds the threshold, the\nanalysis of data is performed in the same manner regardless of the actual value of the index, that is,\nregardless of whether it is just above the threshold or if it indicates perfect blinding.\nThe variety of problems that emerges from the atomization of different statistical aspects of a trial\nis inherently rooted in the very nature of the framework adopted by James et al. and Bang James\net al. alike. As stated earlier, neither of the two indexes has a clear practical interpretation in the\nclinical context. For example, neither tells the clinician the probability that a particular portion of\nthe participants were unblinded, nor the probability of a particular level of unblinding. Instead, from\nthe point of view of a clinician, the blinding index behaves like a black box which deems the trial\nwell blinded or not, with little additional insight.\n\n4 Principled Approach to Controlled Clinical Trial Data Analysis\n\nWe now describe a principled method for inference from collected trial data.\n\n4.1 Study Design and Outcome Model\n\nAs we demonstrated in the previous section, many of the problems of the approaches proposed by\nJames et al. and Bang et al. inherently stem from the underlying statistical model. Although our\napproach uses the same type of participants\u2019 feedback data, our statistical model differs signi\ufb01cantly\nfrom that employed in previous works.\nIn the general case, the effectiveness of a particular intervention in a trial participant depends on\nthe inherent effects of the intervention, as well as the participant\u2019s expectations (conscious or not).\nThus, in the interpretation of trial results, we separately consider each population of participants\nwhich share the same combination of the type of intervention and the expressed belief regarding this\ngroup assignment. This is conceptually illustrated in Fig 2.\nA key idea of the proposed method is that because the outcome of an intervention depends on both\nthe inherent effects of the intervention and the participants\u2019 expectations, the effectiveness should\nbe inferred in a like-for-like fashion. In other words, the response observed in, say, the sub-group\nof participants assigned to the control group whose feedback professes belief in the control group\n\n4\n\n\fFigure 2: Conceptual illus-\ntration of the proposed sta-\ntistical model for the 3-tier\nfeedback questionnaire. Dot-\nted and solid lines\nshow\nrespectively the probability\ndensity functions of the mea-\nsured trial outcome across in-\ndividuals in the three control\nand treatment sub-groups.\n\nassignment should be compared with the response of only the sub-group of the treatment group who\nequally professed belief in the control group assignment. Similarly, the \u201cdon\u2019t know\u201d sub-groups\nshould be compared only with each other, as should the subgroups corresponding to the belief in the\ntreatment assignment. This idea is formalized next.\n\n4.2\n\nInference\n\nConsider two corresponding sub-groups, that is, sub-groups corresponding to different types of re-\nceived intervention but the same response in the participants\u2019 feedback questionnaire. Furthermore,\nlet the bene\ufb01t of an intervention observed in a particular participant be expressed as a real number\nag . Thus, and without loss of generality, a greater x(i)\nx(i)\nag indicates greater bene\ufb01t. For example, xi\nmay represent the amount of fat loss in a fat loss trial, the reduction in blood plasma LDL in a statin\ntrial etc. Our goal is to infer p(\u2206x), that is, the probability density function over the difference \u2206x\nin the bene\ufb01t observed across the two compared sub-groups.\nLet DCg = {x(1)\nDT g = {x(1)\ntotality of all data of participants who believe they were assigned to the group g:\n\n} be the trial outcome data collected from a control sub-group and\n} of the matching treatment sub-group. Then, if Dg = DCg \u222a DT g is the\n\nCg, . . . , x(nCg)\n\nT g , . . . , x(nT g)\n\nCg\n\nT g\n\np(\u2206x | Dg) =\n\nP (Dg | \u2206x) p(\u2206x)\n\np(Dg)\n\n.\n\n(4)\n\n(cid:90)\n\n(cid:90)\n\nModelling the response of each sub-group using a normal distribution\nT g \u223c N (mT g, \u03c3g)\nx(j)\n\n(5)\nand remembering that for the underlying distributions it holds that mCg + \u2206x = mT g, allows us to\nfurther write\n\nCg \u223c N (mCg, \u03c3g)\nx(i)\n\nand\n\np(\u2206x | Dk) \u221d p(Dg | \u2206x) =\n\np(Dg | \u2206x, mCg, \u03c3g) p(mCg) p(\u03c3g) d\u03c3g dmCg\n\n(6)\n\nmCg\n\n\u03c3g\n\nwhere p(mCg) is a prior on the mean of the control sub-group and p(\u03c3g) a prior on the standard\ndeviation within sub-groups. What Eq (6) expresses is the process of probability density function\nmarginalization over nuisance variables mCg and \u03c3g. Since the values of these latent model vari-\nables are unknown, marginalization takes into account all of the possibilities and weights them in\nproportion to the supporting evidence.\nWhen two corresponding sub-groups of participants are considered, for uninformed priors over mCg\nand \u03c3g, the posterior distribution of \u2206x is given by:\n\n\u2212 ng\u22121\n= c\ng\nwhere constant scaling factors have been omitted for clarity, and\n\np(\u2206x | Dg) \u221d c\n\n\u2212 nCg +nT g\u22121\ng\n\n2\n\n2\n\nExtending to the joint inference over the entire data corpus, the posterior can be computed simply\nas a product of all sub-group pair posteriors (up to a scaling constant):\n\nnCg(cid:88)\n\ni=1\n\ncg =\n\n2\n\nx(i)\nCg\n\n+\n\nT g + \u2206x)2 \u2212\n(x(j)\n\n(cid:20) nCg(cid:88)\nnT g(cid:88)\nnT g(cid:88)\np(\u2206x | \u222ag Dg) \u221d(cid:89)\np(\u2206x | Dg) \u221d(cid:89)\n\nx(i)\nCg +\n\nj=1\n\nj=1\n\ni=1\n\ng\n\ng\n\n5\n\n(cid:21)2\n\n(7)\n\n/ (nCg + nT g)\n\n(8)\n\n(x(j)\n\nT g + \u2206x)\n\n\u2212 ng\u22121\nc\ng\n\n2\n\n(9)\n\nProbability densityOutcome magnitude\fThe estimate of the posterior distribution of \u2206x in Eq (9) is the best estimate that can be made using\nthe available data, and it is of the most interest to the clinician. However, as we will discuss in Sec 6,\nboth Eq (7) and (9) have signi\ufb01cance in the interpretation of trial results and their joint consideration\ncan be used to reveal important additional information about the effectiveness of the treatment.\n\n5 Experiments\n\nCertain advantages of the proposed methodology over previous approaches are ipso facto inherent\nin the theory, e.g. the absence of free parameters. Other claimed properties of the method, such as\nits robustness to small input differences, are not immediately obvious. In this section we present the\nresults of a series of experiments which demonstrate the superiority of the proposed method.\n\n5.1 Evaluation Methodology\n\nIn contrast to the methods of James et al. and Bang et al. which do not attempt to infer any objective\nand measurable quantity, the proposed approach pools all available data (trial outcomes and auxiliary\nquestionnaire feedback) in an effort to evaluate robustly the effectiveness of the studied treatment.\nThis feature of our method allows us to directly evaluate its performance. Speci\ufb01cally, we employ\na computer-based simulation whereby data is \ufb01rst randomly (or rather pseudo-randomly) generated\nusing a statistical model with adjustable parameters, followed by the application of the proposed\nmethod which is used to infer the said parameters. The values inferred by our method can then be\ndirectly compared with their known true values.\nExp 1: Reference For our \ufb01rst experiment, we simulated a trial involving 200 individuals, half of\nwhich were assigned to the control and half to the treatment group. For each of the groups, 60% of\nthe participants were taken to be in the \u201cundecided\u201d subgroups GC0 and GT 0. The remaining 40%\nof the participants was split between correct and incorrect guesses of the assigned intervention in\nproportion 3 : 1. In this initial experiment we assume that all participants correctly disclosed their\nbelief regarding which group they were assigned to. Note that this assumption is done purely in the\nprocess of generating data for the experiment \u2013 neither this nor any of the preceding information is\nused by our method to analyze the outcome of the trial.\nWe set the differential effect of treatment to \u2206x = 0.1 and the standard deviation of variability\nwithin each of the assignment-response subgroups to \u03c3\u2212 = \u03c30 = \u03c3+ = 0.1. Relative to genuine\nlack of belief in either control or treatment group assignments, belief in control group assignment\nwas set to exhibit negative effect of magnitude 0.2 and that in treatment group assignment a positive\neffect of magnitude 0.2. Intervention outcomes were then generated by repeated random draws from\nthe corresponding distributions. For example, the outcome associated with a participant in GC\u2212 was\ndetermined by a random draw from the normal distribution N (mC\u2212, \u03c3\u2212).\nThe result of applying the proposed method is summarized in Fig 3 which plots the posteriors (bold\nlines) corresponding to the three subgroups matched by the patients\u2019 post-trial belief and the amal-\ngamated posterior. The maximum a posteriori (MAP) value of the estimate of the differential ef-\nfectiveness of the treatment is \u2206x\u2217 \u2248 0.107, which is close to the true value of \u2206x = 0.1. In\ncomparison, when the differential effectiveness is estimated by subtracting the mean response of the\ncontrol group from that of the treatment group, without the use of our matching sub-groups based\nstatistical model, the estimate is \u2206x \u2248 0.141. Finally, the corresponding values of the blinding\nindices proposed by James et al. and Bang et al. are \u03c11 = 0.53 and \u03c1(cid:48)\n2 = 0.10. Notice that\nthe former indicates a level of blinding roughly half way between a perfectly blinded and unblinded\ntrial, while the latter deems the trial nearly perfectly blinded.\n\n2 = \u03c1(cid:48)(cid:48)\n\n5.1.1 Exp 2: Conservative Distortion\n\nWe modify the baseline experiment by simulating conservative behavioural tendency of partici-\npants in a trial. This was achieved by randomly choosing individuals from decisive subgroups and\nre-assigning them to their corresponding indecisive subgroup without changing their treatment\u2019s\nobserved effectiveness. The probability of re-assignment was set to pcons = 0.2.\nAs before, we applied the proposed method on the modi\ufb01ed data and display the key results in\nFig 3.\nIn addition to the new subgroup posteriors (dotted lines), for comparison in Fig 3(a) we\nalso show the three initial subgroup posteriors from Exp 1 (solid lines). The baseline (thick solid\nline) and new (thin solid line) amalgamated posteriors are shown in Fig 3(b). Fig 3(b) also shows the\nsemi-amalgamated posterior obtained using only decisive subgroups which, by experimental design,\n\n6\n\n\fcomprise data of only those individuals which honestly disclosed their belief of group assignment.\nThe new MAP value for the differential effectiveness using the amalgamated posterior can be seen\nto be \u2206x\u2217 \u2248 0.122 and that using the semi-amalgamated posterior \u2206x\u2217 \u2248 0.116. In Sec 6 we will\nshow how the difference in statistical features of sub-group posteriors can be used to select the most\nreliable posteriors to amalgamate, as well as to reveal additional insight into the nature of the studied\ntreatment and the blinding in the trial.\n\n(a) Sub-group posteriors\n\n(b) Full posteriors\n\nFigure 3: Exp 2: (a) Posteriors for the differential effect treatment computed using the data Dg of each\nexperimental sub-group comprising control and treatment individuals matched by their feedback. (b) Posterior\nfor the differential effect treatment computed using all available data.\n\nExp 3: Asymmetric Progressive Unblinding Starting with the baseline setup, we simulate un-\nblinding of previously undecided individuals of the treatment group. In other words, in each turn\nwe re-assign an individual from the subgroup GT 0 to the subgroup GT + and compute the novel\ndistribution for \u2206x.\nThe robustness of our method is illustrated in Fig 4(a), which shows the MAP estimate of the effec-\ntiveness of the treatment after an increasing number of participants were unblinded. This estimate\nonly shows small random perturbations, with the corresponding standard deviation of 0.0054. The\nplots in Fig 4(b) show the variation of the two blinding indexes throughout the experiment. As ex-\npected from the change in the participants\u2019 auxiliary data, both indexes change in value dramatically.\nThe index of James et al. decreases, while that of Bang et al. increases in absolute value, indicating\nagreement on the lowered level of blinding.\n\n(a)\n\n(b)\n\nFigure 4: Exp 3: (a) The MAP estimate of the treatment effectiveness as the participants assigned to the\ntreatment group are progressively unblinded. (b) The values of the blinding indexes \u03c11 (blue line) and \u03c1(cid:48)\n2 (red\nline), computed at each step of the progressive unblinding of the participants assigned to the treatment group.\n\nExp 4: Symmetric Progressive Unblinding As in Exp 3 we start with the baseline setup and\nsimulate unblinding of previously undecided individuals of the treatment group. In each turn we\nre-assign an individual from GT 0 to GT + and an individual from GC0 to GC\u2212, and compute the\nnovel distribution for \u2206x.\nWe illustrate the robustness of the method by plotting the MAP estimate of the effectiveness of the\ntreatment in Fig 5(a). As before, the estimate only shows small random perturbations, as expected in\nany experiment with a stochastic nature and is to be contrasted with the plots in Fig 5(b) which show\nthe changes in the two blinding indexes throughout the experiment. Again, with the change in the\nparticipants\u2019 auxiliary data, both indexes also change in value. It is insightful to observe that unlike\nin Exp 3, in this instance the values of the two indexes do not exhibit agreement on the direction\nof change of the level of blinding. This re\ufb02ects the importance that the auxiliary data interpretation\nplays in the methods of both James et al. and Bang et al.\n\n7\n\n\f(a)\n\n(b)\n\n2 (red line), computed at each step of the progressive unblinding.\n\nFigure 5: Exp 4: (a) The MAP estimate of the treatment effectiveness as the participants assigned to both the\ntreatment and the control groups are progressively unblinded. (b) The values of the blinding indexes \u03c11 (blue\nline) and \u03c1(cid:48)\n6 Discussion\nDegenerate Cases One of the key ideas behind the present method is that it is meaningful to\ncompare only the sub-groups matched by their auxiliary responses. While a greater number of\nsubgroups may provide more precise auxilliary/blinding information, the introduced partitioning\nof data decreases the statistical strength of each comparison of the corresponding sub-groups. In\nan extreme case, a particular sub-group may be empty. In other words, it is possible that none of\nthe participants of the treatment or the control group expressed a particular belief regarding their\ntreatment assignment. Although this may appear as a problem at \ufb01rst, a more careful examination\nof such cases reveals that this is not so.\nFirstly, note that whenever at least one pair of matching sub-groups is non-empty, the proposed\nmethod is able to compute a meaningful estimate of differential treatment effectiveness. In instances\nwhen there are no non-empty matching sub-groups, the nature of degeneracy can provide useful in-\nsight to the clinician. The absence of individuals in GT + may indicate that the participants assigned\nto the treatment group have either been poorly blinded but misidenti\ufb01ed the received treatment, or\nthat the treatment was vastly ineffective and was recognized as such by the participants assigned to\nit. Similarly, the absence of individuals in GT\u2212 may indicate that the participants assigned to the\ntreatment group have either been poorly blinded and correctly identi\ufb01ed the received treatment, or\nthat the treatment was obviously effective. In all cases, because degenerate data is trivial to recog-\nnize, the clinician is immediately made aware of the presence of a major \ufb02aw in the experimental\ndesign. The cause of degeneration can then be determined using the knowledge of the administered\ninterventions, and the statistics of both auxiliary responses and trial outcomes.\nFurther Insight\nIn Sec 4.2 we derived posteriors corresponding both to only a single pair of\ncorresponding sub-groups in Eq (7) and to the entirety of data, that is, all sub-groups in Eq (9).\nWhile the latter of these is of primary interest, the clinician can derive further useful insight into the\nnature of studied treatment by comparative examination of sub-group posteriors too.\nThe least interesting case is when the sub-group posteriors and the total posterior exhibit similar\ncharacteristics (e.g. the location of the mode). However, consider the case when that is not so. For\nexample, let us say that the posterior corresponding to the two matching \u201cdon\u2019t know\u201d subgroups\nhas the mode near \u2206x \u2248 0 and the total posterior has a decidedly positive mode (with suitably small\nstandard deviations, to make the observation statistically signi\ufb01cant). This could indicate that there\nmay be so-called \u201cnon-responders\u201d in the treatment group, i.e. individuals which did not respond\npositively to the treatment which in most people does produce a positive result [4, 8]. Similar\narguments can be made by considering differences between other sub-group posteriors. Ultimately,\nthe exact interpretation is in the hands of the clinicians who should use their insight into the nature\nof the administered interventions to infer further information of this type.\n\n7 Summary and Conclusions\n\nThis paper examined the problem of assessing the extent of blindness in a clinical trial. We demon-\nstrated a series of fundamental \ufb02aws in blinding index based approaches and thus proposed a novel\nframework. At the centre of our idea is that the comparison of the treatment and control groups\nshould be done in like-for-like fashion, giving rise to the partitioning of participants into sub-groups,\neach sub-group sharing the same intervention and post-trial responses. A Bayesian framework was\nused to interpret jointly the auxiliary and trial outcome data, giving the clinician a meaningful and\nreadily understandable end result. The effectiveness of our method was demonstrated empirically in\na simulation study, which showed its robustness in a variety of scenarios.\n\n8\n\n\fReferences\n[1] H. Bang, L. Ni, and C. E. Davis. Assessment of blinding in clinical trials. Contemp Clin Trials,\n\n25(2):143\u2013156, 2004.\n\n[2] H. K. Beecher. The powerful placebo. JAMA, 159(17):1602\u20131606, 1955.\n[3] F. Benedetti, H. S. Mayberg, T. D. Wager, C. S. Stohler, and J.-K. Zubieta. Neurobiological\n\nmechanisms of the placebo effect. J Neurosci, 25(45):10390\u201310402, 2005.\n\n[4] G. Costantino, F. Furfaro, A. Belvedere, A. Alibrandi, and W. Fries. Thiopurine treatment in in-\n\ufb02ammatory bowel disease: Response predictors, safety, and withdrawal in follow-up. J Crohns\nColitis, 2011.\n\n[5] H. Hemil\u00a8a. Assessment of blinding may be inappropriate after the trial. Contemp Clin Trials,\n\n26(4):512\u2013514, 2005.\n\n[6] H.-H. Henneicke-von Zepelin. Letter to the editor. Contemp Clin Trials, 26(4):512, 2005.\n[7] K. E. James, D. A. Bloch, K. K. Lee, H. C. Kraemer, and R. K. Fuller. An index for assessing\nblindness in a multi-centre clinical trial: disul\ufb01ram for alcohol cessation\u2013a va cooperative\nstudy. Stat Med, 15(13):1421\u20131434, 1996.\n\n[8] D. Karakitsos, J. Papanikolaou, A. Karabinis, R. Alalawi, M. Wachtel, C. Jumper, D. Alex-\nopoulos, and P. Davlouros. Acute effect of sildena\ufb01l on central hemodynamics in mechanically\nventilated patients with WHO group III pulmonary hypertension and right ventricular failure\nnecessitating administration of dobutamine. Int J Cardiol, 2012.\n\n[9] H. S. Mayberg, J. A. Silva, S. K. Brannan, J. L. Tekell, R. K. Mahurin, S. McGinnis, and P. A.\n\nJerabek. The functional neuroanatomy of the placebo effect. 159:728\u2013737, 2002.\n\n[10] D. E. Moerman and W. B. Jonas. Deconstructing the placebo effect and \ufb01nding the meaning\n\nresponse. Ann Intern Med, 136(6):471\u2013476, 2002.\n\n[11] G. H. Montgomery and I. Kirsch. Classical conditioning and the placebo effect. Pain, 72(1\u2013\n\n2):107\u2013113, 1997.\n\n9\n\n\f", "award": [], "sourceid": 264, "authors": [{"given_name": "Ognjen", "family_name": "Arandjelovic", "institution": null}]}