You are here

Inter-observer agreement in endoscopic scoring systems: Preliminary report of an ongoing study from the Italian Group for Inflammatory Bowel Disease (IG-IBD)

Digestive and Liver Disease, Volume 46, Issue 11, November 2014, Pages 969–973



Endoscopic activity has become a therapeutic endpoint in inflammatory bowel disease. Aim of this study was to evaluate inter-observer agreement for endoscopic scores in a real-life setting.


14 gastroenterologists with experience in inflammatory bowel disease care and endoscopic scoring reviewed videos of ulcerative colitis (n = 13) and postoperative (n = 10) and luminal (n = 8) Crohn's disease. The Mayo subscore for ulcerative colitis, Rutgeerts score for postoperative Crohn's disease, Crohn's disease endoscopic index of severity (CDEIS), and the simple endoscopic score-Crohn's disease (SES-CD) for luminal Crohn's disease were calculated. A subset of five endoscopic clips were assessed by 30 general gastroenterologists without specific experience in endoscopic scores. Kappa statistics and intraclass correlation coefficients were used to measure agreement.


Mayo subscore agreement was suboptimal: kappas were 0.53 (95% confidence interval 0.47–0.56) and 0.71 (0.67–0.76) for the two groups. Rutgeerts score agreement was fair: kappas were 0.57 (0.51–0.65) and 0.67 (0.60–0.72). Agreements for CDEIS and SES-CD were good: intraclass correlation coefficients for the two groups were 0.83 (0.54–1.00) and 0.67 (0.36–0.97) for CDEIS and 0.93 (0.76–1.00) and 0.68 (0.35–0.97) for SES-CD, respectively.


The reproducibility of endoscopic scores in inflammatory bowel disease remains suboptimal, which could potentially have major effects on therapeutic choices.

Keywords: Interobserver agreement, SES-CD, CDEIS, Endoscopic scores, Mayo endoscopic subscore, Rutgeerts’ score.

1. Introduction

Ileocolonoscopy is an essential tool for the diagnosis and management of inflammatory bowel disease (IBD)[1], [2], and [3], Crohn's disease (CD) and ulcerative colitis (UC). This procedure allows for the detection of elemental endoscopic lesions and for tissue sampling to confirm diagnosis [4] . Endoscopic findings have a major influence on disease outcomes in both CD[5], [6], [7], and [8]and UC[9], [10], [11], and [12]when the most severe endoscopic lesions are present. More recently, treatment-induced healing of mucosal lesions has been associated with more favourable long-term IBD courses[13], [14], [15], [16], [17], and [18]. These observations have led to inclusion of endoscopic outcomes in more recent clinical trials as therapeutic endpoints[14], [19], [20], [21], [22], and [23]. Several scores have been proposed and used to objectively grade the endoscopic severity of IBD[5], [24], [25], [26], [27], and [28]in clinical trials and in routine clinical practice.

Although the severities of IBD lesions as assessed by ileocolonoscopy are significantly related to the outcomes of UC and CD, the endoscopic patterns of IBD are extremely variable. Moreover, endoscopic scores of activity are not widely used by IBD-dedicated gastroenterologists. In the original development of different scores, formal tests of inter-observer agreement were either conducted with limited sets of observers or on limited sets of observations[25] and [26], and additional tests of the agreement of specific scores have been conducted on limited series of published data[27], [28], [29], [30], and [31].

Based on these observations, the Italian Group for Inflammatory Bowel Disease (IG-IBD) focused its attention on the promotion of a pilot study to explore the reliability of endoscopic scoring of IBD based on evaluations of the agreements for selected endoscopic scores. To this end, the inter-observer variability in the assessments of endoscopic scores were evaluated in a group of IBD-dedicated gastroenterologists with experience in IBD endoscopy (i.e., the ‘experts’) and in a group of general gastroenterologists (i.e., the ‘non-experts’) to outline the potential hazards of the routine use of endoscopic scores for IBD.

2. Methods

A group of 14 gastroenterologists with experience in the fields of IBD management and IBD endoscopy (i.e., the “experts”) were involved in dedicated meetings that focused on the evaluation of the inter-observer variability of assessments of endoscopic activity scores for IBD. During each meeting, each of the 14 involved gastroenterologists reviewed digitally recorded endoscopic clips of ulcerative colitis (13 videoclips), postoperative CD (10 videoclips), and luminal CD (8 videoclips) and scored the endoscopic activities observed in the clips.

The following four scoring systems were selected for the purposes of this study: the Mayo endoscopic subscore [24] for UC activity, Rutgeerts score [5] for postoperative CD, the Crohn's disease endoscopic index of severity (CDEIS) [25] , and the simple endoscopic score for Crohn's disease (SES-CD) [26] for luminal CD.

A subset of the same clips (5 UC, 5 postoperative CD and 5 luminal CD) also viewed by 30 gastroenterologists who had not received specific training related to endoscopic scores (i.e., the “non-experts”) and also had to assess the same 4 endoscopic scores during a dedicated independent meeting.

The experts had all been practicing endoscopy for more than 5 years, belonged to referral centres for IBD management and were accustomed to clinical trials that included endoscopic evaluations. Supplementary Table S1 details the characteristics and expertise of the 14 experts. All of the experts belonged to tertiary referral IBD centres and had previous experience with IBD scores; the median duration of endoscopy practice of this group was 21 years, and the median number of IBD patients who had been followed-up at their centres was 1,750. During each meeting, the endoscopic scores were reviewed and discussed prior to video assessments to maximise concordance. The non-experts included gastroenterologists who belonged to primary/secondary IBD referral centres, had attended an IBD meeting, and had at least basic experience in digestive endoscopy but no formal training related to IBD endoscopic scores. The non-experts received, which underwent a brief explanation of the four mentioned scores[5], [24], [25], and [26]just before their scoring the endoscopic clips.

The experts provide their scores via electronic pads, and the scores were directly transferred to an electronic spreadsheet. The non-experts wrote their scores on dedicated anonymous scoring sheets, and the data were subsequently entered into an electronic spreadsheet. After every round of video clip scoring, the observers were allowed and encouraged to discuss the results, but they were not permitted to change their scores.

2.1. Statistical analyses

The scores were calculated with direct imputation (Mayo and Rutgeerts’ score) or after the calculation of the totals (CDEIS and SES-CD) following imputation with a telematic poll system, and the results were tabulated on an MS Office Excel 2007 spreadsheet.

Data analyses were performed with MedCalc statistical software (v.12.3, Meriekerke, Belgium). Categorical scores (Mayo and Rutgeerts scores) were summarised as frequencies, and semi-continuous scores (CDEIS and SES-CD) were summarised with medians and 95% confidence intervals (95% CI).

Inter-observer agreements were tested with Fleiss’ kappa statistics (kappas) [32] or intra-class correlation coefficients (ICCs) [33] as appropriate for each scoring modality. Agreement was considered [32] poor when the kappa (or ICC) was lower than 0.20, fair when the score was between 0.21 and 0.40, moderate when the score was between 0.41 and 0.60, good when the score was between 0.61 and 0.80, and very good when the score was above 0.80. The coefficient of variation (CVs) was used as a measure of dispersion for each of the four scores.Pvalues were considered significant whenp < 0.05.

3. Results

The inter-observer agreements for the Mayo endoscopic subscore scores in the assessments of the severity of UC were fair (kappa 0.53) and good (kappa 0.71) among the experts and non-experts, respectively. Similarly, the agreements for the Rutgeerts scores were fair (kappa 0.57) and good (kappa 0.67) in the two respective groups. The luminal CD score agreements were very good and good for the experts and non-experts, respectively. The ICC values for the SES-CD were 0.93 and 0.67 in the two respective groups, and the ICC values for the CDEIS were 0.83 and 0.67, respectively. The agreement measure and median coefficient of variation results for the different scoring systems for the different pairs of observations are reported in detail in Table 1 , and a graphical depiction of the agreement results is presented in Fig. 1 .

Table 1 Agreement measures: the Fleiss kappa values and intra-class correlation coefficients with 95% confidence intervals are reported for the expert and non-expert subgroups for the analysed endoscopic scores. Additionally, the medians of the coefficients of variation with the 95% CIs are reported for each score and each group of observations.

  Experts Non experts
  Value 95% CI Value 95% CI
Mayo endoscopic subscore (kappa) 0.53 0.47–0.56 0.71 0.67–0.76
Mayo endoscopic subscore (CV) 31.49% 28.84–33.43 22.36% 21.27–23.45
Rutgeerts score (kappa) 0.57 0.51–0.65 0.67 0.60–0.72
Rutgeerts score (CV) 20.76% 18.18–22.81 11.31% 11.28–15.38
CDEIS (ICC) 0.83 0.54–1.00 0.67 0.36–0.97
CDEIS (CV) 24.68% 21.21–27.13 31.48% 29.34–34.34
SES-CD (ICC) 0.93 0.76–1.00 0.68 0.35–0.97
SES-CD (CV) 17.91% 16.35–18.66 31.61% 28.92–35.75

CDEIS, Crohn's disease endoscopic index of severity; SES-CD, simple endoscopic score for Crohn's disease; CV, coefficients of variation; ICC, intra-class correlation coefficients; CI, confidence intervals.


Fig. 1 Graphic representation of agreements between endoscopic scores in both “expert” and “non-expert” gastroenterologist subgroups: Fleiss kappa values with 95% confidence intervals are reported for the Mayo endoscopic subscore and Rutgeerts score; intraclass correlation coefficients with 95% CIs are reported for the simple endoscopic score for Crohn's disease and the Crohn's disease endoscopic index of severity. CI, confidence interval; Mayo, Mayo endoscopic subscore; Rutgeerts, Rutgeerts score; ICC, intraclass correlation coefficients; SES-CD, simple endoscopic score for Crohn's disease; CDEIS, Crohn's disease endoscopic index of severity.

Comparisons of the scoring performances between the experts and non-experts revealed that the Mayo (p < 0.0001 for both for the coefficient of variation and the kappa statistic) and Rutgeerts (p < 0.005 andp = 0.0001 for the coefficient of variation and kappa statistic, respectively) scores were significantly more uniformly assessed by the non-experts. In contrast, the assessments of the analytic scores (CDEIS and SES-CD) by the experts were more homogeneous than those of the non-experts (p < 0.0001 for both scores).

The coefficient of variation was significantly lower for the SES-CD than the CDEIS (p < 0.0001) among the experts, and no difference was observed among the non-experts (p = 0.6171).

4. Discussion

Endoscopic activity represents a relevant therapeutic goal, and several studies support the prognostic relevance of severe endoscopic lesions[5], [6], [9], and [11]. Recently, the increasing use of immunomodulators, including anti-TNF agents, has led to several lines of evidence that support the clinical relevance of complete or partial healing of the lesions of patients with IBD[7], [12], [13], [15], [16], [17], and [18]. The reliabilities and reproducibility of endoscopic scores are obviously becoming a relevant issue in IBD because inter-individual differences in endoscopic scoring might result in differences in clinical decision-making.

Examination of the agreement of scores based on recorded endoscopic videos was chosen as the strategy to achieve the purposes of this study. Absolute agreements based on live endoscopic findings might be somewhat different, but we considered the use of recorded videos to be closest to a situation of central reviewing. Of course, different settings might potentially have led to slightly different results, and this limitation should be kept in mind when attempting to generalise the results of this study. Moreover, this study required different modalities of voting for the experts and non-experts for practical reasons, and this difference may have implications for the generalisation of our results; the experts provided scores via an electronic pad, but the non-experts provided scores on sheets of paper. However, discussion of the voting was not permitted in either case, and similar results should be expected from the two voting modalities.

Regarding UC, the reproducibility of the Mayo endoscopic subscore has formally been explored[29] and [30]. One study [29] found adequate inter- and intra-observer agreements for the experts, but the agreements were markedly lower when trainees were involved in the endoscopic scoring. Subsequent observations made in the past few year[27], [28], [34], and [35]suggest that the agreements of the Mayo endoscopic subscore may be suboptimal, but the individual components and elemental lesions, which are scored on a grading system, may produce greater agreement. Our results support this notion and confirm that the overall agreement of the Mayo endoscopic scores may be an issue. This observation led to proposals of two different and new endoscopic scores for UC: the ulcerative colitis endoscopic index of severity (UCEIS)[28] and [34], and the ulcerative colitis colonoscopic index of severity (UCCIS)[27] and [35]. Both of these scores have been shown to produced increased inter-observer agreement, although they share some limitations; these two scores are more complex than the Mayo endoscopic subscore, they are not yet widely used in clinical practice, and their sensitivities for post-treatment variations are currently undefined. In a more recent trial [30] , the intra-observer agreement of expert central reviewers was excellent, but the inter-observer agreements were also found to be adequate-to-optimal for the Mayo subscore, the UCEIS and a visual-analogue scale summarising endoscopic activity in this study. These latter observations suggest that the goal of good agreement can be reached when central reviewer activity is offered. Finally, the current regulatory definitions for the approval of new drugs for UC include endoscopic healing among the secondary endpoints for trials for newer drugs [36] . The findings of the present study confirm that the Mayo score might produce sub-optimal inter-observer agreements (0.53–0.71). The present observations suggest that educational efforts are needed to increase the knowledge about and familiarity with the scoring system among users. It is conceivable that the use of more analytical scores, such as the UCEIS [28] or the UCCIS [27] , might produce better inter-observer agreements in the future, but external evaluations of the agreements of these scores should be considered. Further factors that might have affected our results include the limited number of videos assessed (which might have led to under- or over-estimation because the interpretation of some of the selected videos may have been more controversial) and the methodology for the assessment of the videos, which was different from the assessment of live endoscopies. To overcome the former issue, further validation on a larger sample of videos is planned and will reduce the biases that were due to the selection of endoscopic clips as far as possible. Regarding the latter issue, the evaluations of endoscopic videos or still images remain surrogates for the evaluation of real endoscopies, but the assessment of endoscopic clips seems to be the best currently available strategy, and it is also that strategy that is employed when central review is needed[13], [14], [27], [29], [30], [31], [34], and [37].

Regarding postoperative CD, the degree of endoscopic disease recurrence is currently assessed via the Rutgeerts score [5] , which has been available since the early 1990s. This score is commonly used both in clinical trials[8], [38], [39], [40], [41], [42], [43], and [44]and clinical practice[1] and [3]. The therapeutic implications of finding more advanced endoscopic lesions following surgical resection are relevant to clinical practice and the prognoses of the patients[5], [8], and [44]. For example, a Rutgeerts score greater than or equal to i2 has been shown to be significantly worse than scores of i0 or i1 in terms of recurrence at the anastomotic site. Nonetheless, the Rutgeerts score has not previously been formally validated, and its reproducibility has not been tested until now. In our study, the Rutgeerts score exhibited a good inter-observer agreement with kappa values close to or slightly greater than 0.60 among the experts and non-experts. Additionally, we believe that educational efforts are needed for this scoring system because differences in scoring may lead to differences in treatment; for example, asymptomatic patients with lesions are allowed to step-up their maintenance treatments, while patients in which no lesions are detected continue their previous treatments [3] . Recently, a clinical trial [8] compared a treatment algorithm that is based on endoscopic features to a symptom-driven therapeutic strategy; final results are not yet available because this trial is still underway.

Rather unexpectedly, we observed inter-observer agreements (as measured by the mean intra-class correlation coefficients) that were good to very good for more complex and analytical scores[25] and [26]regardless of whether the observers were experts. For these scores, the experts exhibited borderline-significantly higher agreements than did the non-experts; however, both groups exceeded 0.60, which indicates that the agreements were good. Substantial variations in CDEIS or SES-CD values following treatments (i.e., 50% reductions compared to baseline) have also been shown to be associated with significantly different clinical outcomes [45] . Because the evaluation of endoscopic activity is going to become a relevant issue for luminal CD and because therapeutic algorithms that are based on the degree of amelioration of a given treatment might be proposed in the near future, the improvement of the knowledge and agreement of the definitions for the scoring of the SES-CD and CDEIS is an important goal. Regarding the stenosis variable, which is considered in both the CDEIS and SES-CD, it should be remembered that the assessment of this variable on endoscopic videos is extremely subjective compared to the impressions that might arise during the procedure that result from the combination of the endoscopic pattern and the difficulty experienced in the attempt to pass the narrowed lumen. When drivers of disagreement in the assessments of the CDEIS and SES-CD have been analysed, the variable ‘stenosis’ has been suggest to be one of the most important [31] . This issue may be overcome by simply using ‘assessment rules’ that might reduce inter-observer variability and need to be shared between different observers prior to assessment.

Our data suggest that the agreements in endoscopic scoring might be lower than previously thought, even among experts. Moreover, the effects of scoring endoscopic activity with different methods might have substantial effects on the management of patients who might increase the future because data on endoscopic healing will be used as the basis for newer treatment algorithms. We believe that every effort to reduce minimise inter-observer disagreement should be undertaken, including but not limiting to educational efforts, central reviewing systems[30] and [31], and electronically aided reporting systems. The higher levels of agreement for the CDEIS and SES-CD might be attributable to the intrinsic characteristics of luminal Crohn's scores, which are the results of the sums of single subscores that are attributed to individual lesions. In contrast, the agreements for the categorical scores (Mayo and Rutgeerts) might be lower because different types of lesions are considered within each degree of endoscopic severity.

Our data suggest that, in the near future when clinical trials that include endoscopic endpoints are planned, endoscopic recording and central reviewing are needed to limit the differences to the fullest extent possible. Regarding this issue, our data appear to support the notion that the SES-CD and CDEIS exhibit better agreements than do the Mayo and Rutgeerts scores. If our results are confirmed in larger and independent series, potential solutions include the use of different scoring systems that are more analytical (i.e., the UCEIS and UCCIS for ulcerative colitis) and focusing particular attention on how endoscopic activity data are recorded. In daily practice, it might be useful to register every procedure and to release the entire video to the patients as has been done for a long time for radiological and ultrasonographic examinations. This procedure might also reduce the need for repeated endoscopic examinations, which can occur when patients move between different centres, and potentially reduce disagreements in the clinical and endoscopic activities that are reported by different physicians.

Finally, the SES-CD and CDEIS were found to have similar and quite optimal inter-observer agreements, as evidenced by intra-class correlation coefficients above 0.80 when scored by experts. Comparison of the coefficients of variation of the scores of the experts (but not for the non-experts) revealed that the SES-CD exhibited significantly less variation than did the CDEIS. Based on this result, no recommendation can be given regarding which scoring system is preferable, and clinicians should thus use the system they are most familiar with because both had very good inter-observer agreements.

Conflict of interest

None declared.

Appendix A. Supplementary data

The following are the supplementary data to this article:

Download file

Supplementary Table S1 Characteristics of 14 ‘expert’ observers; in the final row characteristics are summarized as median (years practicing endoscopy, number of procedures performed yearly and number of IBD patients in care at the center) or percentage of cases in the most represented modality (type of referral center and previous experience with endoscopic scores).


  • [1] A. Dignass, G. Van Assche, J.O. Lindsay, et al. The second European evidence-based consensus on the diagnosis and management of Crohn's disease: current management. Journal of Crohn's and Colitis. 2010;4:28-62
  • [2] A. Dignass, J.O. Lindsay, A. Sturm, et al. Second European evidence-based consensus on the diagnosis and management of ulcerative colitis. Part 2: Current management. Journal of Crohn's and Colitis. 2012;6:991-1030
  • [3] V. Annese, M. Daperno, M.D. Rutter, et al. European evidence based consensus for endoscopy in inflammatory bowel disease. Journal of Crohn's and Colitis. 2013;7:982-1018
  • [4] F. Magro, C. Langner, A. Driessen, et al. European consensus on the histopathology of inflammatory bowel disease. Journal of Crohn's and Colitis. 2013;7:827-851
  • [5] P. Rutgeerts, K. Geboes, G. Vantrappen, et al. Predictability of the postoperative course of Crohn's disease. Gastroenterology. 1990;99:956-963
  • [6] M. Allez, M. Lemann, J. Bonnet, et al. Long term outcome of patients with active Crohn's disease exhibiting extensive and deep ulcerations at colonoscopy. American Journal of Gastroenterology. 2002;97:947-953
  • [7] K.F. Froslie, J. Jahnsen, B.A. Moum, et al. Mucosal healing in inflammatory bowel disease: results from a Norwegian population-based cohort. Gastroenterology. 2007;133:412-422
  • [8] E.K. Wright, P.P. De Cruz, M.A. Kamm, et al. Intestinal resection in Crohn's disease is associated with significant and durable improvement in health related quality of life although to a lesser extent in women and smokers. Results from the POCER study. Journal of Crohn's and Colitis. 2014;:DOP086
  • [9] F. Carbonnel, A. Lavergne, M. Lémann, et al. Colonoscopy of acute colitis. A safe and reliable tool for assessment of severity. Digestive Diseases and Sciences. 1994;39:1550-1557
  • [10] S.P. Travis, J.M. Farrant, C. Ricketts, et al. Predicting outcome in severe ulcerative colitis. Gut. 1996;38:905-910
  • [11] M. Daperno, R. Sostegni, N. Scaglione, et al. Outcome of a conservative approach in severe ulcerative colitis. Digestive and Liver Disease. 2004;36:21-28
  • [12] I.C. Solberg, I. Lygren, J. Jahnsen, et al. Clinical course during the first 10 years of ulcerative colitis: results from a population-based inception cohort (IBSEN Study). Scandinavian Journal of Gastroenterology. 2009;44:431-440
  • [13] P. Rutgeerts, R.H. Diamond, M. Bala, et al. Scheduled maintenance treatment with infliximab is superior to episodic treatment for the healing of mucosal ulceration associated with Crohn's disease. Gastrointestinal Endoscopy. 2006;63:433-442 quiz 464
  • [14] P. Rutgeerts, G. Van Assche, W.J. Sandborn, et al. Adalimumab induces and maintains mucosal healing in patients with Crohn's disease: data from the EXTEND trial. Gastroenterology. 2012;142 1102–11.e2
  • [15] X. Hebuterne, M. Lemann, Y. Bouhnik, et al. Endoscopic improvement of mucosal lesions in patients with moderate to severe ileocolonic Crohn's disease following treatment with certolizumab pegol. Gut. 2013;62:201-208
  • [16] F. Baert, L. Moortgat, G. Van Assche, et al. Mucosal healing predicts sustained clinical remission in patients with early-stage Crohn's disease. Gastroenterology. 2010;138:463-468 quiz e10–1
  • [17] S. Ardizzone, A. Cassinotti, P. Duca, et al. Mucosal healing predicts late outcomes after the first course of corticosteroids for newly diagnosed ulcerative colitis. Clinical Gastroenterology and Hepatology. 2011;9 48–9.e3
  • [18] J.F. Colombel, P. Rutgeerts, W. Reinisch, et al. Early mucosal healing with infliximab is associated with improved long-term clinical outcomes in ulcerative colitis. Gastroenterology. 2011;141:1194-1201
  • [19] P. Rutgeerts, W.J. Sandborn, B.G. Feagan, et al. Infliximab for induction and maintenance therapy for ulcerative colitis. New England Journal of Medicine. 2005;353:2462-2476
  • [20] W.J. Sandborn, G. van Assche, W. Reinisch, et al. Adalimumab induces and maintains clinical remission in patients with moderate-to-severe ulcerative colitis. Gastroenterology. 2012;142 257–65.e1–3
  • [21] W.J. Sandborn, B.G. Feagan, C. Marano, et al. Subcutaneous golimumab maintains clinical response in patients with moderate-to-severe ulcerative colitis. Gastroenterology. 2014;146 96–109.e1
  • [22] W.J. Sandborn, B.G. Feagan, C. Marano, et al. Subcutaneous golimumab induces clinical response and remission in patients with moderate-to-severe ulcerative colitis. Gastroenterology. 2014;146:85-95 quiz e14–5
  • [23] B.G. Feagan, P. Rutgeerts, B.E. Sands, et al. Vedolizumab as induction and maintenance therapy for ulcerative colitis. New England Journal of Medicine. 2013;369:699-710
  • [24] K.W. Schroeder, W.J. Tremaine, D.M. Ilstrup. Coated oral 5-aminosalicylic acid therapy for mildly to moderately active ulcerative colitis. A randomized study. New England Journal of Medicine. 1987;317:1625-1629
  • [25] J.Y. Mary, R. Modigliani. Development and validation of an endoscopic index of the severity for Crohn's disease: a prospective multicentre study, Groupe d’Etudes Therapeutiques des Affections Inflammatoires du Tube Digestif (GETAID). Gut. 1989;30:983-989
  • [26] M. Daperno, G. D’Haens, G. Van Assche, et al. Development and validation of a new, simplified endoscopic activity score for Crohn's disease: the SES-CD. Gastrointestinal Endoscopy. 2004;60:505-512
  • [27] S. Samuel, D.H. Bruining, E.V. Loftus Jr., et al. Validation of the ulcerative colitis colonoscopic index of severity and its correlation with disease activity measures. Clinical Gastroenterology and Hepatology. 2013;11 49–54.e1
  • [28] S.P. Travis, D. Schnell, P. Krzeski, et al. Reliability and initial validation of the ulcerative colitis endoscopic index of severity. Gastroenterology. 2013;145:987-995
  • [29] T. Osada, T. Ohkusa, T. Yokoyama, et al. Comparison of several activity indices for the evaluation of endoscopic activity in UC: inter- and intraobserver consistency. Inflammatory Bowel Diseases. 2010;16:192-197
  • [30] B.G. Feagan, W.J. Sandborn, G. D’Haens, et al. The role of centralized reading of endoscopy in a randomized controlled trial of mesalamine for ulcerative colitis. Gastroenterology. 2013;145 149–57.e2
  • [31] R. Khanna, G. Zou, G. D’Haens, et al. Agreement among central readers in the evaluation of endoscopic disease activity in Crohn's disease. Journal of Crohn's and Colitis. 2014;8(Suppl. 1):S13-S14
  • [32] J.L. Fleiss, B. Levin, M.C. Paik. The measurement of interrater agreement. Statistical methods for rates and proportions 3rd ed. (John Wiley & Sons, Inc., Hoboken, NJ, USA, 2004) 10.1002/0471445428.ch18
  • [33] K.O. McGraw, S.P. Wong. Forming inferences about some intraclass correlation coefficients. Psychological Methods. 1996;1:30-46
  • [34] S.P. Travis, D. Schnell, P. Krzeski, et al. Developing an instrument to assess the endoscopic severity of ulcerative colitis: the Ulcerative Colitis Endoscopic Index of Severity (UCEIS). Gut. 2012;61:535-542
  • [35] K.T. Thia, E.V. Loftus Jr., D.S. Pardi, et al. Measurement of disease activity in ulcerative colitis: interobserver agreement and predictors of severity. Inflammatory Bowel Diseases. 2011;17:1257-1264
  • [36]
  • [37] J.F. Colombel, W.J. Sandborn, W. Reinisch, et al. Infliximab, azathioprine, or combination therapy for Crohn's disease. New England Journal of Medicine. 2010;362:1383-1395
  • [38] C. Florent, A. Cortot, P. Quandale, et al. Placebo-controlled clinical trial of mesalazine in the prevention of early endoscopic recurrences after resection for Crohn's disease, Groupe d’Etudes Therapeutiques des Affections Inflammatoires Digestives (GETAID). European Journal of Gastroenterology and Hepatology. 1996;8:229-233
  • [39] K. Ewe, T. Bottger, H.J. Buhr, et al. Low-dose budesonide treatment for prevention of postoperative recurrence of Crohn's disease: a multicentre randomized placebo-controlled trial, German Budesonide Study Group. European Journal of Gastroenterology and Hepatology. 1999;11:277-282
  • [40] S.B. Hanauer, B.I. Korelitz, P. Rutgeerts, et al. Postoperative maintenance of Crohn's disease remission with 6-mercaptopurine, mesalamine, or placebo: a 2-year trial. Gastroenterology. 2004;127:723-729
  • [41] P. Rutgeerts, G. Van Assche, S. Vermeire, et al. Ornidazole for prophylaxis of postoperative Crohn's disease recurrence: a randomized, double-blind, placebo-controlled trial. Gastroenterology. 2005;128:856-861
  • [42] G.R. D’Haens, S. Vermeire, G. Van Assche, et al. Therapy of metronidazole with azathioprine to prevent postoperative recurrence of Crohn's disease: a controlled randomized trial. Gastroenterology. 2008;135:1123-1129
  • [43] W. Reinisch, S. Angelberger, W. Petritsch, et al. Azathioprine versus mesalazine for prevention of postoperative clinical recurrence in patients with Crohn's disease with endoscopic recurrence: efficacy and safety results of a randomised, double-blind, double-dummy, multicentre trial. Gut. 2010;59:752-759
  • [44] A. Orlando, F. Mocciaro, S. Renna, et al. Early post-operative endoscopic recurrence in Crohn's disease patients: data from an Italian Group for the study of inflammatory bowel disease (IG-IBD) study on a large prospective multicenter cohort. J Crohns Colitis. 2014;10.1016/j.crohns.2014.02.010 pii: S1873-9946(14)00059-2 [Epub ahead of print]
  • [45] M. Ferrante, J.F. Colombel, W.J. Sandborn, et al. Validation of endoscopic activity scores in patients with Crohn's disease based on a post hoc analysis of data from SONIC. Gastroenterology. 2013;145 978–86.e5


a Gastroenterology Unit, Mauriziano Hospital Turin, Italy

b Bolzano Hospital, Bolzano, Italy

c Cancer Research and Cure Institute “Casa Sollievo Sofferenza”, San Giovanni Rotondo (FG), Italy

d University Tor Vergata, Rome, Italy

e University Hospital “Careggi”, Florence, Italy

f University Hospital “L. Sacco”, Milan, Italy

g “San Camillo-Forlanini” Hospitals, Rome, Italy

h Hospital “Cardarelli”, Naples, Italy

i Catholic University “S. Cuore”, Columbus Integrated Complex, Rome, Italy

j “Sandro Pertini” Hospital, Rome, Italy

k University of Bologna, “S. Orsola Malpighi” Hospital, Bologna, Italy

l University of Padua, Padua, Italy

m United Hospitals “Villa Sofia-Cervello”, Palermo, Italy

lowast Corresponding author at: Gastroenterology Unit, Mauriziano Hospital, Corso Re Umberto 109, I-10128 Turin, Italy. Tel.: +39 011 5082534; fax: +39 011 5082536.