The Evolution of Gleason Grading of Prostate Cancer

Aquesta Pathology, Brisbane, Queensland, Australia, 2 University of Queensland Faculty of Medicine, Brisbane, Queensland, Australia, 3 Department of Pathology and Molecular Medicine, Wellington School of Medicine and Health Sciences, University of Otago, Wellington, New Zealand, Department of Oncology-Pathology, Karolinska Institute, Stockholm, Sweden, Trillium Health Partners and University of Toronto, Mississauga, ON, Canada and Wesley Hospital, Brisbane, Queensland, Australia,

More than 50 years ago, Dr Donald Floyd Gleason created a unique system for the grading of prostate cancer. This grading system was based on histological findings from needle biopsies, transurethral resections and radical prostatectomy specimens of 270 patients enrolled in a study conducted by the Veterans Administration Co-operative Urological Research Group (VACURG). The majority of patients presented with extraprostatic (stage 3) disease, while almost 40 % had metastatic disease. 1 A unique feature of the criteria proposed by Gleason was that grading was based entirely upon tumour architecture. 1 In contrast to systems for grading many other cancers, as well as those for prostate cancer grading then currently in use, cytological atypia was not considered to be a component of grading. [2][3][4][5] Another aspect of the Gleason system that differed from other systems was that, rather than focusing upon the highest grade, the grade representing the largest area of cancer and the second largest area were added to give a score upon which patient management was based.
Five grading patterns were proposed. Grade 1 was defined as closely packed uniform glands forming well-circumscribed nodules. Grade 2 tumours were similar but the nodules were less well circumscribed, consisting of well differentiated glands with variability of size and shape, while some cribriform patterns were permitted. Grade 3 was composed of infiltrating welldifferentiated glands; however, this could also include cribriform glands, single cells and cords, and masses of cells. Grade 4 was defined as a diffuse growth of large polygonal cells resembling clear cell carcinoma of the kidney, while grade 5 tumours consisted of undifferentiated carcinoma with little or no glandular differentiation.
The Gleason system has been validated using cancer specific mortality as the end point. 6 This early study found that both the primary and secondary patterns were important, with survival of patients having two tumour patterns falling between that expected for each individual pattern.

Changes to Gleason grading by Gleason
In 1974 Gleason made several changes to his grading system based on a larger study population of 1032 patients. While no changes were made to grade 1 criteria, those for grades 2-5 were significantly amended. Specifically the presence of cribriform glands was now considered a feature of grade 3, rather than grade 2 tumours. Cribriform patterns in grade 3 were described as variable in size and could be large and infiltrating. In addition to this, papillary architecture was added to the features of grade 3 tumours. Pattern 4 was expanded to include raggedly infiltrating fused glands as well as coalescing and branching glands. Single cells were no longer included in the features of grade 3 tumours, but were now classified as grade 5. Grade 5 tumours also included carcinomas with comedonecrosis, signet ring cells and nests and sheets of cells without a glandular architecture. 7 Using these improved criteria, and following recommendations that grading be performed at low magnification using a x4 or x10 objective, Gleason found that lower grade tumours were commonly lower stage and that higher grade tumours were commonly high stage at presentation. Interobserver reproducibility was found to be 50% and within +/-1 grade in approximately 85%. 7 Gleason score groups (lumping/ grouping of scores) The establishment of 5 grades and 9 scores in the Gleason system was considered necessary to accommodate the heterogeneity and the variety architectural patterns characteristic of prostate cancer. It became clear over time, that the complexity of the grading system hindered survival analysis. In particular it was considered, for the purposes of research, that a smaller number of prognostic groups should be established. In order to reduce the number of grading/ scoring categories, groupings designated low, intermediate and high-grade were often utilized, although the challenge for most researchers was to decide which scores belonged to each category. In 1977 Gleason commented on the "common practice" by research groups of "lumping" of Gleason scores in an attempt to increase statistical significance in their studies. He criticized the use of score groups 2-4, 5-7 and 8-10 as he considered that this resulted in loss of useful clinical information, and that the middle group of scores had significantly different outcomes. He suggested that separating the groups according to Gleason scores 2-5, 6, 7 and 8-10 would be a clinically valid alternative. It was considered that score groups 2-6 and 7-10 were also useful in distinguishing between prognostic groups. 8,9 Subsequently, others have used different combinations of Gleason scores as prognostic groups for the purpose of determining appropriate treatment options. [10][11][12][13][14][15][16][17][18] Some of these score groupings consisted of two categories representing low and high grade tumours. In other studies three categories, representing low, intermediate and high-grade, and 4 as well as 5 categories were investigated for prognostic significance.

Why change a "perfectly good" grading system?
Despite the attempted establishment of numerous other grading systems for prostate cancer, Gleason grading is the only system that has achieved worldwide acceptance and has remained in usage for more than fifty years. Despite this longevity it is evident that the system has problems. The diagnosis and management of prostate cancer has changed significantly over the last 50 years. In particular, prostate specific antigen (PSA) testing and screening has become available, resulting in early detection of prostate cancer. The method of taking prostate needle biopsies has also changed. Whereas previously one or two thick gauge needle biopsies were used to sample a palpable abnormality, more recently sampling of multiple areas is performed using thin core biopsies. In addition, different methods are now used to optimize cancer detection, including multiparametric magnetic resonance imaging (MRI)/ transrectal ultrasound (TRUS) guided biopsies and transperineal biopsies. The methods of treatment have also changed dramatically, with a high proportion of patients receiving either curative treatment or active surveillance.
In addition to changes to the diagnosis and management of prostate cancer over recent years, it became clear that significant amendments to Gleason grading criteria were necessary. In particular the appropriateness of including cribriform glands as a feature of Gleason pattern 3 has been questioned. During this period it also became apparent that some pathologists were not strictly adhering to Gleason's grading rules. In view of these developments it was widely considered that some evolution of Gleason grading was necessary for it to remain relevant to current practice. The necessity to amend Gleason grading and to adapt it to modern usage was embraced by The International Society of Urological pathology (ISUP). In 2005 the ISUP convened a consensus conference in San Antonio, Texas, USA in order to re-formulate prostate cancer grading.

modifications to Gleason grading
The 2005 ISUP Consensus Conference was attended by 52 invited international urological pathology experts, with decisions being attained through discussion and voting. 19 As a result of the meeting, major changes were agreed upon, although the resulting 2005 Modified Gleason System (MGS) classification was still largely based upon Gleason's original recommendations. It was agreed that Grade 1 cancers either represented non-malignant conditions or inadequately sampled tumours of higher grade and as a consequence there was consensus that this grade should not be used. It was also agreed that while Grade 2 cancer may be found in the transition zone in resection specimens, this grading should not be applied to needle biopsies. The consequence of this is that GS 1+1, 1+2, 2+1 and 2+2 cancers should not be diagnosed in these specimens. It was decided that cribriform glands, other than those that are small round and uniform with regular round lumens, should be classified as grade 4. An additional pattern, that of poorly formed glands, was added to the criteria of grade 4. The method of Gleason scoring in needle biopsies was significantly altered. Instead of summing the most common and the second most common grade to derive a score, the most common grade and the highest grade, no matter how small, were added to give the GS. In contrast, if the secondary pattern of a lower grade was <5%, it was excluded from the GS.
In contrast to the conventional Gleason classification, grading of variants was recommended and for most variants it was agreed that this should be based upon tumour architecture, ignoring cytologic changes. One exception to this was mucinous adenocarcinoma in which consensus was not achieved regarding a preferred grading method. The MGS has been validated in several studies which have shown a better correlation between needle biopsy and radical prostatectomy GS, as well as with pT staging category and biochemical recurrence free survival, than the conventional GS. [20][21][22] Despite this, one study with nadir PSA as the clinical end point, found that both GS and MGS were of prognostic significance and that conventional GS out performed MGS in needle biopsies. 23

ISUP grading
By 2014, it became clear, due to the availability of new scientific knowledge, that further amendments to the MGS of 2005 were necessary. It was apparent that, while the amendments would be minor, it was important that they should be undertaken. Timing for this was crucial in view of the imminent updating of the World Health Organisation (WHO) Classification of Tumours of the Urinary System and Male Genital Organs, which was due to be published in early 2016. A further consensus conference was convened under the auspices of the ISUP. In order to facilitate this an organising committee of 6 expert uropathologists were appointed and the resulting meeting consisted of 65 uropathologists and 17 urologists and oncologists, from 19 countries. 24 The organising committee members presented evidence relating to various questions and these were later voted upon at the conference.
Recent studies have shown that prostate cancer with cribriform morphology behaves as an aggressive cancer. 25,26 It has also been shown that rounded cribriform cancers, previously classified as Gleason grade 3, are extremely rare without associated irregular cribriform glands or other patterns of grade 4. 27 Further, distinguishing these tumours from cribriform grade 4 tumours has been shown to be subjective, with little interobserver reproducibility even amongst experts. 28 From this it was decided that any cribriform cancer would be better designated Gleason grade 4 and this recommendation achieved consensus at the meeting. Similarly, it was agreed that all glomeruloid structures should be considered grade 4 as they were basically cribriform in architecture. The other important modification that achieved consensus at the meeting was that mucinous adenocarcinoma should be graded on the morphology of the underlying architecture and not uniformly considered to be grade 4.
A major feature of the conference was the development of a 5 tier grading system based upon Gleason grading. The necessity for this was a result of the recommendations of 2005 MGS where it was decided that Gleason patterns 1 and 2 should not be diagnosed on needle biopsy, which meant that the lowest GS diagnosable on needle biopsy would be 6. Given that 6 is in the middle of the range of scores 2-10, some patients were left thinking that they had intermediate grade cancer with an intermediate risk of aggressive behavior. Recent studies had indicated that GS 6 tumours were indolent cancers with one study even showing that these tumours have no metastatic potential. 29 Following on from Gleason's earlier prognostic group concept, it was suggested that this could be solved through the establishment of groupings of MGS. It was proposed that as score 6 is the lowest possible score this would be designated grade 1 with GS 3+4 as Grade 2, GS 4+3 as Grade 3, GS 4+4, 3+5 or 5+3 as Grade 4 and GS 9-10 as Grade 5, The naming of this "new" grading system was the subject of much discussion. The organizing committee had agreed that this would be ISUP Grade since the consensus meeting was convened under the auspices of the ISUP. However, without the prior knowledge of the other organizing committee members, one committee member floated the idea that this system should be named after himself. This did not achieve consensus despite two separate votes and subsequently, there was unanimous agreement by the ISUP Council that the term ISUP Grade should be applied.

Recent
investigations have been undertaken to validate ISUP grading as a prognostic parameter for prostate cancer. Separate from and prior to the 2014 ISUP consensus conference, a large multiinstitutional study was performed in an attempt to validate the newly proposed prognostic groups. Unfortunately, as the cases in this study were accessioned between 2005 and 2014 there can be no certainty as to which grading criteria were used. This is of particular importance as the cases in this study were not subjected to central review. 30 Subsequent studies; however, have validated the new ISUP grading system with grading categories being shown to be significantly associated with death 31-33 or biochemical failure. [34][35][36][37] The future of ISUP grading It is now evident that improvements can be made to the 2014 ISUP grading to better predict patient outcome and to select treatment options. Specific features of the grading system in need of revision relate to ISUP grades containing Gleason pattern 4 carcinoma. An ISUP grade 2 tumour can have < 5% to 50% of Gleason grade 4 tumour, with a risk of metastasis and cancer-related death proportionate to the amount of grade 4 tumour present. 38 A consequence of this is that the amount of grade 4 tumour present can influence treatment and in particular, can be a factor influencing the decision between active surveillance and definitive treatment. Gleason score 4+3 cancer can have 50-95% of Gleason grade 4 and again the higher the proportion of grade 4 the worse the prognosis. 39 From these data it is evident that the percentage of Gleason pattern 4 present in needle biopsies should be factored into prostate cancer grading in order to maximize the prognostic information that is available to the clinician. Unfortunately there is currently no evidence to suggest which method should be used to assess the amount of pattern 4 present, and in particular whether this should be core or case based. Further, it is undecided whether the percentage of a pattern present should be assessed as the area of tumour or the length of tumour within a core. There have also been suggestions that the presence of cribriform cancer be reported separately and distinct from other patterns of grade 4, as this may be associated with a worse outcome. 25,26 A further issue that requires addressing relates to the score groups that constitute the ISUP system. In the current system ISUP grade 4 consists of 4+4, 3+5 and 5+3 tumours. It has been demonstrated that these three categories are different prognostically, with 5+3 cancers, in some cases, appearing to be as aggressive as ISUP grade 5 cancer. 40 In conclusion, ISUP grading, based upon 2005 MGS-based grouped Gleason scores is already in widespread usage. The terminology here is of some importance and is inappropriate to label these as grade groups as they are not grade, but rather score groups. Clearly, this system in its current form requires further evolution so as maximize prognostic information that will more appropriately inform treatment.