Accurate quantification of bone loss facilitates preoperative planning and standardization for research purposes in patients who undergo revision TKA. The most commonly used classification to rate bone defects in this setting, the Anderson Orthopaedic Research Institute classification, does not quantify diaphyseal bone loss and reliability has not been well studied.
We developed a new classification scheme to rate bone defects in patients undergoing revision TKA and tested (1) the intraobserver and interobserver reliability of this classification for revision TKA based on preoperative radiographs, and (2) whether additional CT images might improve interobserver reliability.
This was a preregistered observational study. Interobserver reliability was analyzed using preoperative radiographs of 61 patients who underwent (repeat) revision TKA, and their bone defects were rated by five experienced orthopaedic surgeons. For intraobserver reliability, ratings were repeated at least 2 weeks after the first rating (Timepoints 1 and 2). Directly after the radiographic assessments of Timepoint 2, the observers were provided with CT images of each patient and asked to rate the bone defects for a third time (Timepoint 3), to assess the additional value of CT. Intraobserver and interobserver reliability were tested using Gwet’s agreement coefficient 2, which is a measure of agreement between observers in categorical data. Substantial agreement was defined as coefficients between 0.61 to 0.8 and almost perfect agreement as > 0.8.
The intraobserver reliability varied between 0.55 (95% CI 0.40 to 0.71) and 0.87 (95% CI 0.78 to 0.96) in the epiphysis, between 0.69 (95% CI 0.58 to 0.80) and 0.98 (95% CI 0.95 to 1) in the metaphysis, and between 0.95 (95% CI 0.90 to 0.99) and 0.99 (95% CI 0.98 to 1) in the diaphysis. The interobserver reliability varied between 0.48 (95% CI 0.39 to 0.57) and 0.49 (95% CI 0.42 to 0.56) in the epiphysis and between 0.81 (95% CI 0.75 to 0.87) and 0.88 (95% CI 0.83 to 0.93) in the metaphysis, and was 0.96 (95% CI 0.93 to 0.99) in the diaphysis at Timepoint 1. The interobserver reliability at Timepoint 2 was similar to that of Timepoint 1. The addition of CT images did not improve reliability (Timepoint 3).
The bone defect classification was less reliable in the epiphyseal area compared with the metaphysis and diaphysis. This finding may be explained by prosthetic components obscuring this region or the more severe bone defects in this region. The addition of CT scans did not improve reliability. Further testing of reliability with observers from other institutions is necessary, as well as validity testing, by testing the classification in relation to intraoperative findings.