The policy was originally created in 2011 and has been updated regularly with literature searches of the MedLine database.
The FDA approval of the Heartsbreath™ test was based on the results of the study sponsored by the National Heart, Lung, and Blood Institute, the Heart Allograft Rejection: Detection with Breath Alkanes in Low Levels (HARDBALL) study. (2) The HARDBALL study was a 3 year multicenter study of 1,061 breath samples in 539 heart transplant patients. Prior to scheduled endomyocardial biopsy (EMB), patient breath was analyzed by gas chromatography and mass spectroscopy for volatile organic compounds. The amount of C4-C20 alkanes and monomethyl alkanes was used to derive the marker for rejection known as the breath methylated alkane contour (BMAC). The BMAC results were compared with subsequent biopsy results as interpreted by two readers using the International Society for Heart and Lung Transplantation (ISHLT) biopsy grading system as the "gold standard" for rejection.
The authors of the HARDBALL study reported that the abundance of breath markers of oxidative stress were significantly greater in Grade 0, 1, or 2 rejection than in healthy normal persons. Whereas in Grade 3 rejection, the abundance of breath markers of oxidative stress was reduced, most likely due to accelerated catabolism of alkanes and methyl alkanes that makes up the BMAC. The authors also reported finding that in identifying Grade 3 rejection, the negative predictive value of the breath test (97.2%) was similar to EMB (96.7%), and that the breath test could potentially reduce the total number of biopsies performed to assess for rejection in patients at low risk for Grade 3 rejection. The sensitivity of the breath test was 78.6% versus 42.4% with biopsy. However, the breath test had lower specificity (62.4%) and a lower positive predictive value (5.6%) in assessing Grade 3 rejection than biopsy (specificity 97%, positive predictive value 45.2%). In addition, the breath test was not evaluated in Grade 4 rejection.
Patterns of gene expression for development of the AlloMap™ test were studied in the Cardiac Allograft Rejection Gene Expression Observation (CARGO) study, which included eight U.S. cardiac transplant centers enrolling 650 cardiac transplant recipients encompassing over 5,000 clinical encounters. (3) The study included discovery and validation phases. In the discovery phase, patient blood samples were obtained at the time of EMB, and the expression levels of over 7,000 genes known to be involved in immune responses were assayed and compared to the biopsy results. A subset of 200 candidate genes were identified that showed promise as markers that could distinguish transplant rejection from quiescence, and from there, a panel of 11-genes was selected that could be evaluated using polymerase chain reaction (PCR) assays. A proprietary algorithm is applied to the results of the analysis, producing a single score that considers the contribution of each gene in the panel. The third phase in the development of the AlloMap test was a pivotal validation study designed to further evaluate the algorithm and establish performance characteristics of the test. This phase of the study was prospective and blinded, and enrolled 270 patients.
The results of the CARGO study have now appeared in abstracts presented at the 2005 Annual Meeting of the International Society of Heart Lung Transplantation. (3) Investigators from the CARGO study reported on gene expression profiling (GEP) of peripheral blood mononuclear cells to identify patients with moderate/severe cardiac allograft rejection. The validation phase of the CARGO study, published in 2006, was prospective, blinded, and enrolled 270 patients. (3) Primary validation was conducted using samples from 63 patients independent from discovery phases of the study and enriched for biopsy-proven evidence of rejection. A prospectively defined test cutoff value of 20 resulted in correct classification of 84% of patients with moderate/severe rejection but just 38% of patients without rejection. Of note, in the “training set” used in the study, these rates were 80% and 59%, respectively. The authors evaluated the 11-gene expression profile on 281 samples collected at 1 year or more from 166 patients’ representative of the expected distribution of rejection in the target population (and not involved in discovery or validation phases of the study). When a test cutoff of 30 was used, the negative predictive value (no moderate/severe rejection) was 99.6%; however, only 3.2% of specimens had Grade 3 or higher rejection. In this population, Grade 1B scores were found to be significantly higher than Grade 0, 1A, and 2 scores but similar to Grade 3 scores. The sensitivity and specificity for determining quiescent versus early stages of rejection was not addressed.
Additional clinical experience is needed to confirm and extend the current results, and to address several important questions such as the best cutoff value and when to test. In addition, the impact of this test on management decisions and health outcomes is unknown. Some of these issues will be addressed by an ongoing randomized clinical trial, known as the Invasive Monitoring Attenuation through Gene Expression (IMAGE) study; comparing AlloMap molecular testing with traditional biopsy based surveillance for heart transplant rejection. The IMAGE trial began recruiting subjects in January 2005.
Bernstein and colleagues studied GEP in heart transplant recipients with mild acute rejection and reported that the AlloMap score increased with biopsy grade; specifically, patients with Grade 1B rejection had scores similar to that of Grade 3A (indicating acute rejection), while those with Grade 1A rejection (i.e., focal rejection) had scores similar to Grade 0 (no rejection). (4) These findings suggest that patients with Grade 1B rejection (i.e., diffuse rejection) may have more serious sequelae than those with Grade 1A.
In a retrospective study, Eisen and colleagues examined the use of the AlloMap for longitudinal monitoring of heart transplant rejection in 19 cases. (5) The authors reported that high AlloMap scores clustered with rejection or graft dysfunction.
Starling and colleagues examined the impact of corticosteroids on the AlloMap score and reported that the AlloMap score could distinguish between rejections from quiescence, and can also be used to monitor the response of patients to steroid treatment. (6)
Nodular endocardial infiltrates (i.e., not myocardial) are also known as Quilty lesions, which appear to have minimal clinical implications, but which overlap in histologic characteristics with Grade 2 histology. Marboe and colleagues compared the AlloMap score in those without evidence of acute rejection to those with and without Quilty lesions. (7) The authors reported that recipients forming Quilty lesions had a distinct gene expression profile.
No published studies or abstracts were found that examined how either Heartsbreath or AlloMap could be integrated into the management of the patient, either to select or deselect patients for EMB, or potentially replace EMB altogether. Evans and colleagues discussed the economic implications of noninvasive testing for cardiac allograft rejection, based on the assumption that a positive AlloMap test would result in a confirmatory biopsy, while a negative test would permit deferral of a biopsy. (8)
Based on the results of the CARGO study, the authors estimate that during the first post-transplant year, the numbers of EMB would be halved, resulting in an aggregate cost savings for all heart transplant patients of 12 million dollars per year. It should also be noted that GEP is also under investigation for other inflammatory conditions, such as Crohn’s disease and systemic lupus erythematosus (SLE).
A search of the MedLine database performed through August 2008 did not identify any evidence that would alter the conclusions reached above. A review from the California Technology Assessment Forum concluded that given the post-hoc change in the threshold and the small size of the CARGO primary validation study (reviewed above), “it would be prudent to require independent confirmation of the CARGO study results” before widespread adoption of AlloMap GEP to monitor heart transplant patients occurs. (9) This 2006 California Technology Assessment also noted that there were no studies that compared clinical outcomes of patients monitored with GEP to those of patients monitored with EMB. The design and objectives of the IMAGE trial have been reported; no results are available at this time.
A multicenter work group has published their post-CARGO clinical observations. (10) They identified a number of factors that can affect AlloMap scores, including the time post-transplant, corticosteroid dosing, and transplant vasculopathy. (10, 11) Scores of 34 and above were considered positive, potentially indicating rejection, whereas scores below that threshold were considered negative with no evidence of rejection. Analysis of data from a number of centers collected post-CARGO showed that, at 1 year or more post-transplantation, an AlloMap threshold of 34 had a positive predictive value of 7.8% for scores of greater than or equal to 3A/2R on biopsy and a negative predictive value of 100% for AlloMap scores below 34. There is insufficient information in this study to determine whether there are potential study biases in this report. These findings are limited due to a very low number of events; only five biopsy samples (2.4%) were found to have a grade of 2R or greater. At 1 year, 28% of the sample showed an elevated AlloMap score (>34) even though there was absence of evidence of rejection on biopsy. Thus, frequent monitoring with AlloMap could potentially result in an increase in the number of biopsies performed in stable patients who would not otherwise undergo routine biopsy. The significance of chronically elevated AlloMap scores in the absence of clinical manifestation of graft dysfunction and the actual impact on the number of biopsies performed is currently unknown. Controlled studies with a larger number of patients are needed to evaluate if GEP might reduce the need for surveillance biopsy at various times post-transplant. The incremental value that AlloMap provides, above and beyond that obtained from the history and physical examination and basic laboratory studies, is uncertain. Given the absence of substantial new trial results and the questions that remain regarding the impact of AlloMap on health outcomes, routine use of GEP in post-transplantation surveillance is considered experimental, investigational and unproven. The policy statements are unchanged.
A search of peer reviewed literature through November 2010 identified no new clinical trial publications or any additional information that would change the coverage position, specific to Heartsbreath, of this medical policy.
Regarding AlloMap, Cadeiras and colleagues conducted a retrospective statistical analysis of data from 76 heart transplant recipients who underwent GEP with the AlloMap test at a single institution. (12) In multivariate analysis, statistically significant predictors of a high GEP score were a higher serum creatinine level, a higher corrected QT interval, a lower oxygen saturation level, and a lower platelet count. The authors concluded that the complex relationship between GEP test results and biological parameters warrants further study. The Cadeiras analysis did not address the previously stated limitations in the evidence base such as the optimal cutoff level for the AlloMap test and timing of testing.
Mehra et al. assessed the clinical implications of 28 cardiac transplant recipients who progressed to moderate to severe rejection, 53 who progressed to mild rejection, and 46 who remained rejection free. (13) Those with gene scores of £ 20 (rejection free) or ³ 30 (associated with progression to moderate to severe rejection) represented 44% of the cardiac transplant population within six months post-transplant. Their conclusion of early profiling permits development of surveillance strategy.
In 2010, results of the IMAGE study were published. (14, 15) This was an industry-sponsored non-inferiority randomized controlled trial (RCT) that compared outcomes in 602 patients managed with the AlloMap test (n=297) or routine EMBs (n=305). The study was not blinded. The study included adult patients from 13 centers who underwent cardiac transplantation between one- and five-years previously, were clinically stable and had a left ventricular ejection fraction of at least 45%. In order to increase enrollment, the study protocol was later amended to include patients who had undergone transplantation between six-months and one-year earlier; this sub-group ultimately comprised only 15% of the final sample (n=87). Each transplant center used its own protocol for determining the intervals for routine testing. At all sites, patients in both groups underwent clinical and echocardiographic assessments in addition to the assigned surveillance strategy. According to the study protocol, patients underwent biopsy if they had signs or symptoms of rejection or allograft dysfunction at the clinic visits (or between visits), or if the echocardiogram showed a left ventricular ejection fracture decrease of at least 25% compared to the initial visit. Additionally, patients in the AlloMap group were biopsied if their test score was above a specified threshold; however, if they had two elevated scores with no evidence of rejection found on two previous biopsies, no additional biopsies were required. The AlloMap test score varies from 0 to 40, with higher scores indicating a higher risk of transplant rejection. The investigators initially used 30 as the cutoff for a positive score; the protocol was later amended to use a cutoff of 34 to minimize the number of biopsies needed. Fifteen patients in the AlloMap group and 26 in the biopsy group did not complete the study.
The primary outcome was a composite variable; the first occurrence of 1) rejection with hemodynamic compromise; 2) graft dysfunction due to other causes; 3) death; or 4) retransplantation. The trial was designed to test the non-inferiority of GEP with the AlloMap test compared to EMBs with respect to the primary outcome. Use of the AlloMap test was considered non-inferior to the biopsy strategy if the one-sided upper boundary of the 95% confidence interval (CI) for the hazard ratio (HR) comparing the 2 strategies was less than the pre-specified margin of 2.054. The margin was derived using the estimate of a 5% event rate in the biopsy group, taken from published observational studies, and allowed for an event rate of up to 10% in the AlloMap group. Secondary outcomes included death, the number of biopsies performed, biopsy-related complications and quality-of-life using the 12-item short-form (SF-12).
According to Kaplan-Meier analysis, the 2 year event rate was 14.5% in the AlloMap group and 15.3% in the biopsy group. The corresponding HR was 1.04 (95% CI=0.67 to 1.68). The upper boundary of the CI of the HR, 1.68, fell within the prespecified non-inferiority margin (2.054); thus gene expression profiling was considered non-inferior to EMB. Median follow-up was 19 months. The number of patients remaining in the Kaplan-Meier analysis after 300 days was 221 in the biopsy group and 207 in the AlloMap group; the number remaining after 600 days was 137 and 133, respectively. The secondary outcome, death from all-causes at any time during the study, did not differ significantly between groups. There were a total of 13 (6.3%) deaths in the AlloMap group and 12 (5.5%) in the biopsy group (p=0.82). During the follow-up period, there were 34 treated episodes of graft rejection in the AlloMap group. Only 6 of the 34 (18%) patients presented solely with an elevated AlloMap score. Twenty patients (59%) presented with clinical signs or symptoms and/or graft dysfunction on echocardiogram and 7 patients had an elevated AlloMap score plus clinical signs or symptoms with or without graft dysfunction on echocardiogram.
A total of 409 biopsies were performed in the AlloMap group and 1249 in the biopsy group; the biopsy rate differed significantly between groups, p<0.001. Most of the biopsies in the AlloMap group, 67%, were performed because of elevated gene-profiling scores. Another 17% were performed due to clinical or echocardiographic manifestations of graft dysfunction and 13% were performed as part of routine follow-up after treatment for rejection. There was 1 (0.3%) adverse event associated with biopsy in the AlloMap group and 4 (1.4%) in the biopsy group. In terms of quality of life, the physical-health and mental-health summary scores of the SF-12 were similar in the two groups at baseline and did not differ significantly between groups at two-years.
A limitation of the study was that the threshold for a positive AlloMap test was changed part-way through the study; thus, the optimal test cut-off remains unclear. Moreover, the study was not blinded which could have impacted treatment decisions such as whether or not to recommend biopsy based on clinical findings. In addition, the study did not include a group that only received clinical and echocardiographic assessment and therefore the value of AlloMap testing beyond that of clinical management alone cannot be determined. The uncertain incremental benefit of the AlloMap test is highlighted by the finding that only 6 of the 34 treated episodes of graft rejection detected during follow-up in the AlloMap group were initially identified due solely to an elevated gene-profiling score. Since 22 episodes of asymptomatic rejection were detected in the biopsy group, it is likely that the AlloMap test is not a sensitive test, possibly missing more than half of the episodes of asymptomatic rejection. Since clinical outcomes were similar in the 2 groups, there are at least 2 possible explanations. The clinical outcome of the study may not be sensitive to missed episodes of rejection, or it is not necessary to treat asymptomatic rejection. In addition, the study was only statistically powered to rule out more than a doubling of the rate of the clinical outcome, which some may believe is an insufficient margin of non-inferiority. Finally, only 15% of the final study sample had undergone transplantation less than 1 year before study participation; therefore, findings may not be generalizable to the population of patients 6- to twelve-months post-transplant.
The California Technology Assessment Forum published an updated literature review on gene expression profiling with AlloMap in 2010. (16) Their review concluded, based on findings of the CARGO and IMAGE studies, that there was sufficient evidence to conclude that AlloMap testing improves health outcomes for patients who are at least one year post-transplantation, but not to those between 6- and 12-months post-transplantation.
Practice Guidelines and Position Statements
In 2010, the International Society of Heart and Lung Transplantation issued guidelines for the care of heart transplant recipients. (17) The guidelines included the following recommendations:
- The standard of care for adult heart transplant recipients is to perform periodic endomyocardial biopsy during the first 6- to 12-months after transplant for rejection surveillance.
- After the first year post-transplant, EMB surveillance every four- to six-months is recommended for patients at higher risk of late acute rejection.
- GEP using the AlloMap test can be used to rule out acute heart rejection (Grade 2 or greater) in appropriate low-risk patients between 6-months and 5-years post-transplant.
There is insufficient evidence on the diagnostic accuracy of the Heartsbreath test, especially for Grades 3 and 4 rejection, and no published studies have evaluated the clinical utility of this test.
Thus, the coverage statement remains unchanged for Heartsbreath test.
There is evidence on the diagnostic accuracy of the AlloMap test from the CARGO trial and post-CARGO publications. However, the evidence is not sufficiently rigorous to determine the true sensitivity and specificity of the test with certainty. The threshold indicating a positive test that seems to be currently accepted, a score of 34, evolved part-way through the data collection period of the subsequent non-inferiority trial (the IMAGE study) evaluating the test’s clinical utility. The IMAGE study had several methodological imitations, e.g., lack of blinding, and it was not able to determine whether AlloMap offers incremental benefit compared to biopsy performed on the basis of clinical exam and echocardiography. In patients who are less than 1 year post-transplant, the group that is at highest risk of transplant rejection, there are insufficient data on the clinical utility of AlloMap.
Additional published randomized clinical trials are needed to compare the clinical outcomes in patients managed with heart transplant rejection tests and EMB before changing the surveillance strategies in monitoring acute heart transplant rejection. According to ClinicalTrials.gov, an Early Invasive Monitoring Attenuation through Gene Expression (EIMAGE) Trial began recruiting patients in May 2010. The study is designed to evaluate the leukocyte gene expression profiling method in monitoring asymptomatic heart transplant patients for acute rejection early post-transplantation, from two- to six-calendar months (55 to 185 days). A two arm study comparing AlloMap to EMB is planned for completion in September 2011. Thus, the coverage statement remains unchanged for the AlloMap test.
A search of peer reviewed literature through May 2013. The following is a summary of the key literature to date.
Findings from the HARDBALL study were published in 2004. No subsequent studies that evaluate use of the Heartsbreath test to assess for graft rejection were identified in literature reviews or any additional information that would change the coverage position of this medical policy.
Since the last policy update, two assessments have been published. First, a November 2011 BlueCross Blue Shield Association (BCBSA) Technology Evaluation Center (TEC) Assessment reviewed the evidence that was used for AlloMap testing. (18) The Assessment stated the methods used in the studies reviewed were not clear, the data used to generate the values of the study were not available, and the results were not consistent with results from the actual patient sample. The studies all lack sufficient description to determine whether there were biases, such as verification bias. And, the sensitivity of AlloMap testing was not reported in several of the studies reviewed. Therefore, the Assessment concluded that the evidence is insufficient to permit conclusions about the effect of the AlloMap test on health outcomes.
An ECRI Emerging Technology Evidence Report published in December 2011 reviewing the accuracy of the AlloMap test. (19) They concluded due to the undefined sensitivity and specificity, the accuracy of the AlloMap test could not be determined. The lack of reporting of mortality, occurrence of rejection episodes, complications, and quality of life was limited as only the IMAGE trial (one RCT) reported these outcomes. Despite one RCT, the AlloMap test resulted in fewer EMBs for those heart transplant patients with stable allograft function; however, additional data would be helpful to confirm those results. Thus, the lack of peer reviewed scientific literature or any additional information evaluating the use of AlloMap testing were identified that would change the coverage position of this medical policy.
Disclaimer for coding information on Medical Policies
Procedure and diagnosis codes on Medical Policy documents are included only as a general reference tool for each policy. They may not be all-inclusive.
The presence or absence of procedure, service, supply, device or diagnosis codes in a Medical Policy document has no relevance for determination of benefit coverage for members or reimbursement for providers. Only the written coverage position in a medical policy should be used for such determinations.
Benefit coverage determinations based on written Medical Policy coverage positions must include review of the member’s benefit contract or Summary Plan Description (SPD) for defined coverage vs. non-coverage, benefit exclusions, and benefit limitations such as dollar or duration caps.