![]() |
||
|
||
| The Tools at Hand | ||
The Graphical DrilldownData Desk is a wonderful tool for finding patterns in data. In some situations you may want to look for patterns in summarized data and then drill down into the raw data to see the details for a specifically defined subset. Data Desk offers a variety of graphical and subsetting tools that can be used to perform such an analysis on data from any industry or experiment. Our consulting staff used a number of these tools for a challenging analysis of health care claims data, and you can use the same technique to drill down into your own data. In analyzing the health care claims data our objective was to produce graphical tools to help find patterns of resource inefficiencies and billing anomalies by providers of health care services. In this case, we investigated the amount of time hospitals kept patients in their hospital. Payors are interested in this number because longer hospital stays are more expensive, take beds away from other patients who need them, and increase the possibility of patients catching a hospital-borne infection. We started with a file containing several million records. First, we divided this file into several smaller files on the basis of medical diagnosis so that we could restrict our analysis to similar patients. Once we created these homogenous datasets, we created a regression model that predicted the patient's length of stay in the hospital using factors such as age, severity of illness, source of admission, and discharge status.
The model can be thought of as a tool for removing all of the known variability. We were interested in looking at the data that were left over. These leftovers are represented by the residuals. We essentially summarized using the Summaries by Group command those residuals for each hospital and looked for those hospitals that had significant differences from what the model predicted. We were interested in identifying those "different" hospitals and exploring deeper to see if we could understand why they were different. So we dropped the columns from the Summaries by Group table and plotted the Mean Residuals versus the Count in a scatterplot. We colored the points based on how statistically different each of the means were from the overall mean. The points in red and orange were the ones of most interest to us.
The next step of the drilldown was to look at the raw data for those patients in a particular hospital of interest. First we made a Variable Table with all of the important data for each patient and plots of Race, Sex, and Age. We assigned a HotSet Selector to the table and charts and enabled Automatic Update for each. Then we selected a hospital name in Summaries by Group table. As we selected a new hospital name from the summary table, the plots and Variable table updated to reflect only those patients for the selected hospital.
Using these Data Desk tools allowed us to explore large datasets for anomalous behavior very efficiently and to drill down into the details very quickly to understand the causes of this behavior. Although this example describes analysis of health care claims data, the same approach can work with a variety of different datasets and applications. Graphical Drilldown: Technique Summary
How can we help you? Learn more about Data Desk Training. |