FFT-method case validity
Dr. Google – Which health information can I trust?
Why is it difficult to provide decision support for problems of uncertainty?
A. What do you need?
- Aikman, D., Galesic, M., Gigerenzer, G., Kapadia, S., Katsikopoulos, K. V., Kothiyal, A., ... & Neumann, T. (2014). Taking uncertainty seriously: Simplicity versus complexity in financial regulation. Bank of England Financial Stability Paper, 28.
- Green, L., & Mehr, D. R. (1997). What alters physicians' decisions to admit to the coronary care unit?. Journal of Family Practice, 45(3), 219–226.
- Jablonskis, E., & Czienskowski, U. (2017). Decision trees online. http://www.adaptivetoolbox.net/Library/Trees/TreesHome#/
- Jenny, M. A., Pachur, T., Williams, S. L., Becker, E., & Margraf, J. (2013). Simple rules for detecting depression. Journal of Applied Research in Memory and Cognition, 2(3), 149–157.
- Luan, S., Schooler, L. J., & Gigerenzer, G. (2011). A signal-detection analysis of fast-and-frugal trees. Psychological Review, 118(2), 316.
- Martignon, L., Katsikopoulos, K. V., & Woike, J. K. (2008). Categorization with limited resources: A family of simple heuristics. Journal of Mathematical Psychology, 52(6), 352–361.
We have all been there - Dr. Google is frequently consulted about symptoms of illness. On the Internet there is a plethora of medical information that offer consumers to inform themselves about symptoms, benefits or damages of treatment options. Unfortunately, the quality of digital health information varies dramatically. Misleading information about medical interventions lead to the misjudgement of risks and prevents informed decisions than can sometimes have serious consequences. To prevent this, it is important that you can recognize the quality of health information on the Internet. Our decision tree as a digital checklist shows you what you need to pay attention to.
When do I need this figure?
If you are looking for health information on the Internet and find information that might help you, you can use the following decision tree.
What does the figure show?
The figure shows a decision tree that you can use to check any health information. Here you check whether the information provided helps you to make an informed decision or not.
A warning means that you are unlikely to be able to make an informed decision based on this health information. There can be many reasons for this: It may be because essential information is withheld. It may be advertising or designed unprofessionally. In some cases, the decision tree can come to a wrong conclusion.
An "all-clear" means that the health information can be used to support an informed decision. There remains the possibility of a residual error in the decision tree.
You can also check this information according to further quality criteria. Please note, however, that no offer or checklist is ever perfect. With each additional feature that you check, the risk of an incorrect assessment of the offer increases. Further features are:
- Is a health-related intervention described that could make a decision for/against?
- Is there no recommendation as to what should be done?
- Is a possible evaluation of the decision options clearly distinguished from the facts or is there no evaluation at all?
- Is the quality of the studies used for health information addressed?
- Are numbers expressed both positively and negatively at the same time, e.g. how many are improving and how many are not?
- Is it clear how the provider produces this and other health information?
Further established criteria for quality-assured health information can be found in the checklists of the Institute for Quality and Efficiency in Health Care (IQWIG) and the Ärztlichen Zentrums für Qualität in der Medizin (ÄZQ).
Where is the data that the decision tree is based on coming from?
Cases – Which health information served as a basis?
662 health information (cases) from German-language websites were collected, 487 of which were researched by experts from the Harding Center for Risk Literacy.
(1) Websites from the health category from web catalogues such as SimilarWeb.
(2) Search results from Bing and Google for the following terms or combinations of terms (see Bertelsmann, 2018): health information; health guide; diseases; cold; migraine; abdominal pain; joint pain; skin eczema; carpal tunnel syndrome; cancer; prostate cancer; prostatitis; breast cancer; psoriasis; herniated disc; back pain; osteoarthritis; sarcoidosis; influenza; how to recognize ....; What should I do about ...; What works against ...; Should I see a doctor about ...; How dangerous is ...; Which household remedies help against ...; How long helps ...; What from ...; How can I ... prevent?
(3) The sample was artificially enriched by randomly drawn pages from websites which, according to their own information, follow the guideline Gute Praxis Gesundheitsinformation ("oversampling" of rarer cases with the expression "supports informed decision-making" compared to a random selection)
175 additional cases were compiled by laypeople in a study on antibiotics for upper respiratory tract infections, early detection of ovarian cancer and the mumps measles rubella vaccination for children.
Target assessment– How was determined whether an informed decision was enabled?
42 experts from research on health information, from health insurance companies, from the network Netzwerk Evidenzbasierte Medizin as well as representatives of health associations with professional experience in the field of health information assessed the cases.
Each information was evaluated by three experts with regard to the question: Does the health information provided enable a layperson to make an informed decision? A four-step response format was used. The median value of three experts each was used as the criterion value for the individual case. The experts did not receive any information about the potential features used in the study.
Based on the „Guten Praxis Gesundheitsinformation“ (DNEBM, 2016) and according to the DISCERN-Standards (ÄZQ, 2019) 31 and 39 features respectively were identified as generally testable by laypeople by the RisikoAtlas team. A crosscheck with other publications (Bernstam et al., 2008; Bunge et al., 2010; Zhang et al., 2015) regarding this question the elimination of redundant features resulted in 65 potential features.
The purpose of pre-selecting features was to limit the number of candidates for the decision tree to distinguish between health information. The feature selection was performed from two points of view: Lay testability and statistical significance; in seven steps:
- Cases 1–100 (information) were coded, compared, discussed and harmonised by two independent research assistants in 65 features. A statistical feature selection was then performed (using Random Forest Trees with boruta in the statistical program R) and different features were eliminated due to feedback of the coders (were found to be too difficult to use by laypeople). 22 features remained.
- Cases 101–499 were were coded, compared, discussed and harmonised in groups of 100 by two independent assistants. Four further statistical feature selections were performed, which lead to the elimination of seven more features.
- Cases 500–598 were used to check the coding behaviour of laypeople against a practiced research assistant. Five Clickworkers each did not achieve a satisfactory agreement on 3 features. The statistical feature selection did not change anything.
- In a laboratory study, laypeople coded cases 600-675 with regard to the remaining 12 features. The statistical feature selection did not change anything.
On the base model shown above.
Both, the recursive algorithm by Marcus Buckmann and Özgür Simşek (manuscript in preparation) and the FFTrees package (Phillips et al., 2017) were used for model identification. The ifan algorithm was used to optimise for balanced accuracy.
Model II – specifically for websites that inform about specific medical interventions
The second model is based not only on the assumption that the person seeking health information on the Internet prepares a decision on a specific medical intervention, but also on the assumption that he or she is in a position to recognise whether a website serves this purpose at all. Thus information offers, which have only the character of general knowledge transfer or are discussion platforms, are not considered.
The data (collected health information cases) were collected between 2017 and 2018 and coded in their features and assessed by experts. 661 cases were complete. The datasets were randomly divided into training datasets (two thirds) and test datasets (one third). The models have the following quality:
On the base model.
balanced accuracy = 0.74; specificity in the rejection of information that did not enable an informed decision (share of 86% in the test set) of 0.91. This means that 91 out of 100 information that do not enable an informed decision are recognised by the decision tree.
The sensitivity in the confirmation of health information that enable an informed decision (share of 14% in the test set) is 0.57.
Model II – specifically for websites that inform about specific medical interventions
balanced accuracy = 0.75; specificity in the rejection of information that did not enable an informed decision (share of 79% in the test set) of 0.91. This means that 91 out of 100 information that do not enable an informed decision are recognised by the decision tree.
The sensitivity in the confirmation of health information that enable an informed decision (share of 21% in the test set) is 0.59.
Potential for development
- Higher share of health information collected by laypeople in order meet the actual searching behaviour
- Use of cases with experimentally confirmed "informed decisions" - beyond expert judgements - for additional validation
- Further development for other languages that need their own training data
Empirical evaluation with consumers
- ÄZQ (2019). Qualität von Gesundheitsinformationen im Internet. https://www.patienten-information.de/checklisten/qualitaet-von-gesundheitsinformationen.
- Bernstam, E. V., Walji, M. F., Sagaram, S., Sagaram, D., Johnson, C. W., & Meric‐Bernstam, F. (2008). Commonly cited website quality criteria are not effective at identifying inaccurate online information about breast cancer. Cancer, 112(6), 1206–1213.
- Bertelsmann (2018). Suche nach Gesundheitsinformationen. Bericht.
- Buckmann, M., & Simsek, Ö. (Manuskript in Vorbereitung). Rekursiver FFT-Algorithmus.
- Bunge, M., Mühlhauser, I., & Steckelberg, A. (2010). What constitutes evidence-based patient information? Overview of discussed criteria. Patient Education and Counseling, 78(3), 316–328.
- DNEBM (2016). Leitlinie Gute Praxis Gesundheitsinformation.
- Phillips, N. D., Neth, H., Woike, J. K., & Gaissmaier, W. (2017). FFTrees: A toolbox to create, visualize, and evaluate fast-and-frugal decision trees. Judgment and Decision Making, 12(4), 344–368.
- Steckelberg, A., Berger, B., Köpke, S., Heesen, C., & Mühlhauser, I. (2005). Kriterien für evidenzbasierte Patienteninformationen. Zeitschrift für ärztliche Fortbildung und Qualität im Gesundheitswesen, 99, 6.
- Zhang, Y., Sun, Y., & Xie, B. (2015). Quality of health information for consumers on the web: A systematic review of indicators, criteria, tools, and evaluation results. Journal of the Association for Information Science and Technology, 66(10), 2071–2084.