FFT-method case validity
Make informed investments despite the grey capital market
Why is it difficult to provide decision support for problems of uncertainty?
How to construct a decision tree for a consumer problem - the FFT method of case-based feature validity
A. What do you need?
Once you have made a selection of potential features, you need to find out how often and under what circumstances they occur in the real world. For this you collect material of typical decision situations, e.g. real purchase offers, videos of real consulting situations or real informational services.
If such case material of typical decision situations is not available, the FFT method of case-based feature validity is the method of choice. For this you have to consider how many candidate characteristics you want to examine, but also how rare the test object is, i.e. what the decision tree should help to identify. If you need support during this process, please consult the final report on the Risk Atlas project from July 2020 or contact us. Contact details can be found here.
If such case material from typical decision situations is not available, the FFT method of case-based feature validity is not suitable. In such a case, you have to use a different method.
The alternative is then the "view of the expert", on which the model approach presented here was aimed from the outset. Several independent experts evaluate each individual case with a view to the goal of the development, e.g "Does this health information allow an informed decision?" Their assessment must then be combined. The median of their judgements proves to be more robust than arithmetic averages when combining the individual assessments. here
B. How do you proceed?
You now have to find out which features the selected cases (for example, real purchase offers) show or do not show. During describing your cases in terms of the potential features, you learn a lot about the actual testability of the features by laypeople. It can be assumed that you will subsequently remove some features for which examination by the consumers would have been difficult.
Parallel to the coding of the cases in the features the expert evaluations are "collected". Experts receive only the case material, never the features or even the feature coding. The aim is to model the expert assessments independently (the expert's view).
As the coding of numerous potential features is very complex, an early point in time of a further reduction of the number of features promises great efficiency gains. However, before making a selection with the help of laypeople by testing how well they understand the features, a statistical approach is usually less costly and easier. A simple statistical feature selection is worthwhile after just 100 cases. Various tools are available, e.g. the boruta package or the caret package (both implemented in the open source solution R). With boruta, the basic meaning of the individual features is checked by so-called random forests. If a feature behaves like a random number with regard to the expert assessments, the statistical recommendation is to leave it aside due to lack of significance in further development. As with all statistical approaches based on random sampling, this also applies here: If your background knowledge tells you that the feature should actually be significant, test it in another round and also code it in the next 100 cases.
Repeat this process of coding, assessing, and statistical feature selection every 100 cases and try to make the set of features more manageable. This has a positive effect on the coding effort, the expert assessments and also on the model finding.
If nothing changes during feature selection, if you have assessed 500 to 1000 cases depending on the number of features, or if you run the risk of falling below six features, it is worth modeling the decision tree on the basis of these case-feature assessment profiles.
The pipeline for development can be summarized in a simplified illustration:
You will model a Fast-and-Frugal Tree (FFT) using the portion of cases you select as training data; often 50% to 80% of cases. This FFT has a certain ability to make the right decisions (assessment). This means it will overlook cases in the real world and give false alarms on others. To quantify this ability, either perform a statistical cross-validation (you apply the decision tree to randomly repeated cases; test data cases) or apply it once to a collection of cases with assessments that you set aside before modeling (20% to 50% of the data).Alternatively, you can collect a completely new sample of cases with feature codings and assessments (out-of-sample) to which you apply the decision tree (additional time and effort).
Finally, the model must be tested in practice with laypeople. Here, a randomised controlled study is useful. It compares the decision intentions of consumers who are given the decision tree with those who have nothing or a standard information sheet. here
- Aikman, D., Galesic, M., Gigerenzer, G., Kapadia, S., Katsikopoulos, K. V., Kothiyal, A., ... & Neumann, T. (2014). Taking uncertainty seriously: Simplicity versus complexity in financial regulation. Bank of England Financial Stability Paper, 28.
- Green, L., & Mehr, D. R. (1997). What alters physicians' decisions to admit to the coronary care unit?. Journal of Family Practice, 45(3), 219–226.
- Jablonskis, E., & Czienskowski, U. (2017). Decision trees online. http://www.adaptivetoolbox.net/Library/Trees/TreesHome#/
- Jenny, M. A., Pachur, T., Williams, S. L., Becker, E., & Margraf, J. (2013). Simple rules for detecting depression. Journal of Applied Research in Memory and Cognition, 2(3), 149–157.
- Luan, S., Schooler, L. J., & Gigerenzer, G. (2011). A signal-detection analysis of fast-and-frugal trees. Psychological Review, 118(2), 316.
- Martignon, L., Katsikopoulos, K. V., & Woike, J. K. (2008). Categorization with limited resources: A family of simple heuristics. Journal of Mathematical Psychology, 52(6), 352–361.
In times of digitalisation, more and more people are investing their money on the Internet, including in products from the so-called grey capital market. There, providers are subject to less supervision. Increasingly, their offers attract customers through high interest rates or returns and the supposed investment security of your money. Company investments, crowdfunding projects, direct investments or gold savings plans - with the increasing abundance of offers it becomes even more difficult to distinguish investment opportunities that are trustworthy from those that are not. One of the characteristics of trustworthy offers is that they provide you, the consumer, with essential information so that you can weigh up potential returns and risks, and thus invest in an informed manner. But what kind of information are these? With our decision tree as a digital checklist, you can check whether a provider on the Internet enables you to make an informed investment decision.
When do I need this tool?
If you are considering investing money, e.g. to save money for the future, and if you then want to invest directly on the Internet instead of consulting a financial advisor, you need to be able to separate the wheat from the chaff. You must be able to sort out offers that stand in the way of an informed investment decision. Which offers enable you to weigh up potential returns and risks?
Imagine that you have now found an investment opportunity. The tool should help you to recognize whether providers enable you with their offer to make an informed decision or whether you should check their offer first with an advisor.
What does the figure show?
The figure shows a decision tree with which you can check German investment offers that allow a direct investment. Here you check whether the offer at hand allows you to weigh up possible returns, costs and risks or not.
A warning means that there is a lot to object to an informed decision based on this offer. There can be many reasons for this: The supplier is not interested in an informed decision on your part or the offer is simply unprofessional. You should therefore not decide on the basis of the information given but instead discuss the offer with an expert or search for other offers. In some cases the decision tree can come to a wrong conclusion.
An "all-clear" means that the offer you have is probably designed to enable you to make an informed decision. We also always recommend obtaining a second opinion. Please note that the assessment does NOT say anything about the quality of the product itself. Nobody can tell you how the product will develop in the future. Even developments from the past are hardly indicative of future developments. In some cases the decision tree can come to a wrong conclusion.
You can also examine the offer more extensively. Please note, however, that no offer or checklist is ever perfect. With each additional feature that you check, the risk of an incorrect assessment of the offer increases. Further features are:
- Is the type of product clearly identified?
- Are risks diversified or is it explicitly stated that the investment offer will diversify?
- Is the link between return forecasts and risk addressed?
- Is there a warning: "The acquisition of this asset investment is associated with considerable risks and can lead to the complete loss of the invested capital"?
- Is it pointed out that past performance cannot predict actual or future performance?
- Are the initial investment costs, including all fees, stated?
- Are ongoing fees excluded or at least explicitly stated?
- Are fluctuations in return from past performance indicated?
- Is information available on the taxation of the investment?
- Are the risks of the offered investment opportunity specifically explained?
- Is there a leaflet accompanying the investment offer, e.g. a product information sheet?
- Is suggested that others dependent on your investment [NEGATIVE FEATURE]?
- Are there any statements about guaranteed above-average amounts [NEGATIVE FEATURE]?
If the product already existed in the past:
- Is the return for the previous year given?
- Is the 5-year return reported?
- Can the investment offer on the website be compared with products from other providers?
Regarding the company:
- Are information on the management team or the managing directors provided?
If there can be a supervisory authority:
- Is the supervisory authority responsible for the provider indicated?
- Is the provider or the investment offer itself listed in the databases of the supervisory authority BaFin?
- Is it mandatory to provide personal data in order to receive information about the offer?
- Is the offer shown available as a PDF file?
- Are there explanatory headings and subheadings in the text of the investment offer?
Where is the data coming from?
Cases – Which offers served as a basis?
693 German-language offers were compiled by experts from the Harding Center for Risk Literacy. Summary pages of individual banks on various capital investments (i.e. key figures of certain investment options listed in tabular form), consulting services offered by banks or independent brokers, insurance companies, and financial administrators were not included. However, included are those "offers" that laypeople considered to be investment opportunities.
The following strategy was chosen for the 2018-2019 search:
For the first 100 cases Google was used for searching with single terms in analogy to a study on the grey capital market (Verbraucherzentrale Hessen, 2016) (bond, old-age provision, fund, investment, capital investment, returns, savings, call money, securities). In each case ten results were used. For "returns" and "investment", two cases were not found on the Google result pages 1-3, but in each case only on page 5. If a hit led to a list for two or three very similar products, the offer that was mentioned first mentioned was selected.
For the further cases 101-500 these searches were repeated and extended, and the goal orientation was condensed with search combinations with interest, share, warranty, gold, Grüne, interest-rate, precious metal, ETF. In addition to Google, the same searches were carried out on Facebook. A further 180 cases were obtained by laypeople searching within the framework of a RisikoAtlas study. In addition, individual information on offers from projects on crowdfunding platforms were manually enriched.
Target assessment– How was determined whether an offer information enables an informed decision?
42 experts with academic or practical professional experience regarding the design of information on offers assessed the cases.
Each information was evaluated by three experts with regard to the statement: The information provided enables a layperson to make an informed investment decision. A four-step response format was used. The median value of three experts each was used as the criterion value for the individual case. The experts did not receive any information about the potential features used in the study.
In order to avoid that all links found by laypeople are assessed by experts, pages where no product was advertised at all (e.g. financial news), company websites (unless only one offer was offered), and very similar products of already assessed providers (e.g. only different numbers, names) were not submitted.
Potential features – Which features were considered?
Based on various sources (aside from German-language handouts, e.g.. Aspara & Chakravarti, 2015; California Debt and Investment Advisory Commission, 2017; Consumer Fraud Research Group, 2006; Glazer, 2016; Investopedia, 2015a, 2015b, 2017; Investor.gov, 2017; IOSCO, 2002; Law Commission (2014). Fiduciary duties of investment intermediaries. Consultation paper No. 215; Lee et al., 2013; Rensburg & Botha, 2014; Rutledge, 2012; The Share Center, 2017; US Securities and Exchange Commission, 2010, 2014) 138 features were selected, of which 72 were regarded as principally assessable by laypeople after the elimination of redundancies after a first test.
Selection of features and modelling
The purpose of pre-selecting features was to limit the number of candidates for the predictive model to distinguish between investment information that enables an informed decision and information that does not. The feature selection was performed from two points of view: Lay testability and statistical significance; in six steps:
- Cases 1-50 were coded, compared, discussed and harmonised by two independent research assistants in the 72 features. A total of 24 further features in the coding were found not to be sufficiently comprehensible by laypeople.
- Cases 51-100 were coded, compared, discussed and harmonised by two additional independent research assistants in 48 features. A statistical feature selection was then performed (using Random Forest Trees with boruta in the statistical program R). 37 features remained.
- Cases 101-384 were coded, compared, discussed and harmonised in groups of 100 by two different pairs of research assistants, and three more statistical feature selections were performed. 31 features remained.
- Cases 385-484 were used to check the coding behaviour of laypeople against a practiced research assistant. Five Clickworkers each did not achieve a satisfactory agreement on 3 features. The statistical feature selection eliminated 24 features.
- In a laboratory study, laypeople coded cases 485-677, and two research assistants cases 678-693 with regard to the remaining 24 features. The statistical feature selection did not change anything.
On Model I shown above – filter formats that are highly likely to enable an informed decision
Model I clears offers where experts fully agree that an informed decision is possible. Both, the recursive algorithm by Marcus Buckmann and Özgür Simşek (manuscript in preparation) and the FFTrees package (Phillips et al., 2017) were used for model identification. The ifan algorithm was used to optimise for balanced accuracy.
Model II – filter formats that clearly prevent an informed decision
Model II warns against offers where experts fully agree that an informed decision is not possible. The model contains only causally linked features, i.e. providers who intend to conceal information cannot use these without improving the information base for potential customers. Both, the recursive algorithm by Marcus Buckmann and Özgür Simşek (manuscript in preparation) and the FFTrees package (Phillips et al., 2017) were used for model identification. The ifan algorithm was used to optimise for balanced accuracy.
What is the quality of the data?
The data (collected cases) were collected between 2017 and 2019 and coded in their features and assessed by experts. 693 cases were complete. The datasets were randomly divided into training datasets (two thirds) and test datasets (one third).
The model for identifying offer information that is highly likely to enable an informed investment has the following quality:
A cross validation of the identified decision tree resulted in the following quality measures: balanced accuracy = 0.78; sensitivity in the recognition of offer information that enable an informed decision (share of 3% in the test set) of 0,83. This means that 83 out of 100 of such offer information are recognised.
The specificity in the confirmation of sometimes less or hardly informative offer information is 0.72.
The model for identifying offer information that definitely do not enable an informed investment has the following quality:
A cross validation of the identified decision tree resulted in the following quality measures: balanced accuracy = 0.69; sensitivity in the recognition of problematic that definitely do not enable an informed decision (share of 29% in the test set) of 0.80. This means that 8 out of 10 problematic offer information are recognised.
The specificity in the confirmation of sometimes less or hardly informative offer information is 0.57.
Potential for development
1. Continuous further development of the underlying training data due to changes in the market situation
2. Higher share of offers collected by laypeople in order meet the actual searching and finding behaviour