Paying for Performance

Magic Bullet or a Payment Reinvented?

Von Andreas Kalk and Berit Kieselbach

The authors look at new initiatives in results-based health financing or paying for performance initiatives from the perspective of the historical tradition of paying for performance. It points out lessons learnt from past approaches and highlights potential pitfalls when transferring the approach from western industrial settings to health systems in developing countries. Last but not least, it tries to provide food for thought concerning the perception of labour relations and staff motivation underlying the different reform approaches.

Since the beginning of the new millennium it became more and more obvious that innovative reform approaches are needed to increase efficiency and responsiveness of health systems in developing countries, which have shown little progress towards achieving the MDGs within the suggested timeframe. A range of traditional approaches to health sector reform did not lead to the desired outcomes at the expected pace. As a consequence, certain multilateral and bilateral ‘donors’ embarked on a new route of results-based health financing or paying for performance initiatives and rolled out several pilot schemes and country-wide initiatives linking payment and performance across several countries and different health systems such as Rwanda or Cambodia.

For a few years, results-based financing or paying for performance (P4P) initiatives get increased attention within the international discussion and have frequently been celebrated as panacea, curing ineffective health systems. However, a brief look into history shows that the idea to link rewards to performance is not new at all.

Paying for performance is not new

Early examples are dating back to times far before the Greek or Roman Empire. The first document clearly defining specific payments for specific results in recorded history is the Babylonian Code of Hammurabi (18th century BC). It included an incentive system for traders, based on their profit or shortfalls. Travelling merchants earned half of the excess of their agent’s investment as a bonus; shortfalls on the other hand had to be covered fully by the merchant’s own funds.

During the 16th and 18th century, the mercantilist school of economics was prevailing and first theories of wages were developed. Mercantilists believed that income and the amount of labour were negatively correlated and that workers are to be kept on subsistence level to reach highest levels of performance. The doctrine of the “hungry worker” – who was believed to be the most productive – was the predominant view. This idea was strongly opposed by the economist Adam Smith, who was convinced of the idea of the “economic man”. He believed that workers aim to increase their efforts to earn more.

Later in history and throughout centuries, piece rate systems were established, linking quantity of production with payment, mostly for agriculture products or for simple craft works. The more successful piece rate systems had common attributes: Outcomes could be easily measured (e.g. counting the produced items), and quality aspects left little or no room for interpretation. In the course of the industrial revolution during the late 18th century, and in view of the increased necessity to make production processes more efficient, piece rate systems were at their height, especially in labour intensive industries. But most of these systems differentiated between types of labour; whilst workers on an operating level were paid per piece, the more skilled employees still remained on a fixed daily pay.

A short episode of profit sharing approaches in the late 18th century ended quite abruptly, due to dilution of responsibility or the so-called freerider effect: More productive teams and departments were discouraged by those with a lower performance.

With Frederick W. Taylor’s time studies starting in 1882 and the development of scientific management theory, P4P got an early scientific basis. Taylor divided complex work processes into single standardized units. He saw workers from a mechanical perspective and perceived human beings as technical parts of the production process. Taylor believed that there was “one best way” to conduct a specified routine work process, which was assumed to be the most efficient independent of the individual performing the task. He attempted to measure and optimize each individual step during the production process and to improve each and every one of them. This approach led to an extreme division of labour. The optimized work routines were linked to financial incentives for those workers following the standardized production methods in the determined time. The productivity increase in the American economy after World War 1 was partly due to rationalisation processes, which had their roots in Taylorism. Nevertheless, the scientific management approach was abolished. The limited and monotonous role of the individual worker in the production process led to decreased motivation and work satisfaction. One of the main shortcomings of Taylor’s theory was the fact that it completely ignored the effects of group and team processes on workers’ motivation; the unit of analysis was the individual worker in isolation.

Resistance to P4P

Taking into account the long history of performance-based payment initiatives, the question arises why these systems did not lead to a quantum leap in the evolution of salary systems. To answer this question, a glance at P4P’s setbacks and resulting criticism is useful. This criticism is almost as old as the concept itself. It derives from a variety of disciplines: economic theory, social sciences and last not least Christian ethics.

As early as in Roman times, the Roman Society promoted a “verum pretium” approach, as an attempt to define prices reflecting the true amount of work performed for the production of a certain outcome. This approach was readapted by the Christian Church introducing the “justum pretium” approach, a just price doctrine accounting for fairness and controlling inflationary processes. In modern terms, the underlying idea corresponds to a ‘fair payment for a fair job’.

In times of feudalism, mercantilism and early economic theory before the 19th century, there was neither a strong interest in the nexus of payment and performance, nor a strong opposition towards it. Only with the industrial revolution and the necessity to increase output, results-based financing mechanisms returned on the agenda. During the same time, from 1848 onwards civil society opposition movements emerged in alliance with trade unions, who struggled for time-based and ‘adequate’ payment.

Since the emergence of social sciences in the early 20th century, a vast theoretical basis for worker motivation and behaviour has been established. Hawthorne and other social scientists created empirical evidence for factors determining worker motivation; challenging the prevailing view that motivation mainly depends on financial incentives. Empirical evidence showed that intrinsic motivation, supportive work environment, supervision, regular feedback and opportunities for career development are strongly related to performance as well.

Most of the research conducted in that period was derived from artificial laboratory settings and did not reflect underlying processes in the context of real working life’s conditions. The first observations highlighting effects of P4P in natural settings, accompanied by evaluation studies, describe efforts to establish performance-based financing mechanisms in the public sector in the 80s in Europe. Surveys amongst public sector employees at that time revealed that only a minority of employees perceived the scheme as an incentive to work beyond the job requirements. Many employees perceived the incentive schemes as prescriptive. The studies depicted job content and career development prospects as strongest incentives for public sector employees. There was, however, broad support for the general principle to link performance to rewards. Evaluations of performance-based remuneration systems in the health sector in the UK brought about several adverse effects. When health staff was directed towards certain target indicators, their efforts to reach these indicators sometimes resulted in dysfunctional behaviour, not necessarily increasing the health status of the population. Targets to reduce waiting time for appointments resulted in the practice that appointments were not given long in advance anymore, and the target to reduce the reported number of cases of antibiotic-resistant infection led to a reduction in the number of blood cultures tested.

Do people and setting matter?

The brief glance into history has shown that the majority of results-based financing initiatives were established in industrial settings of the Western world. Applicability within health care settings in the South requires careful consideration. Can we really assume that work motivation of an industrial worker producing electromechanical relays in a factory is determined to the same extent through intrinsic incentives than that of a health care worker in rural Africa, struggling to improve the wellbeing of his or her community?

The risk of ‘crowding out’ has been mentioned frequently in this context. The term stems from economic science and refers to external interventions, such as monetary incentives, that might undermine intrinsic motivation. The related theory assumes that health workers are motivated through their professional values and a strong service ethic. If incentives solely focus on increasing extrinsic motivation through external incentives, the motivation of health workers may shift towards the latter, and intrinsic motivation might disappear altogether. It is well proven that external incentives increase work motivation of health staff in the short-term, however after some time external incentives tend to be perceived as vested right, so that new incentives are needed to further increase motivation.

P4P initiatives were particularly successful in settings where quantity as well as quality of outcomes can be easily assessed. Compared to standardized work routines at a conveyor belt, the assessment the quantity and quality of medical work seems to be much more complicated. ‘Good’ health care frequently goes beyond standard operating procedures and treatment guidelines and is thus difficult to measure in an objective way.

The establishment of the right set of indicators covering the quality of the process of clinical decision making and comprehensive care is particularly difficult. Indicators can only measure certain dimensions of comprehensive quality care, and most indicators are not sufficient to capture all necessary aspects. This results in a trade-off between the need to limit data collection and to save resources on one side, and the need to cover the variety and complexity of services delivered.

Furthermore, in medical care the client is a part of the “production process”. Due to the asymmetry of information between the care provider and the client, the client is not always in the position to judge on the quality of the treatment he receives and strongly relies on the judgement of the care provider, who acts as an agent on his behalf. This agreement requires trust. However, if the care provider’s clinical judgement is influenced by additional incentives in order to achieve certain indicators, this trust is undermined. Health staff can not fulfil successfully its role of an agent on behalf of the patient.

Compared to most incentive schemes in the North, where the variable parts of the remuneration account to 5-10% of the base salary, P4P initiatives in health care in the South such as Rwanda add considerable amounts to the base salary as variable compensation. This phenomenon is enhanced by the fact that base salaries in most developing countries are very low and hardly cover the essential needs, whilst the base salary in high income countries enables workers and their families to maintain a decent lifestyle. Consequently, the urge for an underpaid health worker in a low-income country to do everything in his power to achieve the defined set of indicators is comparatively high. It ensures his and his family’s survival. ‘Gaming’ is the term often used to describe work behaviour directed towards increasing the rewarded outcomes by all means. In certain cases, this has led to inaccurate reporting to increases of certain forms of diagnostics and treatment over others without medical need, and to the neglect of activities not directly targeted by an indicator. Clinical decision making might be substituted by the blindfolded pursuit of rewarded outcome indicators.

No doubt about short term benefits, but many questions to be answered

There is no doubt about short term benefits of linking payment to results. Examples from Rwanda and Cambodia have clearly shown increased interaction between clients and service providers, increased efficiency of health care services and impressive improvements in monitoring and evaluation leading to a greater accountability.

P4P interventions might provide a window of opportunity for broader management and organisational changes. Nonetheless, the rapid scaling-up of such an intervention, without a comprehensive political framework for human resource development might pose a risk. Rigorous research into long-term outcomes and adverse effects should be conducted before pilot schemes are rolled out countrywide. Such research should build upon experience made in other sectors, and it should incorporate a multisectoral perspective. It might be joined by a broader discussion about professional ethics in various settings.

