The salesperson, the craftsperson and the gardener

how to evaluate a mental health crisis service

Jun 07, 2023

How to evaluate a crisis service

Following the George Floyd protests of 2020, a whole crop of mobile mental health crisis services have emerged, pushed by demand for non-police responses to calls for help in mental health crises. There are now roughly 40 in the wider Bay Area alone.

Many of them were established in 2021 or 2022, and funded using government money, which means that in the ways of governments-who-fund-things, the next year or two will be the time when they start needing to justify their continued existence to those who decide to pay for them.

We evaluate a service to figure out whether what we’re doing is working to accomplish what we are trying to do. However, there are two common reasons we want to do that, and they’re in tension with each other:

To argue for its continued funding by people who have the power to fund it
To figure out how to improve it

1. To keep funding

Many funders require that the projects they fund are evaluated, so they can decide whether to keep funding them. This means that the people who are in charge of ‘evaluation’, if they are also employed by the funded project, essentially have a sales task in their lap–they must sell their project to its funders in order to keep it running.

When selling, the primary determinant of our success is whether or not the customer bought the service; how much they bought, whether they bought consistently and reliably, and whether they recommended it to their friends. To be a good salesperson, you must get into the mind of your potential customer and understand what drives them, and what they might want to buy. If you have the power to shape the product or service at all, you might try to adapt it to be a better fit for your potential customer’s needs. And (perhaps more insidiously) you might shape your narrative to resonate with those desires your customer has, downplaying ways in which the product or service isn’t a good fit for them.

If what you’re selling is a car, a laptop, or a t-shirt, something the potential customer will use up close, every day, then if you shape the narrative to ignore things that are big and obvious then that will come back to bite you the minute they buy one, try it on for size, and kick up a fuss that you swindled them. No–if they are canny and going to use the thing themselves, anything you hide from them they will discover very shortly, and feel betrayed. So you are best off giving an honest description of it, and letting them decide for themselves.

Imagine, for a minute though, that you were a real estate agent selling them a house. Not a house they would ever live in, or even see–but one you would look after and send reports to them about. Moreover, you would not be renting it out for income, but throwing events, the sort of events your buyer wanted to be thrown.

This state of affairs is like the position you are in if you are the leader of a funded mental health project. Your customer is someone working at a big office far away from you, and while they want the stated goals of the project (better mental health) to materialise, they aren’t the ones actually receiving the benefit. So, as the project lead, if you want your project to continue, your task is to show them that you are accomplishing a lot–so much, so much!--so that they will keep funding it and you can keep your job.

2. Improving the service

There’s another reason you might evaluate a service. You run it to accomplish an important task, and you want to determine if it is accomplishing that task. If you find out it isn’t, that could give you clues about what to chance to get it on track.

Imagine you run a pie factory. Your job is to make a lot of pies. It’s someone else’s job to sell them, so all you have to do is make them well and for not too much money. You have a big factory, so you can’t check every individual pie–that would take too long. Instead, you set up a measurement system to check the intermediate steps to tell whether the pie-making is going as it should, and alert you when things are going wrong. If one of the machines breaks down, you want to know about it ASAP, so you can fix it.

Because pie-making has several steps, you need a collection of different measurements, which test different things and help you catch errors at different points in the process. Without such measurements, you would be like a pilot with no instruments–flying blind. You may discover you need to change what things you measure–if you start making a new kind of pie, which needs different spices, or a different sized pie, with different pie tins.

The practice of measurement here is part of the practice of your craft–it lets you see more clearly what is happening in your factory so you can make changes which will let you make better pies, more cheaply, and in the right quantities to meet demand (which you have no control over).

As the head pie-maker, you first and foremost want your measurements to be useful, and honest. Perhaps measuring the speed at which the pies get made is useless because they only get picked up once per day, so the thing that really matters is how many are warm and ready to eat at the daily pick-up and transport deadline. If your measurements are mistaken, you make bad pies, and you don’t accomplish your goal. Accurate, frequent measurement, that you can change when you need to know different things, is what you need. You need a set of tools for seeing clearly.

Sales and craftsmanship–bad measurement bedfellows

You may have already noticed in these stories that these two reasons for measurement have conflicting needs. Small, cheap, idiosyncratic, frequently changing, roughly accurate measurement is what you need if you are managing a process and trying to improve it, like a craftsperson. You are the audience for your measurements, after all.

If you are selling a service to a funder, you need opposite qualities–measurements that help you tell a story, to a customer who is not an expert in your field, not sitting next to you as you relate to a client. You need simple measures–counts that go up every year, that are easy to measure, and easy to include in a story that concludes with ‘and that’s why you should continue to fund this project’. Number of clients, number of appointments, number of treatments given. ‘We did more of the good things you paid us to do than before, so we will do it in the future’: this is table-stakes if you want a customer to continue to fund your project.

In the tussle for dominance between the metrics the salesperson needs and the metrics the craftsperson needs, in mental health, the salesperson wins.

The reason for this is simple–if the project loses funding, it ceases to exist, so there’s no service to measure in a craftsmanlike way.

Most projects never reach the point where their funding is so secure that they can turn their attention to measuring outcomes in a craftsmanlike way–to improve, debug, increase quality; so the only kind of metrics they ever use are salesperson ones.

A concrete example:

I spent about a year volunteering every week on a suicide hotline. About three-quarters of the way through, I got into a minor argument with my shift supervisor. I had, somewhat naughtily, not been completing my call notes in the database after every call, for two shifts in a row. He (in the kind, polite way of someone trained in empathetic listening) called me out on it; it was, in fact, part of the agreed-upon responsibilities of a hotline volunteer to write detailed call notes documenting every call.

One relevant thing to know about suicide hotlines is that, at least on the one I volunteered at, for a major US city, about 80% of calls come in from regular callers–people who are not at risk of suicide, at least not anytime in the near future, and who may not even be particularly upset–but who call to have a chat. They may be socially isolated by mobility constraints, by mood problems, or by personality patterns that make it hard for others to talk to them. Or, they may be talkative people who have learned that they can call the suicide hotline once they have exhausted the goodwill of their friends. We would get to know them by name, enforce call limits so they couldn’t call more than four times a day, and document their calls like any others.

So I protested to the shift supervisor–why do I need to make a detailed call note (including a suicide risk assessment, details of the kind of distress they were experiencing, and a summary of the interventions I used) when this person calls four times a day and always says the same thing? I said it was a waste of time–none of my fellow hotline counsellors would ever read those notes, because nothing ever happened in the call interesting enough to warrant looking up later. (I mean this in the extreme sense. There were multiple callers who would call and tell exactly the same story from decades ago in every call, so you couldn’t easily distinguish one call of theirs from another).

From my perspective, helping these callers was a distraction from the purpose of the hotline–to help people survive when they wanted to end their life. I was happy to write detailed call notes when I did talk to someone who was at risk; knowing that, as happened several times, those notes might come in handy to a counsellor who encountered the same person again in a crisis and needed crucial context.

After almost an hour of back-and-forth, he revealed the stakes. The call notes, one for each incoming call, created a record of the number of calls the hotline received–and when it came time to renew the government grant that funded the salaries of the full-time employees of the hotline, the absolute number of calls was what mattered.

The aggregated statistics did not distinguish between four calls with different people, each with a weapon in their hands, who were talked down from the brink of suicide and kept alive through the compassion and rapport-building of their hotline counsellor, or four calls with the same guy who got bored of talking to his friends and wanted to bitch about his roommate for a few minutes before heading to the gym.

Is this good mental health care? Eh. Who knows. Is this helping more people stay alive who are at risk of suicide? Almost certainly not. And yet, these extra calls and the associated detailed documentation work were essential to the survival of the service. (I eventually negotiated a truce with my supervisor–I would create call records for these calls, but put the bare minimum of effort into writing them. As long as they recorded the date, time, and suicide risk, he said, he was happy.)

If this service, which has been running for decades and is operated by one of the most well-funded grant-based social services agencies in the city, is still driven by the struggle to survive, by the tyranny of salesperson’s metrics, what hope do these (tiny, new, prototype) government-funded mental health mobile crisis teams have of breaking out of this trap?

Perhaps it goes without saying, but–any service with an organizational structure big enough that the people leading it cannot directly watch and participate in all aspects of its delivery (like, for example, at the award-winning Suicide Crisis Centre UK I keep going on about)--needs craftsperson’s metrics.

You cannot steer to get where you want to go if you do not know where you are currently going.

What is all the more difficult about measuring the success of a mental health service is that if someone’s mental health gets better, you can’t observe it directly–and two people looking at the same situation may disagree about whether a person’s mental health got better or worse!

Measuring a mental health service like a craftsperson gardener

A craftsperson is an imperfect metaphor for the person leading a mental health service however, because a craftsperson works with unresponsive, dead materials.

In contrast, the people served by a mental health service are very much alive; active participants in (or resistors of) the process. A slightly better metaphor is that of a gardener; she weeds, prunes, plants, harvests, but cannot force the growth of a plant–the plant must do that itself, with its own constraints and capabilities, and within the ecosystem it inhabits.

An even better metaphor would take into account the fact that someone leading a mental health service is also a person, like the people they serve, and both operating the service and the mere act of measurement can change them as a person. A social worker, who entered the profession out of warm-hearted desire to connect with people who are struggling, might be surprised to wake up and find herself two decades later a cold-eyed administrator, who thinks of her clients more in terms of numbers and statistics and medical reports; rather than as unique, warm, people that she might, in other circumstances, become friends with.

So, the craftsperson/gardener/more-than-gardener style metrics for the person leading a mental health service need to not only help her see clearly; they also need to be good for her, as a person. The metrics, measures, tests, observations, and other kinds of data need to support her (and the team she is a part of) becoming the person she wants to be, just as they need to support her in supporting her clients to survive and become the people they want to be.

What kinds of measurements work like this?

Let’s first lay out what questions such measurements need to help us answer. This is creating our yardstick–as we experiment with different metrics and ways of measuring, we can answer the question ‘how good a metric is this?’ by asking ‘how well does it help us answer our guiding questions?’

I’m going to lay out some sample questions for a mobile crisis service; you may, rightly, discover different questions that are more useful for operating the service you operate. You do you; you may improve the questions themselves over time too.

The purpose of my imagined mobile crisis service is simple (but not easy to accomplish!)–help people survive acute mental health crises with minimal harm to themselves, those around them, and the wider environment.

There are two big-daddy guiding questions implied upfront in that statement, which may be scary to even consider measuring:

How many people having mental health crises (who ask for our help) survive?

How many people having mental health crises (in the area we serve) survive?

The first question leads us to wonder how helpful we are to people who ask for our help. The second question, a much broader one, leads us to wonder how many of the people we aim to serve ask for our help and how they fare without it. Comparing the two lets us start to guess, tentatively, how helpful we are, compared to going it alone without us.

To ask ‘how helpful are we?’ in the context of life-or-death situations can be emotionally excruciating. It can be torturous to consider the possibility that our work isn’t helpful, but we can’t make it more helpful if we can’t consider the idea that it might not be helping already.

To ask ‘of the problem we have the responsibility to address, which portion are we addressing?’ is also difficult because it brings challenging options to the forefront. It is easier to help those who reach out, and ignore those who don’t, because figuring out how to encourage them to reach out, and helping them when they do, is much more difficult.

For our imaginary mobile crisis service, we might start by wondering what we would need to know, in order to find out which of the people we help survive. (For the minute, let’s assume that our salesperson metrics are being kindly taken care of by some administrator in an office with a database and we’re free to consider only craftsperson/gardener metrics.)

For one–how long would we need to keep up with the people we help? If someone asks for our help on Friday, we help them by taking them to a psych hospital, they stay for two weeks, then kill themselves the day they get out, well, we would want to know that, wouldn’t we? We didn’t help them survive, even if they survived the literal day we met them.

Maybe we don’t yet have a way of checking whether people survived longer than it took us to take them somewhere else. Perhaps the first step is figuring out if we can find this out. Can we call them? Can we ask their permission to follow up? What does HIPAA think?

(As is already probably clear, trying to answer a simple question is often harder than trying to answer a complicated one.)

Maybe a first attempt at creating a set of metrics to answer the question ‘how many of the people we helped did survive?’ could look like:

How many people stayed alive while we were with them?
We asked everyone we came to help if we could call them later, a week and a month down the track, to see how they were doing–how many said yes?
Then, we called them when we said we would–how many answered? (What else can we find out about how they’re doing when we call?)
If they didn’t pick up, do they show up in any deaths databases we have access to? If so, they died. If they didn’t, we don’t know–they may not have died, but we can’t confirm they’re alive.

As you can see, this set of measurement strategies is hacky and imperfect. Look them up in a deaths database? Do we have anything easier to do than that? Maybe we find that calling people and asking essentially ‘are you alive?’ turns out to be quite rude and we want to find other ways of developing our relationship with them when we make that check-in call.

With measurements you’re using for guiding your operations, like a pilot using the instruments on their dashboard, a vague guess about a useful metric is more helpful than a precise measurement of a useless metric. Having weird gaps in the measurements you can assemble is to be expected–plenty of things that are important are very challenging to measure!

The point is not to consider this plan ‘the plan’, but to use it as a starting point to make better and better plans over time. Perhaps learning that someone we helped later died is upsetting, and we need to make space for that in our work culture. There’s a lot that can be done from this simple starting point.

So…

Almost all measurement and evaluation of mental health services, particularly crisis services, happens from the perspective of a salesperson, working to ensure the survival and continued funding of their project. This goal needs different kinds of measurement than the goal of making the service better. Many people who work in mental health, and particularly at the frontline, began because they wanted to ‘make a difference’--to improve the mental health of other people, not to satisfy funders.

In order to do the first thing, we might need to keep two sets of books, at least for a while–one as a salesperson, and one as a craftsperson, as a gardener, as a person working with other people.

psychcrisis.org

Discussion about this post