What’s the source of data?
Our data source is the Financial Tracking Service (FTS), hosted by UN OCHA. FTS is a platform that captures contributions to humanitarian responses. FTS doesn’t capture everything – it relies on voluntary reporting by organisations. But it is the most comprehensive source of data for humanitarian funding.
How often are your forecasts?
We’ll publish four forecasts throughout the year. The first of these will be in January, and then every three months after, so: April, July, and October.
How did you come up with a model for forecasting sector funding?
In short, we’ve developed a model to forecast the amount of funding that a sector will receive based on an examination of how humanitarian funding has flowed in the past. Given the different orders of magnitude of sector funding (Emergency Telecomms around $10m, Food Security around $5bn), we first transformed our data to a new scale. We then carried out a multiple linear regression looking at a number of possible inputs.
The most important variables that determine funding the next year are funding in the previous year, the funding requirement, the medium-term growth rate of the sector, and if it is going to be a ‘bumper’ year for the sector (think Health in 2020, for example).
We then use this model to forecast how much funding a sector will receive that year.
Do you have different methodologies for different forecasts?
Yes. Basically we start with the approach above, looking at data from the previous year, and that forms the basis of our forecast. In January, this accounts for 100% of the forecast as we don’t have any data about the current year. This is the ‘outside view’. To borrow an analogy from sports betting, this is the probability of different outcomes before the match has begun – in other words we don’t have any information about this case.
But as the year continues we have more information. We’re able to see how much funding sectors have received and therefore adjust our forecast accordingly. This is the ‘inside view’ of the current year. Following on from the sports analogy, this is the probability of different outcomes during the match, a sort of ‘in-play’ view. Therefore, for each of our forecasts we increase the weight of the ‘inside view’ and decrease the weight of the ‘outside view’ as the year goes on.
What are the steps for producing a forecast?
Firstly, we estimate some of the values from the previous year, as we need a view of this. We may need to estimate some of the values in our model. This is because funding from the previous year may not be ‘complete’ in January – sometimes more funding is recorded after the fact.
Secondly, we compile a database of all of our data, and distinguish ‘special case’ sectors (see below). This is important as we treat these differently. We then set up our model, basing it on the multiple linear regression we’ve run in the past.
Thirdly, we calculate our forecasts by running the model 10,000 times, using a Monte Carlo analysis. This allows us to see all of the potential outcomes of the sector in that year.
Lastly, we’re then able to play with the distribution of potential outcomes and work out the probability of reaching a percentage of the funding requirement, going into recession, or other things that you see on the sector pages.
However, we should note that between steps two and three above we will do something different for April (Q1), July (Q2) and October (Q3) forecasts. We weight the outcome of this model above (outside view), with the inside view. We look at how much funding has been received so far, and adjust the model based on this number, applying a different weighting at different points during the year. For example, in April, this weighting wouldn’t be very high, but in October it would be very high. This basically means that the ‘in-play’ view becomes more important than the pre-match stats, as we progress through the year.
What data is the model trained on?
Excluding certain years for certain sectors (e.g. pre-2018 for AoRs), the model is trained on data from 2014 to 2019.
How have you tested how the model works? And how good are your forecasts?
We’ve tested how the model works against funding data from previous years – but obviously testing against the past is different than forecasting the future. We’ll publish our actual published record over time, which you’ll be able to find at the bottom of each sector page.
We’re keeping score of how well our forecasts do, and in 2022 should be able to expand on this with a full page devoted to keeping score. We can’t do at the moment as we’re still in the middle of our first published forecasts.
Tell me about “Special Cases”
Special cases are one of two things. They are sectors that are so low in funding value that they just don’t behave like other sectors. Fundamentally they are way more volatile and don’t fit well into our regression model. To understand what we mean, look at the graphs of Food Security (regular sector) vs. Emergency Telecommunications (special case). Food Security follows a predictable path each year, but Telecommunications can just bump up suddenly due to a small number of funding flows.t
This type of special case includes (at the moment), Camp Coordination / Management, Early Recovery, and Emergency Telecommunications. Our yardstick here is any sector that is below $100m.
The other special cases are those termed as ‘Areas of Responsibility’ (AoR) of the Global Protection Cluster. These are Child Protection, Gender Based Violence, and Mine Action. Housing Land and Property is also an AoR, but we’re not including them on the site for now due to lack of data. The reason behind noting these as a ‘special case’ is that they are, again, more volatile than other sectors. This is in part because they are coming out of the shadows of the wider Protection umbrella. Growth rates for Child Protection and GBV especially have been high in recent years.
What this all means in practice is that the prediction intervals for these special cases are wider than other sectors. We use the same methodology, but there’s just more uncertainty around them. Therefore, we account for this by widening the range of possibilities.
How do you visualise the forecasts?
Go over to our sector pages to find out! We’re using R to generate the visualisations, and we use an awesome package called Highcharter to visualise pretty much everything here.
Why are the forecasts based on probabilities?
So there’s a decision to make – do we generate ‘point estimates’ or ‘probabilistic forecasts’. Point estimates are a single number, so that would be, “we think Food Security will receive $5bn”. A probabilistic forecast would say “we think Food Security will receive between $4.5bn and $5.5bn”. We’ve gone with the latter as it communicates the inherent uncertainty present in our funding systems. No sector or context is guaranteed funding, and we’re not always totally sure how much will come in. A probabilistic forecast is more honest in a way – it communicates that uncertainty.
Text and percentages are often assigned to different probabilities, can you break down the meaning of the text?
Sure. So it’s based on the Probability Yardstick (found here pg. 29) which is used by the UK intelligence community to communicate uncertainty. Because some of our forecasts are in the middle of the ranges found on the yardstick we’ve expanded them out slightly, so:
0 – 7.5% = remote chance
7.5 – 22.5% = highly unlikely
22.5% – 37.5% = unlikely
37.5% – 52.5% = realistic possibility
52.5% – 77.5% = likely
77.5% – 92.5% = highly likely
92.5% – 100% = almost certain