The Noise from Brazil: The corruption of Covid death databases
published: 17 April 2022 ...... last updated: 17 April 2022
Creating permanent panic
Ariel Karlinsky is an economist and statistician who has become well known during the Covid pandemic for his excess death estimates. Even if his name is not familiar, his data surely will be, since it is used as a reference by many official sources; from 2021 his work has become the basis of the mortality and excess death estimates and graphs for the very popular Our World in Data, the Economist, and many other sites and researchers use his databases as a basis for their own work.
In an article Karlinsky wrote at around the same time his excess death paper was published he clearly shows that his analytic skills are in dire need of improvement. His lack of competence and respect for all of the factors involved in calculating mortality baselines is astonishing.
Karlinsky described excess deaths, especially excess deaths during a pandemic, as “incredibly easy to predict”. He also states that normally researchers don’t care about excess deaths or baselines, just annual deaths. It is alarming that someone who is creating mortality baselines based on trendlines doesn’t know that baselines are used every day by public health modellers, actuaries, and researchers in many fields. Entire professions and careers are based on these fields, its astonishing that he thinks researchers normally ignore them and “only care for annual number deaths”.
Unfortunately Karlinsky’s ineptitude goes far beyond that, it is more than just naivety, recklessness and ignorance. His deliberate misinformation reaches the level of scientific fraud. Not only has he slandered other scientists who have done better and honest work, he has intentionally corrupted his databases in a way that artificially and permanently keeps death in excess and make it impossible to return to anywhere near a normal pre-Covid baseline.
Strange aggressive slander
Karlinsky’s baseless attack on a reputable Nobel Prize winner leads right to his own fraudulent activities.
Karlinsky went on a campaign of defamation and character assassination against Dr. Michael Levitt of Stanford and his excess death work. One of the main focuses of these attacks was excess death in Brazil; it was a strange choice to direct attention to that country for reasons we will soon see.
One of the tricks of those who engage in propaganda is to claim that they are using a scientist’s own method to prove them wrong, when in fact they are using a completely different method. Karlinsky either doesn’t understand the method Levitt was using, or he is outright committing professional slander. After reviewing the integrity of Karlinsky’s work, it is clearly the later.
Karlinsky claimed that Dr. Levitt miscalculated his own excess deaths by 10.5 perecntage points, but in order to reach that amount Karlinsky was using his own different methodology, not Dr. Levitt's. Dr. Levitt’s method showed Brazil with estimated excess deaths at about 7% above normal, not 17.5% as Karlinsky claims. Karlnsky then went on to say Dr. Levitt thought that that 7% increase in mortality was normal, when it was very clear Dr. Levitt was talking about excess.
Karlinsky heavily criticized Dr. Levitt for using Brazil's RC database for his 2020 excess death estimate. While it is true that following the RC database trend as a baseline past the year 2020 may have underestimated deaths if Dr. Levitt had continued using that method, that is not what was done. The estimate Dr. Levitt made for Brazil was a one-off that used the data point of the trend only for the year 2020.
Dr. Levitt had not been satisfied with the data from Brazil due to it’s lower quality and lack of detail that doesn’t allow for it to be age-adjusted in the same way as his other excess death estimates, and so Brazil isn’t included in his regular excess death charts and updates.
This is an important part of being a responsible scientist, carefully parsing all data and assessing whether or not it is sound enough to use.
Dr. Levitt's Brazil excess death estimate for 2020
April 2021 Dr. Levitt made an excess death estimate of 94,000 using the linear regression trend of RC data as a baseline.
It is not as accurate as some countries due to the lower quality of Brazil’s data, and because of this it was also not possible to be age-adjusted like Dr. Levitt’s other excess death estimates. Still his method was very reasonable given the circumstances, and the trend of Brazil’s other database, SIM, also had about the same value at that point in 2020.
At that point where deaths equal ~1,364,000 in the year 2020, both databases would have approximately equal trendlines.
Trendlines of yearly deaths reported by the two databases (RC and SIM) intersect in the year 2020.This shows the soundness of Dr. Michael Levitt’s 2020 excess deaths estimate of 93,000 (the difference between the red and green line at the point of the year 2020).
Another reason that Dr. Levitt was correct to use the RC data at that point was that as we will see later the improvement rate of the RC database shows that the RC database was to become equal to the SIM database in 2020
This is why its strange Karlinsky chose to focus so much criticism on Brazil.
For someone who knows that a 7% increase in mortality is profound, Karlinsky has precious little regard for his own miscalculations that result in his all-cause death counts being many percentage points above what it should be.
In fact Karlinsky artificially raises all-cause death counts by using an unwarranted scaling factor that increases the official listed deaths by a whopping 8%, and previously had used a scale factor that increased deaths even more, by 10%.
Taking a closer look at Karlinsky’s work exposes both Karlinsky’s corruption and why his attack on Dr. Levitt was slanderous.
Karlinsky uses scaling factors based on the observation that Brazil has two death registries (SIM and RC), and that they have had discrepancies in their death counts in the past as the RC continues to update their database and recording system. The observation is correct, its what he did with it thats criminal.
The 2 main ways Karlinsky inflated excess deaths was by:
using preliminary data from the SIM database that is known to contain duplicate records and overestimate current all cause deaths
not acknowledging that SIM and RC databases are now virtually equal
SIM's current deaths are overcounted
Karlinsky declares that the SIM's higher preliminary death counts for 2020 are “more plausible” than the official RC counts for no reason other than that he personally likes them better. In fact there is reason to believe they are less plausible.
The last available fully vetted and officially approved SIM dataset is from the year 2019.
The preliminary all-cause death data from SIM that Karlinsky uses for Covid-era excess death estimates is inaccurate and unreliable, and it is well-known to overstate death count due to duplicates.
SIM itself describes the process of preparing this preliminary data for eventual registry in the official database, warning that early death reports contain errors and problems of data consistency. The most worrisome of these is duplicate recordings of the same death. The preparation process includes removing these duplicate filings, which can’t even start until 2 months after the month in which the death occurred. [3]
Karlinsky's death estimates are scaled up
To reconcile the differences in the preliminary data Karlinsky uses a scaling factor to project how many deaths are happening in the pandemic era. Since excess death is found by subtracting pandemic deaths from a baseline of normal death, using a scale factor to inflate pandemic-era death consequently also over-estimates excess death. Let’s examine how he derived his scale factor.
Karlinsky claims that the ratio of the difference in deaths between the preliminary SIM data and the official RC database makes an appropriate scale factor. His first scale factor, 1.10, was allegedly taken from the difference in those databases for deaths reported in first months of 2020 and increased death counts by 10%.
However, the ratio between the two databases in the finalized official counts from 2019 was only 1.05. Had he used a ratio based on the properly reviewed and vetted data from the end of 2019, the scale factor would have been very much lower.
Knowing that the 2019 SIM/RC ratio was 1.05 and the 2020 ratio was 1.10, from this we might extrapolate that the SIM data overstates deaths by an equivalent of 5 points, since the preliminary SIM data from the start of 2020 was 1.1 times higher than RC (1.10 - 1.05 = 0.05).
As of February 2022 Karlinsky has revised the scale factor to 1.08. While better than the previous higher one of 1.10, it is still far too high.
Since we know preliminary SIM data overstates deaths by 0.05, if we would want to entertain the idea of a significant discrepancy between the 2 databases, and that comparing preliminary data to actual data was the best way of finding true deaths for that given time period, then using 1.03 (1.08 - 0.05 = 1.03) as a scale factor would be more accurate than the 1.10 or 1.08 that Karlinsky uses.
A 1.03 scale would result in about 559,000 excess deaths, which is more than 162,000 deaths less than Karlinsky’s estimate.
We know the ratio should never be higher than 1.03, but using any scale is ill-conceived and would lead to large overestimations of regular death during the pandemic times because the RC database has been increasingly improving.
No scale is needed
Karlinsky claims that because the RC database has had lower listed deaths than SIM in the past, and because accurate SIM data lags so far behind the release of RC data from the same time, that a scale factor derived from the differences in the RC and SIM is needed to find the true amount of deaths during the Covid pandemic, which in Karlinsky’ mind surely must be higher than official counts.
But the RC is continually improving its registration process at such a rate that it isn’t appropriate to use any scaling factor. Just 5 years ago there was a big difference in the 2 databases, but each year the RC had been increasingly improving its registration system and the gap between the 2 registries had been continuing to narrow. Their trendlines project that they would become equal By the year 2020 when the pandemic started, yet Karlinsky has not accounted for this dramatic rate of improvement at all.
As you can see in the chart of the ratio of SIM to RC data, the RC data improvement rate shows there should not be any scale factor at all by 2020. The ratio of deaths between the two databases reaches “1” before the year 2020, meaning that at that point they have reached equivalency, so the current listed RC deaths should be close to being accurate for the pandemic years.
Taking all of these issues into account using any scaling factor is misguided. While it is true that the RC database still might be slightly undercounting current deaths, that number is small, much smaller than the amount that the SIM database over-counts.
Using a linear regression model of SIM deaths from 2015-2019 as a baseline, similar to Karlinsky’s method but without a scale factor and without the ‘fixed effects’ he uses that increase excess deaths by lowering the expected baseline, gives a much lower excess death estimate.
Without inflation of current official reported deaths the total excess deaths for years 2020 and 2021 would be about 462,000. Thats almost 300,000 less than Karlinsky’s estimate.
True Brazil Excess Death 2020-2021 vs Karlinsky's Estimate
In his article Karlinsky says that the gap between the expected deaths and actual deaths in Brazil do not appear to be closing. That should come as no surprise, as he himself has permanently propped open this gap with his unwarranted scale factor that will always inflate deaths by 8%,
mathematically making it almost impossible for the gap to ever close.
In those conditions with a high permanent inflation of deaths, the normal death baseline would have to decrease by an incredible amount to ever return to normal.
Thats how anyone with some basic math skills and ill intent can create a permanent panic: the forever pandemic
Since the start the Covid-19 pandemic, the scope of it’s severity often has been expressed in terms of Covid cases. But relying on who is recorded as a Covid case is not the best way to determine the impact of Covid, there is another way of measuring deaths that can be done by finding excess deaths. What is excess death, why it is important, and several prominent methods of measuring excess death in during the Covid pandemic are examined.
.... (read more)