On 2021-04-10 18:48:39, user Daniel Haake wrote:
Regarding version 6 of your study, I have pointed out with my comment which statistical problems are present due to your study design, which leads to an overestimation of the calculated IFR (cf. https://www.medrxiv.org/con... "https://www.medrxiv.org/content/10.1101/2020.07.23.20160895v6?versioned=true#disqus_thread)"). Thank you very much for your reply to my statement. I think that an exchange is important, because this is the only way to get reasonable results. Therefore, please do not regard my comments as criticism, but as suggestions for improvement on how to achieve correct values. Since my statement is still valid with version 7, I answer to your answer, in which I comment here in version 7.
Re: Re: The time of the determination of the death figures
Here you seem to have misunderstood me. I meant that with your example wave of infections and starting the study shortly after the peak of the wave, there is the problem that antibodies have not yet been formed by many people by the time the study starts. By choosing the time of death then, you caught 95% of the deaths, but only a much smaller proportion of those infected. This leads to an underestimated numerator and thus an overestimated IFR.
Just because it was also done that way in the Geneva seropaevelence study does not automatically mean it is correct. So there are also very much studies where the study date was chosen for the number of deaths. For example:
https://www.who.int/bulleti...<br />
https://www.medrxiv.org/con... <br />
https://www.medrxiv.org/con...
?However, I agree with you that the Santa Clara County study should be taken with a grain of salt, as here the subjects were called via a Facebook ad and thus bias may have occurred.? As I said, I understand the idea of taking a later date for the number of deaths. However, the associated problems regarding the underestimation of the infected, which I wrote about in the previous answer, still remain.
It is still incomprehensible that you calculate a difference of 22-24 days, but then take a value 28 days after the study midpoint. This puts them 4-6 days behind your own calculation and thus automatically increases the IFR. Why do you elaborately calculate the difference of 22-24 days to determine the correct time, but then don't use that value??? Let me open up another example. Let's say we are testing at the peak of an infection wave. But now we count all the dead who showed up after a certain time, but we don't take into account that a large number of people still got infected after that. Some of the counted dead will also have become infected after the study. Then we have recorded all the dead, but not all the infected. Or do you want to say that all the dead are from the first half of the infection wave and none from the second part of the infection wave (especially since that would lead to an IFR of 0% for the second part of the infection wave). As you can see, it is problematic if you assume the number of deaths in the much later course, because you then choose the denominator of the quotient too small and arrive at an IFR that is too high.
In general, only deceased persons who are clear to have been infected before the latest time at which study participants may have become infected may then be included. This is not the time of the study, since the antibody tests can only be positive after some time following an infection.
Re: Re: PCR tests from countries with tracing programs
Is it really "PCR testing per confirmed case", not "PCR testing per capita" that is the important parameter? Let us assume two example scenarios for this purpose. Let's assume that we test every resident and at that time 1% of the population is in the status where the PCR test is positive. Then we currently know from everyone what their status is. But then we would only get 1 positive tested person out of 100 tests performed. This test would then not be taken because of the too low ratio of tests per positive case. And this, although we would have tested even everyone. Now let's assume the opposite case. We test in a country where we don't know exactly where how many people are infected. Now we test in one region and assume that this result is transferable for the whole country. But actually this region is not as affected as other regions, we just don't know. Now we do 10,000 tests and find 20 infected people there. Then we come up with a ratio of 1 positive test per 500 tests performed. That test would then be included in your selection, even though the ratio of infected is actually higher. Therefore, it is just not the "per confirmed case" that is the important parameter. Because if there is a high number of cases in the country, you could now double and triple test everyone and know very well and still this investigation would be excluded. At the same time, however, studies can be included with few tests and thus a high statistical uncertainty for the reasons mentioned earlier.??
The comparison with South Korea is also problematic. 0 or 1 seropositive results are far too few to have any statistical significance. The statistical uncertainty here is simply too high. And, as already mentioned, the results of these investigations cannot be transferred across the board to the other investigations. ??
Including reported case numbers from countries that have a tracking system that works well for you leads to an overestimation of IFR.
Re: Re: Study selection
That you screen out studies, based on recruitment I can understand. I think that is statistically correct. I also see the danger with recruitment that you can't get representative results. Therefore, it is also understandable that you want to see which studies are useful and which are not.<br />
Nevertheless, you just sort out the studies that have a low calculation of IFR and leave studies with high values in your study. This leads to a shift toward the high values. Furthermore, studies that are straight up deviant are more problematic because a larger shift is possible in that direction. Let's say there is a hypothetical virus with an IFR of actually 0.5%. Then we have a study with a value of 0.3% and a study with 1.5%. The high value in particular is further away from the actual value and thus shifts the calculated value upward. If you have an actual IFR of 0.5%, you can misestimate by a maximum of 0.5 percentage points on the downside and by 99.5 percentage points on the upside in theory. This is also not surprising because such distributions are right skewed. If I remove both, the study with the too low value and the study with the too high value, the actual value does not change. If I remove both, the calculated value shifts upwards, because a stronger shift is possible in this direction. This leads to an overestimation of the IFR.
Re: Re: Adjustment of death rates for Europe due to excess mortality
You write in your reply that this is not relevant because reported deaths were used and not excess mortality. In Appendix Q you write: <br />
"For example, the Belgian study used in our metaregression computed age-specific IFRs using seroprevalence findings in conjunction with data on excess mortality in Belgium“. You may not have applied this to other studies. However, you are using a study that did. Accordingly, this is crucial and has an impact on your result.
Re: Re: Calculation of the IFR of influenza
You nevertheless calculate an age-specific IFR for COVID-19 and calculate the IFR as it would look if there were an equal distribution across age groups, which in fact there is not. At the same time, you say what the IFR is for influenza, which, as shown, you understate. After all, the comparability of numbers due to changing life circumstances do not change in a short period of time. Therefore it is no problem to use the IFR for influenza of several years. Thus you suggest a comparability of the numbers. It is not possible to compare an IFR that assumes an equal distribution of age groups with an IFR that does not assume an equal distribution. However, this is exactly what is being suggested. By the way, it is not only the media, it was also taken up by Dr. Drosten. For another reason the comparability is difficult. Namely, an IFR is compared of influenza, where we could already protect the vulneable groups to some extent by vaccination and also an infection could have been gone through in the past, which helps to fight the disease and can therefore lead to fewer problems. However, to be honest, one can of course argue here that this is just the way the situation is. Therefore it is also understandable for me if one nevertheless makes such a comparison. Then, however, by assuming an equal distribution over the age structure for both viruses, or the actual distribution for both. By the way, there is another problem. There is a comparison of an estimated IFR with a measured one.
---------------------------------------------------
Additional comment
With the studies to date, it is very difficult to estimate how high the IFR actually is. This is because there are problems with all methods. If you take antibody studies, there is the problem that antibodies are not detectable in all infected people. If you take the reported numbers of cases, there is the problem of the dark field. How could one calculate a clean IFR? By actually testing a certain proportion of the population as a representative group on a regular basis. For example, you can test 1 per thousand of the population every week and see if they are positive for COVID-19. Then look at how many people have died over time from the group of positives. Those deceased could then be autopsied by default to determine whether they died from or with COVID-19. In doing so, one must then determine what period of time after infection is still valid to count as a COVID-19 dead person. After all, is a person who died 10 months after infection still a COVID-19 dead person? After all, it is the elderly who are dying. But it is not atypical that they would have died over time even without infection. Now imagine that a 94-year-old dies 10 months after an infection. Can one then still say whether it was due to COVID-19? In this case, one would probably have to look at the medical history before and after COVID-19 and also see what symptoms the deceased had after the infection. Only with such a procedure it is possible to calculate a clean IFR. For a correct comparability with influenza, this procedure would also have to be used for the calculation of the IFR of influenza. If you are really interested in a scientific comparability of the IFR, you should proceed in this way.