On 2020-12-02 21:30:36, user Alexis Rohou wrote:
I was asked by a journal to review this manuscript. Below is my review
***
This manuscript explores the observation that Thon rings visible in amplitude spectra of micrographs decrease in amplitude as a function of spatial frequency (distance from the origin in F space) and that this decrease is more pronounced in micrographs collected with larger objective lens defocus.
Since the height of Thon rings from image of test specimens can be taken as an estimator of recoverable signal-to-noise ratio in experimental data recorded under identical conditions, this has led many practitioners to prefer to collect data as close to focus as possible. The dominant assumption in the field has been that the observed defocus-dependent contrast attenuation is due to imperfect spatial coherence of the electron source, but this manuscript provides compelling evidence that another phenomenon is responsible.
The authors note that a significant amount of signal is delocalized beyond the edges of the field of view and so cannot be recovered. Further, the authors point out that single-sideband (SSB) signal in the collected image (be it from features in the field of view but near its edges, or delocalized from features not present in the field of view), while it contributes power to the image, does not contribute to Thon rings because its amplitude is not modulated by the CTF.
I find the authors' evidence in support of this compelling:<br />
- experimentally, the nodes (local minima) between Thon rings to not reach the "noise floor" as would be predicted if all contrast in the image arose from phase contrast attenuated by a spatial-coherence envelope. Computationally, the authors show that this "Thon ring floor" is raised under conditions where more of the recorded image power consists of SSB signal (increased defocus or small field of view)<br />
- theory predicts that, at the fluencies normally used in cryoEM, the spatial coherence of the illumination supplied by modern eletron sources is such that one would not expect significant defocus-dependent attenuation effects<br />
- most compelling, the relative intensity of Thon rings in actual images is well predicted by the fraction of image features for which signal for both side bands is recorded (Fig 4)
My only significant reservation with this manuscript is about the "messaging", and specifically this sentence of the abstract: "The principal conclusion is that much higher values of defocus can be used than is currently thought to be possible". <br />
While the authors have convinced me that the negative effects of defocus were misunderstood and overstated, their claim that higher defocus could be used with no ill effect should be qualified (preferably in the abstract, and in the main text) to make it clear that they are only referring to the imaging part of the experiment, and not the image processing part of experiments, where high defocus values would force users of most packages to use very large box sizes at various parts of the process creating unusually large computational burdens, and/or other problems may occur. If the authors want to keep the claim as is, they should add experimental results that support it, e.g. high-resolution apoferritin reconstructions obtained from both low and high defocus datasets, along with characterization of the mean SSNR, ResLog plot, or similar, in each case. Probably better to keep the paper more or less as is and just qualify this claim, in my opinion.
Beyond that, I have more minor suggestions / questions.
(1) Abstract: I'd encourage the authors to consider removing the sentence remove about correcting mag distortion ("We also show (...) many orientation") - if I understood correctly, this becomes very significantly only at very large defocus, and only if averaging spectra to 1D curve before fitting. For these reasons, I think this is a rather minor point of the paper. In the context of the abstract, I think this aside distracts from the main message
(2) Abstract: "and Ewald sphere correction". Perhaps I missed it, but I don't recall reading in the main text an explanation of why defocus should allow for better Ewald sphere correction, or a demonstration that this is the case. I suggest removing this from the abstract, or adding text explaining this, or a citation to a reference that does (on that note, after a quick re-read of Russo & Henderson 2018, I also don't see an obvious demonstration there that higher defocus yields better Ewald sphere curvature correction, but I'd happily stand corrected).
(3) Page 3: "This is because compensating information, which unfortunately is of no use, may enter the image from features that are outside the field of view." On first read, this sentence confused me - I think because the phrase "compensating information" threw me off. How about something like "This is because unrelated single-side-band signal delocalized from features outside the field of view may enter the image."?
(4) Page 4: "Since delocalized (...) high defocus values to record images (Russo and Henderson 2018b)". I think readers who like me are not well versed in the optics and maths of SSB imaging, this statement is difficult to understand. Could it be explained a little further / clarified? To spell out my confusion: why does the feasibility of recovering SSB information even the absence of the Friedel mate mean that it should be advantageous to operate at higher defocus?
(5) Same paragraph ("We note that information in (...) become greatly reduced"). This whole paragraph argues (I think) that collecting highly-defocus images is OK, yet wasn't one of the points of Downing & Glaeser (2008), cited in this paragraph, that the larger the defocus the lower the more CTF correction schemes or Wiener filters fail at retrieving all of the information (due to the "twin image" problem). My apologies If I'm mis-understanding - if that's the case perhaps other readers will also need a bit more hand-holding through this paragraph.
I loved all the detail poured into M&M, so I suggest specifying further:<br />
(6) Page 5: "annular zones of 1 reciprocal-space pixel" - how was interpolation done here? Nearest neighbor?<br />
(7) Page 5: "floated" - I assume this means adding a constant so that the average value is zero?<br />
(8) Page 6: "Smooth curve" - fix capitalization. Also, what kind of smooth curve?
Results:<br />
(9) Page 6: "The integrated power at 2.35 Å" - measured how? In real space in the white box?<br />
(10) Page 6: "(67% of intensity)" - 67% of which intensity?<br />
(11) Page 6: "~0.23 nm" - to guide the eye, please add a second x axis in figure 2, or replace the existing one, so that we can look for the 0.23 nm feature.
(12) Page 7: "The mean value of this noise spectrum can be regarded as the "zero baseline" for the power spectra of images recorded with a specimen". This noise floor will rise as a function of the number of electrons incident upon the detector. The choice of illumination condition when collecting "no-object"/"beam-only" images for these experiments is therefore important. I assume that the authors used the same illumination conditions as had been used in the actual experiment with a specimen. Is this correct? Either way, could the authors briefly mention somewhere what illumination conditions were used for this? <br />
-- I expect that using the same illumination condition would lead to an overestimate of the height of the noise floor. Indeed, during experiments with specimens, some fraction of electrons will be lost to apertures, leading to an overall decrease in the average number of eletrons reaching the detector. One may thus expect the actual noise floor in "with-specimen" experiments to be even lower, perhaps making the authors' point even more striking.
Discussion:<br />
(13) Page 7: "did not prevent images at 8 um defocus from being recoded at a resolution of 1.44 Å". Is this shown somewhere? Fig 1C shows 1.3 um defocus, not 8 um.
(14) Figure 2a: could the X axis be re-labelled, or also labeled with spatial frequency in nm-1 or Å-1 - this would help locate the 3.5 Å bump mentioned in the discussion
(15) Suppl Figs 4 and 5: here also, having a second X axis, or a second set of labels with spatial frequencies would be helpful.
(16) Figures S4 and S5: The lower bound of the Thon rings is "raised" with increased defocus, as predicted by the increase in SSB signal, but why is this lower bound so much higher at around 0.5 Nyquist, while remaining low at the origin and edges of F space? Is this predicted by the model? Does it correspond to the FT of the shape of the circular mask used in generating the simulated images?
(17) Page 9: "to interference between the contributions (...) which is 2a". This sentence reads as though the two SSB beams are interfering constructively or destructively with each other. Unless I'm mistaken the interference is between the scattered beams and the unscattered beam, is it not? That's certainly what the next sentence seems to say.
(18) Page 9: "The persistence of lattice images within (...) displaced from the particle". Likely because of my lack of expertise, and specifically because I do not know what the "coherence diameter" is, this sentence was lost on me.
(19) Page 10: "We note that this behavior is different (...) envelope function". For completeness, how about adding a supplementary plot overlaying the observed behavior (as in Fig 4) and the prediction from the spatial coherence (at whatever beam characteristics best fit the data, to point out perhaps that an unrealistic illumination semi-angle would be needed to fit the data)? This would help readers like myself who are not quite certain what one would expect such plots to look like if spatial coherence were really at play here.
(20) On the subject of Figure 4, I am curious about why the last few points of the 2.3 Å series seem so far off the prediction. The authors made a point of saying that the power spectra were so oversampled that even at that frequency, they had 3 pixels sampling each ring. So why the discrepancy, if not undersampling/aliasing? This made me curious: what would an equivalent plot from the simulation data look like? Would the Thon ring amplitudes from this synthetic experiment be a closer match to the predictions (dashed lines in Figure 4)? If not, perhaps this mismatch is due to poor sampling of these very fine rings at high defocus after all?
Summary and conclusions<br />
(21) Here might be a good place to formulate some caveat about the practicalities of processing data collected at very large defocus.
Figures & supplements<br />
(22) Figure S5: this would seem to argue strongly against evaluating the power spectrum using patches - would the authors agree? if so, how about mentioning it in passing somewhere? The optimal way to compute power spectra for the purpose of CTF parameter fitting is still a topic being discussed in the literature of late, and this observation would seem to be relevant.