Jump to content

Talk:COVID-19 pandemic in the United States

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 67.169.166.36 (talk) at 00:19, 30 July 2020 (Number of US cases by date). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Template:COVID19 sanctions

Recovered cases in the United States

Hi, I would like to ask why there is a difference in the number of recovered patients in the United States between the one from the bar graph and the other from the epidemology overview chart. --User:42.60.88.59 Revision as of 15:58, 15 June 2020

Statistics and weekly periodicity?

The plots showing the number of new COVID-19 cases and the number of deaths show a fairly clear periodicity with a ~7 day period. If there is a reason for this, I think mentioning it in the article would be useful. Is it something to do with when people get tested and their weekly schedule? When they are more likely to interact with others (e.g. on weekends) or something else?

Need to cite sources, esp. in "Number of U.S. cases by date" and "Progression charts" sections

I have searched in vain for sourcing for the critical cases/deaths/etc charts in these two (sub)sections. Specific inline citations with live links should be provided for each chart. Can someone please add them, for WP:Verifiability? —RCraig09 (talk) 22:37, 14 July 2020 (UTC)[reply]

Sources:
But you are right, there are no sources indicated above or under the charts you mentioned, and people who update the charts should ideally fix that. A plain link to the source like this[1] would go a long way to help. --Dan Polansky (talk) 10:05, 16 July 2020 (UTC)[reply]
@Rider0101: (I saw you made updates of progression charts): What are the sources for the data in the progression charts? --Dan Polansky (talk) 08:22, 18 July 2020 (UTC)[reply]
There is Template:COVID-19 pandemic data/United States medical cases, and it has sources indicated in the table row "State Sources", on a per state basis. Perhaps the US level aggregates are taken from this table. --Dan Polansky (talk) 10:20, 23 July 2020 (UTC)[reply]

Daily charts are broken

The daily case/death charts, x axis, is now labelled only with months. But the labels such as April and May do not correspond to April 1, May 1, etc. This can be easily seen by looking at the charts from a few days ago, which had day-of-month numbers. 67.169.166.36 (talk) 09:40, 17 July 2020 (UTC)[reply]

I confirm the problem; the plots can be compared to those available at worldometers. I saw similar problems of bad x-axis labeling for x-axis of the date type in charts I was making on Wikiversity. It seems to happen once the number of data points exceeds some number. Perhaps someone would be inclined to open a ticket against the graphing add-in, but the issue may lie deeper, in the graphing library.
In the meantime, the issue could be addressed by a workaround: reduce the number of data points e.g. by dropping the starting values, in February and early March. When I dropped the first 10 values (Feb 26-Mar 6), the problem disappeared. I could drop that data but let me note that as new data points are added, more values will need to be dropped from the beginning. --Dan Polansky (talk) 07:13, 18 July 2020 (UTC)[reply]
I went ahead and dropped first 10 values from multiple charts to fix the above so they now start at Mar 7. It helped. The visual information loss seem tolerable. But it is not ideal. --Dan Polansky (talk) 07:34, 19 July 2020 (UTC)[reply]

The fix Dan made works (thank you!), but as he said it's only temporary. I am completely unfamiliar with the charting tools, so I suggest someone more experienced should open a bug ticket.

After a second cup of coffee, I now see that my original report duplicated that from TrilliumLady a few hours before me.

I also now see the same (?) bug is affecting other charts on this page, but less obviously. As of today (July 21) both the >100,000 and 50,000-100,000 cases charts are depicting data points out to about July 28 (7 days in the future).

Another Talk poster mentioned the obvious 7-day periodicity in the daily charts. It seems pretty clear that some people in the reporting chains have ordinary 5-day work weeks.

The daily charts were very informative back in March when things changed rapidly. Today, things change slowly. So maybe the daily chart for Feb-May could be made static, as an archive, and then a new current chart could show weekly averages? That would reduce the data point count (1/7x), avoiding (for now) the charting bug, and it would hide the artificial weekend lag too. 67.169.166.36 (talk) 10:37, 21 July 2020 (UTC)[reply]

I fixed the two charts in "Number of U.S. cases by date" section as well, and even anonymous IP users can do that. Having weekly numbers sounds interesting; I am not sure how comfortable the updaters would be with that idea since their data sources probably show daily values. (The updaters still did not tell us what their sources are.) --Dan Polansky (talk) 11:48, 21 July 2020 (UTC)[reply]
The plots would ideally be smoothed by applying 7-day moving average, and I could do that but I do not know how the updaters would cope with that. Calculating the average of 7 values is easy, but it is harder than copying a value from one location to another. Ideally, the plotting framework would allow something like "y2=sma(y1, 7)", and calculate that automatically; "sma" stands for "simple moving average". --Dan Polansky (talk) 12:02, 21 July 2020 (UTC)[reply]
The charts were significantly changed, in particular their size, which now makes their utility significantly less. I would recommend a larger size for the charts, especially those with the data of many states. Jaedglass (talk) 05:55, 24 July 2020 (UTC)[reply]
I set the width=700 from 900 previously (diff) because the charts were too wide and now they look perfectly fine on my 15" screen, not small at all, with no reduced utility. What screen size are you using or what kind of device that they seem too small to you? (In fact, I would be happy with width=500. And there are larger screens than 15" but also smaller screens.) --Dan Polansky (talk) 08:33, 25 July 2020 (UTC)[reply]
15.5 inch Dell Inspiron set with recommended resolution of 1366X768, and given the number of lines on the chart, at the current size the charts are not meaningful.Jaedglass (talk) 07:22, 26 July 2020 (UTC)[reply]
I got similar screen size and resolution as above. The charts and their level of detail appear meaningful. The number of horizontal lines appears sufficient. However, the number of vertical lines may be low if the chart starts showing month boundaries, which it does if the number of points exceeds a certain threshold, it seems; right now, the "Number of U.S. cases by date" charts have good number of vertical lines, which was ensured by someone by removing some data points at the beginning. And right now, e.g. the "No. of new daily cases" chart has very few vertical lines, which is fixed not by increasing the chart size but rather by dropping several initial data points. --Dan Polansky (talk) 17:14, 27 July 2020 (UTC)[reply]

Chart of active cases, compared to other countries

US has the highest number of active cases within the world. Would it be useful to add a chart, comparing its trend with other most affected countries? --Traut (talk) 09:19, 18 July 2020 (UTC)[reply]

COVID-19 Active Cases per 100 000 population

The above is misleading in so far as it disregards growth in number of tests; the misleading effect is particularly pronounced for Sweden and the US; I don't know about Brazil. Comparing test positivity rates would be more useful. The above chart does give some idea and is not completely useless, but one has to keep in mind that there was significant growth in test counts, and it is really hard to keep that in mind for most readers. --Dan Polansky (talk) 11:56, 18 July 2020 (UTC)[reply]

(outdent) Below, let me replot the chart that gives a very different picture for the U.S:

Test positivity rate for US, calculated from Our World in Data, from owid-covid-data.csv[2], smoothed via 7-day moving average:

--Dan Polansky (talk) 12:09, 18 July 2020 (UTC)[reply]

Of course, the more you test, the more cases you will find. But there's already a graph, including tests and cases. The suggested graph here is how US does, compared to other countries with current or former major outbreaks. Highest test rates you may find in UAE (many cases), high test rates in Bahrain (many, many cases), lower test rates in Qatar or Chile (many, many, many cases) - and how would you like to compare the quality and amount of testing between US and Brasil? So you will need a certain simplification when you do compare US with other countries. Here's another comparison of cases and deaths - selected by countries with most cases. Qatar with 3750 cases/100 000 population, Bahrain with 2170 cases/100k or Chile with 1750 cases/100k are beyond the grid. But deaths in Qatar and Bahrain are surprisingly low. I don' trust the data from worldometers.info very much. But for a comparison of tests per population, their numbers work sufficiently well. See the charts below - and sorry, for this quick comparison I did not bother to select the same chart colors --Traut (talk) 12:51, 18 July 2020 (UTC)[reply]
COVID-19 cases per 100 000 population from countries with the most cases
COVID-19 cases per 100 000 population from countries with the most cases
COVID-19 deaths per 100 000 population from countries with the most cases
COVID-19 deaths per 100 000 population from countries with the most cases
I believe an intercountry comparison of daily test positivity rates is much more meaningful than an intercountry comparison of daily case counts. Such a comparison is meaningful even if test regimes are vastly different. In case of doubt, we should refrain from publishing country comparison charts; we should take pains not to mislead. Let me emphasize that I am not talking of test rates; I am talking of test positivity rates, that is, the ratio of daily new cases to daily new tests. worldometers.info data is probably generally okay, but they have the grave defect that it does not show daily test count alongside daily case count, as far as I know; but I do not see what worldometers.info has to do with the present proposal. --Dan Polansky (talk) 13:05, 18 July 2020 (UTC)[reply]
As for intercountry comparison of covid-coded deaths, that suffers from different covid-coding between countries and different testing regimes; better compare excess deaths, available for US on US level and state leven and for many European countries. --Dan Polansky (talk) 13:11, 18 July 2020 (UTC)[reply]
So obviously you care about test rates - but I don't. It was your suggestion to talk about tests, I didn't. But you can go to [worldometers] and sort the table by Tests/1M pop and find out about the total number of tests per country. This may indicate why some countries have high case numbers, because they have high test numbers. Take e.g. Gulf Daily News who claim 40% of tested people in Bahrain .[3]. But since you do not have exact and comparable data for those countries, you have to take what you got. The assumption of active case numbers is already a simplification, because hardly any country does test for recovered people. Here in my chart the assumption is that people are healthy 14 days after test, if they did not die. Official quarantine recommendations are down to 10 days instead by now. Once again: you can't compare countries by test numbers, because you do not have that kind of information. But you have official case counts (mine are from ECDC) and reasonable population counts (2018-12-31). --Traut (talk) 13:26, 18 July 2020 (UTC)[reply]
Worldometers does not plot daily test counts. For instance, on page https://www.worldometers.info/coronavirus/country/us/ it plots daily case counts but it does not plot daily test counts, so the page is gravely misleading.
Plotting daily case counts between counties while not accounting for growth of test rates is obviously misleading, and this issue cannot be addressed by saying "I don't care about test counts". Since, it is not about what particular editors care about; it is about what is a fair representation and comparison and what is misleading the reader.
The reader can obtain intercountry comparisons of test positivity rates at https://ourworldindata.org/grapher/positive-rate-daily-smoothed?tab=chart; there is a default list of countries but more countries can be added by clicking on "Add country". --Dan Polansky (talk) 13:43, 18 July 2020 (UTC)[reply]
And the above chart that plots deaths for countries with most cases is misleading by choice of countries: why not show top countries by deaths per 100 000 pop? This way, Sweden artificially looks the worst, which it isn't. The reader can get a more relevant picture at File:COVID-19 Outbreak World Map Total Deaths per Capita.svg. For an accurate and relevant picture of Sweden, the reader is well advised to consult the charts at COVID-19 pandemic in Sweden#Additional data, charts and tables. --Dan Polansky (talk) 14:17, 18 July 2020 (UTC)[reply]
That's because the second chart does show the same countries as the first chart. It is not supposed to show the countries with the most deaths, but the deaths of the countries with the most cases. But again, the main question here is how to compare the number of active cases per capita for US vs. other highly affected countries. --Traut (talk) 14:59, 18 July 2020 (UTC)[reply]
Active cases per capita is 1) not reliably known, 2) distorted by different testing regimens, 3) not particularly relevant; better drop the comparison. Daily progression of test positivity rate is known for many countries, and is relevant--Dan Polansky (talk) 15:06, 18 July 2020 (UTC)[reply]
As for "deaths of the countries with the most cases", I don't see how this is relevant. --Dan Polansky (talk) 15:07, 18 July 2020 (UTC)[reply]

Thanks for the feedback, Dan, you made your point, but you have a very different idea of what you want. Now I'd like feedback from other people who understand my idea to compare how US performs, compared to other countries. --Traut (talk) 15:45, 18 July 2020 (UTC)[reply]

Sure. For others: please read the above discussion and think of the intellectual duty of not misleading the reader. From what I can see, my substantive objections were not addressed above; rather, it all became very subjective, such as "I don't care" or "you have a different idea"; it should have been "X is fair enough", "Y is misleading", and the like. --Dan Polansky (talk) 15:59, 18 July 2020 (UTC)[reply]

I happened across this discussion, saw additional opinions had been sought and decided to voice mine. If I ought not to have because of my lack of understanding or such, I hope you'll forgive my temerity. Addtionally, on reading the discussion, I recognized Traut's name, and, in case it's perceived as bias, I'm declaring that I contributed to a discussion in which Traut proposed similar graphs for another article, and broadly supported that point. However, I did not in any way come to this page because of Traut: I was looking at the article, then the talk page, then this discussion, and only then did I recognize Traut's name.

As to be expected from a third voice, I agree with some points from each commenter. As I see it, the discussion has the following main points: 1, is some sort of comparison graph useful, and 2, if so, what graph would best maximize readers' comprehension and minimize their being misled.

On the first topic, I wholeheartedly agree that a comparison graph is useful. On oft-mentioned concept in the communication of mathematics is whether a number is large. Without such a comparison a reader cannot easily tell whether the number of cases is large because of a relatively high number of infections or simply because the USA is a populous country.

On the second topic, I agree with Dan Polansky that the better statistic to chart is excess deaths, presumably per 100,000 people or similar. I think Dan Polansky is right about high testing rates misleading a reader but wrong about test positivity being a useful statistic. On positivity, as I understand it, even two countries with similar populations and numbers of infected might have completely different positivity rates just because one tests many more people than the other, so I don't see that as an illuminating statistic.

Also on the second topic, ideally the choice comparison regions ought to have some clearly stated criteria: regions with similar area, GDP, population, population density or Covid policy. Other countries have much larger or smaller populations, so comparing the USA with the EU or ASEAN might be more appropriate. I can't offer any good concrete suggestions for that though.

In conclusion, I think Traut is right about a comparison being useful and that Dan Polansky is right about the better statistic is excess deaths over average deaths, presumably per 100,000 people.

68.96.208.77 (talk) 17:32, 24 July 2020 (UTC)Constructive Feedback[reply]

Dropping daily recoveries

I would be happy to drop the chart with daily recoveries. The data seems to be more noise than signal, and is not very important, I think. What would be important are daily hospitalizations, but we do not have that in the article. --Dan Polansky (talk) 08:55, 21 July 2020 (UTC)[reply]

I added weekly hospitalizations. What are the daily recoveries good for? Maybe I just lack the proper imagination for daily recoveries. Is anyone using them to calculate daily current cases (not new cases) and is that meaningfully reliable? --Dan Polansky (talk) 18:47, 21 July 2020 (UTC)[reply]
Notionally, comparing graphs of daily recoveries to daily new cases and daily deaths lets one build a picture how the treatment period changes over time, but the data for daily recoveries does not look to be of very high quality, which... limits its value. pauli133 (talk) 15:58, 23 July 2020 (UTC)[reply]
pauli133, do we at least know where that data is sourced from? --Dan Polansky (talk) 17:41, 23 July 2020 (UTC)[reply]
There's some handwaving about covidtracking and worldometers, but no actual reference, and I don't immediately see this data at either site. Whoever is adding it gets the numbers from SOMEWHERE, but if that's the CDC or random.org, I'm not yet sure. pauli133 (talk) 18:03, 23 July 2020 (UTC)[reply]

Number of US cases by date

Could we get these two charts sorted by highest/latest value? Right now they're sorted by... I don't even know what.

I'm happy to swap things around, but I don't want to step on toes belonging to any current maintainers. pauli133 (talk) 15:55, 23 July 2020 (UTC)[reply]

Do you mean ... sorting the legend? It would be ideal if the legend order matched the ranking on the most recent date - closest to the legend. Notably, about 1 in 10 men cannot distinguish all 12 of those colors, so this change would make the chart more accessible. However, if this change made the assignment of colors to states vary from day to day, that would be confusing for frequent readers. 67.169.166.36 (talk) 11:30, 25 July 2020 (UTC)[reply]

I went ahead and implemented, so you can see the result. The source is sorted alphabetically, and the y values (the legend) are sorted in order of cases, to match the endpoint on the graph. Should be easier to edit AND to read now. pauli133 (talk) 17:40, 29 July 2020 (UTC)[reply]

Looks great! 67.169.166.36 (talk) 00:19, 30 July 2020 (UTC)[reply]

Exponential fit

For July an exponential fit to the COVID-19 [laboratory-confirmed] cases in the United States table's mortality figures in the article is very good (deaths ≃ 120757.5035 × exp(0.005817882889 × day) [with day 1 = 8 July]; maximum discrepancy 421 out of 125,000 for period 8-22 July).

Extrapolating that fit (straightforward mathematics, absolutely no opening for anything subjective) gives 241,317 deaths for 2 Nov 20. Pol098 (talk) 16:14, 23 July 2020 (UTC)[reply]

Why would you want to do exponential fit when there is no current exponential growth of total deaths? I don't understand. Nor is there exponential growth of total cases or current hospitalizations. And what would be the underlying epidemiological model, unlimited exponential growth? --Dan Polansky (talk) 18:01, 23 July 2020 (UTC)[reply]
An exponential of form y = a × exp(b × x) gives a very good fit (curve-fitting gives deaths≃120757.5035 × exp(0.005817882889 × day) [day 1 = 8 July]; it matches the 15 data points with a maximum discrepancy of 241 (out of about 125,000). In terms of a graph: if number of deaths is plotted against date, with deaths on a logarithmic scale (intervals 1-10, 10-100, 100-1000, etc. equally spaced), the line is almost straight for the recent past, and can be well-fitted by the exponential form. There is no detailed modelling, underlying epidemiological or other, or indeed assumption involved, it is simple mathematical extrapolation.

While July has been quite closely exponential, the figures may well drop—or rise—beyond what simple extrapolation says. The extrapolated figures for 23 July and days following are 132,538; 133,311; 134,089; 134,872; 135,659; 136,450. Compare them with the numbers as they are added to the article. I don't call this a "prediction", it's just extending the curve as it was in the past 15 days.

For comparison I did this calculation on 16 July, with dates from 1-15 July; the extrapolated figure for 22 July was 126,196. The actual figure posted in the article today for 22 July was 126,511. There's absolutely no personal interpretation here; anyone can replicate this with the figures from the article today and on 16 July. HTH, Pol098 (talk) 18:31, 23 July 2020 (UTC)[reply]
I'm not 100% sure what you're trying to do, but: just because anyone can replicate it, does not mean that it is automatically our place to do so. If a reliable source makes a projection, we can quote it. pauli133 (talk) 21:23, 23 July 2020 (UTC)[reply]
Exponential extrapolation done without critical attitude already produced enough fiasco, did not it? Let's not do more of it. --Dan Polansky (talk) 08:35, 25 July 2020 (UTC)[reply]
"a projection" I was very clear that this is purely a mathematical exercise, not a prediction or projection. In fact, if I had to bet on it, I'd expect precautionary measures to kick in and make the actual figure months hence somewhat lower than the extrapolation (my point is simply that growth 1-22 July was demonstrably exponential). There are sources that talk about the growth and its exponential nature; my comment was to provide numbers to help anyone working with sources that provide words without numbers. My extrapolation has already diverged somewhat for 23-24 July; the actual figures are significantly higher than the extrapolated figures I gave. HTH, Pol098 (talk) 11:43, 25 July 2020 (UTC)[reply]
The above seems pretty bizarre to me; the claim that a certain extrapolation stating "241,317 deaths for 2 Nov 20" is a not a projection seems hard to understand; the claim that it is not a prediction seems better. But what really caught my attention was the claim that the fit was "straightforward mathematics, absolutely no opening for anything subjective": sure, not being subjective is a property of all deterministic processes and algorithms. Some astrologers use deterministic transformations of sky observations to arrive at certain predictions, but that does not make these predictions meaningful. There is a whole class of algorithms using deterministic transformations to produce something that has a certain properties (not all of them) of random outputs; these are pseudorandom generators. A particular deterministic procedure or method must be meaningful or be meaningfully applied; the procedure's being deterministic alone does not make its application meaningful. And the application of exponential fitting for something that would not grow exponentially for over three months (until the quoted 2 Nov 20) even in the case of zero interventions cannot possibly be meaningful. --Dan Polansky (talk) 16:28, 28 July 2020 (UTC)[reply]
OK, whether or not we call it a projection is semantics; "not a prediction" is certainly better, so please thus interpret my comment. The process I used is a simple least-squares fit to an exponential determined by two parameters (using an available tool); the visible straightness of the lines in the log graph in the article, and the small differences between the fitted function and reported numbers suggests that fitting an exponential function is suitable. I did this to try to understand what seems to be happening, and what might happen if it continued in the same way (hopefully not). The exponential nature of the curve in the article might be worth mentioning if a source has published it. Thanks for the comments. Best wishes, Pol098 (talk) 21:22, 29 July 2020 (UTC)[reply]

New report "Tracking COVID-19 in the United States"

Available here, with a Guardian article about it here. -- Daniel Mietchen (talk) 15:18, 24 July 2020 (UTC)[reply]

Some good news on covid in the U.S.

Some good news concerning the development in the U.S. (there are also bad news, but they are out of scope of the following):

  • On U.S. level, new daily cases seem to be plateauing, starting to decline; current hospitalization seem to approach a plateau or a peak[4].
  • In Arizona and Florida, both new daily cases and current hospitalizations are behind a peak and starting to decline[5][6].
  • In Texas, new daily cases are behind a peak and in decline; current hospitalizations slowed down and seem to be approaching a plateau[7].

--Dan Polansky (talk) 16:42, 28 July 2020 (UTC)[reply]

Semi-protected edit request on 29 July 2020

This sentence makes no sense.

Scott Gottlieb, former commissioner of the FDA, when a vaccine is ready for testing, about 25,000 people, in different groups, would be given the vaccine, two weeks apart, until 100,000 people have been inoculated over about six weeks.

What does Gottlieb have to do with it? He's making the sentence ungrammatical without making it make sense. Please leave him out and change the sentence to this.

When a vaccine is ready for testing, about 25,000 people, in different groups, will be given the vaccine, two weeks apart, until 100,000 people have been inoculated over about six weeks.

2601:5C6:8081:35C0:41B9:4147:7563:3EFE (talk) 00:38, 29 July 2020 (UTC)[reply]

The sentence does not have proper grammar and most certainly needs to be changed, but the sentence and its supposition appears to be attributed to Gottlieb. My proposed change would be: Gottlieb, former commissioner of the FDA, says that when a vaccine is ready for testing, about 25,000 people in different groups would be given the vaccine two weeks apart, until 100,000 people have been inoculated roughly for over six weeks.Tenryuu 🐲 ( 💬 • 📝 ) 00:53, 29 July 2020 (UTC)[reply]
@Tenryuu: I've taken the entire sentence out. The source is an OpEd written by Gottlieb, and in context he is proposing an stepped-wedge trial. The numbers are completely arbitrary and as far as I can tell, Gottlieb has no plan to actually implement it (he can't anyways as he left the FDA in 2019). The paragraph makes more sense without it as now it focuses on what is actually happening instead of hypotheticals.  Ganbaruby! (Say hi!) 12:33, 29 July 2020 (UTC)[reply]
Ganbaruby, thanks for doing that. The article appears to be locked behind a paywall so I didn't have any context beyond that. —Tenryuu 🐲 ( 💬 • 📝 ) 15:01, 29 July 2020 (UTC)[reply]