Mathematics desk
< April 6	<< Mar \| April \| May >>	April 8 >

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

April 7

Area, vertex, and boundary centroids

Two related questions: For what polygons does the area centroid (centroid of the entire enclosed region) coincide with (a) the centroid of the vertices, or (b) the centroid of the boundary points? For (a), it's true for all triangles and of course for all regular polygons. I'm guessing it's true for all parallelograms too. But what other polygons? As for (b), it does not hold for all triangles or even for all isosceles triangles; it holds for all regular polygons, and I'm guessing for all parallelograms, but what else? Loraof (talk) 19:04, 7 April 2016 (UTC)[reply]

Trivial remarks: obviously for any region with a line of symmetry, all three centroids must lie on that line. If there are two lines of symmetry, all three lie at the intersection and so coincide. (In this case there is a rotational symmetry, and so it is reasonable to speak of "the center" of the region.) --JBL (talk) 20:50, 7 April 2016 (UTC)[reply]

Case (a) is much easier since the area centroid and vertex centroid are both rational functions of the vertex coordinates. For the edge centroid you need to weight by the edge lengths, which gives you √'s in the expression and everything gets messier. For quadrilaterals I'm pretty sure (meaning I have most of a proof worked out but it's too much trouble to write out the deatials) that (a) is true iff it's a parallelogram. The crucial point is, and correct me if I'm wrong about this, that as operators on quadrilaterals, the point centroid and area centroid both commute with affine transformations. So you can simplify by taking three points of your quadrilateral to be (0, 1), (0, 0), and (1, 0). This idea won't work for the edge centroid because a linear transformation does not change length by a constant factor. --RDBury (talk) 03:30, 8 April 2016 (UTC)[reply]

Interpolation and extrapolation

I have been told, and told to teach, that interpolation is generally more reliable than extrapolation. Obviously, everything in science and statistics rests, at least tacitly, on a bed of assumptions; for example, in the case of science, we usually assume that induction works. What are the "minimal" statistical assumptions required for extrapolation to be generally less reliable than interpolation?--Leon (talk) 20:07, 7 April 2016 (UTC)[reply]

I don't think "interpolation is more reliable than extrapolation" is a formalizable statement in the sense you're hoping for. --JBL (talk) 20:52, 7 April 2016 (UTC)[reply]

From the lede of our article Extrapolation: "It is similar to interpolation, which produces estimates between known observations, but extrapolation is subject to greater uncertainty and a higher risk of producing meaningless results." -- ToE 21:14, 7 April 2016 (UTC)[reply]

@Thinking of England: If this is meant as a response to me then I don't understand its relevance to my comment. If it is meant only as a response to Leon, please feel free to remove or ignore this comment. --JBL (talk) 21:40, 7 April 2016 (UTC)[reply]

@Thinking of England: But that just tells me what I've already been told, with no explanation of why.

@Joel B. Lewis: In that case, is it in the same category as induction? By this I mean that it is a ill-defined heuristic that is important to science.--Leon (talk) 09:44, 8 April 2016 (UTC)[reply]

In my opinion, yes, this is a (well-founded) heuristic that one should expect to hold in any reasonable model, but not actually a statement with a fixed formal meaning. --JBL (talk) 20:21, 8 April 2016 (UTC)[reply]

(ec) I think it's something like this: if you have a simple regression line, I think you can calculate the variance of the predicted value (i.e., its variance over a large number of replications of the regression) as a function of the independent variable's value x. I think this variance is greater if you are outside the range of values of x that you used in the regression. The reason is this: draw a linear regression line through some data points. Suppose it correctly goes through a point in the middle, which I'll call the pivot point. Now except by coincidence, the estimated slope coefficient will not be exactly right; the correct line goes through the pivot point with a somewhat different slope. The farther away from the pivot point you look, the more the true line and your estimated line diverge from each other.

Another less formal way in which the maxim is true is this: a linear (or any other) regression may have its functional form misspecified. This might not matter much in the range that the x data used for the regression were in, but matters more the farther outside that range that you try to extrapolate. For instance, draw a bunch of data points almost exactly satisfying y=x². Pick a narrow range and run a linear regression. It may fit reasonably well for the data points included in the regression, but fits worse the farther outside that range you look. Loraof (talk) 21:20, 7 April 2016 (UTC)[reply]

Assuming only that uncertainty increases monotonically with distance outside a range of samples, there is no limit to the error that an excessive extrapolation can incur. See Extrapolation#Quality of extrapolation. AllBestFaith (talk) 02:04, 8 April 2016 (UTC)[reply]

I'd say it has to do with the assumption that the underlying function you're interpolating is continuous and "smooth" on the same scale as your point sampling density. That is, when interpolating, you're typically assuming the tightest radius of curvature for your function in the region you're sampling is on the same order or greater than the distance between any two sampled points. If those assumptions are true, you can't introduce substantial oscillations between any two points: the function is confined to a certain delta above/below the two points. Even with those minimal assumptions, though, extrapolation doesn't share that property. Even with a radius of curvature several times that of the point separation, you don't have to go very far before you can get a continuous function to go near-vertical. - However, violating either of those assumptions can allow interpolation to be as error-prone as extrapolation. Discontinuous functions can jump all over the place, and if your tightest radius of curvature is allowed to be even a quarter of the point separation distance, then you can get arbitrarily far from the points and still be able to return to fit both sides. -- 162.238.240.55 (talk) 15:01, 10 April 2016 (UTC)[reply]

I should add that even without radius of curvature issues (e.g. when you know the underlying functional form is linear, but with uncertain coefficients), extrapolation is still less accurate than interpolation, due to compounded errors of the slope. Within the range of your interpolated points, you know that any estimate should have an error ε, the error in fitting your points. However, your fitting protocol will have some error δ in its estimate of the slope. Even if δ is much less than ε, you get a δ of error for each unit you travel from the center. At some distance away from the points the sum of the δ's is going to be much greater than ε - if you go out far enough, arbitrarily so. -- 162.238.240.55 (talk) 15:18, 10 April 2016 (UTC)[reply]

I look at as how far you extrapolate or interpolate from the last data point. If we put it in terms of the distance between the last two data points, then when you interpolate, the farthest you can go is 50% of that distance. An extrapolation of 50% of that distance might not be too bad, either. On the other hand, you can extrapolate much further, but should then expect much worse results. StuRat (talk) 15:26, 10 April 2016 (UTC)[reply]

After-the-fact meta-analyses of projections hinged upon the use of mathematical extrapolation might get at something close to what you want, but I wouldn't expect this specific statistical question has (yet) been the focus of any meta-analysis in the applied sciences.4.35.219.219 (talk) 00:33, 13 April 2016 (UTC)[reply]

If time is the independent variable, and one does/doesn't have a grasp of contributing variables in a model, one would/wouldn't expect extrapolation to fit mathematical rules, per se. Extrapolation is going to have an error fitting expected -- essntially purely mathematical -- ranges iff one doesn't have a new confounding piece of information. 4.35.219.219 (talk) 00:39, 13 April 2016 (UTC)[reply]

An example, current, is projection of global population before a leveling off or decline begins. Projection to and past 10 billion depends upon relatively static behavioral assumptions, as well as an absence of some catastrophic block. If either family-size reduction turns out to be more 'geographically contagious' or a serious pathogen epidemic occurs, all bets are off on whether Earth's population goes much higher than it is right now.4.35.219.219 (talk) 00:46, 13 April 2016 (UTC)[reply]