Jump to content

Wikipedia:Reference desk/Mathematics: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 168: Line 168:


::{{reply|Lambiam}} The best fit curve for data points in a [[volatility smile]] appears to be a skewed hyperbola. I haven't found an efficient way to fit such a thing without resorting iterative techniques. You've given me an idea with sliding windows and linear interpolation though. I need to think about it. ~[[User:Anachronist|Anachronist]] <small>([[User talk:Anachronist|talk]])</small> 18:56, 19 October 2020 (UTC)
::{{reply|Lambiam}} The best fit curve for data points in a [[volatility smile]] appears to be a skewed hyperbola. I haven't found an efficient way to fit such a thing without resorting iterative techniques. You've given me an idea with sliding windows and linear interpolation though. I need to think about it. ~[[User:Anachronist|Anachronist]] <small>([[User talk:Anachronist|talk]])</small> 18:56, 19 October 2020 (UTC)

:::This is closely analogous (though the other way up) to estimating the position of the mode when data have been classified, the construction involving a pair of crossed straight lines at the top of the modal class bar of the drawn histogram. A simple estimate, which you would have to judge the adequacy of, comes from projecting inwards the straight lines given by joining the data points closest to the minimum to the respective next ones moving outwards. → [[Special:Contributions/2A00:23C6:AA08:E500:40DF:4AE9:718D:E2EA|2A00:23C6:AA08:E500:40DF:4AE9:718D:E2EA]] ([[User talk:2A00:23C6:AA08:E500:40DF:4AE9:718D:E2EA|talk]]) 13:14, 20 October 2020 (UTC)


= October 20 =
= October 20 =

Revision as of 13:15, 20 October 2020

Welcome to the mathematics section
of the Wikipedia reference desk.
Select a section:
Want a faster answer?

Main page: Help searching Wikipedia

   

How can I get my question answered?

  • Select the section of the desk that best fits the general topic of your question (see the navigation column to the right).
  • Post your question to only one section, providing a short header that gives the topic of your question.
  • Type '~~~~' (that is, four tilde characters) at the end – this signs and dates your contribution so we know who wrote what and when.
  • Don't post personal contact information – it will be removed. Any answers will be provided here.
  • Please be as specific as possible, and include all relevant context – the usefulness of answers may depend on the context.
  • Note:
    • We don't answer (and may remove) questions that require medical diagnosis or legal advice.
    • We don't answer requests for opinions, predictions or debate.
    • We don't do your homework for you, though we'll help you past the stuck point.
    • We don't conduct original research or provide a free source of ideas, but we'll help you find information you need.



How do I answer a question?

Main page: Wikipedia:Reference desk/Guidelines

  • The best answers address the question directly, and back up facts with wikilinks and links to sources. Do not edit others' comments and do not give any medical or legal advice.
See also:

October 13

Quality of point distribution on a sphere

I'm looking for a function, ideally one calculable in less than O(n*n), that I can use to measure how evenly spread a set of points are over the surface of a sphere.

Does anybody have any pointers for where to start looking? — Preceding unsigned comment added by 2A01:E34:EF5E:4640:8089:C4C5:1392:7CE8 (talk) 18:43, 13 October 2020 (UTC)[reply]

A large set of points drawn independently from a uniform distribution over the surface will have some clusters and some "bald" spots; is my assumption correct that this should have a lower quality score than when the points repel each other and try to maintain a maximal social distance? (So that the set of vertices of a spherical regular dodecahedron has max quality for a set of 20 points?) One possible measure could be based on the potential energy of the configuration, using some force field (e.g., a repellent force of magnitude between two points at an angular distance ). This measure can be calculated in steps (in the RAM model). Lower energy corresponds to a more even spread.  --Lambiam 19:58, 13 October 2020 (UTC)[reply]
Thankyou for your detailed answer. I had considered a potential-based approach, but it occurred to me that I could construct the Voronoi diagram and look at the area of cells in O(nlogn). This seems like overkill though so I was wondering if there weren't a well-known accepted metric.
You might use the measures from the Thompson problem or the Tammes problem. See also [1]. It also occurs to me to find the largest empty circle, but I don't know how efficiently that can be calculated on the sphere. --Amble (talk) 20:44, 13 October 2020 (UTC)[reply]
The measure from the Thompson problem uses the force field model but with a different force, and Tammes nearly fits the model. Using a step function as the potential, where the cutoff point for the step is adjustable, will, with a bit of tweaking, get you the Tammes version. Any number of potential functions could be used, so it's not a matter of finding a measure but which measure is best suited to the problem at hand. --RDBury (talk) 21:24, 13 October 2020 (UTC)[reply]
Some examples are given in Diggle and Fisher (1983): [2]. As RDBury points out, there are lots of options, which may be more or less suited to the problem. I pointed to the Thompson and Tammes problems because they have existing literature and you can probably find some sample code. --Amble (talk) 21:37, 13 October 2020 (UTC)[reply]
The area on the sphere between two parallel planes is proportional to the distance between those planes (a theorem of Archimedes; this is used in more than one algo for sphere arrangements). So perhaps we can reduce the problem to a simpler one: how evenly are the coordinates in x, in y and in z distributed? You could do this by squaring the difference between a coordinate and the corresponding (2k-2n+1)/2n, where k is the coordinate's rank. —Tamfang (talk) 01:46, 20 October 2020 (UTC)[reply]
If two points are close to each other on the sphere, each pair of their coordinates will also be close, so nicely separated coordinates imply separated points. But some very even sphere distributions (e.g. the dodecahedral vertices) may have very uneven coordinate distributions, so the coordinate test may discard good point distributions or score them much lower than much worse distributions.  --Lambiam 09:12, 20 October 2020 (UTC)[reply]

Quadratic formula division

Can you divide numbers in the numerator of a simplified quadratic formula by a number in the denominator?

For example, if I had 2 ± sqrt(31) all over -3, could the negative sign in the denominator move up to the top? And could I divide the 2 by the -3 to get -2/3 ± sqrt31?

Thanks, User:Heyoostorm_talk! 22:36, 13 October 2020 (UTC)[reply]

Not quite. You need to distribute the -1/3 over both elements of the sum, so you'd get -2/3 ± (sqrt31)/3 = -2/3 ± sqrt(31/9).
In certain contexts where the + vs the - is tracked, it would be more appropriate to write -2/3 ∓ sqrt(31/9), but that is not an issue here. Where it applies is in equations such as sin(±π/2) = ±1. Dividing both halves of that equation by -3 gives you (-1/3)sin(±π/2) = ∓1/3. See Plus–minus sign#In mathematics for more details.
But if all you are doing is "moving the negative sign up to the top" with a quadratic formula, then no problem. But you should keep in mind that is because (-1)±x = ∓x, which in that context is the same thing as ±x. -- ToE 23:58, 13 October 2020 (UTC)[reply]
Thanks, that helps. User:Heyoostorm_talk! 14:30, 14 October 2020 (UTC)[reply]

October 14

Stumped on a simple thing

I can solve this iteratively, but because my application is a real-time analysis project, I tried to find a closed-form solution but I'm running into a square root of a negative number, and I shouldn't be. I've been tearing my hair out most of the day trying to figure out where I went wrong.

The problem is simple enough: Given a set of N real values where N-1 values are known, what must the unknown Nth value be so that it equals d standard deviations from the mean of all the values? Basically I need to solve for xN in this expression, in which all terms on the right also contain xN:

Actually there are two possible values for xN. Simple enough. So I start with the variance identity, with xi being one of the values, μ is the mean, and σ is the standard deviation:

Because the value xN is unknown, the mean and variance can be rewritten as:

where s1 and s2 are the sums of the known N-1 values of xi and xi2, respectively:

Expanding the first equation above, I get

Rather than working this out myself, I plugged this into the Wolfram Alpha solver, ran the result through the simplify operation, and got

The problem is, no matter how I break up the problem, do steps manually and then put them into Wolfram Alpha, the expression under the square root always comes out negative. And this is a set of real positive x numbers, using d=2 and N=20.

Is there anything wrong with my approach? As I said, this has been stumping me all day. There's something I'm just not seeing. ~Anachronist (talk) 04:06, 14 October 2020 (UTC)[reply]

@Anachronist: Why would the inside of this square root be always negative? Note that such a real sometimes exists, but does not always. We know that and so it hinges on the sign of the expression . If this expression is negative then there is a real solution. In particular, say in the extreme case that . Then for any value of I cannot reasonably ask for an that is, say, a hundred standard deviations away from their mean. Note Chebyshev's inequality which places stringent restrictions on how many points can be a certain number of standard deviations away from the mean. To apply the inequality, consider the uniform distribution on your points, so that if there are m points such that the probability of that event is given by . With the inequality then asserts that or . If then this forces , i.e. there are no real solutions. This proves that solutions cannot always exist. Note that this is a weaker restriction than ; adding to both sides gives or , or . --Jasper Deng (talk) 05:04, 14 October 2020 (UTC)[reply]
The expression under the square root isn't negative. There's a negative sign at the front, but is also negative. The last term under the root, , is times the variance of the first N-1 values, so nonnegative.--101.98.109.114 (talk) 05:07, 14 October 2020 (UTC)[reply]
@Jasper Deng: and anonymous. Thank you. I'll point out that is always negative for my values of d and N (d=2 and N=20 in my tests). That means the second term under the radical must be the culprit; the expression under the square root is always negative in all of my tests with real positive numbers for 19 values of x and solving for the 20th. My computer program is doing this with running sums on a time series; I've checked s1, so maybe I'm not updating s2 properly. I'm not trying to solve for a bizarre distribution of data, say, 20 values randomly distributed between 80 and 90. When I solve it iteratively, I always get the correct solution. Let me go back and check my code.... ~Anachronist (talk) 05:20, 14 October 2020 (UTC)[reply]
@Anachronist: Assuming you calculated correctly, clearly your implementation of is not correct. In Python:
x = [i for i in range(19)]
d = 2
N = len(x) + 1
s_1 = sum(x)
s_2 = sum([x_i**2 for x_i in x])
x_N = -(-d**2*N*(d^2 - N + 1)*((N - 1)*s_2 - s_1**2)**0.5 -s_1*(d**2 - N + 1))/((N - 1)*(d**2 - N + 1))
print(str(x_N))
I get 564.0255249385683 as the solution. Remarkably, the existence of a real solution depends only on and , and not any of the actual data points. I don't know which language you are using but if you are using Python, note that x^2 is not the square of x. It is the exclusive or of x with 2. You need to use the notation I used above to get the square.--Jasper Deng (talk) 05:30, 14 October 2020 (UTC)[reply]
@Jasper Deng: I'm not using a power operator, I'm doing x*x for x squared. Also, for d<<N, a real solution depends only on that second term under the radical, , and I have verified that it is always negative. I put the numbers in Excel and verified that s2 and s1 are mathematically correct, but I discovered that I was using the incorrect sample size. Instead of calculating the sums for the last N-1 data values to calculate the Nth value, I was using the last N values. That consistently gave me negative numbers under the square root.
It helped to talk this out with you. Thanks. ~Anachronist (talk) 06:38, 14 October 2020 (UTC)[reply]

It is clear that for N=10 and d=3, there is no solution because goes to zero, and it's in the denominator. I'm wondering how to get around that, because I can easily find a solution iteratively. ~Anachronist (talk) 14:23, 14 October 2020 (UTC)[reply]

That means your quadratic is actually linear. --JBL (talk) 14:54, 14 October 2020 (UTC)[reply]
You mean continuous?
Unfortunately, the specific case of d=3 and N=10 is one of the situations that are mandatory to solve in this computer program. Due to the instability of the closed-form solution, I think I need to go back to the iterative approach. I was hoping there was a way to avoid the CPU overhead of iterating this. ~Anachronist (talk) 16:37, 14 October 2020 (UTC)[reply]
@Anachronist: Nay. He means linear. The quadratic formula is not valid when the coefficient of the quadratic term is zero, as appears to be the case here. But in that case, your equation is linear and so the solution is in another form.(indeed, if there indeed were no solution, even iterative methods are doomed to fail).--Jasper Deng (talk) 16:47, 14 October 2020 (UTC)[reply]
Consider the quadratic equation with real coefficients. If and then it has solutions . If and then it does not have real solutions. If then actually it is not a quadratic equation at all, it is the much simpler linear equation , with unique real solution whenever . (Pretty sure your CPU should be able to handle that ;).)
Also this is probably not relevant to you but I think it is nice to notice that if then , so the quadratic knows about its degenerate linear case; if then it's the other root that approaches the root of the linear equation. --JBL (talk) 17:19, 14 October 2020 (UTC)[reply]
It seems to be less than stable near that discontinuity in the solution though. I tried d=2.9 with N=10 and got some wild looking results. I had solved this some years ago with an iterative technique. In trying to port my software to another platform I figured I'd try for a closed-form solution. I'll go back to iterative, but not the crude cut-stepsize-in-half approach I used before. I think Newton's method may be more efficient since the derivative of the function is pretty straightforward. ~Anachronist (talk) 22:19, 14 October 2020 (UTC)[reply]
@Anachronist: Why not just use the linear case's closed-form solution in the case of exactly one root? Or better yet, use Mueller's formula, which is still valid in the linear case (but you need to still take some care near the discontinuity). The downside of course is having a square root in the denominator, but you can also move it to the numerator by rationalizing the denominator. Indeed, when , the square root of that is greater than itself, so you lose about half of your precision. You can partly save this by expanding the square root as a binomial series: . For a very near zero, truncating this series and substituting it for your square root will be a stabler solution than computing the square root "in full" with a square root function.--Jasper Deng (talk) 22:54, 14 October 2020 (UTC)[reply]
Indeed, why not? Well, it's because all of this started as a fairly simple algebraic problem that I got stuck on because I had unknowingly used N instead of N-1 values to calculate the sums s1 and s2, consistently giving me a square root of a negative number. Once I figured out that the error was in my code and not my formulas, I found further complications at or near combinations of N and d that create a divide-by-zero error. So now to get around that and still avoid iteration, I need to account for special cases and calculate it as an approximation. The closed form turns out to be not so straightforward after all, in practice. I have concluded that if I'm going to be approximating it anyway, I may as well use Newton's method, which is straightforward, elegant, converges quickly, and better than the iterative solution I came up with years ago. I can make the iteration better, and the underlying calculations way more efficient. So I'll still have an improvement in this new version I'm porting.
This has been an interesting exercise and I learned more than I expected (and brought back memories of knowledge that had been long forgotten). Thanks for that. ~Anachronist (talk) 05:00, 15 October 2020 (UTC)[reply]
@Anachronist: And watch out: Newton's method won't do at roots of the derivative! The thing about numerical analysis really is that there's no one-fits-all for anything. This thing about "expand using a Taylor series to stabilize" is common. For example, is going to be less stable than .--Jasper Deng (talk) 06:01, 15 October 2020 (UTC)[reply]

Well, the derivative of does have a suspicious denominator:

That quantity is N-1 times the variance of N-1 samples. Subtracting from that the sum of N-1 samples times the final Nth value may give me a negative square root. If that's the case (I haven't had time to test it), I'll go back to a binary search approximation. It'll still be better than what I had, because before I was inefficiently calculating the whole mean and standard deviation in each iteration, without first separating out s1 and s2, which can be treated as constants. So at least I'll be able to eliminate any loops in each iteration. ~Anachronist (talk) 18:58, 16 October 2020 (UTC)[reply]

@Anachronist: If I were you, I would get rid of square roots before applying Newton's method. Here that would mean the equation . This does not introduce any extraneous solutions since can be of either sign (corresponding to either solution). Square roots are bad for stability (again, they lose half of your precision almost automatically). By "binary search" I assume you mean the bisection method, but in that case you still need to ensure you have one point below the x-axis and another above in order to guarantee finding a root. Such points are guaranteed to exist but for a very near zero, it will not do you any good since you will have to go very far out in order to get at the root which diverges as .--Jasper Deng (talk) 19:29, 16 October 2020 (UTC)[reply]
Yes, bisection method. My original algorithm never encountered an error over years of use by multiple people with millions of streaming real-time values, so I'm less worried about special situations than about division-by-zero or square-root-of-negative with other approaches. My old approach simply calculated, for a test value of xN, the mean and standard deviation of the entire set of N values, and adjusted xN depending on whether it was less or greater than the value calculated for for that iteration, changing the sign and increment of the adjustment until xN fell within an acceptable error around . All sequences of N values always had a variance, so it always converged. I just wasn't happy with the efficiency. No one ever complained that my algorithm was bogging things down. I just figured, since I'm porting this to another platform, that I would see what I can do to improve it. ~Anachronist (talk) 04:54, 17 October 2020 (UTC)[reply]
I manually obtained the following formula:
,
where
,
which does not suffer from any problems. Ruslik_Zero 13:28, 17 October 2020 (UTC)[reply]
The formula may be rewritten in a very simple form:
,

where

, .
The expression under the second square root is obviously positive. Under the first root it is positive if . Ruslik_Zero 13:48, 17 October 2020 (UTC)[reply]
@Ruslik0: That's not valid for and unstable in cases close to that, which is what the OP is having trouble with.--Jasper Deng (talk) 16:45, 17 October 2020 (UTC)[reply]
It's a much simpler and easier-to-understand expression for the solution, though. ~Anachronist (talk) 16:49, 17 October 2020 (UTC)[reply]
I found an incredibly fast convergence for this:
  1. Start with a guess for ; a good guess is the previous value found
  2. Evaluate (first equation at the start of this thread)
  3. Set
  4. Repeat from step 2 until
This converges after just a couple of iterations (often 1 iteration for my error ), and doesn't need bisection method or Newton's method. ~Anachronist (talk) 02:16, 18 October 2020 (UTC)[reply]
@Anachronist: You're doing what we call fixed-point iteration. This can converge extraordinarily quickly, but be careful to ensure you keep enough digits to avoid the loss of precision due to the square root, and also to choose a good initial guess (it must be within the roots' basin of attraction).--Jasper Deng (talk) 05:38, 18 October 2020 (UTC)[reply]
It turned out to be unstable for values of d larger than 2 or 3, so I went back to bisection. I'm satisfied that I eliminated the loops for calculating mean and standard deviation, to make it more efficient than my previous version. ~Anachronist (talk) 06:06, 19 October 2020 (UTC)[reply]
@Anachronist: Typically one only uses bisection method to obtain a good guess for a better/faster method like the fixed-point iteration or Newton's method. You probably also saw just how hard satisfying all the conditions for fixed-point iteration to work can be!--Jasper Deng (talk) 07:11, 19 October 2020 (UTC)[reply]

Primes (base 10) questions

Some questions on primes (all questions are base 10)

  1. Can it be proved that there is no maximum prime consisting only of the digits 1,3,7 & 9
  2. Can it be proved that for every length number that there is a prime consisting of only 1,3,7,&9?
  3. (Had another question that built on top of it that turned out to be about right truncatable primes)

(Side question, do any of these answers change if 2 is allowed in the first position in the numbers?)Naraht (talk) 13:54, 14 October 2020 (UTC)[reply]

I believe that the answer to question 2 is that no proof is known (and the only way to say, "Yes, it can be proved", is if you have a proof), but that it is nevertheless almost certainly true, in the same way that Goldbach's conjecture is true. The number of primes of length whose decimal representation contains only digits 1, 3, 7 and 9 appears to grow proportionally to , so there are an increasing lot of those. Since 1 follows from 2, the same holds for 1.  --Lambiam 00:15, 15 October 2020 (UTC)[reply]

October 19

Linear estimation of a local minimum from discrete data

I have an array of data — say, six (x,y) values — characterized by a minimum y value in the middle of the set, increasing toward the ends of the x range. (The x,y values in this case come from a volatility smile, which I have characterized as a superposition of two hyperbolic functions, but that isn't important.) Basically, I have discrete points on a smooth curve, and I don't want to know or care what the underlying curve is.

Is there a quick linear method for approximating the value of x that corresponds to the minimum of the smooth curve, without resorting to spline fits, other polynomial fits, or fitting my own data model (which doesn't have a closed form solution)?

For example, I can eyeball it easily enough:

*



    *             *

              *
        *_ *
         ^
       minimum
     likely here

I'm just wondering if there's something I can do, perhaps with a weighted average of the slopes of each segment, that might give me an approximation of the minimum without curve-fitting. Somebody must have figured this out. Is there a technique for this? ~Anachronist (talk) 06:46, 19 October 2020 (UTC)[reply]

@Anachronist: Simply calculating the argmin from the data directly should suffice, right? This linear-time procedure will take less time than anything else I could suggest here. Otherwise, your answer will depend on what kind of smooth curve you select; you'd need to essentially find the root of the derivative and numerical differentiation really only works here with some interpolation of some sort.--Jasper Deng (talk) 07:16, 19 October 2020 (UTC)[reply]
You can fit a quadratic curve (or, for that matter, any polynomial of a fixed degree) in linear time. If you give the data points weights for some between 0 and 1, the higher values – presumably farther away from the minimum – get a lower weight, which is useful if the shape of the curve is poorly approximated by a parabola; that in the example looks more cuspy. (This method of using weights is insensitive to translations but sensitive to scaling of the values.) It will work poorly, though, if the curve has relatively flat ends. You can alternatively use a sliding window of a fixed width, say with indices to keep track of a data segment of length . The -th data point is stored at position After detecting the minimum in the stream of data à la JD, keep copying to the window for more data points; then stop. The discrete minimum is in the middle of the segment of the retained data points. Then use any method of curve fitting to find the minimum.  --Lambiam 08:27, 19 October 2020 (UTC)[reply]
@Jasper Deng: If you're suggesting that argmin is simply the minimum-value data point, no, that isn't what I need. In my crude illustration, I deliberately put two points (asterisks) at the same y-level, with different slopes on the left and right, to suggest that the minimum of the curve isn't right in the middle of those two lowest points, nor is it either one or the other point if the points are unequal by a tiny amount. In the illustration, the minimum (the underscore character) is closer to the left low point.
I guess I could restate the problem this way:
Given:
  • an unknown asymmetric curve having a first derivative that increases smoothly from negative to positive in an unknown non-linear way,
  • a sparse sampling of known (x,y) data points on that curve, equally spaced along the x axis,
  • a minimum y value in the range of (x,y) samples available,
what approach would result in a quick approximation of the x location (not the value) of the minimum of the curve?
@Lambiam: The best fit curve for data points in a volatility smile appears to be a skewed hyperbola. I haven't found an efficient way to fit such a thing without resorting iterative techniques. You've given me an idea with sliding windows and linear interpolation though. I need to think about it. ~Anachronist (talk) 18:56, 19 October 2020 (UTC)[reply]
This is closely analogous (though the other way up) to estimating the position of the mode when data have been classified, the construction involving a pair of crossed straight lines at the top of the modal class bar of the drawn histogram. A simple estimate, which you would have to judge the adequacy of, comes from projecting inwards the straight lines given by joining the data points closest to the minimum to the respective next ones moving outwards. → 2A00:23C6:AA08:E500:40DF:4AE9:718D:E2EA (talk) 13:14, 20 October 2020 (UTC)[reply]

October 20