Content deleted Content added

Inline

Revision as of 09:03, 10 January 2021

Welcome to the mathematics section
of the Wikipedia reference desk.

skip to bottom

Select a section:

Shortcut

WP:RD/MA

Want a faster answer?

Main page: Help searching Wikipedia

How can I get my question answered?

Select the section of the desk that best fits the general topic of your question (see the navigation column to the right).
Post your question to only one section, providing a short header that gives the topic of your question.
Type '~~~~' (that is, four tilde characters) at the end – this signs and dates your contribution so we know who wrote what and when.
Don't post personal contact information – it will be removed. Any answers will be provided here.
Please be as specific as possible, and include all relevant context – the usefulness of answers may depend on the context.
Note:
- We don't answer (and may remove) questions that require medical diagnosis or legal advice.
- We don't answer requests for opinions, predictions or debate.
- We don't do your homework for you, though we'll help you past the stuck point.
- We don't conduct original research or provide a free source of ideas, but we'll help you find information you need.

Ready? Ask a new question!

How do I answer a question?

Main page: Wikipedia:Reference desk/Guidelines

The best answers address the question directly, and back up facts with wikilinks and links to sources. Do not edit others' comments and do not give any medical or legal advice.

January 3

Generating numbers with a modulus which is not a power of two?

Suppose you had a stream of bits from which to construct some numbers using a modulus of say 72. Does the fact that the modulus is not a power of two introduce any bias in the result? For some reason it just seems that it would. Earl of Arundel (talk) 14:53, 3 January 2021 (UTC)[reply]

Here's my attempt to remove any possible bias. (Although I'm not sure it's even necessary at this point!)

ulong generate(ulong bits, ulong modulus)
{
 ulong sum = 0;
 ulong power = 1;
 while(power != 0)
 {
  if(modulus & power)
   sum += bits % power;
  power <<= 1;
 }
 return sum;
}

Earl of Arundel (talk) 15:05, 3 January 2021 (UTC)[reply]

This does not solve the issue for long streams. If you have bits galore, as in a cheaply obtained infinite stream, an easy solution is to chop up the stream into parts of length

n

such that

m\leq 2^{n},

where

m

stands for the modulus, interpreting these as binary numbers and discarding all outside the range from

0

to

m-1

. For a modulus of

72

you should choose

n=7,

which means that about 44% of the bits are thrown away. If you have to pay dearly for each bit, this is not a good approach. Instead, think of the problem as having to convert an infinitely expanded fraction of a value in the unit interval given in some base

a

(here with

a=2

) to one in base

b

(here with

b=m

). For example, suppose you are given the stream

00100100001111110110101010...,

(which happens to be the fractional part of

\pi

expressed as a binary fraction, instead of the usual decimal

14159265...

). If the desired modulus is

2=3

, this should now become

0102110122220102...,

again the same value, but this time ternary. Using this example, without looking ahead we know that the binary fraction is contained in the closed interval from

0.00100100001111110110101010{\overline {0}}

to

0.00100100001111110110101010{\overline {1}}

, in which an overlined digit denotes an infinite repetition. The corresponding respective ternary fractions are

0.{\underline {0102110122220102}}1002...

and

0.{\underline {0102110122220102}}2001...,

in which the common initial part is underlined. So after having seen an initial segment

00100100001111110110101010

of the input stream, you can already produce the initial segment

0102110122220102

of the output stream. This method uses all digits from the input; no information is discarded. There is a theoretical risk that you may have to wait indefinitely before a next output digit becomes certain. If the bitstream begins with

010101...,

with an interminably repeated alteration of

0

and

1

, you are kept in suspension about whether to produce a ternary

0

or

1

until the pattern is broken. For a random input stream this is not a real problem. You need to be prepared, though, to compute with very large integers. --Lambiam 16:03, 3 January 2021 (UTC)[reply]

There are intermediate schemes where you throw out some information, but not as much as you would be generating one number at a time. For example, if you want to generate numbers modulo 3, then interpret each 8 bit chunk as a number between 0 and 255, and if this is less than 242 then express in base 3 to get 5 numbers modulo three. You end up throwing out 1-243/256 = 5% of the bits instead of 25% as when you generate the numbers one by one. --RDBury (talk) 01:55, 4 January 2021 (UTC)[reply]

On average you lose a small amount more. For each 8 bits in, you obtain 5 trits output with probability 243/256 and 0 trits with probability 13/256; on the average 1215/256 = 4.74610 trits. The information content of a trit is log₂ 3 = 1.58496 bits. So the average information content extracted from 8 bits equals 4.74610 × 1.58496 = 7.52238 bits. The efficiency is then 7.52238/8 = 0.94030 bits, so the loss is 0.05970 bits per input bit, or almost 6%. In a general formula, for

m

base-

a

digits in and

n

base

b

-digits out, where

b n \leq a m

, the efficiency is

{\frac {b^{n}\log(b^{n})}{a^{m}\log(a^{m})}}.

Splitting the input up in chunks of 290 bits to get out batches of 47 base-72 digits, the efficiency is slightly above 0.99. --Lambiam 03:05, 4 January 2021 (UTC)[reply]

Yes, the 5% was a lower bound in the information lost, basically a raw percentage of the bits thrown out. Any computation done will reduce efficiency more. In this case it was the conversion from binary (8-bits) to ternary (5*log₂3=7.9 bits). You can get arbitrarily close to 100% efficiency by using larger and larger chunks, but it's a case of diminishing returns. You could also increase efficiency without increasing chink size by being a bit smarter about throwing out bits. For example if your chunk starts '11111' then you don't need to look at the remaining 3 bits to know that the number is >242, so just throw out those 5 and start over saving yourself 3 bits. Again, it's trade-off between simplicity and efficiency. --RDBury (talk) 05:49, 4 January 2021 (UTC)[reply]

Derivative of matrix

${\frac {d}{dt}}\ln {A(t)}=A^{-1}(t)A'(t)$

Or

$=A'(t)A^{-1}(t)$

Which is true?

How do you define the function

\ln

on matrices? You can define it as the inverse – if it exists – of function

\exp

defined by a power series:

\exp A=1+A+{\frac {A^{2}}{2}}+{\frac {A^{3}}{6}}+\cdots ~.

But many of the conventional rules for determining derivatives do not work for matrices. For example, while

{\frac {d}{dt}}f(t)^{3}=3\left({\frac {d}{dt}}f(t)\right)f(t)^{2},

the identity

{\frac {d}{dt}}A^{3}=3\left({\frac {d}{dt}}A\right)A^{2}

is invalid. The correct rule is

{\frac {d}{dt}}A^{3}=\left({\frac {d}{dt}}A\right)A^{2}+A\left({\frac {d}{dt}}A\right)A+A^{2}\left({\frac {d}{dt}}A\right).

The conventional rule fails because matrix multiplication is not commutative. I think that both of these rules for the derivative of the logarithm of a matrix likewise fail to hold. --Lambiam 19:00, 3 January 2021 (UTC)[reply]

If A is sufficiently close to I then you can define ln(A) as (A-I)-(A-I)²/2+(A-I)³/3-(A-I)⁴/4 ... . There are probably other A's for which ln(A) can be reasonably defined, but that doesn't solve the basic problem here, namely that the usual rules for calculus have to be drastically recast or thrown out altogether when you don't have commutativity. Note that

\exp({\begin{bmatrix}0&0\\0&0\end{bmatrix}})=\exp({\begin{bmatrix}0&2\pi \\-2\pi &0\end{bmatrix}})=I.

So defining the inverse of exp on matrices is trickier than it may look at first blush. --RDBury (talk) 02:21, 4 January 2021 (UTC)[reply]

See here and also here. Count Iblis (talk) 08:52, 4 January 2021 (UTC)[reply]

The issue of defining

\ln

on matrices can be bypassed by reformulating the original question as: assuming that

A(t)=\exp B(t),

can we express

{\tfrac {d}{dt}}B(t)

as a product of the inverse

A^{{-}1}(t)

and the derivative

{\tfrac {d}{dt}}A(t)

of

A(t).

And the answer is, no, not in general. It is true for the special case of diagonal matrices, for which multiplication is commutative, and then either order of the multiplicands works. While not immediately related, I can report that the following identity appears to hold:

\exp(\lambda A)\exp(\mu A)=\exp((\lambda +\mu )A).

It follows that

(\exp A)^{n}=\exp(nA).

--Lambiam 11:22, 4 January 2021 (UTC)[reply]

Generally

\exp(A)\exp(B)=\exp(A+B)

if the operators commute. Ruslik_Zero 08:07, 5 January 2021 (UTC)[reply]

Matrix logarithm covers the topic of defining the logarithm of a matrix. Like in the 1D case, there are branches, but I am not too familiar with several complex variables or the behavior of this particular multivalued function.--Jasper Deng (talk) 10:26, 7 January 2021 (UTC)[reply]
Now there's an idea: Look it up on Wikipedia! Okay, I feel silly now. At least we managed to cover a good chunk of the article. In any case, the article doesn't mention derivatives. It just came to me that log does define map from some open set in C^n² to C^n² so it has a derivative in that sense. From this point of view it would be a n² by n² matrix, or perhaps more accurately a linear map from the space of n×n matrices to itself, which may be complicated but it is defined. There's probably a way to get an expression for this, but it's not coming to me at the mement. --RDBury (talk) 00:15, 9 January 2021 (UTC)[reply]
It seems to me that the parameter $t$ of $A(t)$ with respect to which the derivative is taken is meant to range over $\mathbf {R} .$ --Lambiam 00:26, 9 January 2021 (UTC)[reply]
Yes, the chain rule implies that d/dt ln(A(t)) = Dln(A)(A'(t)) where Dln(A) is the derivative considered as a map from M_nn to M_nn. So knowing one is equivalent to knowing the other. Note that, by the above series for exp(A), you can write Dexp(A) = I + (ad_L(A)+ad_R(A))/2 + (ad_L²(A)+ad_L(A)ad_R(A)+ad_R(A)²)/6 +... where ad_L is the map U→AU and ad_R is the map U→UA. Similarly, I think you can get a series expression for Dln(A) when A is close to I, but it would be more complicated. I don't know if Derivative of the exponential map (linked above) is relevant here because the domain of exp in that context is a Lie algebra, it might be applicable though. --RDBury (talk) 04:23, 9 January 2021 (UTC)[reply]

January 7

limit of Likelihood of tie for most selected of others

(From this week's RuPaul's Drag Race)... If 7 queens vote at Random for one of the other queens (in this case they can't vote for themselves, Ben DeLaCreme isn't part of this), what is the probability of a tie in the queen with the most number of votes. Does this probability go up or down if they can vote for themselves? As the number of queens goes up, does the probability go up or down?

If self-votes are not allowed, the probability of a tie for 7 queens equals

129534/6 7 =

0.46273

. If allowed, it goes down to

372540/7 7 =

0.45236

. Increasing the number of queens increases the likelihood of a tie, except if that number was less than

3

. I suspect that the limit equals

1 - 1/ e =

0.63212

. --Lambiam 09:56, 7 January 2021 (UTC)[reply]

Here's a computer calculation showing the same results as stated above, as well as the results for lower numbers. --116.86.4.41 (talk) 10:45, 7 January 2021 (UTC)[reply]

January 10

@@ Line 73: / Line 73: @@
 *::It seems to me that the parameter <math>t</math> of <math>A(t)</math> with respect to which the derivative is taken is meant to range over <math>\mathbf{R}.</math> &nbsp;--[[User talk:Lambiam|Lambiam]] 00:26, 9 January 2021 (UTC)
 *:::Yes, the chain rule implies that d/dt ln(A(t)) = Dln(A)(A'(t)) where Dln(A) is the derivative considered as a map from M<sub>nn</sub> to M<sub>nn</sub>. So knowing one is equivalent to knowing the other. Note that, by the above series for exp(A), you can write Dexp(A) = I + (ad<sub>L</sub>(A)+ad<sub>R</sub>(A))/2 + (ad<sub>L</sub><sup>2</sup>(A)+ad<sub>L</sub>(A)ad<sub>R</sub>(A)+ad<sub>R</sub>(A)<sup>2</sup>)/6 +... where ad<sub>L</sub> is the map U→AU and ad<sub>R</sub> is the map U→UA. Similarly, I think you can get a series expression for Dln(A) when A is close to I, but it would be more complicated. I don't know if [[Derivative of the exponential map]] (linked above) is relevant here because the domain of exp in that context is a Lie algebra, it might be applicable though. --[[User:RDBury|RDBury]] ([[User talk:RDBury|talk]]) 04:23, 9 January 2021 (UTC)
-= January 6 =
 = January 7 =