Properties of polynomial roots

Properties of polynomial roots

In mathematics, a univariate polynomial is an expression of the form

a_0 + a_1 x + \cdots + a_n x^n,\quad a_n\not = 0,

where the ai belong to some field, which, in this article, is always the field \mathbb C of the complex numbers. The natural number n is known as the degree of the polynomial.

In the following, p will be used to represent the polynomial, so we have

p = a_0 + a_1 x + \cdots + a_n x^n.

A root of the polynomial p is a solution of the equation p = 0: that is, a complex number a such that p(a) = 0.

The fundamental theorem of algebra combined with the factor theorem states that the polynomial p has n roots in the complex plane, if they are counted with their multiplicities.

This article concerns various properties of the roots of p, including their location in the complex plane.

Contents

  • Continuous dependence on coefficients 1
  • Algebraic expression of roots 2
  • Complex conjugate root theorem 3
  • Radical conjugate roots 4
  • Bounds on (complex) polynomial roots 5
    • Based on the Rouché theorem 5.1
    • Other bounds 5.2
      • Proof 5.2.1
    • Landau's inequality 5.3
    • Bounds on positive polynomial roots 5.4
    • Polynomials with real roots 5.5
  • Gauss–Lucas theorem 6
  • Statistical repartition of the roots 7
    • Asymptotic results 7.1
  • See also 8
  • Notes 9
  • References 10

Continuous dependence on coefficients

The n roots of a polynomial of degree n depend continuously on the coefficients.

This result implies that the eigenvalues of a matrix depend continuously on the matrix. A proof can be found in a book of Tyrtyshnikov.[1]

The problem of approximating the roots given the coefficients is ill-conditioned. See, for example, Wilkinson's polynomial.

Algebraic expression of roots

All roots of polynomials with rational coefficients are algebraic numbers, by definition of the latter. If the degree n of the polynomial is no greater than 4, all the roots of the polynomial can be written as algebraic expressions in terms of the coefficients—that is, applying only the four basic arithmetic operations and the extraction of n-th roots. But by the Abel-Ruffini theorem, this cannot be done in general for higher-degree equations.

Complex conjugate root theorem

The complex conjugate root theorem states that if the coefficients of a polynomial are real, then the non-real roots appear in pairs of the type a ± ib.

For example, the equation x2 + 1 = 0 has roots ± i.

Radical conjugate roots

It can be proved that if a polynomial P(x) with rational coefficients has a + √b as a root, where a, b are rational and \sqrt{b} is irrational, then a − √b is also a root. First observe that

\left(x - \left [ a + \sqrt b \right ] \right) \left(x - \left [ a - \sqrt b \right ] \right) = (x - a)^2 - b.

Denote this quadratic polynomial by D(x). Then, by the Euclidean division of polynomials,

P(x) = D(x)Q(x) + cx + d = ((x - a)^2 - b)Q(x) + cx + d, \,\!

where c, d are rational numbers (by virtue of the fact that the coefficients of P(x) and D(x) are all rational). But a + √b is a root of P(x):

P\left( a + \sqrt b \right) = c\left(a + \sqrt b \right) + d = (ac + d) + c \sqrt b = 0.

It follows that c, d must be zero, since otherwise the final equality could be arranged to suggest the irrationality of rational values (and vice versa). Hence P(x) = D(x)Q(x), for some quotient polynomial Q(x), and D(x) is a factor of P(x).[2]

This property may be generalized as: If an irreducible polynomial P has a root in common with a polynomial Q, then P divides Q evenly.

Bounds on (complex) polynomial roots

Based on the Rouché theorem

A very general class of bounds on the magnitude of roots is implied by the Rouché theorem. If there is a positive real number R and a coefficient index k such that

|a_k|\,R^k > |a_0|+\cdots+|a_{k-1}|\,R^{k-1}+|a_{k+1}|\,R^{k+1}+\cdots+|a_n|\,R^n

then there are exactly k (counted with multiplicity) roots of absolute value less than R. For k=0,n there is always a solution to this inequality, for example

  • for k=n,
R=1+\frac1{|a_n|}\max\{|a_0|,|a_1|,\dots, |a_{n-1}|\} or
R=\max\left(1,\,\frac1{|a_n|}\left(|a_0|+|a_1|+\cdots+|a_{n-1}|\right)\right)
are upper bounds for the size of all roots,
  • for k=0,
R=\frac{|a_0|}{|a_0|+\max\{|a_1|,|a_2|,\dots, |a_{n}|\}} or
R=\frac{|a_0|}{\max(|a_0|,\,|a_1|+|a_2|+\cdots+|a_{n}|)}

are lower bounds for the size of all of the roots.

  • for all other indices, the function
h(R)=|a_0|\,R^{-k}+\cdots+|a_{k-1}|\,R^{-1}-|a_k|+|a_{k+1}|\,R+\cdots+|a_n|\,R^{n-k}
is convex on the positive real numbers, thus the minimizing point is easy to determine numerically. If the minimal value is negative, one has found additional information on the location of the roots.

One can increase the separation of the roots and thus the ability to find additional separating circles from the coefficients, by applying the root squaring operation of the Dandelin-Graeffe iteration to the polynomial.

A different approach is by using the Gershgorin circle theorem applied to some companion matrix of the polynomial, as it is used in the Weierstraß–(Durand–Kerner) method. From initial estimates of the roots, that might be quite random, one gets unions of circles that contain the roots of the polynomial.

Other bounds

Useful upper bounds for the magnitudes of all of a polynomial's roots[3] include the near optimal Fujiwara bound[4]

2\, \max \left\{ \left|\frac{a_{n-1}}{a_n}\right|, \left|\frac{a_{n-2}}{a_n}\right|^{\frac{1}{2}}, \cdots, \left|\frac{a_0}{2a_n}\right|^\frac 1 n\right\},

which is an improvement (as the geometric mean) of Kojima's bound:[5]

2\,\max \left\{ \left|\frac{a_{n-1}}{a_n}\right|,\left|\frac{a_{n-2}}{a_{n-1}}\right|, \cdots, \left|\frac{a_0}{2a_1}\right|\right\}.

Other bounds are the Cauchy bound[6]

1+\max\left\{\left|\frac{a_0}{a_n}\right|,\left|\frac{a_1}{a_n}\right|,\cdots, \left|\frac{a_{n-1}}{a_n}\right|\right\}

and the Lagrange bound[7][8]

\max\left\{1,\sum_{i=0}^{n-1} \left|\frac{a_i}{a_n}\right|\right\}

or

\sum_{i=0}^{n-1} \left|\frac{a_i}{a_{i+1}}\right|.

These expressions return only bounds surpassing unity, so they cannot be used for some polynomials.

A tighter upper bound on the magnitudes of the roots is[9]

\max\left\{1+\frac{|a_{n-1}|}{|a_n|}, \frac{|a_{n-2}|}{|a_n|}, \dots , \frac{|a_0|}{|a_n|}\right\}.

Without loss of generality we assume the polynomial to be monic with general term aixi. Let { aj } be the set of negative coefficients. An upper bound for the positive real roots is given by the sum of the two largest numbers in the set { |aj|1/j }. This is an improvement on Fujiwara's bound which uses twice the maximum value of this set as its upper bound.

A similar bound also due to Lagrange holds for a polynomial with complex coefficients. Again assume the polynomial to be monic with general term aixi. Then the upper bound for the absolute values of the roots is given by the sum of the two greatest values in the set { |ai|1/i }. Again this is an improvement on Fujiwara's bound which uses twice the maximum value of this set as its upper bound.

A third bound also due to Lagrange holds for a polynomial with real coefficients. Let the aixn-i be the general term of the polynomial with 0 ≤ im. Let the first d terms of the polynomial have positive coefficients and let A be the maximum of these d coefficients. Then 1 + ( A / a0 )1/( 1 + d ) is an upper bound to the positive roots of the polynomial.

Sun and Hsieh obtained an improvement on Cauchy's bound.[10] assume the polynomial is monic with general term aixi. Sun and Hsieh showed that upper bounds 1 + d1 and 1 + d2 could be obtained from the following equations.

d_1 = \tfrac{1}{2} \left((| a_{n-1}| - 1) + \sqrt{(|a_{n-1}| - 1 )^2 + 4a } \right), \qquad a = \max \{ |a_i | \}.

d2 is the positive root of the cubic equation

Q(x) = x^3 + (2 - |a_{n-1}|) x^2 + (1 - |a_{n-1}| - |a_{n-2}| ) x - a, \qquad a = \max \{ |a_i | \}

They also noted that d2d1

Proof

Let ζ be a root of the polynomial

z^n+a_{n-1}z^{n-1}+\cdots+a_1z +a_0;

in order to prove the inequality |ζ| ≤ Rp we can assume, of course, |ζ| > 1. Writing the equation as

-\zeta^n=a_{n-1}\zeta^{n-1}+\cdots+a_1\zeta+a_0,

and using the Hölder's inequality we find

|\zeta|^n\leq \|a\|_p \left \|(\zeta^{n-1},\cdots,\zeta, 1) \right \|_q.

Now, if p = 1, this is

|\zeta|^n\leq\|a\|_1\max \left \{|\zeta|^{n-1},\cdots,|\zeta|,1 \right \} =\|a\|_1|\zeta|^{n-1},

thus

|\zeta|\leq \max\{1,\|a\|_1\}.

In the case 1 < p ≤ ∞, taking into account the summation formula for a geometric progression, we have

|\zeta|^n\leq \|a\|_p \left(|\zeta|^{q(n-1)}+\cdots+|\zeta|^q +1\right)^{\frac{1}{q}}=\|a\|_p \left(\frac{|\zeta|^{qn}-1}{|\zeta|^q-1}\right)^{\frac{1}{q}}\leq\|a\|_p \left(\frac{|\zeta|^{qn}}{|\zeta|^q-1}\right)^{\frac{1}{q}},

thus

|\zeta|^{nq}\leq \|a\|_p^q \frac{|\zeta|^{qn}}{|\zeta|^q-1}

simplifying we get,

|\zeta|^q\leq 1+\|a\|_p^q.

Therefore

|\zeta|\leq \left \| \left (1,\|a\|_p \right ) \right \|_q=R_p,

holds, for all 1 ≤ p ≤ ∞.

Landau's inequality

The previous bounds are upper bounds for each root separately. Landau's inequality provides an upper bound for the absolute values of the product of the roots that have an absolute value greater than one. This bound for the product of roots is not much greater than the preceding bounds of each root separately.[11]

Let z_1, \ldots, z_n be the n roots of the polynomial p, and

M(p)=|a_n|\prod_{j=1}^n \max(1,|z_j|).

Then

M(p)\le \sqrt{|a_0|^2 +|a_1|^2 +\cdots |a_n|^2}\,.

This bound is useful to bound the coefficients of a divisor of a polynomial: if

q=b_m x^m +\cdots+b_0

is a divisor of p, then

|b_0| +|b_1| +\cdots |b_m| \le 2^m\,\left | \frac{b_m}{a_n}\right |\, M(p)\,.

Bounds on positive polynomial roots

There also exist bounds on just the positive roots of polynomials; these bounds were developed by Akritas, Strzeboński and Vigklas based on previous work by Doru Stefanescu. They are used in the computer algebra systems Mathematica, Sage, SymPy, Xcas etc.[12][13]

Polynomials with real roots

It is possible to determine the bounds of the roots of a polynomial using Samuelson's inequality. This method is due to a paper by Laguerre.[14]

Let a_n x^n+a_{n-1}x^{n-1}+\ldots+a_1x+a_0 be a polynomial with all real roots. Then its roots are located in the interval with endpoints

x_\pm=-\frac{a_{n-1}}{na_n} \pm \frac{n-1}{na_n}\sqrt{a^2_{n-1} - \frac{2n}{n-1}a_n a_{n-2}}.

Example: The polynomial x^4+5x^3+5x^2-5x-6 has four real roots −3, −2, −1 and 1. The above formula gives

x_\pm=-\frac{5}{4} \pm \frac{3}{4}\sqrt{\frac{35}{3}};

thus its roots are contained in I = [-3.8117, 1.3117].

Gauss–Lucas theorem

The Gauss–Lucas theorem states that the convex hull of the roots of a polynomial contains the roots of the derivative of the polynomial.

A sometimes useful corollary is that if all roots of a polynomial have positive real part, then so do the roots of all derivatives of the polynomial.

A related result is Bernstein's inequality. It states that for a polynomial P of degree n with derivative P′ we have

\max_{|z| \leq 1} \big|P'(z)\big| \le n \max_{|z| \leq 1} \big|P(z)\big|.

Statistical repartition of the roots

The statistical properties of the roots of a random polynomial have been the subject of several studies. Let

p(x) = a_n x^n + a_{n-1} x^{n-1} + \cdots + a_2 x^2 + a_1 x + a_0

be a random polynomial. If the coefficients ai are independently and identically distributed with a mean of zero, the real roots are mostly located near ±1. The complex roots can be shown to be on or close to the unit circle.

If the coefficients are Gaussian distributed with a mean of zero and variance of σ then the mean density of real roots is given by the Kac formula[15][16]

m( x ) = \frac { \sqrt{ A( x ) C( x ) - B( x )^2 }} {\pi A( x )}

where

\begin{align} A( x ) &= \sigma \sum { x^{ 2i } } = \sigma \frac{ x^{ 2n } - 1 } { x - 1 }, \\ B( x ) &= \frac{ 1 } { 2 } \frac{ d } { dx } A( x ), \\ C( x ) &= \frac{ 1 } { 4 } \frac{ d^2 } { dx^2 } A( x ) + \frac{ 1 } { 4x } \frac{ d } { dx } A( x ). \end{align}

When the coefficients are Gaussian distributed with a non-zero mean and variance of σ, a similar but more complex formula is known.require('Module:No globals')

local p = {}

-- articles in which traditional Chinese preceeds simplified Chinese local t1st = { ["228 Incident"] = true, ["Chinese calendar"] = true, ["Lippo Centre, Hong Kong"] = true, ["Republic of China"] = true, ["Republic of China at the 1924 Summer Olympics"] = true, ["Taiwan"] = true, ["Taiwan (island)"] = true, ["Taiwan Province"] = true, ["Wei Boyang"] = true, }

-- the labels for each part local labels = { ["c"] = "Chinese", ["s"] = "simplified Chinese", ["t"] = "traditional Chinese", ["p"] = "pinyin", ["tp"] = "Tongyong Pinyin", ["w"] = "Wade–Giles", ["j"] = "Jyutping", ["cy"] = "Cantonese Yale", ["poj"] = "Pe̍h-ōe-jī", ["zhu"] = "Zhuyin Fuhao", ["l"] = "literally", }

-- article titles for wikilinks for each part local wlinks = { ["c"] = "Chinese language", ["s"] = "simplified Chinese characters", ["t"] = "traditional Chinese characters", ["p"] = "pinyin", ["tp"] = "Tongyong Pinyin", ["w"] = "Wade–Giles", ["j"] = "Jyutping", ["cy"] = "Yale romanization of Cantonese", ["poj"] = "Pe̍h-ōe-jī", ["zhu"] = "Bopomofo", }

-- for those parts which are to be treated as languages their ISO code local ISOlang = { ["c"] = "zh", ["t"] = "zh-Hant", ["s"] = "zh-Hans", ["p"] = "zh-Latn-pinyin", ["tp"] = "zh-Latn", ["w"] = "zh-Latn-wadegile", ["j"] = "yue-jyutping", ["cy"] = "yue", ["poj"] = "hak", ["zhu"] = "zh-Bopo", }

local italic = { ["p"] = true, ["tp"] = true, ["w"] = true, ["j"] = true, ["cy"] = true, ["poj"] = true, } -- Categories for different kinds of Chinese text local cats = { ["c"] = "", ["s"] = "", ["t"] = "", }

function p.Zh(frame) -- load arguments module to simplify handling of args local getArgs = require('Module:Arguments').getArgs local args = getArgs(frame) return p._Zh(args) end function p._Zh(args) local uselinks = not (args["links"] == "no") -- whether to add links local uselabels = not (args["labels"] == "no") -- whether to have labels local capfirst = args["scase"] ~= nil

        local t1 = false -- whether traditional Chinese characters go first
        local j1 = false -- whether Cantonese Romanisations go first
        local testChar
        if (args["first"]) then
                 for testChar in mw.ustring.gmatch(args["first"], "%a+") do
          if (testChar == "t") then
           t1 = true
           end
          if (testChar == "j") then
           j1 = true
           end
         end
        end
        if (t1 == false) then
         local title = mw.title.getCurrentTitle()
         t1 = t1st[title.text] == true
        end

-- based on setting/preference specify order local orderlist = {"c", "s", "t", "p", "tp", "w", "j", "cy", "poj", "zhu", "l"} if (t1) then orderlist[2] = "t" orderlist[3] = "s" end if (j1) then orderlist[4] = "j" orderlist[5] = "cy" orderlist[6] = "p" orderlist[7] = "tp" orderlist[8] = "w" end -- rename rules. Rules to change parameters and labels based on other parameters if args["hp"] then -- hp an alias for p ([hanyu] pinyin) args["p"] = args["hp"] end if args["tp"] then -- if also Tongyu pinyin use full name for Hanyu pinyin labels["p"] = "Hanyu Pinyin" end if (args["s"] and args["s"] == args["t"]) then -- Treat simplified + traditional as Chinese if they're the same args["c"] = args["s"] args["s"] = nil args["t"] = nil elseif (not (args["s"] and args["t"])) then -- use short label if only one of simplified and traditional labels["s"] = labels["c"] labels["t"] = labels["c"] end local body = "" -- the output string local params -- for creating HTML spans local label -- the label, i.e. the bit preceeding the supplied text local val -- the supplied text -- go through all possible fields in loop, adding them to the output for i, part in ipairs(orderlist) do if (args[part]) then -- build label label = "" if (uselabels) then label = labels[part] if (capfirst) then label = mw.language.getContentLanguage():ucfirst(

Asymptotic results

For large n, a number of asymptotic formulae are known. For a fixed x

m( x ) = \frac{ 1 } { \pi | 1 - x^2 | }

and

m( \pm 1 ) = \frac{ 1 } { \pi } \sqrt { \frac{ n^2 - 1 } { 12 } }

where m( x ) is the mean density of real roots. The expected number of real roots is

N_n = \frac{ 2 } { \pi } \ln n + C + O( n^{ -2 } )

where C is a constant approximately equal to 0.6257358072 and O() is the order operator.

This result has been shown by Kac, Erdös and others to be insensitive to the actual distribution of coefficients. Numerical testing of this formula has confirmed these earlier results.

See also

Notes

  1. ^
  2. ^
  3. ^
  4. ^ Fujiwara M (1916) Über die obere Schranke des absoluten Betrages der Wurzeln einer algebraischen Gleichung, Tôhoku Math J 10: 167–171
  5. ^ Kojima T (1917) On the limits of the roots of an algebraic equation, Tôhoku Math J 11 119–127
  6. ^ Cauchy AL (1829) Exercises de mathematique. Oeuvres 2 (9) p122
  7. ^ Lagrange J–L (1798) Traite de la r'esolution des equations numeriques. Paris.
  8. ^ Hirst HP & Macey WT (1997) Bounding the roots of polynomials. Coll Math J 28 (4) 292
  9. ^ Cohen, Alan M., "Bounds for the roots of polynomial equations", Mathematical Gazette 93, March 2009, 87–88.
  10. ^ Sun YJ and Hsieh JG (1996) A note on circular bound of polynomial zeros, IEEE Trans Circuits Syst. I 43, 476-478
  11. ^ Mignotte, Maurice, "Some useful bounds". Computer algebra, 259–263, Springer, Vienna, 1983
  12. ^
  13. ^
  14. ^ .
  15. ^ Kac M (1943) Bull Am Math Soc 49, 314
  16. ^ Kac M (1948) Proc London Math Soc 50, 390

References