Author 
Message 
Martin Evans science forum beginner
Joined: 10 Jun 2005
Posts: 3

Posted: Tue Jun 14, 2005 10:07 am Post subject:
Re: Probability question



Scott Hemphill wrote:
Quote:  Scott Hemphill <hemphill@hemphills.net> writes:
Martin Evans <Martin.Evans@arm.com> writes:
Scott Hemphill wrote:
Pr = (TR1)! / (R! * (TR)!)
And what is the probability for 19 right answers out of 20? Hint: where
does the 20th answer go?
For T = 20 and R = 19 we get Pr = (20  19  1)!/(19! * (20  1)!) = 0!/19! = 0
I think this is what we would expect  the probablity of getting exactly 19 questions
right would be zero, wouldn't it? PS  I don't need the hint, thanks! :P
No. 0! is not zero. It is one.
I mean, yes the probability is zero. But, no your formula doesn't produce
that result.
Scott

Scott Hemphill hemphill@alumni.caltech.edu
"This isn't flying. This is falling, with style."  Buzz Lightyear

I had forgotten that 0! is not zero  it's all too long ago!
There is a problem with the formula I derived anyway, as you have correctly said in another
posting, but I haven't had the time to look at it yet! I need to go round the loop again,
but I think the basic approach is right. 

Back to top 


Jon Haugsand science forum beginner
Joined: 03 May 2005
Posts: 37

Posted: Tue Jun 14, 2005 12:16 pm Post subject:
Re: Probability question, was Re: Mensa Forgot Another Possibility!



* Scott Hemphill
Quote:  and subfactorial(n) is the number of derangements of n objects, i.e.
the number of permutations for which none of objects are in their
original position.
subfactorial(n) is further developed:
subfactorial(n) = round(n!/e) + [n==0]
where round is the function which rounds to the nearest integer
and the bracket notation yields 1 if the boolean expression inside
is true and 0 if the boolean expression is false. I've used ==
for the equality operator, and e is Euler's constant e = 2.71828....

Interesting, but hardly educational. This /is/ a mysterious formula.
Why does it work? I don't really want an answer (unless you have some
smart and easy explanation), but will try to search the answer for
myselv.
Anyway, it might be useful to point out a recursive solution to the
whole problem:
h(n,k) = number of ways K hats land on a correct head out of N heads.
h(n,n) = 1
h(n,n1) = 0
h(n,k) = choose(n,k) * h(nk,0)
h(n,0) = n!  [ sum(i=1,n) h(n,i) ]
This formula is implementable on a computer, and follows naturally
from the problem description. It is of course the h(n,0) that is
different, i.e. your subfactorial.

Jon Haugsand
Dept. of Informatics, Univ. of Oslo, Norway, mailto:jonhaug@ifi.uio.no
http://www.ifi.uio.no/~jonhaug/, Phone: +47 22 85 24 92 

Back to top 


Scott Hemphill science forum beginner
Joined: 09 Jun 2005
Posts: 21

Posted: Tue Jun 14, 2005 1:41 pm Post subject:
Re: Probability question, was Re: Mensa Forgot Another Possibility!



Jon Haugsand <jonhaug@ifi.uio.no> writes:
Quote:  * Scott Hemphill
and subfactorial(n) is the number of derangements of n objects, i.e.
the number of permutations for which none of objects are in their
original position.
subfactorial(n) is further developed:
subfactorial(n) = round(n!/e) + [n==0]
where round is the function which rounds to the nearest integer
and the bracket notation yields 1 if the boolean expression inside
is true and 0 if the boolean expression is false. I've used ==
for the equality operator, and e is Euler's constant e = 2.71828....
Interesting, but hardly educational. This /is/ a mysterious formula.
Why does it work? I don't really want an answer (unless you have some
smart and easy explanation), but will try to search the answer for
myselv.

That's why I gave a reference. There are some smart and easy explanations
depending on your background. I'll give one below which uses generating
functions.
Quote:  Anyway, it might be useful to point out a recursive solution to the
whole problem:
h(n,k) = number of ways K hats land on a correct head out of N heads.
h(n,n) = 1
h(n,n1) = 0
h(n,k) = choose(n,k) * h(nk,0)
h(n,0) = n!  [ sum(i=1,n) h(n,i) ]
This formula is implementable on a computer, and follows naturally
from the problem description. It is of course the h(n,0) that is
different, i.e. your subfactorial.

Rewriting your last equation:
n! = sum(k=0,n) h(n,k)
Substituting your nexttolast equation:
n! = sum(k=0,n) choose(n,k) h(nk,0)
I'll notate h(n,0) as d(n), the number of derangements of n objects.
n! = sum(k=0,n) choose(n,k) d(nk)
1 = sum(k=0,n) 1/k! d(nk)/(nk)!
The sequence generated by the right half of this equation for n = 0, 1, ...
is the convolution of the sequences 1/n! and d(n)/n! Therefore the
generating function of the sequence for the left side of the equation,
(1, 1, ...) will be equal to the product of the generating functions
for 1/n! and d(n)/n!.
The generating function for (1, 1, ...) is sum(k>=0) 1*z^k = 1/(1z).
The generating function for 1/n! is sum(k>=0) 1/k! z^k = e^z.
Let the generating function for d(n)/n! be D(z).
Then
1/(1z) = e^z D(z), or D(z) = e^z/(1z)
D(z) = 1/(1z) (z^0/0!  z^1/1! + z^2/2!  ... )
The value of d(n)/n! will be the coefficient of z^n. Since
1/(1z) = 1 + z + z^2 + ..., the coefficient of z^n will be
sum(k=0,n) (1)^k/k!
So
d(n)/n! = sum(k=0,n) (1)^k/k!
Note that this sum goes to e^1 as n goes to infinity.
d(n) = n! sum(k=0,n) (1)^k/k!
= n! (e^1  sum(k>n) (1)^k/k!)
It's pretty easy to establish that this sum for k>n is less than 1/2
for n > 0, and since d(n) is an integer, it must be the one you get
when you round n!/e to the nearest integer. The expression has to
be fixed to get the correct answer for n = 0. Hence,
d(n) = round(n!/e) + [n==0]
Scott

Scott Hemphill hemphill@alumni.caltech.edu
"This isn't flying. This is falling, with style."  Buzz Lightyear 

Back to top 


Scott Hemphill science forum beginner
Joined: 09 Jun 2005
Posts: 21

Posted: Tue Jun 14, 2005 1:56 pm Post subject:
Re: Probability question, was Re: Mensa Forgot Another Possibility!



Scott Hemphill <hemphill@hemphills.net> writes:
Quote:  d(n) = n! sum(k=0,n) (1)^k/k!
= n! (e^1  sum(k>n) (1)^k/k!)
It's pretty easy to establish that this sum for k>n is less than 1/2
for n > 0, and since d(n) is an integer, it must be the one you get
when you round n!/e to the nearest integer. The expression has to
be fixed to get the correct answer for n = 0. Hence,

I meant to say that it's easy to establish that
n! sum(k>n) (1)^k/k! has absolute value less than 1/2 for n > 0.
Scott

Scott Hemphill hemphill@alumni.caltech.edu
"This isn't flying. This is falling, with style."  Buzz Lightyear 

Back to top 


Jon Haugsand science forum beginner
Joined: 03 May 2005
Posts: 37

Posted: Tue Jun 14, 2005 6:31 pm Post subject:
Re: Probability question, was Re: Mensa Forgot Another Possibility!



* Scott Hemphill
Quote:  That's why I gave a reference. There are some smart and easy explanations
depending on your background. I'll give one below which uses generating
functions.

Thanks. I actually have a lot of background, but somehow generating
functions have escaped my education. Time to look into it.

Jon Haugsand
Dept. of Informatics, Univ. of Oslo, Norway, mailto:jonhaug@ifi.uio.no
http://www.ifi.uio.no/~jonhaug/, Phone: +47 22 85 24 92 

Back to top 


Scott Hemphill science forum beginner
Joined: 09 Jun 2005
Posts: 21

Posted: Thu Jun 16, 2005 2:10 pm Post subject:
Re: Probability question, was Re: Mensa Forgot Another Possibility!



Jon Haugsand <jonhaug@ifi.uio.no> writes:
Quote:  * Scott Hemphill
That's why I gave a reference. There are some smart and easy explanations
depending on your background. I'll give one below which uses generating
functions.
Thanks. I actually have a lot of background, but somehow generating
functions have escaped my education. Time to look into it.

I'll put in a plug for _Concrete Mathematics_. It is my favorite math
text. I discovered it after I was out of school, and it contains all
the useful math I had somehow missed.
You can also solve the original problem using an inversion formula.
If you have a function f defined in terms of g:
f(n) = sum(k=0,n) (1)^k choose(n,k) g(k)
Then g is implicitly defined in terms of f. This formula can be inverted
to solve for g:
g(n) = sum(k=0,n) (1)^k choose(n,k) f(k)
There are a variety of ways to prove this, but I like to use generating
functions, because they help me to prove identities involving "choose()"
(binomial coefficients) that I can't remember.
The generating function of a sequence is a way of dealing with the sequence
all at once. The generating function of a sequence (a0, a1, a2, ...) is
a0 + a1*z + a2*z^2 + ....
When you multiply two functions together:
(a0 + a1*z + a2*z^2 + ...) * (b0 + b1*z + b2*z^2 + ...)
you get:
(a0*b0 + (a0*b1+a1*b0)*z + (a0*b2+a1*b1+a2*b0)*z^2 + ...)
The coefficient of z^n is sum(k=0,n) a_k*b_(nk), so this function is
the generating function of the sequence c_n = sum(k=0,n) a_k*b_(nk).
The sequence c_n is the convolution of a_n and b_n. Whenever you have
a sum of terms with some factors involving k and some factors involving
nk, you can recognize a convolution.
Starting with
f(n) = sum(k=0,n) (1)^k choose(n,k) g(k)
f(n) = sum(k=0,n) (1)^k n!/((nk)!k!) g(k)
f(n)/n! = sum(k=0,n) (1)^k g(k)/k! * 1/(nk)!
Let F(z) be the generating function of f(n)/n! and G(z) be the generating
function of g(n)/n!. The right half of this last equation is the convolution
of the sequences (1)^n g(n)/n! and 1/n!.
The generating function of (1)^n g(n)/n! is G(z) and the generating
function of 1/n! is e^z.
So
F(z) = G(z) e^z
Solving for G(z) by substituting z > z:
G(z) = F(z) e^z
This is the same equation as the previous one with the roles of F and G
reversed! So now we can follow the same steps we used to get the previous
equation in reverse order to arrive at:
g(n) = sum(k=0,n) (1)^k choose(n,k) f(k)
(Using this inversion to solve the derangement problem is left as an
exercise.)
Scott

Scott Hemphill hemphill@alumni.caltech.edu
"This isn't flying. This is falling, with style."  Buzz Lightyear 

Back to top 


Jon Haugsand science forum beginner
Joined: 03 May 2005
Posts: 37

Posted: Thu Jun 16, 2005 7:58 pm Post subject:
Re: Probability question, was Re: Mensa Forgot Another Possibility!



* Scott Hemphill
Quote:  I'll put in a plug for _Concrete Mathematics_. It is my favorite math
text. I discovered it after I was out of school, and it contains all
the useful math I had somehow missed.

Yes I know. Somehow, I only read first half of Knuth's vol 1 in "The
Art of Computer Programming". Your first derivation of the
derangement formula is digested with satisfaction. Thanks. Made me
look into convolution in general where I found the following joyful
web site: http://www.jhu.edu/~signals/discreteconv/

Jon Haugsand
Dept. of Informatics, Univ. of Oslo, Norway, mailto:jonhaug@ifi.uio.no
http://www.ifi.uio.no/~jonhaug/, Phone: +47 22 85 24 92 

Back to top 


Scott Hemphill science forum beginner
Joined: 09 Jun 2005
Posts: 21

Posted: Thu Jun 16, 2005 11:27 pm Post subject:
Re: Probability question, was Re: Mensa Forgot Another Possibility!



Jon Haugsand <jonhaug@ifi.uio.no> writes:
Quote:  * Scott Hemphill
I'll put in a plug for _Concrete Mathematics_. It is my favorite math
text. I discovered it after I was out of school, and it contains all
the useful math I had somehow missed.
Yes I know. Somehow, I only read first half of Knuth's vol 1 in "The
Art of Computer Programming". Your first derivation of the
derangement formula is digested with satisfaction. Thanks. Made me
look into convolution in general where I found the following joyful
web site: http://www.jhu.edu/~signals/discreteconv/

Thanks for the link!
Scott

Scott Hemphill hemphill@alumni.caltech.edu
"This isn't flying. This is falling, with style."  Buzz Lightyear 

Back to top 


alan truelove science forum beginner
Joined: 09 Jun 2005
Posts: 8

Posted: Wed Jun 29, 2005 10:44 pm Post subject:
Re: Probability question



As promised a correct solution (I hope) for "three rows of objects"
[The general problem (any number of rows of objects) is pretty
difficult]
I haven't bothered to type up the derivation of the formulas but will
certainly do so if anyone wants it.
I would be glad to hear of any blunders, or better method
I find it hard to believe that no one has solved this before, so I
will communicate with a few old Cambridge buddies ..
To the original poster (rec.org.mensa)  see what you have started!
   
3 groups of n objects each are presented; select one object from each
group (a 'triad')
Given the first object, the (correct) other two are unique.
Repeat until all triads have been selected.
P(0,n,x) is prob. of getting exactly x correct triads
P(d,n,x) is prob. of getting x triads correct, with in each group 
n 'good' objects ,plus d 'dummy' objects none of which can form part
of a correct triad.
Quote:  '       
Results: (for n up to 20)  i.e. P(0,n,x) 
Problem with 3 rows of objects:
Check n=no. of objs Probs for x=no.of correct triads
Sum in each row x = n to 0
1 2 0.25 0 0.75
1 3 0.028 0 0.25 0.722
1 4 0.002 0 0.031 0.181 0.786
1 5 2 zeros 0.002 0.018 0.157 0.822
1 6 3_zeros 0.001 0.013 0.137 0.849
1 7 4_zeros 0.001 0.01 0.121 0.868
1 8 6_zeros 0.008 0.109 0.883
1 9 7_zeros 0.006 0.098 0.896
1 10 8_zeros 0.005 0.09 0.905
1 11 9_zeros 0.004 0.082 0.913
1 12 10_zeros 0.003 0.076 0.92
1 13 11_zeros 0.003 0.071 0.926
1 14 12_zeros 0.003 0.066 0.931
1 15 13_zeros 0.002 0.062 0.936
1 16 14_zeros 0.002 0.058 0.94
1 17 15_zeros 0.002 0.055 0.943
1 18 16_zeros 0.002 0.052 0.946
1 19 17_zeros 0.001 0.05 0.949
1 20 18_zeros 0.001 0.047 0.951
'match across 3 sets of 20
' alan j truelove July 1 05
'571 242 0153 'alan_truelove@hotmail.com
Imports System.Math
Public Class Form1
Inherits System.Windows.Forms.Form
#Region " Windows Form Designer generated code "
Public Sub New()
MyBase.New()
'This call is required by the Windows Form Designer.
InitializeComponent()
'Add any initialization after the InitializeComponent() call
End Sub
'Form overrides dispose to clean up the component list.
Protected Overloads Overrides Sub Dispose(ByVal disposing As
Boolean)
If disposing Then
If Not (components Is Nothing) Then
components.Dispose()
End If
End If
MyBase.Dispose(disposing)
End Sub
'Required by the Windows Form Designer
Private components As System.ComponentModel.IContainer
'NOTE: The following procedure is required by the Windows Form
Designer
'It can be modified using the Windows Form Designer.
'Do not modify it using the code editor.
Friend WithEvents Button1 As System.Windows.Forms.Button
<System.Diagnostics.DebuggerStepThrough()> Private Sub
InitializeComponent()
Me.Button1 = New System.Windows.Forms.Button
Me.SuspendLayout()
'
'Button1
'
Me.Button1.Location = New System.Drawing.Point(24, 32)
Me.Button1.Name = "Button1"
Me.Button1.Size = New System.Drawing.Size(144, 64)
Me.Button1.TabIndex = 0
Me.Button1.Text = "Button1"
'
'Form1
'
Me.AutoScaleBaseSize = New System.Drawing.Size(5, 13)
Me.ClientSize = New System.Drawing.Size(292, 266)
Me.Controls.Add(Me.Button1)
Me.Name = "Form1"
Me.Text = "Form1"
Me.ResumeLayout(False)
End Sub
#End Region
Public Sub Button1_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button1.Click
Dim i, j, iii, jjj, ii, jj As Integer
ReDim P(21, 100, 100)
ReDim pstr(100)
ReDim ptot(100)
nmax = 20
dmax = 20
myline2 = " Problem with 3 rows of objects:"
ts6.WriteLine(myline2)
myline2 = "Check no. of objs Probs for x=no.of correct
triads"
ts6.WriteLine(myline2)
myline2 = " Sum in each row x = n to 0"
ts6.WriteLine(myline2)
'P(0,n,x)= 1/n^2 of P(0,n1,x1) + 3(n1)1,n2,x) +
(n1)(n2)P(2,n3,x)
'P(1,n,y)=1/(n+1)^2 of P(0,n,x) + 3n P(1,n1,x) + n(n1)
P(2.n2,x)
'P(2,n,z)=1/(n+2)^2 of 4 P(1,n,x) + 5n P(2,n1,x) + n(n1)
P(3,n2,x)
'P(3,n,x)=1/(n+3)^3 of 9 P(2,n,x) + 7n P(3,n1,x) + n(n1)
P(4,n2,x)
'P(d,n,x)= 1/(n+d)^2 of d^2 P(d1,n,x) + (2d + 1) P(d,n1,) +
n(n1) P(d+1,n2,x)
' P(d,n,x) is the prob of getting x correct triads, choosing
from n 'good' entries, and
' d 'dummy' entries on each row (dummies cannot match up)
P(0, 1, 1) = 1 'if only one col, pr get 1 triad is 1
P(0, 1, 0) = 0 'if only 1 col, pr get 0 triads is nil
For n = 1 To nmax
P(0, n, n  1) = 0
' prob getting exactly n1 triads out of n cols (no
dummies)
Next
'    
For d = 0 To dmax
P(d, 0, 0) = 1
'if there are 0 good entries (any no of dummies) the prob
of getting 0 triads is 1
P(d, 1, 1) = (d + 1) ^ (2)
'if there is 1 good entry (d dummies) the prob of getting the
1 triads is 1/(d+1)^2
P(d, 1, 0) = 1  (d + 1) ^ (2)
'if there is 1 good entry (any no of dummies) the prob of
getting 0 triads is 1 prob of getting the 1 triad
'     
For n = 1 To nmax
P(d, 0, n) = 0 'obviously the prob of getting >0
triads is nil
'     
'prob of getting all n triads out of n good cols
'1/(n+d) * 1/(n+d1) .. * 1/(d+1)
'get (d+n)!/d!
P(d, n, n) = (d + 1)
For jj = 2 To n
P(d, n, n) = P(d, n, n) * (jj + d)
Next
' we need 1 over the above, squared
P(d, n, n) = P(d, n, n) ^ (2)
'   
Next
Next
'             
For n = 2 To 20 '0 'dummies', i.e. P(0,n,x) are
the probs of getting x correct triads out of n objects (in each row)
d = 0
mystr = Round(P(d, n, n), 3)
xx = 0 'counts zero for print line
If mystr = "0" Then
pstr(d) = " 2 zeros "
xx = 2
Else
pstr(d) = mystr & " 0 "
End If
ptot(d) = P(d, n, n)
'note that P(d,n,n) was calculated above
' and P(d,n,n1) is always zero
For x = n  2 To 0 Step 1
P(d, n, x) = 0 'ready to calculate this value
If (x > 0) Then P(d, n, x) = P(d, n  1, x  1) / n ^ 2
'   
If n > 1 Then P(d, n, x) = P(d, n, x) + _
3 * (n  1) * P(d + 1, n  2, x) / n ^ 2
'   
If n > 2 Then P(d, n, x) = P(d, n, x) + _
(n  1) * (n  2) * P(d + 2, n  3, x) / n ^ 2
mystr = Round(P(d, n, x), 3)
If mystr = "0" Then
xx = xx + 1
pstr(d) = xx & "_zeros "
Else
pstr(d) = pstr(d) & " " & mystr
End If
ptot(d) = ptot(d) + P(d, n, x)
Next
myline = ptot(0) & " =sum; n = no. of obj. each row= " _
& n & " ;probs(no. corr. triads, runs n to 0) " & pstr(0)
myline2 = Round(ptot(0), 3) & " " & n & " " & pstr(0)
'the string of probs for x=n down to x=0
ts6.WriteLine(myline2)
'MsgBox(myline)
'    
'We now calculate the P(d,n,x) for d>0, which quantities are defined
'at the beginning. These are just used in the iterative calculation.
For d = 1 To dmax  1 'probably not necess for d to go
this high
pstr(d) = ""
ptot(d) = 0
For x = n To 0 Step 1
P(d, n, x) = d ^ 2 * P(d  1, n, x) / (n + d) ^ 2
If x < n Then P(d, n, x) = P(d, n, x) + _
(2 * d + 1) * n * P(d, n  1, x) / (n + d) ^ 2
If n > 1 And x < n  1 Then P(d, n, x) = P(d, n,
x) + _
n * (n  1) * P(d + 1, n  2, x) / (n + d) ^ 2
ptot(d) = ptot(d) + P(d, n, x)
pstr(d) = pstr(d) & " " & Round(P(d, n, x), 3)
Next 'end of the x loop
' MsgBox(ptot(d) & " " & d & " " & n & " d, n " &
pstr(d))
Next 'end of d = 1 to dmax1 loop
'     
Next 'end of the loop n = 2 to 20
ts6.Close()
End 'stop the program
End Sub
End Class
'      
Module global
Public ts6 As System.IO.StreamWriter = _
System.IO.File.CreateText("C:/mathpuzz/results.doc")
Public P(,,), ptot() As Single
Public nfact, nplusdfact, dfact As Single
Public d, dmax, n, nmax, x, xx, kkk As Integer
Public pstr(0), mystr, myline, myline2 As String
End Module
'         

Back to top 


Pavel314 science forum addict
Joined: 29 Apr 2005
Posts: 78

Posted: Wed Aug 03, 2005 12:06 am Post subject:
Re: "Americans tap wine over beer"



"Ray Calvert" <gsinews@sbcglobal.net> wrote in message
news:%tcFe.447$aT1.336@newssvr19.news.prodigy.com...
Quote:  I read the article but did not see anything indicating that people were
considering chemicals when they were picking their poison. Maybe I missed
something. I wonder about the numbers though, They indicate Americans
drink about 24 gallons of beer and 2 gallons of wine per year. I assume
they mean for those who drink each, not an average of all Americans or of
all drinkers. That would indicate that people who drink beer drink about
235 beers a year while people who drink wine only drink about 50 glasses a
year. Wine drinkers must not reach for their alcohol of choice very often.
I certainly do more than my share by those standards.

50 glasses a year? That's less than one per week. I would guess that that's
the average for all Americans, not just wine drinkers.
From http://www.winexmagazine.com/archives/xercize.htm
"Per capita wine consumption in the United States in 1995 was: 2.13
gallons/adult"
which is close to your 2 gallons per year and for all adults, not just wine
drinkers. Of course, the average (mean) is more meaningful if you have the
standard deviation of the distribution. There's a lot of good statistical
information about wine consumption at
http://repositories.cdlib.org/cgi/viewcontent.cgi?article=1049&context=ucscecon
Table 2 gives the average and standard deviation of wine share but I'll have
to read this a few more times while sober to fully understand it. I
crossposted to alt.sci.math.probability and sci.stat.math to see if any of
the stat wizards there could read anything into this. It looks like the
USCanada share is increasing while the standard deviation as a ratio to
mean is decreasing, indicating a tightening up of the wine consumption
trends. (I really like the wine consumption by latitude graphs.)
Another interesting fact from that site:
"The approximate ratio of beer advertising to wine advertising is: 10 to 1"
Which is in the neighborhood of your 24 gallons of beer to 2 gallons of wine
per year, or 121. It looks like it pays to advertise.
Paul 

Back to top 


Duncan Smith science forum beginner
Joined: 29 Apr 2005
Posts: 21

Posted: Thu Aug 25, 2005 1:19 pm Post subject:
Re: question about linear regression



Joe wrote:
Quote:  "Dan Akers" <digikey@webtv.net> wrote in message
news:11828430CEF19132@storefull3135.bay.webtv.net...
Joe wrote;
snip
"I would like to make my best fit equation fit the data a little better,
can anyone suggest a method to accomplish this?
I am currently in the process of measuring the actualpredicted errors
(absolute value of), but I am not really sure what to do with those
errors or how to minimize them.
Any suggestions, sources, web sites, or tutorials greatly appreciated."
_____________________________________
Re;
Why are you using linear regression to model what seems to be a
nonlinear relation? Try a polynomial fit; quadratic, cubic, or even
higher order if you have the data...
Dan Akers
Hi Dan,
Thank you for the suggestions. As I said in my last post, I am using a
program called curvefit 1.3. It will automatically model over 30 equations
to the data. It does polynomials, quadratics, reciprocal logs, just about
any model you could think of. It even does interpolation (4 different types)
You can also define your own models and have it find the coefficients. I am
using 100200 X,Y data pairs. So there can be several values of Y for any
given value of X. Is that what you mean by nonlinear? When I look at some
of the other models, they seem to be the same. They can't predict to one
decimal place. Maybe I am expecting too much from these models.

Maybe. If there's variability in the data (which there clearly is),
then there's a limit to how accurate / precise the predictions can be.
It's easy to find a model that will predict the data you're using to fit
it (exactly), but that will usually be an atrocious model for predicting
'new' data. For some X you have several values for Y. Are these Y
values invariably within 0.1 of each other? If not, how could you ever
find a model that will guarantee to give predicted values within 0.1 of
the true value of Y?
Duncan 

Back to top 


Joe science forum beginner
Joined: 25 Jun 2005
Posts: 22

Posted: Fri Aug 26, 2005 10:41 pm Post subject:
Re: question about linear regression



"Duncan Smith" <buzzard@urubu.freeserve.co.uk> wrote in message
news:deknh9$m3l$1@newsm1.svr.pol.co.uk...
Quote:  Joe wrote:
"Dan Akers" <digikey@webtv.net> wrote in message
news:11828430CEF19132@storefull3135.bay.webtv.net...
Joe wrote;
snip
"I would like to make my best fit equation fit the data a little better,
can anyone suggest a method to accomplish this?
I am currently in the process of measuring the actualpredicted errors
(absolute value of), but I am not really sure what to do with those
errors or how to minimize them.
Any suggestions, sources, web sites, or tutorials greatly appreciated."
_____________________________________
Re;
Why are you using linear regression to model what seems to be a
nonlinear relation? Try a polynomial fit; quadratic, cubic, or even
higher order if you have the data...
Dan Akers
Hi Dan,
Thank you for the suggestions. As I said in my last post, I am using a
program called curvefit 1.3. It will automatically model over 30
equations
to the data. It does polynomials, quadratics, reciprocal logs, just about
any model you could think of. It even does interpolation (4 different
types)
You can also define your own models and have it find the coefficients. I
am
using 100200 X,Y data pairs. So there can be several values of Y for
any
given value of X. Is that what you mean by nonlinear? When I look at
some
of the other models, they seem to be the same. They can't predict to one
decimal place. Maybe I am expecting too much from these models.
Maybe. If there's variability in the data (which there clearly is),
then there's a limit to how accurate / precise the predictions can be.
It's easy to find a model that will predict the data you're using to fit
it (exactly), but that will usually be an atrocious model for predicting
'new' data. For some X you have several values for Y. Are these Y
values invariably within 0.1 of each other? If not, how could you ever
find a model that will guarantee to give predicted values within 0.1 of
the true value of Y?
Duncan

Hi Duncan,
Well, that was sort of my point (pardon the pun). When I look at that many
data sets on a scatter plot, it looks more like a plane might define it
better, but I am not sure how to fit data to a plane, rather than a line.
No, the Y values are not within 0.1 of each other. That was why I was
thinking of using a look up table instead of trying to use an equation. My
new statistics book that I picked up yesterday has me intrigued with ideas
on covariance and correlation, so I am going to run those tests first, and
try to do this step by step. It also discusses multiple regression at the
end of the book, but I have no idea what that is, yet. It is a little more
advanced than my old book from college.
Joe 

Back to top 


Joe science forum beginner
Joined: 25 Jun 2005
Posts: 22

Posted: Fri Aug 26, 2005 10:44 pm Post subject:
Re: question about linear regression



"Dan Akers" <digikey@webtv.net> wrote in message
news:490430E6D61342@storefull3131.bay.webtv.net...
Quote:  Joe wrote;
"I am using 100200 X,Y data pairs. So there can be several values of Y
for any given value of X. Is that what you mean by nonlinear?"
_____________________________________
Re;
I just reread you latest post and I realize that I did not comprehend on
my first read, that you sometimes have multiple Y values for some X
values. If Y is indeed dependent on X, well, that right there tells you
that there is some issue with repeatability. Hence the empirical data
has a probabilistic error band associated with it.
Given that, I would suggest that you calculate the Y error terms (Y1Y2)
for each X, where Y1 is the empirical and Y2 the calculated or
predicted, plug the summation of these into the standard deviation
equation I posted earlier to get the standard deviation of the dependent
variable Y, and then present the data with the primary calculated curve
along with the probabilistic range of your choosing.
Dan Akers

Hi Dan,
Thank you for the ideas on the standard deviation. I think that is the same
as the standard error that my new book mentions. Still, it doesn't look to
me like a line is going to be able to model this data. I looked at a
scatterplot as you suggested, and it looks like a plane might be better than
a line. I need to backtrack first and do some analysis of correlation on
just the data before I proceed with this any further.
Joe 

Back to top 


Pavel314 science forum addict
Joined: 29 Apr 2005
Posts: 78

Posted: Thu Oct 20, 2005 2:17 am Post subject:
Re: Does poker betting exaggerate skill differences?



Mark Spahn" <mspahn@localnet.com> wrote in message news:11l6nr3et21lnb2@corp.supernews.com...
Suppose five friends A, B, C, D, E have a weekly evening
of poker in which they each start with $10 and play until
one player has won all the money (a gain of $40).
Their skills differ so that their respective probabilities of
winning a hand of poker are .22, .21, .20, .19, .18.
Are their probabilities of winning for the evening (= 20 hands;
is that a reasonable number?) identical to their probabilities
of winning a hand?
During the course of a poker evening, as one player gains
more money than the others, does his having more money
to bet confer an advantage, so his probability of winning
for the evening rises? If so, then the players' probabilities
of winning an evening of poker might be, say,
.24, .22, .20, .18, .16.
Can anyone shed some light on this question?
 Mark Spahn
Mark,
I wrote a program in Ubasic to simulate the five friends playing poker. The program simulated 10,000 weekly games or 192.3 years. The program assumes the same bet on each hand. When a player loses all his money, he's out of the game and the others play on. When the final player has the entire $50, the game is over. Higher skill on the perhand level definitely leverages your odds at the evening game level.
RESULTS
Player Win
Probability____$1 Bet__________$5 Bet_______$10 Bet
..18 1.47% 5.86% 17.95%
..19 5.32% 10.60% 18.70%
..20 13.11% 17.82% 20.45%
..21 29.25% 27.99% 20.86%
..22 50.85% 37.73% 22.04%
In my earlier reply to your problem, I said that I expected the less skilled players to have better odds with larger bets but I never expected the disproportionate results on the $1 bet games. It seems that over the long haul, skill gives you the ability to build up the cash cushion which helps you survive runs of bad luck later in the game.
Just as a test, I set everyone at the same skill level. Even at the $1 bet level the results are fairly even, which increases my confidence in my simulation program.
Player Win
Probability____$1 Bet
..20 19.92%
..20 20.48%
..20 19.73%
..20 20.07%
..20 19.80%
Finally, I let four players have equal skill level and set the fifth at a significantly higher level. The results are as you might expect:
Player Win
Probability____$1 Bet
..18 2.98%
..18 3.12%
..18 2.69%
..18 3.15%
..28 88.06%
The moral is that if you're against better opponents, put everything on one test and trust to luck; if you are the most skilled, try to draw the contest out to bring your skill into play.
E.G., in roulette the house is the more "skilled" player by virtue of having the odds in its favor, I believe 52% for the house to 48% for the player. So put all your chips on one roll and cross your fingers.
Any suggestions on how to apply this lesson to the stock market?
Paul 

Back to top 


Mark Spahn science forum addict
Joined: 07 Jul 2005
Posts: 62

Posted: Thu Oct 20, 2005 5:15 am Post subject:
Re: Does poker betting exaggerate skill differences?



"Pavel314" <Pavel314@NOSPAM.comcast.net> wrote in message news:WMSdndCwYeeqY8veRVnsg@comcast.com...
Mark Spahn" <mspahn@localnet.com> wrote in message news:11l6nr3et21lnb2@corp.supernews.com...
Suppose five friends A, B, C, D, E have a weekly evening
of poker in which they each start with $10 and play until
one player has won all the money (a gain of $40).
Their skills differ so that their respective probabilities of
winning a hand of poker are .22, .21, .20, .19, .18.
Are their probabilities of winning for the evening (= 20 hands;
is that a reasonable number?) identical to their probabilities
of winning a hand?
During the course of a poker evening, as one player gains
more money than the others, does his having more money
to bet confer an advantage, so his probability of winning
for the evening rises? If so, then the players' probabilities
of winning an evening of poker might be, say,
.24, .22, .20, .18, .16.
Can anyone shed some light on this question?
 Mark Spahn
Mark,
I wrote a program in Ubasic to simulate the five friends playing poker. The program simulated 10,000 weekly games or 192.3 years. The program assumes the same bet on each hand. When a player loses all his money, he's out of the game and the others play on. When the final player has the entire $50, the game is over. Higher skill on the perhand level definitely leverages your odds at the evening game level.
RESULTS
Player Win
Probability____$1 Bet__________$5 Bet_______$10 Bet
.18 1.47% 5.86% 17.95%
.19 5.32% 10.60% 18.70%
.20 13.11% 17.82% 20.45%
.21 29.25% 27.99% 20.86%
.22 50.85% 37.73% 22.04%
In my earlier reply to your problem, I said that I expected the less skilled players to have better odds with larger bets but I never expected the disproportionate results on the $1 bet games. It seems that over the long haul, skill gives you the ability to build up the cash cushion which helps you survive runs of bad luck later in the game.
Just as a test, I set everyone at the same skill level. Even at the $1 bet level the results are fairly even, which increases my confidence in my simulation program.
Player Win
Probability____$1 Bet
.20 19.92%
.20 20.48%
.20 19.73%
.20 20.07%
.20 19.80%
Finally, I let four players have equal skill level and set the fifth at a significantly higher level. The results are as you might expect:
Player Win
Probability____$1 Bet
.18 2.98%
.18 3.12%
.18 2.69%
.18 3.15%
.28 88.06%
The moral is that if you're against better opponents, put everything on one test and trust to luck; if you are the most skilled, try to draw the contest out to bring your skill into play.
E.G., in roulette the house is the more "skilled" player by virtue of having the odds in its favor, I believe 52% for the house to 48% for the player. So put all your chips on one roll and cross your fingers.
Any suggestions on how to apply this lesson to the stock market?
Paul
Paul,
Wow, bravo! I am impressed, and just as surprised as you at how much the ability to accumulate a bankroll magnifies the handwinning probability into a much larger gamewinning probability. I remember reading somewhere that Richard Nixon, when he learned poker while in the navy, observed and studied many, many games before he actually began to bet money. It looks like that was a good plan, because the skill of a neophyte player, even if only slightly below par, will be decisively overwhelmed by the better skill of the other players.
 Mark 

Back to top 


Google


Back to top 



The time now is Thu Oct 18, 2018 12:25 am  All times are GMT

