Author 
Message 
cagdas.ozgenc@gmail.com science forum beginner
Joined: 29 Mar 2006
Posts: 6

Posted: Mon Jul 03, 2006 1:32 pm Post subject:
modeling total probability less than 1



Greetings.
I am trying to fit a function to model the probability for a car loan
to default given the client's % downpayment. The free variable is the %
downpayment. The free variable is bounded between 0 and 1 (0%, 100%),
where probability at 100% is obviously 0 (meaning that credit amount is
0 and there is nothing to default). What's confusing to me is that the
area under curve is not 1, it should be the average default of the
credit protfolio (Probability that a credit will default no matter what
the downpayment is).
First of all, is there a common distribution with such strict limits (0
and 1)? Secondly how does one incorporate the fact that not all loans
default into a distribution?
Thank you for your time. 

Back to top 


gjedwards@gmail.com science forum addict
Joined: 20 May 2006
Posts: 70

Posted: Mon Jul 03, 2006 3:13 pm Post subject:
Re: modeling total probability less than 1



Presumably you have a list of downpayment sizes (S) and a binary
yes/no (D) for default.
What data exactly are you trying to fit a function to? You need to
think about the scale at which you are going to consider developing a
model.
Are the possible values of the downpayment already discrete?  i.e.
are there a fixed number of possible downpayment %'s or can a borrower
pay any amount. In the latter case (and possibily even in the former
depending on the amount of data, etc) you need think about a histogram
with the downpayment size split into regions, eg, <10%, 11%20%, etc.
To give you an example of the wrong scale and model, imagine one of the
data points was a person who made a 27.32334234% downpayment and
defaulted. Therefore you should never, ever, lend to someone who wants
to make a 27.32334234% downpayment  they are CERTAIN to default.
I *guess* you're probably much more interested in the probabilty that
people in the 2030% range, etc. (say) default. The sum under this
histogram should be N (if you don't normalise). If not, you've missed a
data point somehow!
The right scale (and the relevant model) is partly down to your skills
as a statistician. But presumably you're iterested in finding a
reasonably simple model. You should explore histograms at various
scales to consider whether any obvious patterns appear, given the
amount of data you have.
Cagdas Ozgenc wrote:
Quote:  Greetings.
I am trying to fit a function to model the probability for a car loan
to default given the client's % downpayment. The free variable is the %
downpayment. The free variable is bounded between 0 and 1 (0%, 100%),
where probability at 100% is obviously 0 (meaning that credit amount is
0 and there is nothing to default). What's confusing to me is that the
area under curve is not 1, it should be the average default of the
credit protfolio (Probability that a credit will default no matter what
the downpayment is).
First of all, is there a common distribution with such strict limits (0
and 1)? Secondly how does one incorporate the fact that not all loans
default into a distribution?
Thank you for your time. 


Back to top 


gjedwards@gmail.com science forum addict
Joined: 20 May 2006
Posts: 70

Posted: Mon Jul 03, 2006 3:15 pm Post subject:
Re: modeling total probability less than 1



sorry, where N is the total number of defaulters.
gjedwa...@gmail.com wrote:
Quote:  Presumably you have a list of downpayment sizes (S) and a binary
yes/no (D) for default.
What data exactly are you trying to fit a function to? You need to
think about the scale at which you are going to consider developing a
model.
Are the possible values of the downpayment already discrete?  i.e.
are there a fixed number of possible downpayment %'s or can a borrower
pay any amount. In the latter case (and possibily even in the former
depending on the amount of data, etc) you need think about a histogram
with the downpayment size split into regions, eg, <10%, 11%20%, etc.
To give you an example of the wrong scale and model, imagine one of the
data points was a person who made a 27.32334234% downpayment and
defaulted. Therefore you should never, ever, lend to someone who wants
to make a 27.32334234% downpayment  they are CERTAIN to default.
I *guess* you're probably much more interested in the probabilty that
people in the 2030% range, etc. (say) default. The sum under this
histogram should be N (if you don't normalise). If not, you've missed a
data point somehow!
The right scale (and the relevant model) is partly down to your skills
as a statistician. But presumably you're iterested in finding a
reasonably simple model. You should explore histograms at various
scales to consider whether any obvious patterns appear, given the
amount of data you have.
Cagdas Ozgenc wrote:
Greetings.
I am trying to fit a function to model the probability for a car loan
to default given the client's % downpayment. The free variable is the %
downpayment. The free variable is bounded between 0 and 1 (0%, 100%),
where probability at 100% is obviously 0 (meaning that credit amount is
0 and there is nothing to default). What's confusing to me is that the
area under curve is not 1, it should be the average default of the
credit protfolio (Probability that a credit will default no matter what
the downpayment is).
First of all, is there a common distribution with such strict limits (0
and 1)? Secondly how does one incorporate the fact that not all loans
default into a distribution?
Thank you for your time. 


Back to top 


C6L1V@shaw.ca science forum Guru
Joined: 23 May 2005
Posts: 628

Posted: Mon Jul 03, 2006 3:48 pm Post subject:
Re: modeling total probability less than 1



Cagdas Ozgenc wrote:
Quote:  Greetings.
I am trying to fit a function to model the probability for a car loan
to default given the client's % downpayment. The free variable is the %
downpayment. The free variable is bounded between 0 and 1 (0%, 100%),
where probability at 100% is obviously 0 (meaning that credit amount is
0 and there is nothing to default). What's confusing to me is that the
area under curve is not 1,

This is very common in many areas of application. Such a random
variable is called "mixed", being partly continuous, and partly
discrete. The "continuous" part may have a smooth probability curve,
but the "discrete" part does not. In fact, the CUMULATIVE probability
distribution should still go to 1 for large values of the variable, but
its graph will jump upwards at points of finite probability (like a
staircase). One very comon type of example of this type is for the
lifetime of a new piece of equipment. The lifetime is zero if you have
purchased a dud that does not work; otherwise, if it does work, you may
then have a lifetime probability curve that is smooth and nicely spread
out.
That being said, I cannot see that your application is of this type at
all. If I understand correctly, you are plotting the default
probability against downpayment. At each point, the default probability
is <= 1, and all you need is that Prob{default} + P{no default} = 1.
The area under your curve of default probability vs. downpayment does
not need to be 1, and I don't understand why you think it should be. In
fact, it ought to be considerably less than 1.
Quote:  it should be the average default of the
credit protfolio (Probability that a credit will default no matter what
the downpayment is).
First of all, is there a common distribution with such strict limits (0
and 1)?

There are many distributions like this. The most common one is called
the Beta distribution.
Quote:  Secondly how does one incorporate the fact that not all loans
default into a distribution?

What "distribution" are you talking about? The downpayment amount is
not a random variable, so when you plot default probability vs.
downpayment, you are not plotting a distribution in probability sense.
R.G. Vickson
Adjunct Professor, University of Waterloo
Quote: 
Thank you for your time. 


Back to top 


Jeremy Boden science forum Guru Wannabe
Joined: 28 Apr 2005
Posts: 144

Posted: Mon Jul 03, 2006 5:09 pm Post subject:
Re: modeling total probability less than 1



On Mon, 20060703 at 06:32 0700, Cagdas Ozgenc wrote:
Quote:  Greetings.
I am trying to fit a function to model the probability for a car loan
to default given the client's % downpayment. The free variable is the %
downpayment. The free variable is bounded between 0 and 1 (0%, 100%),
where probability at 100% is obviously 0 (meaning that credit amount is
0 and there is nothing to default). What's confusing to me is that the
area under curve is not 1, it should be the average default of the
credit protfolio (Probability that a credit will default no matter what
the downpayment is).
First of all, is there a common distribution with such strict limits (0
and 1)? Secondly how does one incorporate the fact that not all loans
default into a distribution?
This doesn't sound like it's necessarily a very good model. 
If the loan is interest free, then a small down payment indicates either
that I have good reserves and can pay before any penalties are due or
that I'm very poor and likely to default.
Many UK loan companies appear to use post code (indicating relative
affluence) as a predictor of default.

Jeremy Boden 

Back to top 


cagdas.ozgenc@gmail.com science forum beginner
Joined: 29 Mar 2006
Posts: 6

Posted: Tue Jul 04, 2006 5:08 am Post subject:
Re: modeling total probability less than 1



Quote:  This is very common in many areas of application. Such a random
variable is called "mixed", being partly continuous, and partly
discrete. The "continuous" part may have a smooth probability curve,
but the "discrete" part does not. In fact, the CUMULATIVE probability
distribution should still go to 1 for large values of the variable, but
its graph will jump upwards at points of finite probability (like a
staircase). One very comon type of example of this type is for the
lifetime of a new piece of equipment. The lifetime is zero if you have
purchased a dud that does not work; otherwise, if it does work, you may
then have a lifetime probability curve that is smooth and nicely spread
out.
That being said, I cannot see that your application is of this type at
all. If I understand correctly, you are plotting the default
probability against downpayment. At each point, the default probability
is <= 1, and all you need is that Prob{default} + P{no default} = 1.
The area under your curve of default probability vs. downpayment does
not need to be 1, and I don't understand why you think it should be. In
fact, it ought to be considerably less than 1.
it should be the average default of the
credit protfolio (Probability that a credit will default no matter what
the downpayment is).
First of all, is there a common distribution with such strict limits (0
and 1)?
There are many distributions like this. The most common one is called
the Beta distribution.
Secondly how does one incorporate the fact that not all loans
default into a distribution?
What "distribution" are you talking about? The downpayment amount is
not a random variable, so when you plot default probability vs.
downpayment, you are not plotting a distribution in probability sense.

Ok. It hit me as a ligtning. I feel stupid. Downpayment is clearly not
a random variable. I understand now that I was plotting a family of
discrete boolean random variables parametrized on downpayment.
Thanks for your help. 

Back to top 


Google


Back to top 



The time now is Mon Sep 24, 2018 1:48 am  All times are GMT

