Newlisp Forum Members Trend.

For the Compleat Fan
Locked
Kazimir Majorinc
Posts: 388
Joined: Thu May 08, 2008 1:24 am
Location: Croatia
Contact:

Newlisp Forum Members Trend.

Post by Kazimir Majorinc »

Image

If you joined the forum, the chance that you'll post is ~ 70%.
If you posted n posts on forum, chance that you'll post 2n posts is ~ 70%, for each n.

Robert Gorenc
Posts: 12
Joined: Tue Oct 27, 2009 9:01 pm
Location: Croatia

Re: Newlisp Forum Members Trend.

Post by Robert Gorenc »

is it Kazimir's first Forum theorem? :-)

Kazimir Majorinc
Posts: 388
Joined: Thu May 08, 2008 1:24 am
Location: Croatia
Contact:

Re: Newlisp Forum Members Trend.

Post by Kazimir Majorinc »

Yeah, my path to glory. :-) Unfortunately, 70% appears to be specific for Newlisp forum. At lispforum.com, the probability that one who posted n times will post 2n times is about 40%. But still the number is strangely constant, so there is something behind these numbers.

Robert Gorenc
Posts: 12
Joined: Tue Oct 27, 2009 9:01 pm
Location: Croatia

Re: Newlisp Forum Members Trend.

Post by Robert Gorenc »

so, how did you collect this data, manually or use some nl script?
What is this symbol E if formula y=2E+.... and what is R^2??
It was long ago when I worked with math..... :-)

Kazimir Majorinc
Posts: 388
Joined: Thu May 08, 2008 1:24 am
Location: Croatia
Contact:

Re: Newlisp Forum Members Trend.

Post by Kazimir Majorinc »

The data is from http://newlispfanclub.alh.net/forum/memberlist.php . I only copied and paste it in Excel.

2E-5 is 2*10^(-5)= 0.00002.

R2 is "coefficient of determination" which shows how close is actual data to the function above. If data points are very chaotic, trend might be the same, but R2 would be much lower.

DrDave
Posts: 126
Joined: Wed May 21, 2008 2:47 pm

Re: Newlisp Forum Members Trend.

Post by DrDave »

Robert Gorenc wrote:so, how did you collect this data, manually or use some nl script?
What is this symbol E if formula y=2E+.... and what is R^2??
It was long ago when I worked with math..... :-)
The correlation coefficient for a simple linear regression like these tells you how much of the variability in the Y-values of the sample data can be accounted for by the model (your linear equation, which is the relationship betweeen the X and Y values) you created. The variability that is not accounted for by the model is unexplained (at least unexplained by your model; there might be some actual explanation for it).

So, if your sample points all lie EXACTLY on the plotted line of your model, R^2=1; that is 100%, of the variability in your Y sample data can be explained by your model equation.

Like most models, you really need to be able to inspect the raw data (or at least see the plot of the raw data and the model like presented here). Just because R^2 is a very high number dosn't necessarily mean your model is so good at predicting Y from X on all portions of the data region. For example, look at how far from the model line that the sample data falls near the left end of the X scale compared to the middle region. Think about having a data set that has a logarithmic relationship that has a linear plot fitted. You can get a rather high R^2 because only near the extreme end do the sample data vary significantly from the linear model.

DrDave
...it is better to first strive for clarity and correctness and to make programs efficient only if really needed.
"Getting Started with Erlang" version 5.6.2

Locked