## What-If 0002: "SAT Guessing"

ThirstyMonkey
### Re: What-If 0002: SAT Guessing

mikewhite wrote:I think a major point missed by the article is that the SAT is not scored on a completely linear scale, and a perfect score does not correlate with having all the answers correct. This flaws the whole article, it's answering a different question (a much easier one I might add) than originally asked.

Several factors are taken into account, not just how many questions an individual got correct but also how everyone else preformed. I think it would have been interesting to investigate how the bell curve for the random scores would be fitted given random answers (would it be a Normal distribution by central limit thm?) , what is the probability is that an individual would be given a perfect score (1600 or whatever it is) given the random answers, and then also suppose everyone else in the world only supplied random answers to the test, what would I need to have scored in the normal SAT test to have achieved a perfect score in that scenario (I'd imagine anything greater than a 1000 in normal conditions would probably get you there, but it would be fun to learn the actual answer!).

I feel this was a halfhearted attempt at the question at best, but given my background in Math and statistics I probably hold it to higher standards. Also it would probably take a whole days worth of work by a both intelligent and motivated person to get all that info, if it was even possible to get in the first place.

I know people have said this earlier in the thread, but I thought I'd flesh it out with suggestions of what I would have wanted to see. Sadly I'm too busy with work to hunt down the answers myself...

I was pretty disappointed by this entry for exactly these reasons. I believe SAT scores are scaled such that 500 is the average score and 100 is the standard deviation. I don't think it's perfect (perhaps intentionally so), but some effort is made toward that effect. Assuming the College Board would continue to abide by these rules despite a much denser score distribution than normal, this would imply that, for each individual section, one would have a 0.13% chance of getting a perfect score (that is, their score would lie 3 standard deviations above the mean.) Thus because the person is guessing randomly, the sections are independent, we would have a 0.13^3 = 0.00246%. Apparently the percentile for a 2400 (as of 2006, according to Wikipedia) is 99.98. So, when people take the test normally, 0.02% get a perfect score. This increase makes sense because a person's results on the three sections is not independent. A person whose intelligence lies "three standard deviations above the mean" (whatever that means) would have a relatively good chance of getting a perfect score on each section.

The difference between my calculated score is a factor of 10, but I think it would be even bigger. Perhaps there are some statistical subtleties due to the tests having different number of questions. (I simply thought of it as: (chance of performing three SDs above mean)^3). Also, the reported rate of 2400s is below the chance of simply of being three standard deviations above the mean, so the official scoring likely isn't as rigidly ruled as we would like to believe.

### Re: What-If 0002: SAT Guessing

So if EVERYONE took it randomly, then there'd be a much higher chance of getting a perfect score.

However, people who get perfect scores generally say the SATs are really easy, so I assume they get all or almost all the questions right. So if you're the only person who took it randomly, then you have to get almost all the questions right to get a perfect score, and randall's statistics are (roughly) correct.

gmalivuk
GNU Terry Pratchett
### Re: What-If 0002: SAT Guessing

Also, I don't believe the renormalization happens after every testing date. Rather, if averages for everyone who's taken it are drifting in one direction or another, they'll change the scoring for future tests.

So I don't think one testing session where everybody guesses would actually result in a wider range of raw scores being counted as "perfect".
Arariel
### Re: What-If 0002: SAT Guessing

Depends if 'everyone who took the SAT' means 'just this session' or actually everyone who took the SAT.

yedidyak
### Re: What-If 0002: SAT Guessing

I know that in the section I got a perfect score in I still got a couple wrong. That was out of 40-50 questions, so it can't be 99.98% right.

marsman57
### Re: What-If 0002: SAT Guessing

blowfishhootie wrote:
marsman57 wrote:SAT guessing was a huge disappointment after how cool Relativistic Baseball was.

I thought the exact opposite. I thought this one was interesting and at least a (very) slight bit relevant. There are people who guess blindly on all the questions on standardized tests. Not everyone, and not even close to enough to make the question realistically applicable, but it was still interesting to me to know what a person's chances of doing outstanding when guessing on the exam are.

I get where you are coming from. Maybe it would have been more interesting to me if he had continued to explore what the average score you should expect if you guessed on every question (probably not statistically that hard to figure out, but not something I know off the top of my head). Also, it would be fun to compare that to the strategy of answering "always C", but alas that last part would be dependent on the individual exam version.

yedidyak wrote:I know that in the section I got a perfect score in I still got a couple wrong. That was out of 40-50 questions, so it can't be 99.98% right.

If I recall correctly, the number that you can miss and get a perfect score is calibrated per test version. If a (relatively) lot of people are getting 100% of the questions correct, then even one miss will kill a perfect score, but if the number getting every single question correct is (relatively) low, there is a little bit more leniency*. I suppose it is kind of like a "curve" in an exam.

* - I am not exactly sure how they do it. It may be based on average scores rather than just the scores of top performers. Otherwise, you'd need to normalize for the number of top performers taking the exam which I am sure varies from test date to test date (as a top performer only needs to retake the test if they are shooting for a perfect score).

JimsMaher
### Re: What-If 0002: SAT Guessing

conlinism wrote:Correct me if I am wrong, but wouldn't the [probabilitly] of guessing them all correctly be 1/5? Because the questions are independent of one another, so the probability of each is 1/5. The probability would not decrease as you progress through the questions...

Each question is independent of any other; however, the final result is dependent on each-and-every question.

Individual probabilities are compounded here.
To test these test probabilities we're looking at the set of all equally likely permutations.
Think of it like a table, and with each new question adding another dimension to the volume, with a length equal to the number of answers allowed.
Each block in the table is one unique combination of answers.

A test with only one question would be a line of choices.
A test of two questions would be a rectangle of choices.
A test of three questions ... a box of choices.
And that should be enough to understand the principle.

How do you find the volume of a box?
H x L x W

How do you find the number of possible results for a multiple choice test of three questions?
Q1 times Q2 times Q3
with Q# = the number of choices for each question

How do you find the odds of randomly selecting all three questions correct?
1 over { Q1 times Q2 times Q3 }

If all questions have the same number of options, like on the SAT, then the formula is much simpler.

odds = 1 over { x^n }
x = choices per question (5 on SAT) ... or standard length
n = number of questions (158 or 170 on SAT, depending on your source) ... or number of dimensions

1 over { 5^158 } = 1 over 273 691 106 313 440 834 164 790 934 236 311 744 390 676 160 522 756 986 671 302 399 197 224 820 837 082 148 727 859 021 164 476 871 490 478 515 625

1 over { 5^170 } = 1 over 066 819 117 752 304 891 153 513 411 678 787 046 970 379 922 002 626 217 449 048 437 304 009 966 024 678 258 966 762 456 338 983 611 203 730 106 353 759 765 625

Which means the odds were potentially overestimated by a factor of 244,140,625.

masterfreek64
### Re: What-If 0002: SAT Guessing

The issue with this logic is that each SAT test's answer scale is weighted so that the points follow a certain distribution. I.e. if everyone guessed right, we would have the same proportion of 2400 SAT score (in some tests, and empty sheet can score something like 400 points, because some people taking the same version of the test had negative points).

gmalivuk
GNU Terry Pratchett
### Re: What-If 0002: SAT Guessing

And empty sheet has positive points because they dock points for wrong answers but not for blank answers. So I believe 200 is what you'd get for a blank sheet, and since they take .25 (raw) points per incorrect answer, getting all the questions wrong would be 0. (of course, there's about a 5e-16 chance of getting them all wrong, which makes that pretty unlikely as well).
MJZimmer88
### Re: What-If 0002: SAT Guessing

My only problem with "SAT Guessing" is that the probability mathematics used to answer the question doesn't really fall into the "physics" subject. It's just math.

And yes, blur the lines all you want... but while probabilities are important in physics, the statistics/calculations used are just straight forward from 4th-6th grade math classes, with BIGGER numbers.

(I capitalized "BIGGER" because I figured the word should appear BIGGER.)

Arariel
### Re: What-If 0002: SAT Guessing

gmalivuk wrote:And empty sheet has positive points because they dock points for wrong answers but not for blank answers. So I believe 200 is what you'd get for a blank sheet, and since they take .25 (raw) points per incorrect answer, getting all the questions wrong would be 0. (of course, there's about a 5e-16 chance of getting them all wrong, which makes that pretty unlikely as well).

Scores less than 200 are not reported. I think you may need to actually get some wrong (something to do with scaling) to get exactly 200 on one or two of the sections.

Jragonlord
### Re: What-If 0002: SAT Guessing

MJZimmer88 wrote:My only problem with "SAT Guessing" is that the probability mathematics used to answer the question doesn't really fall into the "physics" subject. It's just math.

Isn't Physics just applied math?
http://xkcd.com/435/

As for those questioning the scoring of the SAT, those who've been saying it's a normal distribution are correct. Mean = 500, SD = 100, anything below 3 SDs becomes a 200 automatically (lowest possible score), and anything above 3 SDs is automatically an 800.

On a side note, it's not too uncommon, it seems, that one gets higher than 200 for a blank SAT sheet. In most cases, you actually have to dip into negative points (obtained by answering more than four questions wrong for every question answered correctly, or any questions wrong for each question left blank) to get a 200.

drewder
### Re: What-If 0002: SAT Guessing

Arariel wrote:
gmalivuk wrote:And empty sheet has positive points because they dock points for wrong answers but not for blank answers. So I believe 200 is what you'd get for a blank sheet, and since they take .25 (raw) points per incorrect answer, getting all the questions wrong would be 0. (of course, there's about a 5e-16 chance of getting them all wrong, which makes that pretty unlikely as well).

Scores less than 200 are not reported. I think you may need to actually get some wrong (something to do with scaling) to get exactly 200 on one or two of the sections.

According to the people at the SAT office a blank sheet would theoretically be a 200 but in reality the test would be rejected and be considered canceled and no score would be reported. The snoopes people theorize that you are right about needing to get some wrong but from how they word it they can't say for certian.
http://web.archive.org/web/200501040116 ... ml#quest03
http://www.snopes.com/college/exam/sat.asp

gmalivuk
GNU Terry Pratchett
### Re: What-If 0002: SAT Guessing

Jragonlord wrote:As for those questioning the scoring of the SAT, those who've been saying it's a normal distribution are correct. Mean = 500, SD = 100, anything below 3 SDs becomes a 200 automatically (lowest possible score), and anything above 3 SDs is automatically an 800.
Sure, but do they check the distribution for a particular set of tests before scoring those same tests, or do they base current scoring on the distributions from previous tests?
Jragonlord
### Re: What-If 0002: SAT Guessing

gmalivuk wrote:
Jragonlord wrote:As for those questioning the scoring of the SAT, those who've been saying it's a normal distribution are correct. Mean = 500, SD = 100, anything below 3 SDs becomes a 200 automatically (lowest possible score), and anything above 3 SDs is automatically an 800.
Sure, but do they check the distribution for a particular set of tests before scoring those same tests, or do they base current scoring on the distributions from previous tests?

The scaling is calculated per test. Regardless of how any other test has been scored, the distributions are recalculated.

Now, the above statement is for any given test. I suppose that if they were to have someone retake an old test (presumably that they had not taken before), the scaling would have to be recalculated for their results. That last part's conjecture, but I imagine that there could be a statistical relevance to the recalculation, even if it would be tedious.

### Re: What-If 0002: SAT Guessing

Is it possible to get all of the questions right and get below a perfect score? That would suck.

Kain
### Re: What-If 0002: SAT Guessing

I would wager that if enough people got perfect scores such that a perfect score wasn't 3 sigma above the mean, they would probably have to reissue the test, as it would probably indicate some major cheating event occurred.
bitwiseshiftleft
### Re: What-If 0002: SAT Guessing

ThirstyMonkey wrote:I was pretty disappointed by this entry for exactly these reasons. I believe SAT scores are scaled such that 500 is the average score and 100 is the standard deviation. I don't think it's perfect (perhaps intentionally so), but some effort is made toward that effect. Assuming the College Board would continue to abide by these rules despite a much denser score distribution than normal, this would imply that, for each individual section, one would have a 0.13% chance of getting a perfect score (that is, their score would lie 3 standard deviations above the mean.) Thus because the person is guessing randomly, the sections are independent, we would have a 0.13^3 = 0.00246%. Apparently the percentile for a 2400 (as of 2006, according to Wikipedia) is 99.98. So, when people take the test normally, 0.02% get a perfect score. This increase makes sense because a person's results on the three sections is not independent. A person whose intelligence lies "three standard deviations above the mean" (whatever that means) would have a relatively good chance of getting a perfect score on each section.

The difference between my calculated score is a factor of 10, but I think it would be even bigger. Perhaps there are some statistical subtleties due to the tests having different number of questions. (I simply thought of it as: (chance of performing three SDs above mean)^3). Also, the reported rate of 2400s is below the chance of simply of being three standard deviations above the mean, so the official scoring likely isn't as rigidly ruled as we would like to believe.

I believe you mean 0.0013^3 = 0.000000246%. At this rate, if everyone in the US (not just the 17-year-olds) took the test, it'd be about even odds on whether one of them got a "perfect" 2400. It's because of the covariance that people ace it. Also, the test is designed for ordinary students, not outliers, so the real-life scores probably have a heavy tail.

rmsgrey
### Re: What-If 0002: SAT Guessing

Jragonlord wrote:
gmalivuk wrote:
Jragonlord wrote:As for those questioning the scoring of the SAT, those who've been saying it's a normal distribution are correct. Mean = 500, SD = 100, anything below 3 SDs becomes a 200 automatically (lowest possible score), and anything above 3 SDs is automatically an 800.
Sure, but do they check the distribution for a particular set of tests before scoring those same tests, or do they base current scoring on the distributions from previous tests?

The scaling is calculated per test. Regardless of how any other test has been scored, the distributions are recalculated.

Now, the above statement is for any given test. I suppose that if they were to have someone retake an old test (presumably that they had not taken before), the scaling would have to be recalculated for their results. That last part's conjecture, but I imagine that there could be a statistical relevance to the recalculation, even if it would be tedious.

In principle, yes, if someone came along 20 years later and took the same SATs papers you did, then everyone's scores would have to be recalculated, and you'd never be sure what your SAT score actually was. In practice, you'd need at least tens of thousands of people to take the test to have any appreciable chance of shifting the curve enough to change people's scores by more than a couple of points, so they're not going to bother recalculating...

An interesting experiment would be to get a large sample of each year-group to answer questions from the previous year's SATs in addition to their own so you can get an idea of how much, or how little, variation there is between year groups. Another way of getting data between years would be to repeat some of the previous year's questions - say 10%.

The direct comparison between class of 2020 and class of 2021 on the same questions wouldn't tell us much about the relative levels of the two year-groups, but the comparison between the class of 2021's performance on 2020's papers and the class of 2022's performance on 2021's would be a sensible one to use. Of course, it's also an experiment that the more politically minded of the people behind the SATs wouldn't want carried out (unless they had an option on burying the results) - their best-case outcome is that the results show little to no variation year on year, while a result that showed that this year's 2300-scoring students would have only got 2000 last year would pull the rug out from under the SATs...

### Re: What-If 0002: SAT Guessing

As your resident SAT expert, I'd like to point out an error in this What-If:

47 in the newfangled writing section

There are actually 49 writing multiple-choice questions on each test: 35 questions in the 25-minute writing multiple-choice section + 14 questions in the 10-minute writing multiple-choice section. Thus, our denominator is off by a factor of 25.

Also, as others have pointed out, it's possible to miss questions and still achieve a perfect score. Here are some compiled curves from released exams: http://www.erikthered.com/tutor/SAT-Released-Test-Curves.pdf
Indeed, one can typically miss up to two Critical Reading questions and still score an 800 (it doesn't matter whether those two questions were incorrect or omitted: raw scores are rounded to the nearest integer, so 65/67 and 64.5/67 give identical scaled scores). Math sections for which 53/54 results in an 800 are more rare. Of course, all of this is nitpicking, as it's impossible to achieve a perfect score through guessing alone anyhow due to the essay.

Also interesting is the fact that a raw score of zero on any section rarely results in a score of 200 (not shown in the above link, which is apparently only aimed at overachievers). Typical scaled scores corresponding to a raw score of zero range from 210 to 270 (250 Critical Reading and 230 Math on this test: http://www.collegeboard.com/prod_downloads/prof/counselors/tests/sat/2007-08_sat_preparation_booklet.pdf). Sure, a blank test form results in a score of zero, but due to the rounding of raw scores, one can blindly guess on two questions on each section risk-free, resulting in -- at worst -- a raw score of -0.5, which rounds to zero (or with luck, a raw score of 2) on each section!

Arariel
### Re: What-If 0002: SAT Guessing

Kain wrote:I would wager that if enough people got perfect scores such that a perfect score wasn't 3 sigma above the mean, they would probably have to reissue the test, as it would probably indicate some major cheating event occurred.

Not necessarily; a few of the subject tests have 800 at a rather lowish percentile.

pizzazz
### Re: What-If 0002: SAT Guessing

mikewhite wrote:I think a major point missed by the article is that the SAT is not scored on a completely linear scale, and a perfect score does not correlate with having all the answers correct. This flaws the whole article, it's answering a different question (a much easier one I might add) than originally asked.

Several factors are taken into account, not just how many questions an individual got correct but also how everyone else preformed. I think it would have been interesting to investigate how the bell curve for the random scores would be fitted given random answers (would it be a Normal distribution by central limit thm?) , what is the probability is that an individual would be given a perfect score (1600 or whatever it is) given the random answers, and then also suppose everyone else in the world only supplied random answers to the test, what would I need to have scored in the normal SAT test to have achieved a perfect score in that scenario (I'd imagine anything greater than a 1000 in normal conditions would probably get you there, but it would be fun to learn the actual answer!).

I feel this was a halfhearted attempt at the question at best, but given my background in Math and statistics I probably hold it to higher standards. Also it would probably take a whole days worth of work by a both intelligent and motivated person to get all that info, if it was even possible to get in the first place.

I know people have said this earlier in the thread, but I thought I'd flesh it out with suggestions of what I would have wanted to see. Sadly I'm too busy with work to hunt down the answers myself...

There's also the fact that on every test, one math or one reading section is not counted. I'm not sure if this was taken into account or not (Monroe mentioned the number of questions, but I don't know if that's total questions or questions counted).

jjcote
### Re: What-If 0002: SAT Guessing

