Journal of the Society for Psychical Research, (2008) 72, 86-97
by Rupert Sheldrake, Charles Overby and Ashwin Beeharee


Introduction

An apparent sense of being stared at is well known and very common (Sheldrake, 2003a): most people claim to have turned round to find that someone is staring at them; or, conversely, they claim to have stared at someone from behind who then turned round and looked back at them. This non-verbal form of social interaction can be associated with a variety of emotions on the part of the starer, including curiosity, sexual desire and anger (Sheldrake, 2003a). A scientific name for this phenomenon is scopaesthesia, from the Greek words for looking and knowing (Carpenter, 2005).

Is scopaesthesia just a matter of coincidence? Perhaps people sometimes simply turn round by chance, and if they find someone looking at them remember it, but forget the occasions when no one was looking. But perhaps people can sometimes really detect when they are being looked at. The best way to find out is through experimental tests in which people are stared at or not stared at in a randomized sequence of trials. The results of more than 30,000 trials imply that scopaesthesia is real (for a review, see Sheldrake, 2005). A meta-analysis of the experimental data has shown a highly significant effect overall (Radin, 2005).

In the simplest and most widely replicated procedure, people work in pairs, with a staree and a starer. In a randomized series of trials, the blindfolded starees sit with their backs to the starers, who either stare at the back of the starees' necks, or look away and think of something else. A mechanical sound signal, such as a bleep, marks the beginning of each trial. The starees guess whether they are being looked at or not, and their guesses are usually made within 10 seconds of the start of the trial. Each guess is either a hit or a miss, and is immediately recorded on a score sheet by the starer. A test session usually consists of 20 trials, and takes less than 5 minutes. The average hit rate by guessing alone would be 50%. In fact, in a long series of tests conducted by a number of different investigators, in more than 30,000 trials the overall hit rate was 54.7% (Sheldrake, 2005). Significant positive results were also obtained when starers and starees were separated by windows or one-way mirrors (Sheldrake, 2000; Experiment 1 by Colwell et al., 2000).

In some of these tests, a counterbalanced randomization procedure was used that deviated from "genuine randomness". Colwell et al. (2000), Marks and Colwell (2000) and Marks (2003) suggested that in trials that used these particular randomized sequences, and when starees were tested repeatedly, and when feedback was given, the starees' above-chance hit rate could have been due to implicit learning of a hidden structure in the sequences. However, this hypothesis is not supported by the fact that similar above-chance hit rates were also found in thousands of trials with other randomization methods, including the tossing of coins (Sheldrake, 1998, 1999). There were also positive, statistically significant hit rates in trials without feedback (Sheldrake 1999, 2000, 2001). Also, in most experiments starees were tested only once, and so there could have been no such implicit learning (Sheldrake, 2000, 2001, 2003b, 2005).

In this paper we describe an automated version of the standard staring test, using a computerized randomization protocol. For each trial, the system tells the starer whether to stare or not, and after a sound signal that indicates the trial is beginning, the staree guesses out loud by saying "looking" or "not looking". The starer records whether the guess is a hit or a miss on the computer. This procedure is repeated until the test is completed, with a total of 20 trials. The data from these tests are stored on an online database. This simple procedure makes the experiment easy to carry out in colleges and schools, and also by people at home.

We believe this is the first time a staring experiment has been carried out with an automated online system. Because these trials were unsupervised, their results cannot be taken as persuasive evidence for a genuine sense of being stared at. But they enable the effect of a range of variables to be explored, and can be of help in improving the design of automated tests. One advantage of making well designed automated tests available online is that they can be used as a basis for practical classes in schools and colleges, and also provide a basis for student projects.

This experiment was hosted on Rupert Sheldrake' s web site from October 2003 onwards (www.sheldrake.org). We describe here the results up to January 1, 2007. There were 951 tests altogether, with a total of 19,020 trials. The variables included the use of blindfolds or not, the giving of feedback or not, the sex and age of the starees, and the relationships between starees and starers.

Methods

The test involved a series of 20 trials in which the starer was instructed at random to stare or not. In an improved, second version of the test, the computer gave a sound signal to indicate to the participants when each trial began. The staree guessed whether she was being stared at or not, and the starer entered this guess onto the computer by clicking a "Correct" or "Incorrect" button.

Participants were recruited online through Rupert Sheldrake's (RS) web site, and some were encouraged to do the test by RS at lectures and seminars, or in the course of radio interviews in the United Kingdom, Canada and the USA. Most starer/staree pairs were family members, and about half the participants were under 20 years old.

To start with, one of the participants had to register, filling in both people' s names, email addresses, sexes, ages, relationship (e.g. friends, parent/child, siblings), and whether or not the staree was wearing a blindfold and/or receiving feedback. After registering, the starer pressed a button labelled "start trial" and the first instruction appeared, either "Stare" or "Do not stare". After the staree had made her guess and the starer has pressed the "Correct" or "Incorrect" button, a button labelled "Next trial" appeared which the starer pressed when he was ready; the next instruction then appeared, and so on for a total of 20 trials.

In the second version of the test, a built-in sound signal indicated the beginning of each trial. As part of the registration process, there was a button labelled "click here to test sound". If the beep was audible, the person registering clicked a "yes" button, if not, a "no" button.

The system was programmed in Perl and JavaScript. It was thoroughly tested to support most Internet browsers. For randomization of the looking and not-looking trials, it used the rand function in the Perl language, which is an inbuilt pseudo-random number generating function. This function calls upon the random number generator of the operating system, which was the Linux system. The system provided a list of 20 "stare" or "do not stare" values, which were stored on the computer throughout the test, but which were, of course, invisible to the participants. This randomization was unconstrained, and could have contained more than 10 staring or non-staring trials, although on average there were 10 of each. The data were stored on a XML file, with a Perl front-end that helps in simple data analysis activities such as search and sort by different criteria, including the age and sex of the starers and starees, the relationship of the participants, the use of the sound signal or not, and the use of blindfolds and feedback or not.

The test was described as follows to participants by RS on his web site:

Instructions for Conducting the Staring Test

This experiment involves people working in pairs, one the looker and the other the subject. The subject sits with his or her back to the looker, at least 2 meters away, and wears a blindfold (or if this is not possible keeps his or her eyes closed). I have found it convenient to use blindfolds of the kind used on airlines. The looker either looks or does not look at the subject in a series of 20 trials according to a random sequence. The looker sits near the computer screen on which the instruction for each trial will be displayed, and if the instruction says STARE, the looker looks at the back of the subject's neck. If the instructions are DO NOT STARE, the looker looks away and thinks about something else. With some computers, the beginning of the test will be signalled to the subject by means of a sound signal; but if your computer does not make a sound signal you will need to give one by means of a mechanical sound or beep. Here is a summary of the procedure:

  1. Fill in the User information (all entries MUST be completed). As part of this registration process there is a sound test. If your computer does not make a sound when you press the test button, you will need to signal the beginning of each trial to the subject by means of a mechanical click or beep. (Note: there may be a delay of several seconds between pressing the sound test button and the beep.)
  2. Click on the Begin Experiment button
  3. Follow the Instructions for Staring and/or Not Staring. If you computer does not give a sound signal, signal the beginning of the trial to the subject by means of a mechanical click or beep
  4. Ask the subject to respond 'Looking' or 'Not looking'. It is best to guess quite quickly, within 5-10 seconds.
  5. If the subject's response is correct, enter Correct, and if it is incorrect, enter Incorrect.
  6. You can do the experiment with or without feedback. If you decide to give trial-by-trial feedback to the subject, tell him or her if the guess is right or wrong
  7. When all 20 trials are complete, submit the data for permanent storage
  8. You can then either log off, do the test again with the same subject and looker, or switch roles

Statistics

The data from each 20-trial test were expressed in two different ways: first, the "score" method in terms of the total number of hits and misses; second, the "sign" method, in terms of the signs of the results for each 20-trial test. A total of 11 or more hits out of 20 counted as positive (+), 9 or less as negative (-) and 10 as equal (=). The sign method has the advantage that it gives an equal weighting to each participant, whereas a few high or low scoring participants could bias the total score method.

The total scores were analysed statistically using the exact binomial test, with the null hypothesis that the hit rate would be at the chance level of 50%.

The analysis of the sign data took into account only the number of + and tests, and ignored = tests. These data were also analyzed by the exact binomial test, with the null hypothesis that the total number of + tests would be 50% of the total number of + and - tests.

Comparisons between two sets of data (e.g. with feedback or without) were made by means of the Fisher exact test.

Results

Overall results

The combined results from all 951 tests showed a hit rate of 56.9%, very significantly above the chance expectation of 50% (p << 1 x 10-6). The 95% confidence interval for this hit rate was from 56 58%.

By the sign method, there were 572 + tests, 247 tests, and 132 = tests. The excess of positive over negative signs was highly significant (p << 1 x 10-6).

Effects of the automatic sound signal

The first version of this test had no automatic sound signal to indicate the beginning of each trial, so starers had to use some other signal, preferably a mechanical clicker or bleeper. If they did not have such a device, they had to improvise. Clearly signals such as tapping on the table could have given subtle cues to the staree, even if the starer was unaware of giving them. For this reason we developed a second version of the test in which the computer itself emitted a standard sound signal for the beginning of each test, two seconds after the starer received the instruction whether to look or not. However, not all computers had the capacity to emit sounds or the software to produce the sound signal. When registering for the test, participants pressed a button to activate a sample sound signal. They were then asked if their computer made a sound or not, and their response was recorded on the database of results.

We compared the results of tests with and without the automatic sound signal. There were 513 tests with and 438 tests without the sound signal (Table 1). The hit rate without the sound signal was very significantly higher than with the signal: 60.0% as opposed to 54.3% (p < 1x10-6).

Table 1. Comparison of tests with and without an automatic sound signal ("auto signal"). The data are expressed both in terms of the total scores and in terms of the number of tests with scores above the chance level (+) and below the chance level (-). The p values for the statistical comparisons of results with and without the automatic sound signal are in the columns headed "p diff.".

Comparison

Tests

Hits %

p

p diff.

+

-

p

p diff.

Auto signal

513

54.3

<1x10-6

283

150

<1x10-6

No auto signal

438

60.0

<1x10-6

<1x10-6

289

97

<1x10-6

<1x10-6

The higher hit rates without the automatic sound signal could have been due to the leakage of information from starer to staree from the way the improvised signal was given, so all the following comparisons involve only tests in which a sound signal was given automatically.

The distribution of hits

The frequency distribution of hits showed fairly symmetrical pattern, with a peak at 11 hits out of 20 (Figure 1). However, there were several tests with very high hit rates, showing up as a small blip at the right-hand end of the graph. Either the starees in these tests were exceptionally sensitive, or the participants were cheating or frivolously entering false data. In order to avoid the latter possibilities, we excluded the data from all tests in which the hit rates were 18, 19 or 20. Following this conservative approach, all the following analyses are based on data from tests with sound signals and with hit rates of 17 or below. After these exclusions, there was a total of 498 tests, with 5281 hits out of 9960 trials (53.0%; p <1x10-6; 268+ 150- 80=).

Figure 1. Distribution of scores in tests with automatic sound signals, showing the number of tests in which the scores were 0. 1. 2. 3…. out of 20.


The pattern of results in looking and not-looking trials

In the 498 tests with sound signals, and excluding tests with hit rates above 17, the hit rates were 55.3% in looking trials and 50.8% in not-looking trials (p < 1x10-6; Table 2). A similar pattern was found in previous experiments, with higher hit rates in looking than in not-looking trials (Sheldrake, 2005).

Table 2. Comparison of looking and not-looking trials all conducted with an automatic sound signal. The data from a total of 513 tests are expressed both in terms of the total scores and in terms of the number of tests with scores above the chance level (+) and below the chance level (-). The p values for the statistical comparisons of results from looking and not-looking trials are in the columns headed "p diff.".

Comparison

Trials

Hits %

p diff

+

-

p diff

Looking

5100

56.6

297

147

Not looking

5260

52.0

<1x10-6

251

193

0.0003

Male and female starees and starers

Female starees scored slightly higher than males, 53.1% as opposed to 52.9%, but this difference was not significant statistically. In tests with female starers, the hit rates were higher than with male starers, 53.4% as opposed to 52.3%, but again this difference was not significant statistically. The higher hit rates with female starers were almost entirely confined to tests in which they were staring at males (Table 3): with female starers, male starees' hit rate was 54.6%, as opposed to only 51.4% with male starers. With female starees, the sex of the starer made no difference (Table 3).

Table 3. Comparison of tests with male and female subjects, male and female starers, and with different combinations of starers and subjects (Starer/subject). In all these tests there were automatic sound signals and data from test in which the scores were over 17 out of 20 have been excluded. The p values for the statistical comparisons of results are in the columns headed "p diff.".

Comparison

Tests

Hits %

p diff.

Male subjects

135

52.9

Female subjects

363

53.1

NS (0.83)

 

 

 

 

Male starers

152

52.3

Female starers

346

53.4

NS (0.31)

 

 

 

 

Male/male

70

51.4

Female/male

65

54.6

NS (0.10)

 

 

 

 

Male/female

82

53.1

Female/female

281

53.1

NS (0.99)

Effects of starees' age

Starees of different ages had similar hit rates (Table 4) and no clear trend was apparent. The 14-16 age group had a lower hit rate than younger and older starees, and there were small differences between other age groups, but all hit rates were in the range 52.2% to 54.2%, except for the oldest age group, >50, where the hit rate was at chance.

Table 4. Hit rates with subjects in different age groups. In all these tests there were automatic sound signals, and data from tests with scores over 17 out of 20 have been excluded.

Age

Tests

Hits %

p

3-13

30

53.0

NS

14-16

127

52.2

0.01

17-20

65

54.2

0.002

21-30

77

53.8

0.002

31-40

48

53.5

0.02

41-50

130

53.2

0.005

>51

21

50.2

NS

Effects of starer-staree relationships

The highest hit rates were with parent-child pairs and pairs of siblings (Table 5). In the case of parent-child pairs, there was a significantly (p = 0.03) higher hit rate with parents as starees (24 parents; 58.8%) than with children as starees (22 children; 51.5%).

Table 5. Hit rates with different relationships between subjects and lookers. In all these tests there were automatic sound signals and data from test in which the scores were over 17 out of 20 have been excluded.

Relationship

Tests

Hits %

p

Parent/child

46

55.4

0.0007

Siblings

31

55.5

0.004

Partner/spouse

61

52.5

0.04

Other family

189

52.6

0.0007

Friends

151

52.8

0.001

Effects of blindfolds

Hit rates were lower when starees wore blindfolds than when they did not. The average hit rate for the 349 blindfolded starees was 52.5% and for the 149 starees without blindfolds 54.2% (Table 6), but this difference was not significant statistically.

Table 6. Comparison of tests with and without blindfolds, and with and without feedback. In all these tests there were automatic sound signals and data from test in which the scores were over 17 out of 20 have been excluded. The p values for the statistical comparisons of results are in the column headed "p diff.".

Comparison

Tests

Hits %

p

p diff.

Blindfold

349

52.5

0.00001

No blindfold

149

54.2

0.000002

NS (0.11)

 

 

 

 

 

Feedback

258

54.4

<1x10-9

No feedback

240

51.6

0.02

0.005

Effects of feedback

Hit rates were higher when starees received trial-by-trial feedback than when they did not. The average hit rate with feedback was 54.4% compared with 51.6% without (Table 6). This difference was significant (p=0.005).

Changes in scores as tests progressed

We compared the hit rates in the first 10 trials with those in the second 10 trials to see if starees' hit rates changed as the test progressed. Overall, there was a slight decline, from 53.2% to 52.9%, but this difference was not significant statistically. Neither the presence of absence of feedback nor the use of blindfolds resulted in a significant increase or decrease of scores in the second half of the test (Table 7).

Table 7. Comparison of scores in the trials 1-10 and the 11-20 in all tests, tests with and without blindfolds, and with and without feedback. In all these tests there were automatic sound signals; data from tests in which the scores were over 17 out of 20 have been excluded. The p values for the statistical comparisons of results are in the column headed "p diff.".

Comparison

Tests

1-10 Hits%

11-20 Hits%

p diff.

All

513

53.2

52.9

NS (0.74)

 

 

 

 

 

Blindfold

349

52.9

52.1

NS (0.74)

No blindfold

149

53.8

54.6

NS (0.65)

 

 

 

 

 

Feedback

258

54.7

54.0

NS (0.63)

No feedback

240

51.6

51.5

NS (0.97)

Discussion

The fact that hit rates were significantly lower with an automated sound signal at the beginning of each trial (Table 1) suggests that when starers were using signalling methods of their own choice they could have conveyed conscious or unconscious cues to the starees. Because of this possibility, we took into account only the data from tests involving automated sound signals. We also took into account only those data from tests in which stares scored 17 or less out of 20, to guard against the likelihood that elevated scores of 18, 19 or 20 (Figure 1) could have been a result of cheating or a frivolous entry of scores. After these exclusions, there was a total of 498 tests, with a hit rate of 53.0% (p <1x10-6); and by the sign method 268+ 150- 80= (p<1x10-6).

Given that these tests were unsupervised, it is possible that some of the participants with hit rates below 17 were also cheating. Some could even have been sceptics who entered false data in order to prove that entering false data was possible. But the pattern of results makes it unlikely that cheating or attempted sabotage by sceptics could have played a very large part. The number of people who scored above chance levels was 268, compared with 150 who scored below chance. It seems improbable that so many people cheated or that so many sceptics took part in order to enter false results. Also, if they had done so, they would have had to coordinate their efforts so as to produce the more or less symmetrical distribution curve of scores shown in Figure 1. Such a complex conspiracy seems highly improbable, given that the experiment took place over more than 3 years, with participants in many different countries.

As in previous research (Sheldrake, 2005), there was a higher hit rate in the looking trials than in the not-looking trials (Table 2). This difference can be explained either as a consequence of a response bias in favour of guessing "looking" as opposed to "not looking" (Schmidt, 2001), and/or because it may be easier to detect the presence of stares than the absence of stares: in other words, a signal may be easier to detect than the absence of a signal (Sheldrake, 2005).

Perhaps the greatest defect in the design of this experiment was that participants submitted their scores only after they had completed all 20 trials. This means that incomplete sets of data were not recorded; also some people may not have submitted complete sets for one reason or another. There is no way of knowing how much unsubmitted data there was, or whether scores in unsubmitted tests were higher or lower than in the submitted data. In future online staring tests, the programming should be done in such a way that scores are entered into the computer' s database trial by trial, so that no selective reporting is possible. This procedure has already been adopted in other online experiments. In an online telepathy test conducted through RS' s web site, the hit rate in incomplete tests was considerably higher than in complete tests (Sheldrake & Lambert, 2007).

Nevertheless, in spite of these limitations, the data show a number of patterns that are unlikely to be due to a result of cheating or selective reporting of the data, and may be of relevance for future research on this topic. There were similar hit rates with male and female starees, and with male and female starers (Table 3). The highest hit rates occurred between siblings and with parent/children combinations (Table 5), particularly with parents as starees. This is in general tendency for emotionally close or closely related people to score higher in ganzfeld tests (Broughton & Alexander, 1997) and staring tests (Sheldrake, 2001a). In general there was little variation in sensitivity with age (Table 4). Hit rates were reduced by the use of blindfolds, but not significantly so (Table 3).

Feedback significantly increased hit rates (Table 2). This could imply that feedback was enabling starees to learn to detect subtle cues, such as sounds made by the starer. If so, there should have been an increasing hit rate as the test proceeded. But in fact there was no such change (Table 7). In fact feedback enhanced the hit rate right from the outset. This may have been simply because it made test more interesting to do, and more engaging for the participants.

In tests with no feedback and with blindfolds the hit rate was 1879 out of 3700 (50.8%; NS), and by the sign method 85+ 62- 38= (p=0.03). This low hit rate could indicate that under "optimal" conditions the elimination of possible artefacts reduced sensitivity to very low levels, or else it might show that being blindfolded and receiving no feedback made subjects uncomfortable, or bored, and reduced their motivation and/or attention. In conclusion, this online test enabled the effects of a range of variables to be compared, and made it clear that in future automated online tests, it will be important to have a built-in sound signal, and also to collect data trial by trial. One advantage of trial wise collection would be that the number of seconds taken for each trial could be monitored, in order to investigate whether hit rates were higher or lower with fast as opposed to slow responses. Also in future tests, it would be good to record the time at which the tests were done, so that the possible effects of variables like local sidereal time could be explored. The present system records only the date of the tests.

Although unsupervised online tests have the inherent disadvantage that they cannot eliminate possible cheating, they have the advantage that they permit the effects of a range of variables to be investigated with a wider variety of participants than is possible in a laboratory setting. Improved automated online procedures could also be used under supervised conditions in schools and colleges, and would provide a good basis for practical classes in which students could explore the sense of being stared at for themselves, and at the same time learn about the importance of statistical methodology and controlled experimentation.

Acknowledgements

We are grateful to all those who took part in these tests, and to John Caton for maintaining the web site (www.sheldrake.org) on which the test was hosted. Rupert Sheldrake thanks the Institute of Noetic Sciences, Petaluma, California, Mr. Addison Fischer, of Naples, Florida, and the Perrott-Warrick Fund, administered by Trinity College, Cambridge, for financial support.

Rupert Sheldrake,
Perrott-Warrick Project,
20 Willow Road,
London NW3 1TJ, UK

Charles Overby,
Halvemaansteeg 12,
1017 CR Amsterdam,
Netherlands

Ashwin Beeharee,
Department of Computer Science,
University College,
London WC1A 6BT, UK

References

Broughton, R. & Alexander, C.H. (1997) Autoganzfeld II: An attempted replication of the PRL ganzfeld research. JP, 61, 209-226.

Carpenter, R.H.S. (2005) Does scopesthesia imply extramission? Journal of Consciousness Studies 12, 76-77.

Colwell, J., Schröder, S., & Sladen, D. (2000) The ability to detect unseen staring: A literature review and empirical tests. British Journal of Psychology 91, 71-85.

Marks, D., & Colwell, J. (2000) The psychic staring effect: An artifact of pseudo randomization. Skeptical Inquirer (September/October), 41-49.

Marks, D. (2003) What are we to make of exceptional experience? Part 3: unseen staring detection and ESP in pets. The Skeptic 16, 8-12.

Radin, D. (2005) The sense of being stared at: a preliminary meta-analysis. Journal of Consciousness Studies, 12, 95-100.

Schmidt, S. (2001) Empirische Testung der Theorie der morphischen Resonanz - Können wir entdecken wenn wir angeblikt werden? Forschende Komplentärmedizin 8, 48-50.

Sheldrake, R. (1998) The sense of being stared at: experiments in schools. JSPR, 62, 311-323.

Sheldrake, R. (1999). The 'sense of being stared at' confirmed by simple experiments. Biology Forum, 92, 53-76.

Sheldrake, R. (2000) The 'sense of being stared at' does not depend on known sensory clues. Biology Forum 93, 209-224.

Sheldrake, R. (2001a) Experiments on the sense of being stared at: The elimination of possible artefacts. JSPR, 65, 122-137.

Sheldrake, R. (2001b) Research on the sense of being stared at. Skeptical Inquirer, March/April, 58-61.

Sheldrake, R. (2003a) The Sense of Being Stared At, And Other Aspects of the Extended Mind. London: Hutchinson.

Sheldrake, R. (2003b) The need for open-minded scepticism: A reply to David Marks. The Skeptic. 16, 8-13

Sheldrake, R. (2005) The sense of being stared at: Part 1: Is it real or illusory? Journal of Consciousness Studies 12, 10-31.

Sheldrake, R. & Lambert, M. (2007) An automated online telepathy test. Journal of Scientific Exploration 21, 511-522.