Kiwi Particle Physicist

June 25, 2006

B-Lab and Open Access to Data

JoAnne over at Cosmic Variance has an interesting post on whether our data should be made public. Apparently at SUSY06 Tao Han proposed that “the LHC data should be made available to the community.” Currently most particle physics experiments refuse to release some or all of their raw or processed data publicly.

At Belle we make 1/nb of data available on request (Japanese only) for public outreach. Seeing as this corresponds to less than 1/600,000,000 of the total data, it is unlikely that anything will be found that hasn’t already been seen by the collaboration in the full data set, and we don’t really have to worry about other people going off and writing papers on our data.

As far as I’m aware the data consists of the same quantities we use to do physics analyses: 4-vectors of the charged tracks, neutral pions and photons. I assume it also includes PID information on the charged tracks, so you can see the probability of a given track being an electron, muon, pion, kaon or proton.

The data is mainly used by high school students to search for new resonances. For example, they might combine two proton tracks to search for a doubly charged six-quark state. One small problem is that somebody actually found one. Apparently claims for the discovery of violation of conservation of energy or electric charge are quite common as well. Of course, when you consider the possibility of particle misidentification and the limited acceptance of the detector these are easy enough to explain. I think it’s possible for someone to find just about anything they want to look for if they misinterpret the data correctly enough, which would obviously mean that there would be credibility issues with results published by anyone not on the collaboration.

Openly releasing data like this is great for physics outreach, and for getting students interested and involved in the experiment. I don’t think this is what Tao had in mind when he asked for the 4-vectors from the LHC experiments though. I assume he is asking for all the processed data, so that particle physics theorists and others could do their own analyses.

Given the massive amounts of data involved with Belle or LHC though I can’t see how this could possibly work. Surely the only way would be to cut down the number of events by releasing only selected signal events after an analysis has been performed by the collaboration, which would sort of defeat the point. Having the data out there might improve everyone’s confidence in the experiment, but in practice I just can’t see how anyone could possibly sift through all of it on their own and pick up something missed by the rest of the collaboration.


Post a Comment

Links to this post:

Create a Link

<< Home