Huge repositories of data collected by Internet companies are not accessible to scientists, leading some to complain that studies based on these data can’t be peer-reviewed.