Lessons from the lab: ALL data matters

Every day, graduate students in science and engineering fields generate data of varying quality, most of which – especially negative results – are never published. Journal referees and editors are the primary arbiters of what is the most interesting or novel to the research community, and the nature of the peer review and journal acceptance process inevitably leads to the exclusion of some potentially valuable results. Excluding a sometimes-significant portion of results from publications is a detriment to researchers and to research progress because others can’t glean a comprehensive view of all the work done to learn from past mistakes. 

Every chemistry graduate student pursuing a PhD must pass a candidacy exam to be considered a PhD candidate. This is usually in the form of a presentation or written paper that is reviewed by a committee of four or five professors. Passing this exam indicates that the committee has confidence in your abilities and direction to obtain a PhD, leaving you with the task of making a contribution to science over the next two to four years. My candidacy presentation started at 8am on a Monday. I spent an hour and a half being questioned by my five professor committee. I was promptly sent out of the room to allow for deliberation. Twenty long minutes later, I was told that I had conditionally passed my candidacy exam, with an emphasis on conditionally. My committee informed me that the few positive results I had presented were not indicative of two years of work. Years of reading positive results in the literature and seeing post-doctoral researchers successfully pump out fantastic results showed me what was valued. So, I made my presentation with a major focus on the few positive results I had obtained. In subsequent talks with committee members, I learned that they were expecting to see all the negative results that I had generated, and how i had overcome experimental hurdles to obtain my few positive results. Had I included a summary of my negative results as well, it could have been a very different exam. This experience changed my perception of the importance of negative results, and the process by which you learn from them in pursuing positive results. This notion became even more apparent in the lab when I took on a project that required reproducing a previously published work from our research group. 

Reproducing results from past scientific publications is a common starting point for many research projects. It provides a basis for comparison and often validates a material or process for further application in the project. As a graduate student in chemistry, I started a solar to fuel conversion project by trying to reproduce a seminal paper published in our own research group a decade earlier. The fabrication method involved a number of steps that would produce uniquely shaped silver nanowires in an array that held promise as a light harvesting material. My professors remembered the process as being robust, with straightforward methods that should take only a week or two to reproduce and extrapolate to other materials. My initial attempts to utilize the process to produce the structures were unsuccessful. With two undergraduates working with me, we spend months changing half a dozen experimental parameters, purchasing fresh precursor materials, and still were not able to reliably obtain the structures. Even speaking with the first author over the phone didn’t solve the problem, he said he didn’t remember it being particularly difficult and that there wasn’t any trick to consistently produce the structures. What was made clear was the hundreds of samples produced over the course of perfecting the process prior to publishing the results. Ultimately, the answer turned out to be a longer aging step than reported in the publication. In the end, reproducing the work cost hundreds of lab hours and thousands of dollars in microscopy characterization time. The biggest cause of this was figuring out how each parameter in the process explicitly impacted the structures. If only I could have seen the data from the hundreds of samples analyzed in producing the original work, then I might have gleaned some valuable insights into what variables to modify. 

All data matters: they are an essential part of the research process and should be accessible to anyone viewing a published work. In the early days, scientists and engineers came together to dispute and validate claims made by others in the field. Today, the digital revolution in data makes it easier than ever to communicate, organize, and access data. This leaves perception as the biggest barrier to change. Here at Citrine, we care deeply about transparency through research data, and provide a platform to store, organize, and access all the results generated in producing great research.