Confessions of a Heavy User
Dr. Maria Mills, a postdoc in NHLBI’s Laboratory of Single Molecule Biophysics, and her colleagues in senior investigator Dr. Keir Neuman’s group, are looking at how a protein interacts with a small molecule.
“We wanted to see whether this molecule changes the structure and dynamics of the protein,” she explained. “We do an atomistic complex simulation where every part of the protein and every part of the molecule—the physical equation of them—are all explicitly calculated…It’s very computationally expensive. It’s a massive amount of information that’s being calculated very quickly over and over again.”
Earlier this year, CIO Alastair Thomson, one of the folks who manages IT resources at NHLBI, approached Mills with an idea for a 2-month pilot project using Biowulf, a 20,000+ processor Linux cluster in NIH’s high-performance computing facility.
“They asked me to conduct a head-to-head comparison of my computational work on Biowulf and on the cloud to see if there were any advantages in using the cloud,” she said. “In terms of computational time, the performance was similar, but the cloud did save me time.”
With the cloud, she said, “you design a computer system that you want at the time and it’s immediately available to you. With Biowulf, we have a huge group of people sharing processors on a certain architecture and you have to wait until the processors that you need are free. Some days it’s quick. Some days you might be waiting a day or 2 days before your job’s run. You just have to wait it out. And since I tend to use a lot of processors for what I do—256 to 512 processors—sometimes I wait quite a while. And honestly, I sometimes feel like I’m hogging [the processors] a bit, because my jobs can take 60 hours.”
The work Mills is doing is considered basic research without a direct translational application. However, the specific protein system she’s analyzing is “involved in repairing problems that happen when DNA becomes too tangled, leading in humans to severe genetic defects, premature aging and susceptibility to cancer,” she said. “We’re not specifically looking for cures or treatment. We’re just trying to understand the system and maybe down the line people can use the information to help people who have these disorders.”
The lab uses what’s called a “magnetic tweezers manipulation technique,” Mills said. “We take a small piece of DNA, attach a magnetic bead to it and we can use the tweezers to manipulate the DNA. The proteins [we’re studying] reorganize DNA, unfold it, cut it and wrap it around itself. We can use these tweezers to measure what’s happening to DNA in the presence of these proteins. We can’t really see what’s actually happening to the proteins. That’s where the simulation comes in. It gives us a way to visualize the protein molecules since we can’t watch that directly.
“The magnetic tweezers experiments we’ve done indicate that the molecule we are looking at in the simulations stimulates the protein’s activity,” she said. “Hence our desire to understand how it affects the protein dynamics.”
Mills’ part of the process makes her a heavy processor user, which in turn made her an ideal candidate for testing a third-party IT provider. For the pilot, she worked closely with cloud provider technicians to design the best computer structure and write scripts to make things move faster.
“At first it took some optimizing, but that was just sort of the learning curve,” she concluded. “The cloud was really nice, because it was resources temporarily dedicated to me, so I didn’t have to worry about waiting for them and I didn’t have to worry about other people not having access to resources they needed. It was a virtual computer cluster that I built and I’m using, and when I’m done with it I just shut it off.”
Mills hopes to publish results from her pilot soon.