AHEAD IN THE CLOUD|
CIO Thomson Guides NHLBI into Off-Site Computing
Scientific data is not what it used to be. It used to come in megabytes and then gigabytes, but today terabytes and petabytes are becoming the norm and innovations in science seem to consume more bytes every day. That’s one reason Alastair Thomson, NHLBI chief information officer, and his NIH colleagues began looking to the cloud for IT resources.
“I remember back around 1992 or so I was ordering 1-gig servers and I thought, ‘How in the world will anyone ever use that much space?’” Thomson recalled. “Now we’ve got a new microscope going into Bldg. 14F that uses a massive amount of image data. We’re talking way past gigabytes or even terabytes these days.”
As technology gets more sophisticated, it also begins to require more resources to keep up with it.
Cloud computing—paying a third party to handle IT infrastructure off-site—has been around awhile, nearly two decades according to some estimates. In recent years, however, those in charge of providing IT resources for large organizations have increasingly investigated remote use of hardware, storage capacity, energy for electricity and cooling, speed and power.
“You’re only paying for what you actually need,” Thomson explained. “There’s a big advantage in that the cloud is elastic. You use only what you need when you need it. You don’t have to keep paying for energy and resources you’re not using. It helps drive down the costs so we can invest more in the really valuable resource—people’s brains.”
For its first forays into the cloud, NHLBI tested the waters with what Thomson called “mundane uses, internal applications” like the institute’s web site. Its move to the cloud was virtually seamless.
“When we relocated the web site, we called it the biggest change you didn’t see,” Thomson recalled. “It was completely transparent to outside users.”
One use of the cloud seemed to give rise to the next and the next. Thomson immediately calls to mind two successes in scientific research. “Once developers of an image reconstruction software called the ‘Gadgetron’ began tapping the cloud’s power [see Software Forecast Bright with the Cloud] it became clear that there was real potential for new science using the cloud,” he pointed out.
Another application—involving big data—recently finished a pilot testing the cloud against traditional onsite computing [see Confessions of a Heavy User].
Thanks to the success of the Gadgetron and other research, when approached with requests for more IT resources, Thomson now routinely steers investigators toward trying the cloud to meet scientific needs.
There When You Need It
Thomson and his colleagues also found the cloud to be a good alternative storage space for infrequently used data.
Once an investigator concludes a research study, he explained, and the data from the study is published, that PI most likely moves on to the next project and may not revisit the original data for months, years or ever. Still, the now-dormant data has to be properly stored for any potential reference, taking up valuable and expensive server capacity.
“We found that 80 percent of some of our data had not been accessed in 2 years,” Thomson said. “So, we will put it on the cloud, at half the cost of storage space in-house. Savings like that, it’s important to us.”
Soon some of NHLBI’s in-house applications such as the Tracking system for Requests And Correspondence or TRAC, currently running in the institute’s data center, will head to the cloud so they benefit from the high reliability and ability to support NHLBI’s continuity of operations plan.
That’s not to say computing via off-site contractor happened for Thomson overnight without worry. Concern for security of IT data, for example, was a big consideration. “That’s one reason I was hesitant to drive too fast too early,” he said.
However, FedRAMP, the Federal Risk and Authorization Management Program, relieved much of the worry. The government- wide program run by GSA keeps a list of about two dozen cloud providers that have all been vetted and deemed safe-for-use by federal agencies. Big name IT companies such as Amazon, Google and Microsoft all have cloud vendor products on FedRAMP’s list and therefore are well-equipped to tackle potential headaches—power outages or hack attempts, for example—that come with maintaining a large computing facility.
FedRAMP is not the whole story though and NHLBI’s IT security team works to conduct its own assessment and authorization process to ensure that the cloud meets NIH requirements.
“What we learned is, with the phenomenal amount of money cloud providers spend,” Thomson said, “there’s no way NIH could afford the same level of security they already have in place.”
Coming up next for Thomson and other agency CIOs is a presentation of their initial experiences with cloud computing that they will share during the NIH Research Festival.
Also on the horizon are more collaborations among ICs and additional pilots with other intramural researchers on such big data topics as genomics, connecting datasets from large populations and sharing such resources with scientists around the world.
“Everyone is moving this way at some speed,” Thomson said. “Everyone is at least considering it.”
Software Forecast Bright with the Cloud
A few years ago, Dr. Michael Hansen and his NHLBI research group developed a new magnetic resonance imaging (MRI) framework called the Gadgetron. The MRI software takes an image’s raw signal data and quickly reconstructs them so clinicians can review images and diagnose disease.
MRI is a relatively slow and motion-sensitive technique. Patients have to hold their breath, for example, and there’s a lot of waiting involved, to see whether the clinician got the image clearly or has to repeat the procedure.
The Gadgetron software, however, can be used to shorten the wait time considerably and patients—oftentimes children—no longer have to hold their breath during the MRI scan.
Researchers at NIH, Children’s National Medical Center in Washington, D.C., and around the world routinely use Hansen’s software framework, which is available as open source software, capable of running on a wide variety of platforms.
Putting the Gadgetron to work, however, consumes a lot of IT resources, including up to 50 high-powered computers, Hansen said. The on-demand power offered through remote computing is what attracts the users of the framework.
“From a research perspective, where regular [usage rates] may change, the agility of the cloud is really valuable,” said Hansen, chief of the institute’s Laboratory of Imaging Technology and leader of a 5-person team of software engineers.
He recalled two “Aha!” moments in considering the cloud: One, when his team realized the extreme amounts of computational power they would need to run the Gadgetron, and two, when he figured the feasibility, cost and time of assembling his own data center on campus.
“The ability to change directions quickly has real scientific value,” noted Hansen, who latched onto the cloud idea early.
“The strength of the cloud is not in our ability to transfer lots of data there,” he explained, “it is the amount of computing power available. In the case of MRI, a typical clinical scan may be just a few gigabytes, but the processing time is long and that is where cloud computing is useful.”
Applications like the Gadgetron could potentially be deployed in a more traditional cluster-computing facility, but users would have to reserve time and wait for their reconstruction jobs to complete. In a clinical environment, such a deployment strategy is not feasible.
“The bottom line is this,” said Hansen, “when you have the patient on the scanner you cannot wait for a batch job to complete. You need dedicated resources to process the data as the scan is taking place. That is costly since you would need a large amount of computing power per scanner. A cloud deployment allows us to both scale flexibly when there is demand—pay for what we need—and share resources easily with multiple scanners/sites.”
In the context of the total length of a patient study, he explained, the costs associated with clinical staff, nurses and anesthesiologists become an important part of the equation.
“Say a given scan sequence based on breath-holding takes 10 minutes,” Hansen points out. “With free-breathing scanning and Gadgetron reconstruction, we might be able to turn that into a 5-minute sequence. The cloud computing cost is say $10 per hour while we do the scanning, but the cost of just 5 minutes of extra scan time is much more than that. So one can make an argument that cloud computing is also cost-effective.”
Combining scientific innovation with IT cost efficiency is what NHLBI CIO Alastair Thomson and his fellow CIOs want to provide for all NIH researchers.
“That’s very much how I see my role here,” Thomson said. “I try to give the PIs the tools and environment they need and get out of their way.”
Satisfied that Gadgetron users won’t realize the work is being done on the cloud, Hansen’s group now is fine-tuning the application and looking toward expanding availability.
“We hope to scale up and support the project to a stage where vendors pick it up,” he said.
Confessions of a Heavy User
Dr. Maria Mills, a postdoc in NHLBI’s Laboratory of Single Molecule Biophysics, and her colleagues in senior investigator Dr. Keir Neuman’s group, are looking at how a protein interacts with a small molecule.
“We wanted to see whether this molecule changes the structure and dynamics of the protein,” she explained. “We do an atomistic complex simulation where every part of the protein and every part of the molecule—the physical equation of them—are all explicitly calculated…It’s very computationally expensive. It’s a massive amount of information that’s being calculated very quickly over and over again.”
Earlier this year, CIO Alastair Thomson, one of the folks who manages IT resources at NHLBI, approached Mills with an idea for a 2-month pilot project using Biowulf, a 20,000+ processor Linux cluster in NIH’s high-performance computing facility.
“They asked me to conduct a head-to-head comparison of my computational work on Biowulf and on the cloud to see if there were any advantages in using the cloud,” she said. “In terms of computational time, the performance was similar, but the cloud did save me time.”
With the cloud, she said, “you design a computer system that you want at the time and it’s immediately available to you. With Biowulf, we have a huge group of people sharing processors on a certain architecture and you have to wait until the processors that you need are free. Some days it’s quick. Some days you might be waiting a day or 2 days before your job’s run. You just have to wait it out. And since I tend to use a lot of processors for what I do—256 to 512 processors—sometimes I wait quite a while. And honestly, I sometimes feel like I’m hogging [the processors] a bit, because my jobs can take 60 hours.”
The work Mills is doing is considered basic research without a direct translational application. However, the specific protein system she’s analyzing is “involved in repairing problems that happen when DNA becomes too tangled, leading in humans to severe genetic defects, premature aging and susceptibility to cancer,” she said. “We’re not specifically looking for cures or treatment. We’re just trying to understand the system and maybe down the line people can use the information to help people who have these disorders.”
The lab uses what’s called a “magnetic tweezers manipulation technique,” Mills said. “We take a small piece of DNA, attach a magnetic bead to it and we can use the tweezers to manipulate the DNA. The proteins [we’re studying] reorganize DNA, unfold it, cut it and wrap it around itself. We can use these tweezers to measure what’s happening to DNA in the presence of these proteins. We can’t really see what’s actually happening to the proteins. That’s where the simulation comes in. It gives us a way to visualize the protein molecules since we can’t watch that directly.
“The magnetic tweezers experiments we’ve done indicate that the molecule we are looking at in the simulations stimulates the protein’s activity,” she said. “Hence our desire to understand how it affects the protein dynamics.”
Mills’ part of the process makes her a heavy processor user, which in turn made her an ideal candidate for testing a third-party IT provider. For the pilot, she worked closely with cloud provider technicians to design the best computer structure and write scripts to make things move faster.
“At first it took some optimizing, but that was just sort of the learning curve,” she concluded. “The cloud was really nice, because it was resources temporarily dedicated to me, so I didn’t have to worry about waiting for them and I didn’t have to worry about other people not having access to resources they needed. It was a virtual computer cluster that I built and I’m using, and when I’m done with it I just shut it off.”
Mills hopes to publish results from her pilot soon.