22.03.2017 change 22.03.2017

Earthquake in the cloud

Earthquake alert centre in California creates computational risk and threat models in tens of thousands of geophysical grid points. To examine all possible earth movements along the San Andreas fault, millions of calculations are needed for each of these points. These calculations are carried out by virtual machines in so-called clouds. Polish researchers are involved in the work on the optimal use of clouds in science.

In the project funded by the National Science Centre, Dr. Maciej Malawski from AGH University of Science and Technology in collaboration with the University of Southern California in Los Angeles and the University of Notre Dame in Indiana examines the use of the calculations in the cloud in seismically active areas, among other places.

The project leader explained that clouds are virtual machines with enormous computing power. Their capabilities are the result of combining thousands of computers around the world into a single infrastructure, including supercomputers, occupying large spaces in server rooks.

The problem with the optimisation of calculations sent "to the cloud" is an economic dilemma. Both companies and groups of scholars have to consider whether they want to distribute tasks among multiple processors and get the results faster by paying more, or perhaps they care less about time and more about staying within the research project\'s budget. The issue of the appropriate distribution of tasks is rather complicated, because in large research projects, including geological projects, computer specialists work alongside the scientists. That it also the role of Dr. Malawski.

To assess the risk of an earthquake at a given point, seismologists create maps, on which through geophysical simulations of crustal movement they can estimate the seismic wave at a given point. The map of Southern California must consist of points spaced on a grid. Imagine that the grid has dimensions of 100 by 100, so the lines intersect at 10 thousand points. For each of these points, the risk from different earthquakes has to be calculated - Dr. Malawski told PAP.

But that\'s not all, Dr. Malawski emphasised, because these earthquakes may originate from different places where continental plates collide with each other. In California there is the San Andreas tectonic fault, and at different points of the fault there may be a sudden movement, which in turn can lead to earthquakes. Accordingly, there are several thousand possible sources of the earthquake along the fault. And for each point on the map scientists must calculate the impact of various possible earthquakes along the fault. That means 10 thousand points times several thousands of possible scenarios - more than a million cases to calculate.

"Computing architectures such as clouds are very well suited for this. They allow to quickly access a large number of processors. Virtual machines are created on demand, and each of them can perform its share of calculations. Large calculation can be easily divided into pieces. The problem is, how to divided such a calculation into smaller portions, so that it can be calculated in a distributed architecture. And so that it can be calculated quickly and inexpensively at the same time" - said Dr. Malawski. His tasks in the project include the development of software enables such outsourcing of calculations.

Cloud infrastructures for researchers are created throughout the world. In Poland, there is the system PL Grid that links several clusters located in large cities, forming a computing infrastructure available to researchers. Optimisation of calculations requires their division into smaller pieces. But the biggest computing clouds are provided by commercial suppliers, such as Google and Amazon.

"We are dealing with the cost factor, we can buy access to the processor by paying for the time infrastructure use. If we pay 1 cent per hour of CPU usage, and if we need thousands of processors for many hours, then the amounts get more substantial. The faster we want our task calculated, the more expensive it gets" - explained the computer scientist.

He added that if the experts split the problem into pieces that are too small, then the coordination and synchronization of tasks will take a lot of time and work. It may therefore turn out the result is not obtained much faster than if the tasks were grouped into larger "portions".

"This is the basic problem of parallel programming - if I divide something into two pieces and calculate on two processors, I want it calculated two times faster than on one processor. But sometimes it turns out that the process slows down or calculates only 1.5 times faster. So is it cost-effective to pay twice for 10, 20 percent higher speed? These are the dilemmas of multi-criteria optimisation: what is more important: time - because, for example, we want to publish first or warn against mortal danger, or perhaps prudent management of the project budget" - concluded Dr. Malawski.

Optimisation of calculations is also possible in other fields of science, for example in DNA sequencing or the search for extrasolar planets.

PAP - Science and Scholarship in Poland, Karolina Duszczyk

kol/ mrt/ kap/

tr. RL

Przed dodaniem komentarza prosimy o zapoznanie z Regulaminem forum serwisu Nauka w Polsce.

Copyright © Foundation PAP 2024