Introduction
Sphinx is a scheduling middleware for scheduling data intensive application on a dynamically changing grid. It is a project developed under the auspices of the GriPhyN (Grid Physics Network) project. One of the important aspects of Sphinx is the efficient management of data for overall optimal work-flow. The dynamic nature of resources and jobs poses a significant challenge in achieving this. I am developing an adaptive data management component(DMC) over existing middleware for achieving efficient data management including optimal transfer of data, replica management and data transfer prediction.More details on Sphinx web site.
Architecture
Supercomputing Conference (SC'03) Demo
At the Supercompuing conference, we demonstrated a collection of distributed services including Sphinx scheduling service, Clarens, Chimera and ROOT.Goals of the demo are:
- Prototype vertically integrated system that provides transparent/seamless experience to the user.
- Distribute grid services using a uniform web service. We used Clarens web service developed by Caltech for integrating the distributed services.
- Investigate request scheduling in a resource limited and dynamic environment
- Investigate interactive vs. scheduled data analysis on a grid
In the demo scenario, we showed how a physicist can analyze the data using the power of all the distributed services. The scenario involves data discovery, data analysis, interactive workflow generation, workflow scheduling and collabarative analysis.