Researchers often develop their own tools to facilitate their work and support the publication process. In very rare cases, some of them will make the additional effort to make these tools available to the whole research group or even to researchers across institutions. However in most cases these tools tend to evolve and morph organically over time, following the immediate needs of the researcher. Questions like data management and sustainable data preservation practices are usually left to the future.
In this talk I will present the challenges of joining such a research group, BRIDGE, and slowly moving from an adhoc environment to a mostly fully managed full stack system. Our research involves climate modelling, past, present and future, generating many TBs of data weekly that requires storage, analysis, visualisation, availability for reuse, and eventual archiving. Working towards a managed, sustainable outcome required close integration with University systems.
By effective collaboration, we built a home grown processing suite used to manage, process, analyse and share data, capable of scaling up to suit our full dataset, likely to exceed petabyte scale in the next few years. This suite is also used for teaching purposes.