Enjoy the poster session with complimentary drinks and nibbles in the Great Hall. A bar will also be available in the Aston Webb Foyer, accepting cash and cards.
N8 CIR is a new initiative designed to enable high impact research across the 8 northern universities in the N8 group (Leeds, Manchester, Durham, Sheffield, Liverpool, Newcastle, York, Lancaster). The centre of excellence in Computationally Intensive Research is working to identify "bottlenecks" in the research process where RSEs could use their skills, knowledge and expertise to help facilitate a wide range of research projects. Once identified, the N8 CIR will coordinate RSE effort to relieve bottlenecks. This poster will outline the community built, initial work done and reflect lessons learnt.
Model Domain Exper, Center for Environmental Modelling and Computation, University of Leeds
I am an atmospheric scientist and RSE, specialising in helping people use weather and climate models for research into climate change and atmospheric pollution. I work with a domain-specific technical support group called the Centre for Environmental Modelling And Computation (https://www.cemac.leeds.ac.uk... Read More →
DYNAMO (Dynamic Analysis Modelling and Optimisation of GDI engines) is an R&D project co-sponsored by the Advanced Propulsion Centre – APC6 Call. In collaboration with Ford Motor Company (Project Lead) and six other UK based partners – Loughborough University, Bath University, Siemens CDA, Hartree Centre, Cambustion, DE&TC. The project aims to significantly improve the fuel efficiency of two high volume passenger vehicle powertrains with specific intent to simultaneously reduce CO2 and noxious emissions. During the project the team helps to develop and mature new and upgraded advanced engine technology ready for commercialisation and aims to revolutionise the process and methodology currently used to design and develop complex powertrains. High Performance Computing (HPC) allows scientists and engineers to solve complex, compute-intensive problems efficiently. Hartree centre is supporting the project with HPC facilities and optimising the end results.
Edinburgh Genomics runs regular bioinformatics training courses for around twenty students at a time, often working on complex bioinformatics pipelines which require specialist software and reference data. In order to provide all participants with a full custom Linux desktop environment we now use AWS EC2 hosted instances running Ubuntu, XFCE4 and TigerVNC. After solving some initial problems and automating the setup and teardown processes we have arrived at a solution that has proved very effective and flexible. Here we share our practical experiences, lessons learned and several useful scripts.
The Hartree Centre is developing a Virtual Wind Tunnel (VWT) as a more cost-effective and time-saving alternative to expensive physical wind tunnel. VWT is a simulation that applies computational fluid dynamics (CFD) to a virtual design, replicating early wind tunnel tests, so the customer can proceed to a later stage in the optimisation process before the physical wind tunnel tests are needed. The Virtual Wind Tunnel is designed to bring the power of High-Performance Computing to non-HPC experts. A comprehensive workflow has been developed to build the wind tunnel environment, automate the domain decomposition, produce an automatic mesh from a 3D model file (.obj / .stl), automatically configure the CFD engine, and submit the job onto Scafell Pike. Therefore, condensing 20+ complex command lines to one simple command line, removing the complexity of a virtual wind tunnel simulation for a non-technical user, but still running the same high-quality simulation.
I am an RSE working at STFC's supercomputing centre, the Hartree Centre. I am interested in automated workflows, cloud computing, HPC, best practices, and industry collaboration.
The Square Kilometre Array (SKA) project is an international effort to build the world’s largest radio telescope. Unlike traditional radio telescopes the SKA will be defined largely by its software. Production is expected to start next year and will involve well over 100 developers working together to deliver a complex system consisting of everything from control systems to high throughput signal processing to science pipelines running on cloud-like infrastructure. The challenge of managing and organising this effort to a single purpose, and delivering the telescope on time and budget is considerable. To make this achievable, the SKA has adopted the Scale Agile Framework and is currently working towards putting this into practice. While more commonly used for large software projects in industry, this is a rather novel approach for the astronomical community. In this poster we will introduce the approach being taken and the tools being used to implement it for the SKA project.
Zacros (http://zacros.org) is a Graph-Theoretical Kinetic Monte Carlo software application for simulating molecular phenomena on catalytic surfaces. The implemented method for listing and randomly choosing a process is based on the idea that the next event to occur has to be the one with the smallest waiting time. The main task is, therefore, reduced to creating a “catalogue” containing the waiting time of all realizable events, finding their minimum and updating the time values of the involved processes. A Skip List based data structure was implemented in Zacros and further extended to provide us with an almost constant time removal operation. Benchmarks have shown that the new algorithm outperforms the existing one, the “Binary Heap”, in a special class of problems. The “bottlenecks”, namely the operations that potentially hinder performance were clearly identified and discussed along with possible performance improvements.
Code reviews are used in the industry to increase quality of submitted code. Different tools are used to enable the four general review strategies; pre-commit reviews, pull requests (merge requests in Gitlab), post-commit reviews, and Gerrit reviews. Code development specifics can favour one of the strategies, usually pull (or merge) requests. In academia the needs and goals can be very different. Code reviews are rarely used, for a variety of reasons. Code development and workflows are highly dependent on academics’ varying software engineering capacity. RSEs assisting development through code reviews will find that industry practices don’t always apply. Large, established codebases, developed by more than one person over a long period of time, inconsistent coding practices and no testing, can be very difficult to handle. Moreover, their commit history is usually linear and/or not useful. Post-commit reviews are alternatives to pull requests, and this overview of code review tools explores how they can be combined and used in research software development environments.
The R research community has been extremely fortunate for the existence of rOpenSci. (https://ropensci.org/). rOpenSci curate an impressive collection of community contributed packages to support open reproducible research and data access in R. The peer review system used has not only improved the quality of contributed packages, but the practice of reviewing itself has elevated the skills of the whole community involved, by engaging us with best practice. The key to this success is a formalised review process based on detailed recommendations and guidelines (https://ropensci.github.io/dev_guide/). To help reviewers further, rOpenSci have also focused on tooling and automation. One such tool is pkgreviewr (https://github.com/ropenscilabs/pkgreviewr), an R package that helps automate some of the steps and guide reviewers through the review process. In this poster I’ll present an overview of the rOpenSci review process and the use of pkgreviewr to support it. I’ll also tell the story of how the package came to be and how such contributions can lead to deeper involvement with the community.
The launch of the ESA Sentinel 3A and 3B satellites providing 300m global ocean colour data present new opportunities to observe finer scale oceanographic features. However, the satellites represent a 20-fold increase in data volumes compared to earlier sensors and, hence, a significant computational challenge using existing processing tools. In this poster, we share our experience of reducing bottlenecks in satellite data processing workflows and expand upon the use of new technologies such as xarray and dask to address these. As an example, we present how an updated ocean-colour processing chain has been incorporated into our operational processing systems to allow processing over 1.1 billion pixels per day as part of the ESA funded project: Earth Observation for Sustainable Development (EO4SD).
Almost 50 years ago, The Numerical Algorithms Group (NAG) was formed from a collaboration between the Universities of Birmingham, Leeds, Manchester, Nottingham and Oxford. Its aim was, and still is, to improve computational research by improving the quality of the numerical libraries on which such research critically depends. You could argue that NAG was the first sustainable RSE group. Fast forward to 2019 and NAG remains close to its academic roots in many areas. We are industrial partners on international grants, collaborate with research groups across the UK, accept undergraduate industrial placement students, support Centres for Doctoral Training, fund PhD students and much more. Additionally, many universities around the UK have full site licenses for many of NAG's products and service contracts to provide training and support. This poster explores some of the RSE-related projects we are currently working on with our academic collaborators.
Please note that this poster is not part of the poster competition.
Many academics improve the impact of the research by collaborating with NAG to get their algorithms into our library and in front of customers. The feedback from these customers can lead to new research questions. Also, Cloud computing!
Tuesday September 17, 2019 18:30 - 19:30 BST
6. The Great HallAston Webb Building
Giving individual feedback to students while teaching is an important part of their development and learning. However, the level of feedback given to students is often limited by the availability of time and resources. RoboTA is an automated tool to provide continuous feedback and assessment for student coursework. As students submit partial versions of coursework, RoboTA evaluates the students’ progress and provides feedback, including early detection of common mistakes and best practice compliance. In addition, a student performance dashboard has been developed to assist teaching assistants in rapidly assessing student and team performance throughout the course. The tool is modular in its construction which means it should be easy to extend in the future. RoboTA is currently being developed for an undergraduate course in Software Engineering, in the future we plan to extend its functionality to a wider range of courses in Computer Science and potentially further into other departments. RoboTA is one strand of the Institute of Coding (IoC) work we are doing at the University of Manchester. The IoC is a consortium of universities and employers developing the next generation of digital talent at degree level and above.
Modern HPC systems are often heterogeneous and massively parallel, leading to highly complex codes which take significant effort to maintain and test. In addition, specialised expertise in profiling and optimisation for a particular target architecture is often required to get close to the peak theoretical performance. Researchers are not guaranteed to have received software development or HPC training, and don't necessarily have the time to spend on software maintenance.
We are working with the new world of data driven research. Few fields have had the explosive change and adoption of big data, AI and HPC systems as much as medical and life science research. This shown by the universities recent Advanced Research Computing (ARC) and Compute and Storage for Life and environmental Sciences (CaStLeS) initiatives. The challenges faced by researchers in this field are varied. They range from scripting and programming tasks, for file format conversions to the data intensive tasks such as bioinformaticians annotations of mouse embryo gene-expression databases, to computational intensive tasks such as genomics whole genome sequence analysis, to the emerging AI fields such as computer vision for MRI scans. If this seems challenging for the researchers it is more so for research software engineers (RSE’s) in this field who have to support them via specialist knowledge in programming, secure and reliable data management, AI, statistics.
Regular testing is crucial to producing robust software. However, software that runs on HPC (High Performance Computing) systems can be difficult to test at scale. Continuous integration tools such as Jenkins lack compatibility with HPC systems, making it harder to run regular and consistent testing on HPC.The SLURM Plugin was developed to enable research software testing on HPC through Anvil, a Jenkins-based continuous integration service developed by the Science and Technology Facilities Council (STFC). Anvil users can now run tests on SCARF, an HPC cluster based at STFC.
Breast cancer (BC) is the most common cancer in UK females, with about 1 in 8 developing it. Ovarian cancer (OC), although less prevalent, has worse survival rates as it is often not diagnosed until an advanced stage. For both, survival rates increase with earlier diagnosis. Risk models allow identification of women at highest and lowest risk, and so those most likely to benefit from alternative screening modalities and/or preventative treatments. BC and OC risks are multifactorial, depending on genetic, family history and other lifestyle/hormonal/reproductive risk factors. We have developed comprehensive BC (BOADICEA) and OC risk models. As risk prediction is computationally intensive, they have been optimised for real-time clinical use. The models will be accessible for clinical use via a new online tool (www.canrisk.org), developed by a team of software developers, clinicians and scientists following a standard framework and using established software engineering practices. Since such tools are classified as medical devices they must adhere to medical device regulations for safety, quality and efficacy (CE marking). We are currently undertaking the necessary additional risk/quality management and software engineering work.
The wide range of RSE teams that have been set up at institutions across the UK and internationally demonstrates the growing importance of research software engineering and how it is now considered to be a key aspect of the research environment. However, providing RSE capabilities at an institution is not only about setting up an RSE team. The wider research community and the many researcher/developers who are likely to remain within research groups are also important in ensuring sustainable, reliable and robust research software outputs. In this poster we will present our RSE activities at Imperial College London that cover research software development, a research software community and a programme of training workshops that are helping to develop the next generation of research software developers. We consider that this set of different activities offers a complete package to support research software and ensure strong and reproducible research outputs. It also has the potential to act as a template or case study for other institutions looking to bring together similar sets of RSE capabilities. The poster will include a group of co-authors representing the different RSE activity areas.
I am an EPSRC RSE Fellow based in the Department of Computing at Imperial College London and affiliated to Imperial's RSE team.My work focuses on developing tools and middleware to support scientists, and their codes, across a variety of domains. This includes developing technical... Read More →
SOMBRERO is a high-performance parallel HPC benchmark developed from research software used in Lattice Gauge Theory. SOMBRERO specifically tests the conjugate gradient inversion of a physically interesting sparse stencil operator across a four-dimensional array of small complex matrices. Since the shape of these matrices and the vectors on which they act changes with the properties of the theory under study, so does the ratio of floating point computations to the number of bytes sent across the interconnect. The benchmark has been tested on a variety of platforms and architectures, including Intel, AMD, and Arm microarchitectures and Intel and Mellanox interconnects. In this poster, we will present details of the operation of the benchmark, and test results from a sample of systems we have tested.
A researcher reads an average of 22 papers per month and has spent an average of 48 minutes on each paper. Sometimes, we can only experience several hours of deep reading mode, but SN4RE is designed to empower us by helping researchers to manage their reading notes on a well-structured note format (Smart Notes for Research). This platform helps researchers to grasp the essential information of a given paper and provides the opportunities to share aspects of readings that could help other researchers understand their research papers, while saving time in getting a consistent overview. It prevents the rereading of the same paper over again, by providing a concise summary of their reading, SN4RE is a Responsive Platform with numerous features to empower your reading notes.