Loading…
RSEConUK 2019 has ended
The Fourth Conference of Research Software Engineering was held at the University of Birmingham.

Content from all sessions is licensed under a Creative Commons Attribution 2.0 UK: England & Wales License.
Parallel Talks [clear filter]
Tuesday, September 17
 

11:00 BST

#1A1 - Engagement and Outreach - Public engagement and RSEs - A vital role?
Research institutions are becoming increasingly aware of the need to share the benefits of their research with the public; a clear plan for disseminating information to and learning from the public (public engagement) is now a required element for all major UK research funding agencies.

RSEs are often viewed as external to the public engagement process, but their skills hold great potential to help researchers facilitate knowledge exchange with the public. For example, research results are often disseminated through RSE-developed dashboards and databases, citizen science projects often rely on technical platforms developed by RSEs and data visualisation skills can transform the way public understand complex topics.

In this talk, I will showcase a range of projects where RSE input has led to excellent public engagement (and discuss the strengths and limitations of each) with the aim of inspiring RSEs to consider their work in a public engagement context in the future. This will help empower the RSE to become more aware of the vital role they play in this process, and ultimately lead to the development of technical tools that are better designed from a public engagement viewpoint.

Speakers
avatar for Kirsty Pringle

Kirsty Pringle

Model Domain Exper, Center for Environmental Modelling and Computation, University of Leeds
I am an atmospheric scientist and RSE, specialising in helping people use weather and climate models for research into climate change and atmospheric pollution. I work with a domain-specific technical support group called the Centre for Environmental Modelling And Computation (https://www.cemac.leeds.ac.uk... Read More →



Tuesday September 17, 2019 11:00 - 11:25 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

11:00 BST

#1D1 - Legacy Software - A case study in optimisation of legacy software
An old serial program for Lattice Field Theory Simulations, initially written in fixed-form Fortran during decades of research work, needed to be optimised for modern architectures. In this talk I will describe the challenges, the decisions, and the steps we took in order to improve its performance and its scientific output (which is a multi-level optimisation problem), discuss the tools used and the experience we had with them (Intel VTune, ITAC, the Scalasca suite and the BSC Performance Tools), and report our - sometimes surprising - findings.

Speakers
avatar for Michele  Mesiti

Michele Mesiti

RSE, Swansea University



Tuesday September 17, 2019 11:00 - 11:25 BST
4. Aston Webb, Room WG12 Aston Webb Building

11:30 BST

#1A2 - Engagement and Outreach - BEAR Software – Coaching - Teaching Researchers to Fish
There is the saying: ‘Give a man a fish and feed him for a day; teach a man to fish and feed him for a lifetime’. This saying applies to researchers who are writing research software. Even with a RSEs at a university, researchers will continue to write a lot of research software and it is important for RSEs to positively impact this software. By working with researchers, we can upskill researchers so that they write high quality software – leading to the SSI’s ‘Better Software, Better Research’.

I will talk about the coaching programme we provide to researchers, where we engage with researchers in several ways to improve the research software being written and to make it easier for this software to be distributed and reused. We offer a variety of training, coaching, and mentorship to all levels of researchers – from those to just starting their research career to senior researchers. I will present case studies detailing some of the coaching we have provided and how this has improved the research software being written at this university.

Speakers
avatar for Simon Branford

Simon Branford

Research Software Engineer, University of Birmingham



Tuesday September 17, 2019 11:30 - 11:55 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

11:30 BST

#1D2 - Legacy Software - Refactor, Rewrite or Retire
How do we define legacy software?
Code that falls into this category can be typically classified by a lack of tests, or use of outdated practices, including unsupported dependencies, or having a code base written in an obsolete programming language. Whilst this list is not exhaustive, these are usually the signs that software is going to cause you issues when it comes to attempting to extend or scale the code.

Characterisation of code then falls into two simple categories, legacy code, and not legacy code. With no clear definition on legacy software, the boundary between the two is far from clear, and is constantly moving. However, one thing that remains clear and consistent is that every piece of software will encounter the issue of migrating legacy systems in its life cycle.

Generally, there are three possible solutions when dealing with legacy software: refactor, rewrite, or retire. All three options should always be considered when reaching that point in the software development life cycle, and this talk examines the costs and benefits for each. A strong focus on testing is leveraged in this talk when considering either rewriting and refactoring and establishing a good set of tests is paramount when using these approaches.

Speakers
avatar for Thomas Stainer

Thomas Stainer

UKAEA
Coming from a particle physics background, I learnt to program during my PhD, where I quickly realised how powerful programming is as a skill. Since then, I have oscillated between the private and public sector, working in petrochemical and IT security industries. Currently I work... Read More →



Tuesday September 17, 2019 11:30 - 11:55 BST
4. Aston Webb, Room WG12 Aston Webb Building

12:00 BST

#1A3 - Engagement and Outreach - Developing and using a cluster of Raspberry Pis for outreach
The Supercomputing Wales project has as one of its goals increasing the public understanding of High-Performance Computing in Wales. To further this ambition, the RSE team at the Swansea Academy of Advanced Computing agreed to run a stand at the Swansea Science Festival 2018. To this end, we assembled a cluster of 16 Raspberry Pi single-board computers, and developed software to demonstrate some of the principles of parallel computing as well one of as the types of problem that the supercomputer in Swansea is used for in research. In this talk, we will describe the process of designing and constructing both the hardware and the software, and some of the challenges we encountered along the way, as well as discussing how well it works as a scaffold to discussions around parallel computing with members of the general public.

Speakers
avatar for Ed Bennett

Ed Bennett

Senior Research Software Engineer, Swansea University



Tuesday September 17, 2019 12:00 - 12:25 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

12:00 BST

#1D3 - Legacy Software - Launching Java Applications into the Future
Over the years, the Java runtime platform has been hardened against a number of security vulnerabilities. Each time, maintainers have had to change their practices in order to ensure applications remain accessible to users.

In the first quarter of 2019, the Jalview (www.jalview.org) team migrated their mature interactive application from Java 1.8 to Java 11. It required some radical changes. Build, deployment, installation and launch mechanisms were recoded. Getdown (https://github.com/threerings/getdown) was adopted to replace Java Webstart, and adapted to give a hopefully seamless experience with an "install once, works forever" approach. JRE packaging methods were developed to allow over-the-air updates in preparation for future Java releases. We’re now adapting Getdown further to support "channels" so that the user can easily switch between a stable release, development release, or access older versions of Jalview to aid reproducibility.

I’ll examine the issues we encountered when migrating an interactive graphical desktop Java application relied on by thousands of researchers and educators to Java 11. I’ll also evaluate whether our new deployment model meets the future needs of our broad community of users.

Speakers
avatar for Ben Soares

Ben Soares

Research Software Engineer, University of Dundee School of Life Sciences



Tuesday September 17, 2019 12:00 - 12:25 BST
4. Aston Webb, Room WG12 Aston Webb Building

13:30 BST

#2A1 - Careers and Culture - (Research Software) Engineering is not Research (Software Engineering)
With over a decade of experience in industry as a software engineer, architect, and manager, I was expecting a move into research software engineering to involve a straightforward application of my experiences and skills to a new domain. In fact I quickly discovered that the contexts and motivations are different enough that even when we adopt the same practices, it's often for very different reasons. Research Software Engineering needs a different set of guiding values and principles from industrial software engineering; Agile software development and Software Craftsmanship do not supply those values.

Speakers
avatar for Graham Lee

Graham Lee

Head Labrarian, Labrary Ltd.
I make it easier and faster to make high-quality software that respects people's privacy and freedom. You can find me in universities or in companies, but that's probably what I'll be doing. Talk to me about continuous delivery, devops, team performance, measuring success, Python... Read More →


rse pdf

Tuesday September 17, 2019 13:30 - 13:55 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

13:30 BST

#2D1 - Revitalising Legacy Languages - Teaching an old dog new tricks - object-oriented programming in Fortran
RSEs often need to work with code written in Fortran. Though most commonly thought of as a procedural language, modern Fortran features fully-fledged object-oriented programming (OOP) complete with inheritance, encapsulation, and polymorphism. This talk introduces the syntax and philosophy behind OOP in Fortran to those familiar with older versions of the language, showing how it naturally extends pre-existing features. It will also compare the Fortran approach to OOP to those used in other programming languages in a manner which can be understood by non-Fortran developers. Finally, this talk will reflect on the advantages and disadvantages of using OOP in Fortran and under what circumstances this is an appropriate paradigm.

Speakers
avatar for Chris MacMackin

Chris MacMackin

Research Software Engineer, UKAEA
I am part of the newly-formed RSE group at the UK Atomic Energy Authority. We assist researchers in developing new software and promote best practices within the organisation. My interests include numerical methods and different programming paradigms. I am very knowledgeable about... Read More →



Tuesday September 17, 2019 13:30 - 13:55 BST
4. Aston Webb, Room WG12 Aston Webb Building

14:00 BST

#2A2 - Careers and Culture - From Trainee to Director - An Institutional RSE Career Pathway
RSE perform a complex mix of research and service provision that has to fit into career paths in academia. This University has developed and implemented a career pathway tailored to RSE as part of its Digital Research Strategy. A new team was formed in 2018 by merging and expanding existing RSE-like structures into a dedicated support facility outside academic structures of faculties and schools. Job profiles reflecting a professional services-style of working in a research context were created. These start at Trainee Level (Digital Research Graduate Programme), from where role holders can progress to become established RSE team members (Digital Research Scientist). A set of senior role profiles allows further progression with a focus on managerial (Digital Research Service Team Leader), academic and research-driven (Senior Research Data Scientist), or professional services (Senior Digital Research Scientist) responsibilities. A senior management role with strategic responsibilities (Head of Digital Research Service) leads the team and reports to the director of the Digital Research division. As far as we know, this University is the first to incorporate a career path that recognises distinct characteristics of RSE roles.

Speakers
avatar for Jurgen Mitsch

Jurgen Mitsch

Digital Research Service Team Leader, University of Nottingham
The University of Nottingham's Digital Research Service (DRS) is a dedicated, core facility to support researchers and improve research quality. 18 RSE provide bespoke, project-based support in Data Science/Data Analytics, Bioinformatics and Software Engineering.The DRS enable, improve... Read More →



Tuesday September 17, 2019 14:00 - 14:25 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

14:00 BST

#2D2 - Revitalising Legacy Languages - Wasm! and the JavaScript is gone. What is WebAssembly and what can it offer research software?
WebAssembly (or Wasm) is a bleeding edge technology promising to massively expand the world of client-side web development outside the constraints of JavaScript, with potential for wider use as a universal sandboxed virtual machine runtime.

As a high-performance compilation target for the web, Wasm has potential applications in visualisation and performance-critical applications; I’ll explore these and other possible applications for research software. I’ll discuss the current state of Wasm, potential pitfalls and give an overview of the roadmap for the future.

Speakers
avatar for Drew Silcock

Drew Silcock

Research Software Engineer, STFC Hartree Centre



Tuesday September 17, 2019 14:00 - 14:25 BST
4. Aston Webb, Room WG12 Aston Webb Building

14:30 BST

#2A3 - Careers and Culture - "It works on my machine" - working as a research software engineer in a multi-partner international research project
When the software produced by a research project is to be re-used in several subsequent projects, it ought to be high-quality and sustainable. Yet in an heterogeneous team of researchers from several international institutions, each of them with their own background and priorities, all of them under stringent result and reporting constraints, and none of them a formally identified research software engineer, those who volunteer their time to provide quality control and run support software infrastructure are instrumental in ensuring that the team doesn't lose sight of the exigence of quality and sustainability.

In this talk, we will share our experience of taking up RSE duties in a multi-partner international space robotics project. We will give concrete examples of the technical and cultural challenges we faced, sometimes unsuccessfully. We will also sum up the lessons that we have learnt into recommendations for project leaders and RSEs, so that they can profit from our experience.

Speakers
avatar for Romain Michalec

Romain Michalec

Postdoctoral Researcher, University of Strathclyde
I work on the software side of robotics and am particularly interested in space and underwater applications. I am also involved in researcher support.


main pdf

Tuesday September 17, 2019 14:30 - 14:55 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

14:30 BST

#2D3 - Revitalising Legacy Languages - Developing Fortran using Python and Literate Programming
Xcompact3d is a high-performance CFD code for simulating turbulent flows. These flows require resolving a wide range of scales for which compact finite differences are well suited with so-called ‘quasi-spectral’ accuracy, combined with a compact stencil making them computationally efficient. As part of an EPCC-funded project to implement a free-surface solver in Xcompact3d, new differencing schemes capable of resolving the discontinuous changes in field variables without introducing spurious oscillations were required, to this end a fifth-order WENO scheme was implemented.

In this talk I will present the approach followed to implement the WENO scheme, rather than write it directly in the Xcompact3d source code, it was written as a ‘literate program’. This allows the description of the program and its implementation to be interleaved, easing review and understanding of the code. Furthermore by developing it externally it was trivial to develop it as a stand-alone module which could then be wrapped with f2py to quickly test the implementation – this all implemented within the same document. Once satisfied with the implementation, the exported Fortran code could then be integrated with the Xcompact3d code base with minimal changes.

Speakers
avatar for Paul Bartholomew

Paul Bartholomew

Post-Doctoral Research Associate, Imperial College London



Tuesday September 17, 2019 14:30 - 14:55 BST
4. Aston Webb, Room WG12 Aston Webb Building
 
Wednesday, September 18
 

09:30 BST

#3A1 - Citation and Software Discovery - How to learn which software is developed in your institution?
I will present the Code4REF project (https://code4ref.github.io/) which aims at providing guidelines on recording research software in CRIS (Current Research Information Systems). Many universities use CRIS to record research publications, e.g. to display them on their webpages and, in the UK, for preparing submissions for REF (Research Excellence Framework) and reporting research outputs to funding bodies, while software outputs are much less common there.

We believe that scientific code needs to be treated as a primary research output, and should be equally well covered by CRIS. This will not only be useful for the above mentioned purposes, but will allow e.g. to get an overview of all research software developed at an institution, in the research group or by an individual developer using CRIS or their public views. This will provide further evidence that software is vital for research, and will contribute to the campaign for the recognition of the RSE role within academia.

I will outline the current state of the project, explain how you can contribute by providing guidance to further CRIS and promoting Code4REF in your institutions. I will also outline the vision of further grassroots campaign which starts from Code4REF.

Speakers
avatar for Alexander Konovalov

Alexander Konovalov

Lecturer, University of St Andrews



Wednesday September 18, 2019 09:30 - 09:55 BST
5. Nuffield Building, Room G17 Nuffield Building

09:30 BST

#3C1 - Design Methodologies and Project Planning - Cheap, fast, or secure? When research projects meet real-world participants
It is often said that the 'S' in IoT stands for security. In a similar vein, the 'P' in the name might be said to stand for privacy-first design. There is a large and challenging gap between functional adequacy and best practice.

In this talk, we describe the process of developing the 'home gateway' for the SPHERE 100-homes project, a Linux-based research data aggregator installed into participant homes around Bristol in order to act as an endpoint for healthcare data collection on human participants.

We begin by briefly describing the regulatory landscape that applies to human-centred research data. We tested and used open-source packages and services designed to fill as many gaps in our service design as possible. We also had to find solutions for the further, specific challenges raised by the particular requirements of the project, such as data encryption at rest, robust behaviour in the face of unexpected input or events, and auditable data workflows. Finally, we look at the everyday challenges of safely and securely maintaining a sustainable platform in the face of the risks posed by real-world vulnerabilities - patching, system updates and responding to new flaws discovered in standards, hardware and firmware.

Speakers
avatar for Gregory J. L. Tourte

Gregory J. L. Tourte

PhD student, Research Software Engineer, The University of Bristol



Wednesday September 18, 2019 09:30 - 09:55 BST
2. Aston Webb C Block Lecture Theatre Aston Webb Building

09:30 BST

#3D1 - Machine Learning - Pushing the Limits of Exoplanet Discovery via Direct Imaging with Deep Learning
One technique to detect these distant worlds is through the direct detection of their thermal emission. The so-called direct imaging technique is suitable for observing young planets far from their star.

Due to the star emissions, these are very low signal-to-noise-ratio (SNR) measurements. Moreover, the limited and highly unbalanced ground truth hinders the use of supervised learning approaches to automatically detect planets signals in the images.

In this talk, we show how to bypass the scarcity of real data by training a Generative Adversarial Network. The synthetic images produced by the generative model can be assumed to not contain any planet and are augmented by artificially injecting planets signals. The data obtained are not just labeled but, for the positive samples, the exact position of the object to detect is known. CNN detectors trained on this synthetic dataset exhibit good predictive performance and, on real data, the models can re-confirm bright sources detection. In this sense, the above technique seems to go beyond the current state of the art in exoplanet discovery via direct imaging.

Speakers


Wednesday September 18, 2019 09:30 - 09:55 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

10:00 BST

#3A2 - Citation and Software Discovery - Building a network of connected research
The persistent identifier community is an academic-tangential tech community much like the research software engineers. (Some of us are even the same people!) We're trying to make sure that researchers, the stuff they do, and the tools they work with are identified in ways that are unique, persistent, and make sense to machines. We envision a future where there's a network of these identifiers tracing which articles were written by which people using which datasets and software, and so on. But what good is a network if no one is using it? You're invited to contribute to the movement, whether through open software development, or just by being opinionated about interconnected research. This talk will give you a high-level view of persistent identifiers, who's using them and why, and tools you can use now in your own research software endeavors.

Speakers
avatar for Robin Dasler

Robin Dasler

Product Manager, DataCite



Wednesday September 18, 2019 10:00 - 10:25 BST
5. Nuffield Building, Room G17 Nuffield Building

10:00 BST

#3C2 - Design Methodologies and Project Planning - Retrofiting research software engineering practices to active research tools
Researchers often develop their own tools to facilitate their work and support the publication process. In very rare cases, some of them will make the additional effort to make these tools available to the whole research group or even to researchers across institutions. However in most cases these tools tend to evolve and morph organically over time, following the immediate needs of the researcher. Questions like data management and sustainable data preservation practices are usually left to the future.

In this talk I will present the challenges of joining such a research group, BRIDGE, and slowly moving from an adhoc environment to a mostly fully managed full stack system. Our research involves climate modelling, past, present and future, generating many TBs of data weekly that requires storage, analysis, visualisation, availability for reuse, and eventual archiving. Working towards a managed, sustainable outcome required close integration with University systems.

By effective collaboration, we built a home grown processing suite used to manage, process, analyse and share data, capable of scaling up to suit our full dataset, likely to exceed petabyte scale in the next few years. This suite is also used for teaching purposes.

Speakers
avatar for Gregory J. L. Tourte

Gregory J. L. Tourte

PhD student, Research Software Engineer, The University of Bristol



Wednesday September 18, 2019 10:00 - 10:25 BST
2. Aston Webb C Block Lecture Theatre Aston Webb Building

10:00 BST

#3D2 - Machine Learning - The Limitations of Machine Learning
Machine Learning (ML) is a popular topic in the research world. It has spread from basic computer science research to many other disciplines and RSEs are increasingly encouraged to be familiar with applying it to their projects.

In this talk I will discuss various limitations of ML, including how it can augment human bias, the problem with interpreting complex deep learning models and how real-world data can harm your model.

Speakers
avatar for Camilla Longden

Camilla Longden

Research Software Engineer, Microsoft Research
I'm an RSE at Microsoft Research in Cambridge. I work in the Deep Learning Engineering team in the ADA (All Data AI) group. I am also interested in diversity and inclusion in the technology sector.



Wednesday September 18, 2019 10:00 - 10:25 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

10:30 BST

#3A3 - Citation and Software Discovery - How to make software fit in the research citation graph
References between research products (papers, monographs, software, data, etc.) provide context, inform about predecessors and precedents, and make it possible to trace citation, which in turn provides credit for researchers and institutions. Together, research products, researchers, institutions, and the relations between them form a graph, a "research citation graph". With the increasing awareness of the importance of software for research, and ongoing efforts to make citation work for software, I look at modeling software and its specific properties into this kind of graph, which has traditionally been used for text-based publications only. In this talk, I propose some changes that need to be made to fit software and their dependencies into a citation graph. These changes pertain to differences across types of research product, and concepts specific to software. I will suggest how to implement these changes for a citation system which is better suitable for providing fair credit for research software than what we have right now. And finally, I will give an outlook on future work on automatically retrieving citation graphs for software, and weighting relations in them.

Speakers
avatar for Stephan Druskat

Stephan Druskat

Doctoral researcher (software engineering), German Aerospace Center (DLR)
I'm a doctoral researcher in the Sustainable Software Engineering Group at the Institute for Software Technology, German Aerospace Center (DLR). In my work, I focus on research software sustainability and software citation. I'm a Special Collaborator of the Software Sustainability... Read More →



Wednesday September 18, 2019 10:30 - 10:55 BST
5. Nuffield Building, Room G17 Nuffield Building

10:30 BST

#3C3 - Design Methodologies and Project Planning - Got agility? A lightweight technique for Productivity Sustainability Improvement Planning (PSIP)
PSIP (Productivity Sustainability Improvement Planning) is a lightweight, iterative workflow where teams identify their most urgent software bottlenecks, and track progress on work to overcome them. PSIP captures the tacit, more subjective aspects of team collaboration, workflow planning, and progress tracking. In the potential absence of appropriate planning PSIP is designed to bootstrap small, large, and loosely-coupled aggregate team capabilities into best practices, and encourage teams to adopt a culture of process improvement. In this talk we highlight the PSIP stories of two exascale scientific software teams, Exascale Atomistic capability for Accuracy, Length and Time (EXAALT), and Exascale MPI (Message Passing Interface) MPICH (High Performance Message Passing Interface). The EXAALT team used PSIP to adopt continuous integration practices. MPICH focuses on developing a production-ready high-performance MPI implementation that scales to supercomputers. MPICH used PSIP to develop tools for onboarding new team members. In discussing the strengths and weaknesses of the PSIP process in use with these and other research science software teams we will discuss why PSIP helps teams mitigate technical risk.

Speakers
avatar for Elaine M. Raybourn

Elaine M. Raybourn

scientist, SNL
avatar for Rinku  Gupta

Rinku Gupta

Argonne National Laboratory
A passionate researcher focussing on software sustainability and developer productivity

Authors


Wednesday September 18, 2019 10:30 - 10:55 BST
2. Aston Webb C Block Lecture Theatre Aston Webb Building

10:30 BST

#3D3 - Machine Learning - Investigating Deep Learning Approaches for Robust Zooplankton Identification
Plankton support the marine ecosystem & are extremely sensitive to environmental change. The Marine Biological Association of the UK have been exploring new, autonomous imaging technology for rapid estimation of zooplankton abundance in order to improve monitoring & reporting speed to meet legislative monitoring requirements. In collaboration with EPCC, development of robust, automated plankton classification systems are being explored. The Continuous Plankton Recorder Survey has been used to generate an image dataset of zooplankton species generated via a digital imaging system; this project outlines the development, training & validation of Convolutional Neural Networks (CNN) on highly imbalanced datasets for rapid classification. Using a ready-labelled training dataset of 20 zooplankton species, a number of architectures & ensemble methods have been explored to obtain high accuracy classification; robust strategies for handling extreme class imbalance have been developed such that species that occur very infrequently & in low numbers can be reliably classified in near real-time. By optimising the CNNs for use on GPUs on EPCC’s CIRRUS HPC system, we will show how we have improved on existing work in this rapidly evolving field.

Speakers
avatar for Chris Wood

Chris Wood

Applications Consultant, EPCC, University of Edinburgh

Authors


Wednesday September 18, 2019 10:30 - 10:55 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

13:30 BST

#4A1 - Community and Collaboration - Research Software Engineering in Jordan - The MaDiH (مديح) Project
MaDiH (مديح) involves King’s Digital Lab (KDL) / King's College London eResearch, the Hashemite University, the Council for British Research in the Levant (CBRL), the Department of Antiquities of Jordan, the Jordanian Open Source Association, and the EAMENA project. The project delivers training in Research Software Engineering (RSE) best practice, alongside white papers, a prototype National Data Catalogue, and a prototype National Heritage Portal. Workshops are identifying datasets, held both in Jordan and overseas, to ‘repatriate’ (through federation) data collected in Jordan but held offshore. The project is contributing to the development of Jordan’s digital cultural heritage, identifying key systems, datasets, standards, and policies, and aligning them to government digital infrastructure capabilities and strategies. By defining a robust architecture for digital cultural heritage, informed by RSE best practice, the aim is to assist the Department of Antiquities in their planning processes, help product development teams develop their systems, facilitate the aggregation of valuable datasets held in disparate repositories, and ensure data generated from research activity is properly stored and widely accessible.

Speakers
avatar for James Smithies

James Smithies

Director, King's Digital Lab, King's College London
I'm director of King's Digital Lab, a software engineering team specialising in arts & humanities and social science research. I am also Deputy Director of eResearch for King's College London.

Authors
AZ

Andrea Zerbini

Council For British Research in the Levant
CP

Carol Palmer

Council For British Research in the Levant
FB

Fadi Bala'awi

Hashemite University
PF

Pascal Flohr

University of Oxford
SI

Sahar Idwan

Hashemite University
SR

Shaher Rababeh

Hashemite University



Wednesday September 18, 2019 13:30 - 13:55 BST
2. Aston Webb C Block Lecture Theatre Aston Webb Building

13:30 BST

#4D1 - Cloud Technologies and Case Studies - Case study of porting a pipeline to EMBL-EBI Cloud Portal
Case Study of Porting a Bioinformatics Pipeline into Clouds

Cloud providers have different UIs, architectures and APIs. For the research community, it is extremely important to be cloud-agnostic while enjoying advantages of different clouds. It is also extremely important to make the cloud technologies easily accessible for lab scientists with little to no training in clouds. Kubernetes and Docker as a de facto standard is making such goals closer to reality.

We ported a legacy pipeline from IBM Load Sharing Facility, a bare-metal HPC stack to Kubernetes on OpenStack. With minimal changes to the pipeline itself, but by using cloud features intelligently we have made major improvements, changing the pipeline from a single-user local application to a shared multi-user application accessible over the Internet. To investigate and to confirm the cloud-agnostic nature of our solution, we created a CI/CD toolchain to deploy the pipeline onto Kubernetes clusters created on all four major clouds: Google, Amazon, Microsoft and OpenStack. The pipeline can run consistently on GKE, EKS, AKS and EHK, where EHK is a Kubernetes service at European Bioinformatics Institute for research teams to request clusters from.

The general solution that we have developed provides a common set of programming interfaces to support major cloud providers in a consistent, agnostic manner. This reduces the learning curve and skill requirements to port and deploy pipelines in the clouds. This talk presents the methods and the lessons learned during the exercise. It demonstrates the feasibility to rejuvenate legacy pipelines in the clouds with minimum effort.

Speakers
avatar for David Yuan

David Yuan

Cloud Bioinformatics Application Architect, European Bioinformatics Institute
David Yuan is a Cloud Bioinformatics Application Architect working at European Bioinformatics Institute (EBI), European Molecular Biology Laboratory (EMBL). He is driving cloud-adoption onto both private cloud (OpenStack) and public clouds (Google Cloud Platform, Amazon Web Services... Read More →



Wednesday September 18, 2019 13:30 - 13:55 BST
4. Aston Webb, Room WG12 Aston Webb Building

14:00 BST

#4A2 - Community and Collaboration - The Setup of an Institute for Scientific Software, connecting Applied Computing and Data Intensive Sciences
With the ever increasing size of scientific collaborations and complexity of scientific instruments the software needed to acquire, process and analyze the gathered data is gaining in complexity and size too. Unfortunately the role and career path of scientists and engineers working on software R&D and developing scientific software is neither clearly established nor defined in many fields of natural science. In addition the exchange of information between scientific software development and computer science departments at universities or computing schools is scattered and de-fragmented into individual initiatives. To address the above issues we propose an effort on an European level, which concentrates on strengthening the role of software developers in natural sciences, acts as a hub for exchange of ideas among different stakeholders in computer science and scientific software and forms a lobbying forum for software engineering in natural sciences on an international level. This contribution discusses in detail the motivation, role and interplay with other initiatives of a "Software Institute for Data Intensive Science" which is currently being discussed.

Speakers
avatar for Stefan Roiser

Stefan Roiser

Senior Computing Engineer, CERN
I am a computing engineer in the IT department at CERN, Switzerland. My main interest is on data processing software frameworks and applications in high energy physics and their evolution in an evolving hardware landscape. Furthermore I am also interested in the possibility to establish... Read More →



Wednesday September 18, 2019 14:00 - 14:25 BST
2. Aston Webb C Block Lecture Theatre Aston Webb Building

14:00 BST

#4D2 - Cloud Technologies and Case Studies - Empowering domain experts with DARE, a new cloud-based platform and working environment
DARE focuses on empowering domain experts to invent and improve their methods and models by providing a new cloud-based platform and a working environment. We have initially focused in the seismology community, supplying advance interfaces to support the Rapid Ground Motion Assessment (RA) application. It requires rapid data analyses, handling multiple data formats, multiple data sources, and availability of computing and storage resources on demand.
The new interfaces that we are building on DARE provide a fluent path from prototyping to production. Applications are not locked to platforms but can be moved to suitable new platforms without human intervention and with the encoded method’s semantics unchanged. For doing so, we exploit different technologies, such as scientific workflows (CWL), stream-based data-flow systems (dispel4py), containers (Docker), infrastructure orchestrations (Kubernetes), notebooks (Jupyter), and Cloud platforms. DARE platform acts as an intermediary between users’ applications and the underlaying computing resources, submitting applications and collecting (and storing) their provenance and results.

Speakers
avatar for Rosa Filgueira

Rosa Filgueira

Data Architect, Research Fellow, University of Edinburgh, EPCC
I’m a Computer Scientist with background in High Performance and Data-Intensive Computing. I’ve been always in academia engaging with researchers and domain scientists from different domains, such as geosciences, biomedicine and most recently digital humanities. I’ve designed... Read More →



Wednesday September 18, 2019 14:00 - 14:25 BST
4. Aston Webb, Room WG12 Aston Webb Building

14:30 BST

#4A3 - Community and Collaboration - Analytics and Insights about Cultivating the Software Engineering Community at DLR
Software development increasingly became part of the daily work of many researchers in science and engineering. They are faced with software engineering challenges for which they are not trained. In 2005, the German Aerospace Center (DLR) started the ``DLR Software Engineering Initiative'' to support their researchers addressing these challenges. One of the initiative's core element is to setup and establish an active software engineering community within DLR.

Improving the activities of the DLR software engineering initiative is an on-going challenge. For this purpose, a good understanding of the software engineering community within DLR is required. We present insights about the DLR software engineering community through an analysis of the participation at the annual software engineering knowledge exchange workshops. These workshops can be considered as the annual software engineering community event and offer therefore a good starting point to analyze the community. In our analysis we focus on the return rate of the participants as well as the influence of the workshops topic, it's location, and the participants origins.

Speakers
avatar for Carina Haupt

Carina Haupt

Head of Software Engineering Group, German Aerospace Center (DLR)
My goal is to improve the quality of software development in research. This is what I work for and what I do research for. Whether in the field of software engineering, open source or knowledge management.
avatar for Tobias Schlauch

Tobias Schlauch

Research Software Engineer, German Aerospace Center (DLR)
I am working at the German Aerospace Center (DLR) as a research software engineer. Since 2009, I serve as the representative of the DLR software engineering initiative which aims to improve research software development at DLR. In addition, I regularly support development teams to... Read More →



Wednesday September 18, 2019 14:30 - 14:55 BST
2. Aston Webb C Block Lecture Theatre Aston Webb Building

14:30 BST

#4D3 - Cloud Technologies and Case Studies - How to build and run an international open-data image repository
Image Data Resource (IDR, https://idr.openmicroscopy.org) is a public data repository containing over 100TB of life sciences imaging data from published studies in a searchable and reusable format. It is built from existing open-source tools but significant work was required to deploy and keep it running as a production service. I will talk about the journey from our first ever use of cloud services to scaling up a single- server system into the reliable public resource that exists today, including design choices, the mistakes we made, and the challenges we still face.

I will introduce some of the tools we use including Ansible, OpenStack, and Docker, but with a focus on the benefits of reproducible deployments rather than going into too much technical detail of particular tools. All of our infrastructure is open source and I will explain why, and provide links for people interested in finding out more.

This talk will hopefully provide an insight into how a public data services like this could be set up by your institution, including many of the considerations you may not have thought of.

Speakers
avatar for Simon Li

Simon Li

Software and Operations Engineer, Open Microscopy Environment
I'm interested in using advantage of modern cloud technologies to build infrastructure for publishing and analysing open-data, supporting the bug push towards open-science. I'm also interested in what makes a successful interdisciplinary collaboration.



Wednesday September 18, 2019 14:30 - 14:55 BST
4. Aston Webb, Room WG12 Aston Webb Building

14:30 BST

#4C2 - Building a high performance network for HPC in the Cloud
Amazon’s first engineering design principal is easily described as "start with the customer and work backwards”. That typically means we like to think hard about the problems our customers are trying to solve and come up with creative ideas that solve the problem, rather than starting with some cool gadget or tech and trying to find a use for it. The latest manifestation of this is the Elastic Fabric Adapter (EFA), which looks very different from “the usual approach” but delivers some serious  serious application performance for HPC codes on AWS.

Speakers
avatar for Brendan Bouffler

Brendan Bouffler

Principal Product Manager for HPC, Amazon
Brendan Bouffler has 25 years of experience in the global tech industry creating large-scale systems for HPC environments, beginning in the 90’s when he helped co-found a US-based dot-com start-up to apply extreme computing to streaming media for broadcast video environments. The... Read More →


Wednesday September 18, 2019 14:30 - 15:00 BST
5. Nuffield Building, Room G17 Nuffield Building

15:30 BST

#5A2 - Teaching - Where do research software engineers come from? A new minor programme in CS at Lancaster University
As part of the Institute of Coding, the School of Computing and Communications at Lancaster University is creating a new minor programme. A year-long series of credit-bearing modules take non-Computer Science students through computational thinking, learning to program in JavaScript and Python, the history of computing, and teach skills in applied areas of data analytics, information visualisation, virtual worlds, and physical computing. The programme is capped by a 5-week group project for students to showcase their new skills with projects relevant to their fields. Interdisciplinary and group communication is at the heart of the programme, with lab space being redesigned to help people collaborate more effectively.
But once students have finished this programme, taking 1/3rd of their first year, what are they going to do with their skills? Are these students, developing experts in their disciplines but with Computer Science skills, the RSE's of the future? Can their home programmes deal with their skills? This talk will discuss the programme's design, intended to give practical, applied skills and widen participation from disciplines across the university, and explore the implications for RSE's identity in the future.

Speakers
avatar for Paul Dempster

Paul Dempster

Lecturer, Institute of Coding, Lancaster University
My primary interests are programming education and data analytics in education and gaming. As part of my Institute of Coding role, I am designing, implementing, and evaluating innovative Computer Science courses for non-CS major students which allow me to explore some of the questions... Read More →



Wednesday September 18, 2019 15:30 - 15:55 BST
4. Aston Webb, Room WG12 Aston Webb Building

15:30 BST

#5C1 - Tools and Case Studies - Autograd: A cool software trick for avoiding maths
With the ever-growing popularity of deep learning, much attention has been focused on tools to improve researcher efficiency. Backpropagation, the crucial step in training neural networks, is nothing but the chain rule with memoization. Autograd automates the computation of such gradients of a function. This enables the user to quickly explore the model space, by saving them from either having to code up exact analytical derivatives or make do with finite differences. This works whenever one can express a function using common differentiable python, numpy and scipy primitives (those with .deriv).
This talk will describe the algorithm behind automatic differentiation and illustrate key steps with examples. Brief guidance will follow on how to make use of Autograd within your own projects. I will finish by reviewing some of the more recent advances and the exciting use cases to which these methods are being applied!

Speakers
avatar for Pashmina Cameron

Pashmina Cameron

Sr RSE Lead, Microsoft Research Cambridge
Deep learning, Machine Learning, Computer vision, C++, Python



Wednesday September 18, 2019 15:30 - 15:55 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

15:30 BST

#5D1 - HPC - Evaluating containerised genomics pipeline on HPC
Genomics pipelines are series of bioinformatics steps and tools/algorithms used to analyse genome data. A pipeline may run successfully in the environment in which it was created, but fail on other platforms due to differences in execution environment, which is where containers have an advantage. Genomic analyses are incredibly complex and often involve comparison of new data against multiple large-scale external datasets. High performance computing (HPC) helps to solve complex compute problems quickly and at scale.This session covers the evaluation of containerised genomics pipeline on HPC. A performance benchmark is created by comparing containerised pipeline against baremetal pipeline. Reproducibility is essential for the verification and advancement of scientific research. Containers can distribute an entire computing environment and hence support portability. This session is a showcase of work we have conducted at Hartree Centre using all open source technologies to solve real world problem that has a vital role i.e. genomics workflows. The purpose of this case study is to evaluate different container technologies and assess their feasibility and performance at HPC.

Speakers
avatar for Aiman Shaikh

Aiman Shaikh

Research Software Engineer, STFC



Wednesday September 18, 2019 15:30 - 15:55 BST
5. Nuffield Building, Room G17 Nuffield Building

16:00 BST

#5A3 - Teaching - RSE Summit @ Microsoft - Cloudy in Brussels
In March, Microsoft (MS) hosted a summit for EMEA Research Software Engineers in Brussels. This talk will give an overview of discussions a feature of which was how MS see RSEs as a focal point for the adoption of cloud computing for research in Higher Education. Often sitting at the interface between researchers and IT, RSEs have knowledge of both domains, work with both and teach.
Identifying and addressing knowledge gaps is an area of interest for Microsoft and following a wide-ranging discussion on training in digital skills, several themes developed which the talk will discuss: Data; Authentication; Funding; Managing Budgets; Deployment; Culture.
A recurring theme was also to identify deliverables to harness the enthusiasm of the summit. These included the report this talk summarises and a white paper on data in the cloud, but the key deliverable was the Research Software Reactor (RSR). The first workshop takes place at the end of May 2019. Using MS resources: BluePrint, DevOps and Learn, focussed initially on Azure, RSR aims to develop Proof of Concept open source deployment and learning materials for research driven workloads (https://github.com/research-software-reactor/guidelines). The talk will update on progress!

Speakers
avatar for Gerard Gorman

Gerard Gorman

Imperial College London
Azure Cloud; HPC; modelling and data inversion; DSL's and code generation; education in computational science and engineering.
avatar for James Grant

James Grant

Research Software Engineer, University of Bath
avatar for Brad Tipp

Brad Tipp

Director Higher Education Research, Microsoft Corp
I look after the Global Academic Research community as part of our Higher Education team at Microsoft Corp in Seattle.My aim is to make it easier for this community to get their research done faster and better by making our services work better for this community and by helping to... Read More →



Wednesday September 18, 2019 16:00 - 16:25 BST
4. Aston Webb, Room WG12 Aston Webb Building

16:00 BST

#5C2 - Tools and Case Studies - Air Quality & Python: Developing Online Analysis Tools
Poor surface air quality has a range of implications for human health and the economy. Analysing and interpreting the incoming data streams from air quality measurement stations is critical for tackling the problem and for developing early warning systems. I am using Python to develop a set of online analysis tools (ukatmos.org) to enable the public to quickly and easily plot air quality data in many ways, effectively freeing up information that is already publicly available but in awkward formats and often involves development of code. We anticipate these tools will also support data science classes at school, and can speed up scientific research by minimizing effort in repeating analyses.

The tools integrate numerous Python libraries (e.g. Pandas and NumPy), the Django web framework, the Plot.ly tools for creating interactive graphs, and SQL to address the large data volumes. Developing these Python tools in an adaptive and scalable way allows it to grow as more data become available, e.g. satellite observations. Adaptability also includes evolving user requirements. This talk will follow the processes I went through developing these tools and show a working example of the project so far.

Speakers
avatar for Douglas Finch

Douglas Finch

Postdoc Research Associate, University of Edinburgh



Wednesday September 18, 2019 16:00 - 16:25 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

16:00 BST

#5D2 - HPC - Pursuing and supporting reproducible workflows for all with Cylc
Cylc is a workflow engine, allowing anyone to run a custom schedule of inter-dependent tasks. Its general-purpose open-source nature has led to a technically diverse user base, from scientists & other RSEs to HPC analysts and operational forecasting staff, distributed across research centres worldwide. Development has been driven by the requirements of users; it has been a ongoing challenge to cater for the unique automation needs emerging from such varied contexts, for example building a system that scales both down, to transcend mere cron jobs, and up, to complex workflows of thousands of tasks. Hosting Cylc publicly on GitHub has been vital, enabling input, including code contributions, from all factions, and centralising communication between our international team working under inverse time zones. We invest much of our time providing training and support in the effective use of Cylc and notably at the Met Office our users have spun-up their own community group for effective and maintainable Cylc workflow design, adding the balancing act of directing, but not dictating, best practice. Hear us share our wisdom from the pursuit of reproducible workflows for all.

Speakers
avatar for Sadie Bartholomew

Sadie Bartholomew

Scientific Software Engineer, Met Office



Wednesday September 18, 2019 16:00 - 16:25 BST
5. Nuffield Building, Room G17 Nuffield Building

16:30 BST

#5A4 - Testing - R Packages on ARM
Suppose you had a large community of R users. What proprotion of them could get up and running straight away on a new architecture?

The University of Bristol has several hundred R users and two ARM architecture aarch64 based supercomputers - how will they fare? This talk investigates using continuous integration to provide quantative answers.

(This talk is essentially the same as the one givien in HPC Champions on Monday afternoon)

Speakers
avatar for Chris  Edsall

Chris Edsall

Research Software Engineer, University of Bristol
Physicist by training, spent a long time in environmental research institutes looking after high performance computing and doing what we now call RSE.



Wednesday September 18, 2019 16:30 - 16:55 BST
4. Aston Webb, Room WG12 Aston Webb Building

16:30 BST

#5C3 - Tools and Case Studies - From Paper to Tech - an RSE Project on Patient Safety in Leicester Hospitals
As many as 1 in 10 patients are harmed while receiving hospital care in high income countries, indeed 1.8 million incidents were reported to NHS England in 2016-2017. Taking inspiration from aviation safety procedures, academics at the University of Leicester have been trying to track and monitor patient safety as both a learning tool for the medical students but also to improve the patient experience. However, the researchers’ initial implementation was based on a manual process in which both data-gathering and data-analysis were carried out by hand. In this talk I will share how I converted the paper forms into a mobile app (for data gathering) and an accompanying web app for data analysis. I will discuss the frameworks I used to do this, Ionic and Django, and the lessons I learnt along the way. I will also discuss how the academics and I worked together, how this benefited the project and ensured a successful outcome. I hope my experience can encourage and help other RSEs who may be facing a similar project in the future.

No prior technical knowledge is required.

Speakers
avatar for Teri Forey

Teri Forey

Research Software Engineer, University of Leicester



Wednesday September 18, 2019 16:30 - 16:55 BST
1. Bramall - Elgar Concert Hall Aston Webb Building

16:30 BST

#5D3 - HPC - HPC Project Assessment - Stepping into the Unknown
High Performance Computing is a bit of a paradox. On one side you have leading-edge technology, cutting frontiers in using huge quantities of data... and on the other a workhorse workflow powered by FORTRAN with very few people around to progress the code. Somewhere in the middle is you, being asked to tackle projects from how to take clearer images of a black hole to making sure that the Physics Department’s core research doesn’t grind to a halt. How do you navigate between the known, solid projects and the unknown? This talk will take you through the division lines between known and unknown in HPC projects, discuss the characteristics which define each, and go through a handy set of tips on how to assess projects and see if they are a match to your (or your team’s) skill sets. Along the way share what we have learned in commercial HPC, on the open source projects we run, and drop some serious public cloud knowledge.

Speakers
avatar for Cristin Merritt

Cristin Merritt

Sales & Marketing, Alces Flight Limited
I've been working in the field of change management for Tech for over 15 years with current work in cloud adoption for HPC. My focus is gathering together practical tools, tips, and tricks for those out there looking to manage change - and pair them with the open source, services... Read More →



Wednesday September 18, 2019 16:30 - 16:55 BST
5. Nuffield Building, Room G17 Nuffield Building
 
Filter sessions
Apply filters to sessions.