Data Scientist – Dimensions
Digital Science | Europe (remote)
With Dimensions, Digital Science launched an innovative research data and tool infrastructure, broadening the view of the research landscape after decades of focus on the publication/citation complex. The guiding principle, to deliver context, was to take different data sets out of their silos to create a heavily interlinked overarching dataset that described the whole research lifecycle: from funding input (grants), through research outputs (publications) and translation / application of research results (clinical trials, patents), to attention (altmetric and citations) and finally to policy-level impact (mentions of research results in policy papers).
In total, Dimensions today contains more than 128 million documents with more than 4 billion connections between these records. For more information please visit https://dimensions.ai or try the free version of the Dimensions app at https://app.dimensions.ai.
The new position in the Dimensions Data Science team will bolster the existing team to continuously broaden and improve the Dimensions data infrastructure, working with an international group of peers.
About Digital Science
Digital Science is a technology company working to make research more efficient. It invests in, nurtures and supports innovative businesses and technologies that make all parts of the research process more open and effective. The portfolio includes admired brands including Altmetric, Dimensions, Figshare, ReadCube, Symplectic, IFI Claims, GRID, Overleaf, Labguru, BioRAFT, Peerwith, TetraScience and Transcriptic.
- Develop Python code to perform data quality analyses and data aggregations throughout the whole Dimensions data warehouse
- Analyze data from multiple sources and map them to a unified data model
- Develop Python code to perform big data analytics
- Develop algorithms & processes to highlight and manage data issues
- Prototype and test new tools for analysis
- Create data visualisations
- Work together with an NLP team and other data scientists to generate more value
Skills & Requirements
- Strong hands-on experience with scientific bibliographic databases (e.g. Pubmed)
- Hands-on experience in a Python environment
- Experience with PostgreSQL and Solr
- Experience with the research in/outputs (grants, publications, patents, etc.)
- Background in data modelling/mapping
- Experience with SCM (git preferable)
- Comfortable working on the Unix command line
- Understanding of agile methodologies
- Must be a self-learner, possessing inherent inquisitiveness
- Exceptional problem solving and analytical skills
- Strong interpersonal, communications, and organizational skills
Please apply by emailing a CV and cover letter to the link provided.