A community-powered search of machine learning strategy space to find NMR property prediction models.
The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of...
Guardado en:
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/7bd641a416ff403b90f3c0a0e9726773 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:7bd641a416ff403b90f3c0a0e9726773 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:7bd641a416ff403b90f3c0a0e97267732021-12-02T20:06:48ZA community-powered search of machine learning strategy space to find NMR property prediction models.1932-620310.1371/journal.pone.0253612https://doaj.org/article/7bd641a416ff403b90f3c0a0e97267732021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0253612https://doaj.org/toc/1932-6203The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published 'in-house' efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties.Lars A BratholmWill GerrardBrandon AndersonShaojie BaiSunghwan ChoiLam DangPavel HancharAddison HowardSanghoon KimZico KolterRisi KondorMordechai KornbluthYouhan LeeYoungsoo LeeJonathan P MailoaThanh Tu NguyenMilos PopovicGoran RakocevicWalter ReadeWonho SongLuka StojanovicErik H ThiedeNebojsa TijanicAndres TorrubiaDevin WillmottCraig P ButtsDavid R GlowackiPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 7, p e0253612 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Lars A Bratholm Will Gerrard Brandon Anderson Shaojie Bai Sunghwan Choi Lam Dang Pavel Hanchar Addison Howard Sanghoon Kim Zico Kolter Risi Kondor Mordechai Kornbluth Youhan Lee Youngsoo Lee Jonathan P Mailoa Thanh Tu Nguyen Milos Popovic Goran Rakocevic Walter Reade Wonho Song Luka Stojanovic Erik H Thiede Nebojsa Tijanic Andres Torrubia Devin Willmott Craig P Butts David R Glowacki A community-powered search of machine learning strategy space to find NMR property prediction models. |
description |
The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published 'in-house' efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties. |
format |
article |
author |
Lars A Bratholm Will Gerrard Brandon Anderson Shaojie Bai Sunghwan Choi Lam Dang Pavel Hanchar Addison Howard Sanghoon Kim Zico Kolter Risi Kondor Mordechai Kornbluth Youhan Lee Youngsoo Lee Jonathan P Mailoa Thanh Tu Nguyen Milos Popovic Goran Rakocevic Walter Reade Wonho Song Luka Stojanovic Erik H Thiede Nebojsa Tijanic Andres Torrubia Devin Willmott Craig P Butts David R Glowacki |
author_facet |
Lars A Bratholm Will Gerrard Brandon Anderson Shaojie Bai Sunghwan Choi Lam Dang Pavel Hanchar Addison Howard Sanghoon Kim Zico Kolter Risi Kondor Mordechai Kornbluth Youhan Lee Youngsoo Lee Jonathan P Mailoa Thanh Tu Nguyen Milos Popovic Goran Rakocevic Walter Reade Wonho Song Luka Stojanovic Erik H Thiede Nebojsa Tijanic Andres Torrubia Devin Willmott Craig P Butts David R Glowacki |
author_sort |
Lars A Bratholm |
title |
A community-powered search of machine learning strategy space to find NMR property prediction models. |
title_short |
A community-powered search of machine learning strategy space to find NMR property prediction models. |
title_full |
A community-powered search of machine learning strategy space to find NMR property prediction models. |
title_fullStr |
A community-powered search of machine learning strategy space to find NMR property prediction models. |
title_full_unstemmed |
A community-powered search of machine learning strategy space to find NMR property prediction models. |
title_sort |
community-powered search of machine learning strategy space to find nmr property prediction models. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/7bd641a416ff403b90f3c0a0e9726773 |
work_keys_str_mv |
AT larsabratholm acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT willgerrard acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT brandonanderson acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT shaojiebai acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT sunghwanchoi acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT lamdang acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT pavelhanchar acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT addisonhoward acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT sanghoonkim acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT zicokolter acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT risikondor acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT mordechaikornbluth acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT youhanlee acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT youngsoolee acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT jonathanpmailoa acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT thanhtunguyen acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT milospopovic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT goranrakocevic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT walterreade acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT wonhosong acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT lukastojanovic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT erikhthiede acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT nebojsatijanic acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT andrestorrubia acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT devinwillmott acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT craigpbutts acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT davidrglowacki acommunitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT larsabratholm communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT willgerrard communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT brandonanderson communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT shaojiebai communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT sunghwanchoi communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT lamdang communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT pavelhanchar communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT addisonhoward communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT sanghoonkim communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT zicokolter communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT risikondor communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT mordechaikornbluth communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT youhanlee communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT youngsoolee communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT jonathanpmailoa communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT thanhtunguyen communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT milospopovic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT goranrakocevic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT walterreade communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT wonhosong communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT lukastojanovic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT erikhthiede communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT nebojsatijanic communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT andrestorrubia communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT devinwillmott communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT craigpbutts communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels AT davidrglowacki communitypoweredsearchofmachinelearningstrategyspacetofindnmrpropertypredictionmodels |
_version_ |
1718375367177142272 |