I spent months working on domain-specific search engines and knowledge discovery apps for biomedicine and eventually figured that synthesizing "insights" or building knowledge graphs by machine-reading the academic literature (papers) is *barely useful* :https://t.co/eciOg30Odc
— Markus Strasser (@mkstra) December 7, 2021
From the linked paper itself:
Contextual, Tacit Knowledge is not Digital, not Encoded or just not Machine-Interpretable yet
Most knowledge necessary to make scientific progress is not online and not encoded. All tools on top of that bias the exploration towards encoded knowledge only (drunkard's search), which is vanishingly small compared to what scientists need to embody to make progress.
Expert and crowd-curated world knowledge graphs (eg. ConceptNet) can partially ground a machine learning model with some context and common sense but that is light years away from an expert’s ability to understand implicature and weigh studies and claims appropriately.
ML systems are fine for pattern matching, recommendations and generating variants but in the end great meta-research, including literature reviews, comes down to formulating an incisive research question, selecting the right studies and defining adequate evaluation criteria (metrics).
Conclusion: No matter how large the training corpus or how many parameters the model has, large language models are NOT going to be able to crank out new scientific discoveries. The necessary foundations aren't on the web. Sorry, no AGI.
No comments:
Post a Comment