Are texts and personal data of researchers already reused by Elsevier to develop commercial AI reviewing products?
European copyright does not allow Text Mining on scientific articles
For the moment, many subscription journals practice a complete copyright transfer after acceptance of author manuscript by contracts or acceptance of general conditions.
Copyright transfer to publisher in practice means that Text Mining (TM) is only possible where Fair Use exception of copyright applies, such as USA and UK: only for non-commercial purpose. The rest of Europe does not have such disposition: even for researchers working in universities with already paid site licenses for metadata, references and attached abstracts, or full texts of articles, they have to pay licences to perform TM.
Plan S position shows that the current use situation of copyright of scientific articles may not be satisfactory for further research, including TM and Artificial Intelligence (AI) developement. Plan S states that researcher-author of article should retain copyright and apply an CC-BY license, to allow easier re-use of article, as stated in the Berlin Declaration. The copyright reform in EU includes debates on a kind of introduction of Fair Use exception for research purpose.
PlanS and copyright reform clearly consider TM as a high issue for research. Some persons and groups even think to allow TM for commercial purpose for the benefit of innovation. But to my knowledge, legacy publishers are not supporting introduction of fair use.
Copyright assignment on researchers non-articles texts
According to copyright laws in any EU country, copyright owns the person who writes any text. As such, researchers-authors retain copyright on e-mails, comments to editors, texts in formulars, and author manuscripts before reviewing (pre-prints) and after reviewing (post-prints), etc. even a submission platform owned by a publisher was used (for exemple Evise owned by Elsevier).
Where Fair Use exists, it means that TM is only allowed for non-commercial purpose, so publishers including Elsevier, may not text mine legally non-articles copyrighted texts by researchers to develop commercial and patented AI reviewing products.
Highlights statements and manuscripts used for AI research
Article If AI can fix peer review in science, AI can do anything: a 2 years post-doc at UCL university in UK, paid by Elsevier, apparently developed a product based on TM on author manuscripts in the perspective of AI reviewing system.
“reading each paper and identifying its key concepts, organizing key words by type, and identifying relationships between different key phrases. And it's not just an academic exercise: Mrs Isablle Augenstein is on a two-year contract with Elsevier, one of the world's largest publishers of scientific research, to develop computational tools for their massive library of manuscripts”.
The conference proceedings article A Supervised Approach to Extractive Summarisation of Scientific Papers by Isabelle Augenstein clearly says that non-article researcher-author texts and author article manuscripts were used to perform TM:
“The dataset is created by exploiting an existing resource, ScienceDirect, where many journals require authors to submit highlight statements along with their manuscripts. Using such highlight statements as gold statements has been proven a good gold standard for news documents”.
1.Was this really academic research, and therefore Fair Use exception for non-commercial research was considered as relevant?
2.Proceeding article states at end under acknowledgment section: 'This work was partly supported by Elsevier'. -> What about possible conflict of interest that may have been declared in a paragraph Conflict of Interest instead of Acknowledgment?
3.Knowing that all researchers-authors retain copyright on highlight statements and manuscripts and that research may have putative commercial interest, where they all asked a re-use permission?
4.Is Elsevier already performing TM on researcher-authors texts and personal data to develop an AI commercial putative patented reviewing product?
5. What are universities responsibilities to allow or forbid AI research sponsored by publishers for potential patented commercial reviewing products?
6.Can EU researcher-authors ask to erase all e-mails, comments, headlights of research, author manuscripts, personal data etc from Evise platform and Elsevier business, as GDPR allows to?
7.How researcher copyright right may be reinforced? Should Plan S develop a statement about copyrighted material belonging to researchers when they use proprietary commercial publisher submission platform?
8. Is Elsevier already taking for granted that EU will enable TDM for commercial purpose in copyright reform, despite being against introduction of Fair Use exception in copyright reform?
9. Allowing TM for commercial in EU may lead to massive reuse of researchers personal data even as form of non articles personal texts, without their consent, and may infringe competition and data personal protection laws?
I twitted twice to Elsevier and wrote email to proceeding article principle author to have some more detail on the case so far without any success (13.1.2018). No answer.
I wish your feed backs, comments and correction of the text. Thank you in advance.