Ca cocotte! Publication Scientifique - un blog qui clarifie les enjeux de la production académique

jeudi 13 décembre 2018

Automatic reviewing tool

Are texts and personal data of researchers reused by Elsevier to develop commercial AI reviewing products?

[Publication 13.12.2018]
CC0-Graphicalbrain 


European copyright does not allow Text Mining on scientific articles

For the moment, many subscription journals practice a complete copyright transfer after acceptance of author manuscript by contracts or acceptance of general conditions.

Copyright transfer to publisher in practice means that Text Mining (TM) is only possible where Fair Use exception of copyright applies, such as USA and UK: only for non-commercial purpose. The rest of Europe does not have such disposition: even for researchers working in universities with already paid site licenses for metadata, references and attached abstracts, or full texts of articles, they have to pay licences to perform TM.

Plan S position shows that the current use situation of copyright of scientific articles may not be satisfactory for further research, including TM and Artificial Intelligence (AI) developement. Plan S states that researcher-author of article should retain copyright and apply an CC-BY license, to allow easier re-use of article, as stated in the Berlin Declaration. The copyright reform in EU includes debates on a kind of introduction of Fair Use exception for research purpose. 

PlanS and copyright reform clearly consider TM as a high issue for research. Some persons and groups even think to allow TM for commercial purpose for the benefit of innovation. But to my knowledge, legacy publishers are not supporting introduction of fair use.  


Copyright assignment on researchers non-articles texts

According to copyright laws in any EU country, copyright owns the person who writes any text. As such, researchers-authors retain copyright on e-mails, comments to editors, texts in formulars, and author manuscripts before reviewing (pre-prints) and after reviewing (post-prints), etc. even a submission platform owned by a publisher was used (for exemple Evise owned by Elsevier).


Where Fair Use exists, it means that TM is only allowed for non-commercial purpose, so publishers including Elsevier, may not text mine legally non-articles copyrighted texts by researchers to develop commercial and patented AI reviewing products. 

Highlights statements and manuscripts used for AI research

Article If AI can fix peer review in science, AI can do anything: a 2 years post-doc at UCL university in UK, paid by Elsevier, apparently developed a product based on TM on author manuscripts in the perspective of AI reviewing system.  
“reading each paper and identifying its key concepts, organizing key words by type, and identifying relationships between different key phrases. And it's not just an academic exercise: Mrs Isablle Augenstein is on a two-year contract with Elsevier, one of the world's largest publishers of scientific research, to develop computational tools for their massive library of manuscripts”.

The conference proceedings article A Supervised Approach to Extractive Summarisation of Scientific Papers by Isabelle Augenstein clearly says that non-article researcher-author texts and author article manuscripts were used to perform TM: 
“The dataset is created by exploiting an existing resource, ScienceDirect, where many journals require authors to submit highlight statements along with their manuscripts. Using such highlight statements as gold statements has been proven a good gold standard for news documents”.

My questions


1.Was this really academic research, and therefore Fair Use exception for non-commercial research was considered as relevant?

2.Proceeding article states at end under acknowledgment section: 'This work was partly supported by Elsevier'. -> What about possible conflict of interest that may have been declared in a paragraph Conflict of Interest instead of Acknowledgment?

3.Knowing that all researchers-authors retain copyright on highlight statements and manuscripts and that research may have putative commercial interest, where they all asked a re-use permission?

4.Is Elsevier already performing TM on researcher-authors texts and personal data to develop an AI commercial putative patented reviewing product?

5. What are universities responsibilities to allow or forbid AI research sponsored by publishers for potential patented commercial reviewing products?

6.Can EU researcher-authors ask to erase all e-mails, comments, headlights of research, author manuscripts, personal data etc from Evise platform and Elsevier business, as GDPR allows to?

7.How researcher copyright right may be reinforced? Should Plan S develop a statement about copyrighted material belonging to researchers when they use proprietary commercial publisher submission platform?

8. Is Elsevier already taking for granted that EU will enable TDM for commercial purpose in copyright reform, despite being against introduction of Fair Use exception in copyright reform?

9. Allowing TM for commercial in EU may lead to massive reuse of researchers personal data even as form of non articles personal texts, without their consent, and may infringe competition and data personal protection laws? 

I twitted twice to Elsevier and wrote email to proceeding article principle author to have some more detail on the case so far without any success (13.1.2018). No answer.

https://twitter.com/SylvieVullioud/status/1072131320770367493
https://twitter.com/SylvieVullioud/status/1069905273211506688

I wish your feed backs, comments and correction of the text. Thank you in advance. 

2 commentaires:

  1. I think this is definitely academic research. That a postdoc is funded by Elsevier and undertaking research that could be of use to Elsevier doesn't to me suggest that the research isn't academic. After all, its results are published in scholarly journals and are part of the ongoing conversation in the field. The results are not kept hidden for only Elsevier to exploit.

    I don't know the answers to your technical/legal questions, so I won't comment on those.

    I will say that developing tools to help aid the process of peer review (such as suggesting potential reviewers) strikes me as a good idea. Having an AI reviewer replace peers, on the other hand -- especially on the grounds that the AI would not be biased, while humans are -- strikes me as both naive and horrific.

    RépondreSupprimer
  2. Thank you for your comment, J. Britt Holbrook

    'the results are not kept hidden for only Elsevier to exploit'.
    To my knowledge, raw and analysed data that could be re-used by others were not published. Raw data cannot be published since the copyright of author manuscripts and researchers highlight statements belongs to researchers. Analysed data could be published for reuse by others?

    'Having an AI reviewer replace peers'
    I think this may not first practical application of this research. Publishers may introduce AI as a help to handle submission fluxes and matching/selection with human reviewers.

    Technical questions
    The most important. EU academic (except UK) are denied Text Mining on references/abstracts/full texts because of publisher have copyright. Researchers have to ask permissions to Elsevier, and pay extra fees to text mine.
    But Elsevier may have used full texts of author manuscripts and their highlight statements texts and/or who copyright belongs to researchers, without all individual consent.

    It is possible that consent procedure was not reported in article, but impossible to know since I do not have any answer from Elsevier and article authors. Before Christmas, for sure, everybody is very busy.

    RépondreSupprimer