Date: 27 May 2025 @ 14:45 - 16:00

Timezone: Amsterdam

In this session, participants will learn a technique to extract and parse text from PDF documents that are kept in a reference manager library. The use-case for this session is to extract text from the introduction sections of these articles, which can then be optionally used to generate a summary or for further parsing depending on the needs of the participant (e.g. for use in ASReview). Participants will also learn how to create PDFs from these extracted and parsed texts for re-integration with reference manager software via the commonly used RIS format references file. 

 

All necessary software will be provided in a virtual disk image accessible on gROW. A companion guide to this procedure as well as a repository of related scripts will also be made available on the RU library github page.  

Keywords: DCC, Literature search, Reference managers, Text Mining, All faculties

Venue: 1.05B Central Library Instruction Room


Activity log