Abstract (english) | This is the complete dataset about Croatian 2017 publications indexed in the Web of Science Core Collection citation indexes (WoSCC) (SCIEXP, SSCI, AHCI, and ESCI), their OA status as well as data about Croatian journals indexed in WoSCC which are analyzed in the publication: Macan, B., Škorić, L. & Petrak, J. (2020) David among Goliaths: Open access publishing in scientific (semi-)periphery. Learned Publishing, 33 (4), 410-417 doi:10.1002/leap.1320. This case study analyses data on papers of Croatian authors published in 2017 from four WoSCC citation indexes (SCIEXP, SSCI, AHCI, and ESCI). The primary dataset (5,176 articles and reviews) was divided into two subsets, the open access (OA) subset (2,964 papers) and non-OA subset (2, 212 papers). We also used the primary dataset to create a subset of papers published in Croatian journals (1, 588) as opposed to foreign ones. All were screened for full-text OA status, journal JCR quartile ranking, journal dominant discipline, and language of publication. OA papers prevailed with 74.4%. Most were available at publisher websites. The percentage of OA papers in Croatian journals was 99.8%. The share of OA papers was the highest in the humanities and social sciences, which also saw the highest share of papers in the Croatian language. |
Methods (english) | The primary sources of research data were the Clarivate Analytics WoSCC citation indexes: Science Citation Index Expanded (SCI-EXP), Social Sciences Citation Index (SSCI), Arts & Humanities Citation Index (AHCI), and ESCI. The initial search of WoSCC indexes was done in May 2019 using search query "Croatia OR Hrvatska" in the "Basic Search" address field, combined with 2017 in the Year Published field. The WosCC address field covers all of the authors’ affiliations addresses.
The results were then filtered by document type to include only "article" and "review" papers. What remained was our primary dataset of 5,176 papers, which we used as the basis for all further analysis. Clarivate Analytics InCites was used for disciplinary distribution analysis as well as for quartile distribution (Q) of papers according to the journal’s impact factors (JIF). Hrčak and journals’ homepages were used for the identification of the Croatian journals, and DOAJ was used for the additional checking of their OA status.
Our primary dataset was divided into three subsets according to the goals of our analysis:
I. OA subset comprised 2,964 papers. It combined two subsets:
- a subset of 2,574 papers obtained by refining the primary dataset by the WoS Open Access filter (all OA types according to the WoS typology were included) and
- a subset of 390 papers published in Croatian journals which were not marked in WOS as OA papers, but were identified manually by journal name search in Hrčak and DOAJ.
II. Non-OA subset included the remaining 2,212 papers.
- randomized sample: a random sample was created in order to check if there was any "hidden" OA paper within the non-OA subset. Randomization was based on the expected OA proportion of 20%, 95% confidence interval, and type 1 error of 5%. The OA status of thus obtained 212 papers was checked manually using the article title as a Google Scholar search string. All retrieved papers were inspected to find respective OA full text.
III. Croatian journals subset comprised 1,588 papers published by Croatian journals. |