Document Type : Original Article

Authors

1 Professor, Knowledge and Information Science, Department of Knowledge and Information Science, School of Educational Sciences and Psychology, Shahid Chamran University of Ahvaz, Ahvaz, Iran

2 Associate Professor, Knowledge and Information Science, Department of Knowledge and Information Science, School of Social Sciences, Yazd University, Yazd, Iran

3 PhD Student, Knowledge and Information Science, Department of Knowledge and Information Science, School of Educational Sciences and Psychology, Shahid Chamran University of Ahvaz, Ahvaz, Iran

Abstract

Introduction: One problem in conducting scientometric thematic analysis is selecting which of the bibliographic fields containing the topics can be analyzed. This study aimed to compare subject fields of documents to determine the field or a combination of fields which are suitable for conducting a complete and proper thematic analysis in scientometrics.Methods: This was a descriptive research with content analysis approach. Scientific products in the field of functional gastrointestinal disorders were extracted from the Scopus database. The analysis was done on 13798 documents, which included title, author keywords, and index keywords. After clustering using the K-Means method, and calculating the inclusion index for created clusters, the similarity of keywords between the three fields was determined.Results: There was a high similarity between the index and the author keywords (87.71 and 85.71). The low amount of the index in the title field and the index keywords (0) also suggested that there was little similarity between the controlled vocabulary and the keywords used by the authors in the title, and that authors did not use the preferred vocabulary in the title.Conclusion: Using the words of the title field will show the results of the natural language analysis. However, if the purpose of a study is categorizing terms, the use of index keywords field will be the most appropriate.

Keywords

  1. Jafarnejad A. An introduction to data banks. Tehran, Iran: SAMT Publications; 2006. p. 24. [In Persian].
  2. Ferrara A, Salini S. Ten challenges in modeling bibliographic data for bibliometric analysis. Scientometrics 2012; 93(3): 765-85.
  3. Prytherch RJ, Harrod LM. Harrod's Librarians' glossary of terms used in librarianship, documentation and the book crafts, and reference book. Aldershot, UK: Gower; 1990. p. 163.
  4. Qin J. Semantic similarities between a keyword database and a controlled vocabulary database: An investigation in the antibiotic resistance literature. J Am Soc Inf Sci 2000; 51(2): 166-80.
  5. Murphy LS, Reinsch S, Najm WI, Dickerson VM, Seffinger MA, Adams A, et al. Searching biomedical databases on complementary medicine: The use of controlled vocabulary among authors, indexers and investigators. BMC Complement Altern Med 2003; 3: 3.
  6. Gross Ti. What have we got to lose? The effect of controlled vocabulary on keyword searching results. College and Research Libraries 2018; 66(3): 212-30.
  7. Jokar A, Anvari S. Study of thematic approaches (Natural and controlled language) in information retrieval from online bibliographic databases. Library and Information Science 2007; 9(4): 151-64. [In Persian].
  8. Naghneh Esfahani M, Cheshmeh Sohrabi M, Banieghbal N. A comparative study of the Persian and English keywords of theses from the isfahan university of medical sciences, Iran, and the thesauruses and Persian medical subject headings. Health Inf Manage 2013; 9(6): 802-13. [In Persian].
  9. Ghanavati M, Noruzi A, Nakhoda M, Khatir A. Consistency between descriptors, author-supported keywords and tags in the ERIC and Mendeley databases. Iranian Iranian Journal of Information Processing and Management 2018; 33(4): 1745-66. [In Persian].
  10. Tavakolizadeh-Ravari M. Two steps break-cull model for automatic indexing of Persian texts. Research on Information Science and Public Libraries 2015; 21(80): 13-40. [In Persian].
  11. Tseng YH, Tsay MY. Journal clustering of library and information science for subfield delineation using the bibliometric analysis toolkit: CATAR. Scientometrics 2013; 95(2): 503-28.
  12. Mokhtari-Shamsi M, Tavakolizadeh-Ravari M, Zalzadeh E, Baghbanian M. Predicting basic concepts of a field, based on the factors of oldness and frequency use of subject terms: A case study on colon cancer. Health Inf Manage 2016; 13(5): 354-9. [In Persian].
  13. Hazeri A, Tavakolizadeh Ravari M, Ebrahimi V. A study of subject overlap between the main categories of knowledge management within the web of science. Iranian Journal of Information Processing and Management 2015; 30(4): 997-1023. [In Persian].
  14. Hosaininasab SH, Makkizadeh F, Zalzadeh E, Hazeri A. The thematic structure of papers on depression treatment in PubMed from 2005 to 2014. Health Inf Manage 2016; 13(5): 347-53. [In Persian].
  15. Davis MA. Title keyword selection and use for optimum document retrieval. Public and Access Services Quarterly 1997; 2(2): 15-22.
  16. Makizadeh F. The semantic relationship between the themes in Persian scientific articles in the field of global warming. Journal of Climate Research 2016; 7(25-26): 91-109. [In Persian].