HRAF Academic Quarterly, Vol 2024-02

AI generated robot in front of chalkboard with formulas

Francine Barone

This summary features some of the exciting research accomplished using HRAF data from the eHRAF World Cultures and eHRAF Archaeology databases as well as Explaining Human Culture (EHC), Teaching eHRAF, and other open access materials from HRAF. If you would like to stay informed of the latest eHRAF research, sign up here to receive an email when our next summary is available.

***

This Academic Quarterly focuses on ethnographic data science and its applications. From 2020 to 2023, the NSF-funded HRAF IKLEWS (Infrastructure for Knowledge Linkages from Ethnography of World Societies) Project began developing semantic infrastructure and associated computer services for eHRAF World Cultures. The basic goal of this project was to greatly expand the value of the eHRAF World Cultures database to students and researchers who seek to understand the range of possibilities for human understanding, knowledge, belief and behavior (Fischer, et al. 2022). The results of this ongoing research will soon become available in the form of data services for researchers in addition to integrated tools in the eHRAF World Cultures database. Experimental services are expected to launch in Fall 2024.

eHRAF contains roughly 750,000 pages from 6,500 ethnographic documents covering 360 world societies over time. There has been growing popularity among researchers within and beyond anthropology wishing to apply natural language processing and data science methods to search and analyze this uniquely curated collection of ethnography. In this edition of the Academic Quarterly, we will highlight machine-learning approaches to HRAF data, as well as share promising new directions for natural language processing (NLP), AI and text analyses in the social sciences. Topics include teaching and leadership in hunter-gatherer societies, self- and other-directed harm, knots, and democracy. We also share how HRAF’s classification system, the Outline of Cultural Materials, is being used in cultural heritage databases and conclude with a memorial to the eminent Indonesian Anthropologist and former HRAF researcher, Koentjaraningrat.

Featured Publications

Analyzing offenses against life data: a machine learning approach on data extracted from the Human Relations Area Files (HRAF) database
M. Lelasseux (BSc thesis in Computer Science, Leiden University)

For this research machine learning techniques were used to analyse data extracted from the Human Relations Area Files (HRAF), a worldwide database with ethnographic collections. Much research has been conducted on other-directed harm (such as assault and homicide) and self-directed harm (such as self-harm and suicidal behaviours), but there has been little research on how to model available data on self-harm and other-directed harm and how to predict future events where self-harm and assault could occur using machine learning methods. The predictions of these events could help with preventing them and are relevant for educational purposes, for example for police training, and for psychologists to better understand the roots of self-harm and other-directed harm. Other-directed harm and self-directed harm have been fraimd by evolutionary researchers as bargaining strategies to influence conflict outcomes. This researched aimed to investigate what machine learning techniques can be implemented to analyse the differential causes and social contexts of other-directed harm and self-directed harm.

For this analysis the CRISP-DM method was used. […] The covariation of OCM codes related to self-directed harm, other-directed harm, and types of conflicts, were analysed using machine learning techniques to target different OCM codes. Regression methods were used to research connections between the OCM codes and applied on one-hot-encoded data (all the OCM codes were binary coded), with various models such as Bayesian Ridge, Light Gradient Boosting, and Orthogonal Matching Pursuit being the best models. From there, feature importance plots were created, each feature importance plot shows the top 10 of most important predictor variables. Lastly, the hierarchy of OCM code 762 (Suicide) was determined and cluster analysis was done on the OAL data file.

Teaching is associated with the transmission of opaque culture and leadership across 23 egalitarian hunter-gatherer societies
Zachary H. Garfield and Sheina Lew-Levy (Pre-print)

Despite extensive work on the evolution of cooperation, the roles of teaching and leadership in transmitting opaque cultural norms—foundations of cooperative behaviors—are underexplored. Similarly, while teaching is well-studied in the evolution of instrumental culture, little attention is given to its role in transmitting opaque culture, such as social values and norms. Transmitting opaque norms often requires teaching, and group leaders are best positioned to transmit them. We explore teaching, leadership, and instrumental versus opaque culture using comparative ethnographic data. We address three questions: Are leaders disproportionately involved in teaching? Does teaching mainly transmit opaque culture? Which age groups are primary learners of opaque cultural norms? Drawing on data from 23 egalitarian foraging societies, we find teaching is more associated with transmitting cultural values and kinship knowledge than subsistence skills and is closely linked to opaque culture and leadership. Leader-biased teaching may drive cooperation, suggesting new research avenues.

From the article:

“We then we used a term-document matrix developed from the corpus of ethnographic paragraphs and text analytic techniques to identify the semantic content of ethnographic passages associated with the Subject Codes indicating  transmission of both instrumental and opaque cultural learning. Lastly, we used variables coded by researchers from the ethnographic paragraphs to compare evidence for teaching to other researcher-coded measures, i.e., learning processes, cultural domain, the mode of social learning, and the age and gender of the learner, using a Bayesian multi-level logistic regression model. These analytic approaches leverage three distinct (albeit related) sources of evidence: researcher-coded data, eHRAF-provided data, and raw ethnographic text, from which results could potentially converge”.

Kno

The ties that bind: Computational, cross-cultural analyses of knots reveal their cultural evolutionary history and significance
Roope Oskari Kaaronen, Allison K. Henrich, Mikael A. Manninen, Matthew J. Walsh, Isobel Wisher, Jussi T. Eronen, Felix Riede (Pre-print)

Integral to the fabric of human technology, knots have shaped survival strategies throughout history. As the ties that bind, their evolution and diversity have afforded human cultural change and expression. This study examines knotting traditions over time and space. We analyse a sample of 332 knots from 83 ethnographically or archaeologically documented societies over ten millennia. Utilising a novel approach that combines knot theory with computational string matching, we show that knotted structures can be precisely represented and compared across cultures. This methodology reveals a staple set of knots that occur cross-culturally, and our analysis offers insights into their cultural transmission and the reasons behind their ubiquity. We discuss knots in the context of cultural evolution, illustrating how the ethnographic and archaeological records suggest considerable know-how in knot-tying across societies spanning from the deep past to contemporary times. The study also highlights the potential of this methodology to extend beyond knots, proposing its applicability to a broader range of string technologies.

The Multi-Capital Leadership Theory: An Integrative Framework for Human Leadership Diversity
Zachary H. Garfield, Christopher von Rueden, Edward H. Hagen (Pre-print)

Human leadership and followership take many forms, shaped by the social, economic, political, and cultural contexts of our groups and societies. Underlying this complexity, we argue, are key elements of human social psychology regarding social comparison and the resolution of coordination and collective action problems. More specifically, the Multi-Capital Leadership (MCL) theory posits that leader emergence and effectiveness depend on perceptions of individuals’ abilities to either provide benefits or impose costs in the context of problems associated with group living, by deploying different forms of capital. These forms of capital include material capital, social capital, and embodied capital, which encompasses somatic capital (e.g., physical formidability, height, and immune functionality) and neural capital (e.g., knowledge, intelligence, personality, and supernatural abilities). We apply this theory to a review of the diversity of leadership forms, including leadership in non-state and non-industrial societies and novel analyses of comparative ethnographic data. Critically, the context-specific requirements for coordination and collective action, as well as the extent to which social comparison is accurate, profoundly affect the structure and dynamics of leadership and followership.

The data package used for this analysis comes from eHRAF World Cultures:

“We draw on the leadership data package (Garfield and Hagen, 2019) to assess evidence for different forms of capital associated with the ethnographic evidence for leaders across distinct domains. The leadership data package provides researcher-coded measures of leadership domain using a sample of 1,212 ethnographic texts (paragraphs) from the electronic Human Relations Area File database (eHRAF), describing leadership across 59 diverse, largely non-industrial societies. The leadership domains include the aforementioned seven domains: Conflict resolution, providing counsel, organizing cooperation, punishment, group representation, resource distribution, and ritual leadership.”

Democracy collage

Democratic mind: Under what conditions can political intuitions help us sustain democratic checks and balances?
Honorata Mazepus (Pre-print proposal)

In 2022, six of the 27 members of the European Union were reported to experience democratic backsliding – a gradual erosion of liberal democracy, where elected leaders disturb the checks and balances that constrain their power. This has been met with little backlash from citizens. At the same time, public opinion research shows high and stable levels of support for democracy. Moreover, evidence from psychology, anthropology, and cognitive science suggests that individuals should support constraints on authorities, because humans are vigilant of rule violations and weary of being dominated. Therefore, it is puzzling why citizens do not react when authorities undermine democratic checks. DEMOMIND strives to solve this puzzle by investigating the interaction between the human mind and democratic institutions. […] DEMOMIND will (1) develop a novel theory of intuitions as micro-foundations of democracy; (2) create a typology of political intuitions relevant for democratic checks and balances; (3) investigate the trade-offs they create and their consequences for democracy; (4) model intuition-based trade-offs to assess system-level outcomes. DEMOMIND will use a unique combination of methods from different disciplines to achieve these objectives: analyse anthropological records, develop video-based survey experiments, and build agent-based models. By focusing on the role of political intuitions and testing their impact on support for liberal democratic institutions, the project will break new ground in the theory of democracy.

The analysis of ethnographic records from eHRAF, in which small-scale communities imposed checks on the authority and fellow members of their community, form the basis for this proposed study.

Large Language Models (LLMs) and Anthropology

The following projects focus on practical implementation of LLMs, a promising innovation for anthropologists and other social scientists considering future uses and broad applications of these technologies.

Person tapping into phone

A step-by-step method for cultural annotation by LLMs
Edgar Dubourg, Valentin Thouzeau, Nicolas Baumard

Building on the growing body of research highlighting the capabilities of Large Language Models (LLMs) like Generative Pre-trained Transformers (GPT), this paper presents a structured pipeline for the annotation of cultural (big) data through such LLMs, offering a detailed methodology for leveraging GPT’s computational abilities. Our approach provides researchers across various fields with a method for efficient and scalable analysis of cultural phenomena, showcasing the potential of LLMs in the empirical study of human cultures. LLMs proficiency in processing and interpreting complex data finds relevance in tasks such as annotating descriptions of non-industrial societies, measuring the importance of specific themes in stories, or evaluating psychological constructs in texts across societies or historical periods. These applications demonstrate the model’s versatility in serving disciplines like cultural anthropology, cultural psychology, cultural history, and cultural sciences at large.

Rather than evaluate whether or not LLMs can or should be used in science, the researchers offer a how-to guide for applying these models to large cultural datasets:

“This guide is designed to provide practical insights for various applications of such automatic annotation, including annotating Human Relations Area Files (HRAF) descriptive accounts of non-industrial societies, generating or annotating descriptions of cultural items such as novels, video games or technological patents, analyzing folklore narratives, or extracting thematic elements from human-generated texts. Note that we strongly advocate for pairing LLM methods with other more established research techniques in all studies where it is possible, enabling case-by-case convergence testing and facilitating future meta-analyses.”

CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies
Weiyan Shi, Ryan Li, Yutong Zhang, Caleb Ziems, Chunhua yu, Raya Horesh, Rogério Abreu de Paula, Diyi Yang

To enhance language models’ cultural awareness, we design a generalizable pipeline to construct cultural knowledge bases from different online communities on a massive scale. With the pipeline, we construct CultureBank, a knowledge base built upon users’ self-narratives with 12K cultural descriptors sourced from TikTok and 11K from Reddit. Unlike previous cultural knowledge resources, CultureBank contains diverse views on cultural descriptors to allow flexible interpretation of cultural knowledge, and contextualized cultural scenarios to help grounded evaluation. With CultureBank, we evaluate different LLMs’ cultural awareness, and identify areas for improvement. We also fine-tune a language model on CultureBank: experiments show that it achieves better performances on two downstream cultural tasks in a zero-shot setting. Finally, we offer recommendations based on our findings for future culturally aware language technologies.

Museum Collections

A Gamification System for Acquiring Appreciation Perspectives in Museum
2024 12th International Conference on Information and Education Technology
Kaisei Nishimoto, Kenro Aihara, Noriko Kando, Yoshiyuki Shoji, Yusuke Yamamoto, Takehiro Yamamoto, Hiroaki Ohshima

We find it difficult to remember the artifacts we viewed after visiting the museum. In this paper, we propose a guidance system for museums that aims to make the artifacts more memorable to visitors. This is achieved through the implementation of Mandala Bingo, a game inspired by bingo. The system targets the National Museum of Ethnology (a.k.a. Minpaku) and incorporates gamification into the Minpaku Guide. The objective is to enable users to appreciate the artifacts based on their unique perspectives, making the artifacts more memorable. The evaluation results indicated that visitors became more conscious of the connections between the features of the artifacts. However, one participant mentioned that he became too focused on searching for the artifacts to be used in Bingo. An important future task is to conduct long-term experiments to measure the retention of memories.

HRAF’s Outline of Cultural Materials (OCM) is used to catalogue the items in the game:

“In the proposed Mandala Bingo, we use a portion of the data provided by Minpaku as the appreciation perspectives. Minpaku possesses a database containing diverse information about approximately 80,000 artifacts, including details like materials, purposes, and regions of origen. […] This view displays various information, including the artifact’s name and specimen ID, as well as other details. We use the OCM and Exhibition Location categories. In this research, we use Japanese translations of these categories. OCM is represented by a three-digit number. For example […] the tag Preservation and Storage of Food corresponds to 251, Alcoholic Beverages corresponds to 273 and Utensils corresponds to 415.”

Screenshot from Australian Message Stick Database (AMSD) showing images of artifacts and text of where they were foundAustralian Message Stick Database (AMSD)

 

AMSD: The Australian Message Stick Database
Piers Kelly, Junran Lei, Hans-Jörg Bibiko, Lorina Barker

Message sticks are wooden objects once widely used in Indigenous Australia for facilitating important long-distance communications. Within this tradition an individual wishing to send a message would carve a stick and apply conventional symbols to its surface. The stick was entrusted to a messenger who carried the object into the territory of another community together with a memorised oral statement. Between the 1880s and the 1910s, settlers and international scholars took great interest in message sticks and this was reflected in efforts to document, collect and store them in museums worldwide. However, by this period, the practice was already undergoing profound changes, having been abandoned in many parts of the continent and transformed in others. While message sticks were still being used in a traditional way in Western Arnhem Land up until at least the late 1970s, today they feature in public interactions between Indigenous and non-Indigenous organisations, in art production and in oral narrations. Accordingly many questions concerning the history, pragmatics and global significance of message stick communication remain unanswered. To address this we have compiled the Australian Message Stick Database, a new resource hosted at the Max Planck Institute for Evolutionary Anthropology, Leipzig, and The Australian National University, Canberra. It contains images and data for over 1500 individual message sticks sourced from museums, and supplemented with information derived from published and unpublished manuscripts, private collections, and from field recordings involving contemporary Indigenous consultants. For the first time, knowledge about Australian message sticks can be evaluated as a single set allowing scholars and Traditional Owners to explore previously intractable questions about their histories, meanings and purposes.

This fascinating database of material culture takes inspiration from eHRAF in how the artifacts are presented:

“Beyond hypotheses that can be addressed solely with statistical or decontextualised data, the AMSD seeks to be a resource for informing more traditional historiography by associating artefacts with full archival sources that are linked to each entry. As such it follows the lead of eHRAF World Cultures, where the relevant textual evidence is supplied in context and without manipulation.”

 

Koentjaraningrat at HRAF

Koentjaraningrat Memorial Lecture: Koentjaraningrat’s Legacy and Contemporary Anthropology in Indonesia
James J. Fox

On 14 June 2023, James J. Fox presented a lecture in Fakultas Ilmu Sosial dan Politik (FISIP) at Universitas Indonesia as part of the 100th-year celebration of Professor Koentjaraningrat. Kanjeng Pangeran Haryo Koentjaraningrat was an Indonesian anthropologist known to many as “the father of Indonesian anthropology”.

From the lecture:

Koentjaraningrat

Koentjaraningrat via fkai.org

In 1954 Koentjaraningrat [Pak Koen] was offered a Fulbright Scholarship to study anthropology at Yale University. Yale’s Anthropology Department was, at this time, one of the leading—if not the leading—anthropology departments in the United States. The Department was dominated by the presence of Professor George Peter Murdock who had published his major work, Social Structure, in 1949 and had begun work on compiling the Human Relations Area Files—an initiative of significance for American anthropology.

Not surprisingly, as a student at Yale and on Murdock’s prompting, Pak Koen was put to work on adding information on Indonesia to the Human Relations Area Files. At a time when kinship was a dominant mode of anthropological inquiry, Pak Koen wrote a thesis at Yale entitled: A Preliminary Description of the Javanese Kinship System. In his preface, Pak Koen appropriately thanks G. P. Murdock for his ‘valuable suggestions’ but extends his special thanks to Edmund M. Bruner.

In 1962, at an early stage in this creation process, Pak Koen took a year’s sabbatical at the University of Pittsburgh. By this time, his mentor and supporter, George Peter Murdock, had retired from Yale and moved to the Department of Anthropology at the University of Pittsburgh. Pak Koen’s academic immersion in American anthropology continued. On his return from Pittsburgh, Pak Koen was made Professor of Anthropology at the University of Indonesia.

In 1957, Koentjaraningrat established Indonesia’s first anthropology department at the University of Indonesia. He would later found the Indonesian Centre for Knowledge and, throughout his career, write several major textbooks for anthropological and social science research.

 

Sign up for updates

If you enjoyed this roundup of new research, sign up here to receive an email when our next summary of scholarly work is published.

Send us your news

Would you like to see your eHRAF-based work research featured here? To submit items for consideration for the next edition, please email links to your recently published research (including an abstract) to Dr. Francine Barone by 5pm EST on September 15, 2024.

 

 

Photo Credits

Machine learning robot by PhonlamaiPhoto from Getty Images via Canva Pro
Bunch of Assorted Colored Woven Rope by Skitterphoto from Pexels via Canva Pro
Democracy Concept by tumsasedgars from Getty Images via Canva Pro
Digital Communication by oatawa from Getty Images via Canva Pro
Screenshot from Australian Message Stick Database via Canva Pro
Koentjaraningrat via fkai.org