scispace - formally typeset
Journal ArticleDOI

Quantitative analysis of large amounts of journalistic texts using topic modelling

TLDR
A case study of the New York Times coverage of nuclear technology from 1945 to the present shows that LDA is a useful tool for analysing trends and patterns in news content in large digital news archives relatively quickly.
Abstract
The huge collections of news content which have become available through digital technologies both enable and warrant scientific inquiry, challenging journalism scholars to analyse unprecedented amounts of texts. We propose Latent Dirichlet Allocation (LDA) topic modelling as a tool to face this challenge. LDA is a cutting edge technique for content analysis, designed to automatically organize large archives of documents based on latent topics, measured as patterns of word (co-)occurrence. We explain how this technique works, how different choices by the researcher affect the results and how the results can be meaningfully interpreted. To demonstrate its usefulness for journalism research, we conducted a case study of the New York Times coverage of nuclear technology from 1945 to the present, partially replicating a study by Gamson and Modigliani. This shows that LDA is a useful tool for analysing trends and patterns in news content in large digital news archives relatively quickly.

read more

Citations
More filters
Journal ArticleDOI

Applying LDA Topic Modeling in Communication Research: Toward a Valid and Reliable Methodology

TL;DR: The overall goal is to make LDA topic modeling more accessible to communication researchers and to ensure compliance with disciplinary standards by developing a brief hands-on user guide for applying L DA topic modeling.
Journal ArticleDOI

Smart literature review: a practical topic modelling approach to exploratory literature review

TL;DR: The aim of the paper is to enable the use of topic modelling for researchers by presenting a step-by-step framework on a case and sharing a code template, which enables huge amounts of papers to be reviewed in a transparent, reliable, faster, and reproducible way.
Journal ArticleDOI

Comparing Apples to Apple: The Effects of Stemmers on Topic Models

TL;DR: Despite their frequent use in topic modeling, it is found that stemmers produce no meaningful improvement in likelihood and coherence and in fact can degrade topic stability.
Journal ArticleDOI

Artificial intelligence in marketing: Topic modeling, scientometric analysis, and research agenda

TL;DR: The scientometric analyses reveal key concepts, keyword co-occurrences, authorship networks, top research themes, landmark publications, and the evolution of the research field over time in terms of its dominant topics, diversity, evolution over time, and dynamics.
Journal ArticleDOI

When Communication Meets Computation: Opportunities, Challenges, and Pitfalls in Computational Communication Science

TL;DR: This special issue discusses the validity of using big data in communication science and showcases a number of new methods and applications in the fields of text and network analysis.
References
More filters
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Journal ArticleDOI

Framing: Toward Clarification of a Fractured Paradigm

TL;DR: Reaching this goal would require a more self-con- scious determination by communication scholars to plumb other fields and feed back their studies to outside researchers, and enhance the theoretical rigor of communication scholarship proper.
Journal ArticleDOI

Media Discourse and Public Opinion on Nuclear Power: A Constructionist Approach

TL;DR: This article explored the relationship between media discourse and public opinion by analyzing the discourse on nuclear power in four general audience media: television news coverage, newsmagazine accounts, editorial cartoons, and syndicated opinion columns.
Proceedings Article

Generating Typed Dependency Parses from Phrase Structure Parses

TL;DR: A system for extracting typed dependency parses of English sentences from phrase structure parses that captures inherent relations occurring in corpus texts that can be critical in real-world applications is described.
Proceedings ArticleDOI

Dynamic topic models

TL;DR: A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections, and dynamic topic models provide a qualitative window into the contents of a large document collection.
Related Papers (5)