This data set supports a study which investigates forms of address in 17th-century Dutch newspapers, expanding previous research focused primarily on letter corpora. The dataset comprises approximately 6,000 instances of address forms extracted from the Couranten Corpus (2022), a recently published collection of early modern Dutch newspapers. These forms were identified across a range of textual contexts, including national and international news items, quoted letters, direct speech (both spontaneous and scripted), and genre-specific newspaper discourse. Notably, articles originating from regions corresponding to present-day United Kingdom and Switzerland yielded a relatively high frequency of address forms. Each instance was annotated for sociolinguistic variables such as context type and social status of the addressee, which was predominantly high across the corpus. The data were analysed using a mixed-methods approach, combining descriptive statistics with machine learning techniques, including random forest classification and conditional inference trees.
The data was used for the paper "‘Ick en kan U.E. niet na-laten te adverteren…’: forms of address in 17th-century Dutch newspapers" by Machteld de Vos and Maria den Hartog, published in Nederlandse Taalkunde in 2025 (https://doi.org/10.5117/NEDT AA2025.3.003.DEVO).