HKBUtube

Understanding Wikipedia’s Dark Matter: TRANSLATION AND MULTILINGUAL PRACTICE IN THE WORLD’S LARGEST ONLINE ENCYCLOPAEDIA

1321 Views 觀看次數

Open to All 公開

00:00

Recommended Videos 影片推薦

Audio 語言

Digital tools for researching Wikipedia

DEPARTMENT:

Centre for Translation;Department of Translation, Interpreting and Intercultural Studies

CONTRIBUTOR / SPEAKER:

SHUTTLEWORTH, Mark

RELEASE DATE:

2021

LENGTH:

84 Minutes

VIDEO TYPE:

Scholarly Talks

Wikipedia is the world's largest online encyclopaedia. It has 303 active language editions, which were accessed from 1.7bn unique devices during October 2020. Now over twenty years old, the encyclopaedia has been studied by academics working within a range of disciplines since the mid-2000s, although it is only relatively recently that it has started attracting the attention of translation scholars too. During a short space of time we have learnt a considerable amount about topics such as translation quality, translation and cultural remembrance, multilingual knowledge production and point of view, the prominent role played by narratives in articles reporting on news stories, and how translation is portrayed in multiple language versions of the Wikipedia article on the term itself. However, translation largely remains Wikipedia's "dark matter": not only is it difficult to locate, but researchers have so far struggled to map out the full extent of its contribution to this multilingual resource. Our aim in organising this international event is to allow the research community to take stock of the progress made so far and to identify new avenues for future work.

Unlike other Wikipedia research that focuses on big data analytics, research on the "dark matter" of Wikipedia attaches importance to the distinctive features and evolution of one or several articles across interlingual versions. This implies that the scraping method should avoid overlooking any fragments or details of the article while keeping the text clean and readable for further processing. In this workshop, several implementations of scraping Wikipedia articles will be introduced for a wide variety of research scenarios. Most of these methods are supported by official documentation. With the help of interfaces and parsers provided by Wikipedia and other developers, users are able to control exactly what Wikipedia content they want to get, such as tables, quotations, illustrations, etc. The basis of programming and data science will also be introduced in this workshop. Based on Google's Colab platform and Python's rich libraries, participants will be able to get the idea of scraping Wikipedia without installing any additional software on their computer.

Digital tools for researching Wikipedia

部門:

翻譯學研究中心;翻譯、傳譯及跨文化研究系

講者:

SHUTTLEWORTH, Mark

發佈日期:

2021

影片長度:

84 分鐘

影片類型:

學術論壇

Digital tools for researching Wikipedia

部门:

翻译学研究中心;翻译、传译及跨文化研究系

讲者:

SHUTTLEWORTH, Mark

发布日期:

2021

影片长度:

84 分钟

影片类型:

学术论坛

Full Disclaimer

This video is presented here with the permission of the producers. Any downloading, storage, reproduction, and redistribution are strictly prohibited without the prior permission of the respective producers. Go to Full Disclaimer.