Building And Using Comparable Corpora

Download Building And Using Comparable Corpora full books in PDF, epub, and Kindle. Read online Building And Using Comparable Corpora ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads. We cannot guarantee that every ebooks is available!

Building and Using Comparable Corpora

Author : Serge Sharoff
Publisher : Springer Science & Business Media
ISBN 13 : 3642201288
Total Pages : 335 pages
Book Rating : 4.6/5 (422 download)

DOWNLOAD NOW!

Book Synopsis Building and Using Comparable Corpora by : Serge Sharoff

Download or read book Building and Using Comparable Corpora written by Serge Sharoff and published by Springer Science & Business Media. This book was released on 2013-12-13 with total page 335 pages. Available in PDF, EPUB and Kindle. Book excerpt: The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.

Building and Using Comparable Corpora for Multilingual Natural Language Processing

Author : Serge Sharoff
Publisher : Springer Nature
ISBN 13 : 3031313844
Total Pages : 138 pages
Book Rating : 4.0/5 (313 download)

DOWNLOAD NOW!

Book Synopsis Building and Using Comparable Corpora for Multilingual Natural Language Processing by : Serge Sharoff

Download or read book Building and Using Comparable Corpora for Multilingual Natural Language Processing written by Serge Sharoff and published by Springer Nature. This book was released on 2023-08-23 with total page 138 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications.

43rd Annual Meeting of the Association for Computational Linguistics

Author :
Publisher :
ISBN 13 : 9781932432534
Total Pages : 77 pages
Book Rating : 4.4/5 (325 download)

DOWNLOAD NOW!

Book Synopsis 43rd Annual Meeting of the Association for Computational Linguistics by :

Download or read book 43rd Annual Meeting of the Association for Computational Linguistics written by and published by . This book was released on 2005 with total page 77 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Using Comparable Corpora for Under-Resourced Areas of Machine Translation

Author : Inguna Skadiņa
Publisher : Springer
ISBN 13 : 3319990047
Total Pages : 323 pages
Book Rating : 4.3/5 (199 download)

DOWNLOAD NOW!

Book Synopsis Using Comparable Corpora for Under-Resourced Areas of Machine Translation by : Inguna Skadiņa

Download or read book Using Comparable Corpora for Under-Resourced Areas of Machine Translation written by Inguna Skadiņa and published by Springer. This book was released on 2019-02-06 with total page 323 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.

Parallel Corpora for Contrastive and Translation Studies

Author : Irene Doval
Publisher : John Benjamins Publishing Company
ISBN 13 : 9027262845
Total Pages : 313 pages
Book Rating : 4.0/5 (272 download)

DOWNLOAD NOW!

Book Synopsis Parallel Corpora for Contrastive and Translation Studies by : Irene Doval

Download or read book Parallel Corpora for Contrastive and Translation Studies written by Irene Doval and published by John Benjamins Publishing Company. This book was released on 2019-03-20 with total page 313 pages. Available in PDF, EPUB and Kindle. Book excerpt: This volume assesses the state of the art of parallel corpus research as a whole, reporting on advances in both recent developments of parallel corpora – with some particular references to comparable corpora as well– and in ways of exploiting them for a variety of purposes. The first part of the book is devoted to new roles that parallel corpora can and should assume in translation studies and in contrastive linguistics, to the usefulness and usability of parallel corpora, and to advances in parallel corpus alignment, annotation and retrieval. There follows an up-to-date presentation of a number of parallel corpus projects currently being carried out in Europe, some of them multimodal, with certain chapters illustrating case studies developed on the basis of the corpora at hand. In most of these chapters, attention is paid to specific technical issues of corpus building. The third part of the book reflects on specific applications and on the creation of bilingual resources from parallel corpora. This volume will be welcomed by scholars, postgraduate and PhD students in the fields of contrastive linguistics, translation studies, lexicography, language teaching and learning, machine translation, and natural language processing.

Translation-Driven Corpora

Author : Federico Zanettin
Publisher : Routledge
ISBN 13 : 1317639855
Total Pages : 244 pages
Book Rating : 4.3/5 (176 download)

DOWNLOAD NOW!

Book Synopsis Translation-Driven Corpora by : Federico Zanettin

Download or read book Translation-Driven Corpora written by Federico Zanettin and published by Routledge. This book was released on 2014-04-08 with total page 244 pages. Available in PDF, EPUB and Kindle. Book excerpt: Electronic texts and text analysis tools have opened up a wealth of opportunities to higher education and language service providers, but learning to use these resources continues to pose challenges to scholars and professionals alike. Translation-Driven Corpora aims to introduce readers to corpus tools and methods which may be used in translation research and practice. Each chapter focuses on specific aspects of corpus creation and use. An introduction to corpora and overview of applications of corpus linguistics methodologies to translation studies is followed by a discussion of corpus design and acquisition. Different stages and tools involved in corpus compilation and use are outlined, from corpus encoding and annotation to indexing and data retrieval, and the various methods and techniques that allow end users to make sense of corpus data are described. The volume also offers detailed guidelines for the construction and analysis of multilingual corpora. Corpus creation and use are illustrated through practical examples and case studies, with each chapter outlining a set of tasks aimed at guiding researchers, students and translators to practice some of the methods and use some of the resources discussed. These tasks are meant as hands-on activities to be carried out using the materials and links available in an accompanying DVD. Suggested further readings at the end of each chapter are complemented by an extensive bibliography at the end of the volume. Translation-Driven Corpora is designed for use by teachers and students in the classroom or by researchers and professionals for self-learning. It is an invaluable resource for anyone interested in this fast growing area of scholarly and professional activity.

Computational Phraseology

Author : Gloria Corpas Pastor
Publisher : John Benjamins Publishing Company
ISBN 13 : 9027261393
Total Pages : 341 pages
Book Rating : 4.0/5 (272 download)

DOWNLOAD NOW!

Book Synopsis Computational Phraseology by : Gloria Corpas Pastor

Download or read book Computational Phraseology written by Gloria Corpas Pastor and published by John Benjamins Publishing Company. This book was released on 2020-05-15 with total page 341 pages. Available in PDF, EPUB and Kindle. Book excerpt: Whether you wish to deliver on a promise, take a walk down memory lane or even on the wild side, phraseological units (also often referred to as phrasemes or multiword expressions) are present in most communicative situations and in all world’s languages. Phraseology, the study of phraseological units, has therefore become a rare unifying theme across linguistic theories. In recent years, an increasing number of studies have been concerned with the computational treatment of multiword expressions: these pertain among others to their automatic identification, extraction or translation, and to the role they play in various Natural Language Processing applications. Computational Phraseology is a comparatively new field where better understanding and more advances are urgently needed. This book aims to address this pressing need, by bringing together contributions focusing on different perspectives of this promising interdisciplinary field.

Human Language Technologies

Author : Inguna Skadina
Publisher : IOS Press
ISBN 13 : 1607506408
Total Pages : 264 pages
Book Rating : 4.6/5 (75 download)

DOWNLOAD NOW!

Book Synopsis Human Language Technologies by : Inguna Skadina

Download or read book Human Language Technologies written by Inguna Skadina and published by IOS Press. This book was released on 2010 with total page 264 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book contains papers from the Fourth International Conference on Human Language Technologies - the Baltic Perspective (Baltic HLT 2010), held in Riga in October 2010. This conference is the latest in a series which provides a forum for sharing recent advances in human language processing, and promotes cooperation between the computer science and linguistics communities of the Baltic countries and the rest of the world. Bringing together scientists, developers, providers and users, the conference is an opportunity to exchange information, discuss problems, find new synergies, and promote i.

Corpus Analysis for Language Studies at the University Level

Author : Giedrė Valūnaitė Oleškevičienė
Publisher : Cambridge Scholars Publishing
ISBN 13 : 1527565947
Total Pages : 176 pages
Book Rating : 4.5/5 (275 download)

DOWNLOAD NOW!

Book Synopsis Corpus Analysis for Language Studies at the University Level by : Giedrė Valūnaitė Oleškevičienė

Download or read book Corpus Analysis for Language Studies at the University Level written by Giedrė Valūnaitė Oleškevičienė and published by Cambridge Scholars Publishing. This book was released on 2021-02-08 with total page 176 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book highlights corpora use in teaching foreign languages in university education. It will appeal to both academics and practitioners interested in the process of teaching foreign languages at more advanced levels while applying corpus analysis and building tools for corpus annotation. It provides a detailed case study of analyzing the terminology of constitutional law in both English and Lithuanian as an example to illustrate the possibility of integrating corpus analysis tools into the process of teaching foreign languages in university education. The book reveals that initial linguistic knowledge is essential when teaching and learning foreign languages at more advanced levels while applying corpus annotation. In addition, it shows that, even though the use of new corpus software is perceived as a positive, there are still certain issues to be solved in this regard, such as the constant renewal of public computers in universities and the technical and methodological support for teachers while using corpora tools.

Comparable Corpora and Computer-assisted Translation

Author : Estelle Maryline Delpech
Publisher : John Wiley & Sons
ISBN 13 : 1119002702
Total Pages : 304 pages
Book Rating : 4.1/5 (19 download)

DOWNLOAD NOW!

Book Synopsis Comparable Corpora and Computer-assisted Translation by : Estelle Maryline Delpech

Download or read book Comparable Corpora and Computer-assisted Translation written by Estelle Maryline Delpech and published by John Wiley & Sons. This book was released on 2014-07-22 with total page 304 pages. Available in PDF, EPUB and Kindle. Book excerpt: Computer-assisted translation (CAT) has always used translationmemories, which require the translator to have a corpus of previoustranslations that the CAT software can use to generate bilinguallexicons. This can be problematic when the translator does not havesuch a corpus, for instance, when the text belongs to an emergingfield. To solve this issue, CAT research has looked into theleveraging of comparable corpora, i.e. a set of texts, in two ormore languages, which deal with the same topic but are nottranslations of one another. This work had two primary objectives. The first is to assess theinput of lexicons extracted from comparable corpora in the contextof a specialized human translation task. The second objective is toidentify bilingual-lexicon-extraction methods which best match thetranslators’ needs, determining the current limits of thesetechniques and suggesting improvements. The author focuses, inparticular, on the identification of fertile translations, themanagement of multiple morphological structures, and the ranking ofcandidate translations. The experiments are carried out on two language pairs(English–French and English–German) and on specializedtexts dealing with breast cancer. This research puts significantemphasis on applicability – methodological choices are guidedby the needs of the final users. This book is organized in twoparts: the first part presents the applicative and scientificcontext of the research, and the second part is given over toefforts to improve compositional translation. The research work presented in this book received the PhD Thesisaward 2014 from the French association for natural languageprocessing (ATALA).

Web As Corpus

Author : Maristella Gatto
Publisher : A&C Black
ISBN 13 : 1472571533
Total Pages : 250 pages
Book Rating : 4.4/5 (725 download)

DOWNLOAD NOW!

Book Synopsis Web As Corpus by : Maristella Gatto

Download or read book Web As Corpus written by Maristella Gatto and published by A&C Black. This book was released on 2014-02-13 with total page 250 pages. Available in PDF, EPUB and Kindle. Book excerpt: Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions. The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the “web as corpus”. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.

Natural Language Understanding in a Semantic Web Context

Author : Caroline Barrière
Publisher : Springer
ISBN 13 : 3319413376
Total Pages : 317 pages
Book Rating : 4.3/5 (194 download)

DOWNLOAD NOW!

Book Synopsis Natural Language Understanding in a Semantic Web Context by : Caroline Barrière

Download or read book Natural Language Understanding in a Semantic Web Context written by Caroline Barrière and published by Springer. This book was released on 2016-11-17 with total page 317 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book serves as a starting point for Semantic Web (SW) students and researchers interested in discovering what Natural Language Processing (NLP) has to offer. NLP can effectively help uncover the large portions of data held as unstructured text in natural language, thus augmenting the real content of the Semantic Web in a significant and lasting way. The book covers the basics of NLP, with a focus on Natural Language Understanding (NLU), referring to semantic processing, information extraction and knowledge acquisition, which are seen as the key links between the SW and NLP communities. Major emphasis is placed on mining sentences in search of entities and relations. In the course of this “quest", challenges will be encountered for various text analysis tasks, including part-of-speech tagging, parsing, semantic disambiguation, named entity recognition and relation extraction. Standard algorithms associated with these tasks are presented to provide an understanding of the fundamental concepts. Furthermore, the importance of experimental design and result analysis is emphasized, and accordingly, most chapters include small experiments on corpus data with quantitative and qualitative analysis of the results. This book is divided into four parts. Part I “Searching for Entities in Text” is dedicated to the search for entities in textual data. Next, Part II “Working with Corpora” investigates corpora as valuable resources for NLP work. In turn, Part III “Semantic Grounding and Relatedness” focuses on the process of linking surface forms found in text to entities in resources. Finally, Part IV “Knowledge Acquisition” delves into the world of relations and relation extraction. The book also includes three appendices: “A Look into the Semantic Web” gives a brief overview of the Semantic Web and is intended to bring readers less familiar with the Semantic Web up to speed, so that they too can fully benefit from the material of this book. “NLP Tools and Platforms” provides information about NLP platforms and tools, while “Relation Lists” gathers lists of relations under different categories, showing how relations can be varied and serve different purposes. And finally, the book includes a glossary of over 200 terms commonly used in NLP. The book offers a valuable resource for graduate students specializing in SW technologies and professionals looking for new tools to improve the applicability of SW techniques in everyday life – or, in short, everyone looking to learn about NLP in order to expand his or her horizons. It provides a wealth of information for readers new to both fields, helping them understand the underlying principles and the challenges they may encounter.

Digital Sovereignty in Cyber Security: New Challenges in Future Vision

Author : Antonio Skarmeta
Publisher : Springer Nature
ISBN 13 : 3031360966
Total Pages : 182 pages
Book Rating : 4.0/5 (313 download)

DOWNLOAD NOW!

Book Synopsis Digital Sovereignty in Cyber Security: New Challenges in Future Vision by : Antonio Skarmeta

Download or read book Digital Sovereignty in Cyber Security: New Challenges in Future Vision written by Antonio Skarmeta and published by Springer Nature. This book was released on 2023-06-15 with total page 182 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes papers presented during the workshop session titled “CyberSec4Europe - Research to Innovation: Common Research Framework on Security and Privacy” during the Privacy Symposium hosted by Università Ca’ Foscari in Venice, Italy, in April 2022. The 11 peer-reviewed selected papers present findings, conclusions, research, and recommendations in various security-related areas, from highly technical ones (e.g., software and network security) to law and human-centric ones (e.g., governance and cybersecurity awareness).

Chinese Lexical Semantics

Author : Donghong Ji
Publisher : Springer
ISBN 13 : 3642363377
Total Pages : 838 pages
Book Rating : 4.6/5 (423 download)

DOWNLOAD NOW!

Book Synopsis Chinese Lexical Semantics by : Donghong Ji

Download or read book Chinese Lexical Semantics written by Donghong Ji and published by Springer. This book was released on 2013-02-15 with total page 838 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes carefully reviewed and revised selected papers from the 13th Chinese Lexical Semantics Workshop, CLSW 2012, held in Wuhan, China, in July 2012. The 67 full papers and 17 short papers presented in this volume were carefully reviewed and selected from 169 submissions. They are organized in topical sections named: applications on natural language processing; corpus linguistics; lexical computation; lexical resources; lexical semantics; new methods for lexical semantics; and other topics.

Computational Linguistics and Intelligent Text Processing

Author : Alexander Gelbukh
Publisher : Springer
ISBN 13 : 3642549039
Total Pages : 581 pages
Book Rating : 4.6/5 (425 download)

DOWNLOAD NOW!

Book Synopsis Computational Linguistics and Intelligent Text Processing by : Alexander Gelbukh

Download or read book Computational Linguistics and Intelligent Text Processing written by Alexander Gelbukh and published by Springer. This book was released on 2014-04-18 with total page 581 pages. Available in PDF, EPUB and Kindle. Book excerpt: This two-volume set, consisting of LNCS 8403 and LNCS 8404, constitutes the thoroughly refereed proceedings of the 14th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2014, held in Kathmandu, Nepal, in April 2014. The 85 revised papers presented together with 4 invited papers were carefully reviewed and selected from 300 submissions. The papers are organized in the following topical sections: lexical resources; document representation; morphology, POS-tagging, and named entity recognition; syntax and parsing; anaphora resolution; recognizing textual entailment; semantics and discourse; natural language generation; sentiment analysis and emotion recognition; opinion mining and social networks; machine translation and multilingualism; information retrieval; text classification and clustering; text summarization; plagiarism detection; style and spelling checking; speech processing; and applications.

Intelligent Natural Language Processing: Trends and Applications

Author : Khaled Shaalan
Publisher : Springer
ISBN 13 : 3319670565
Total Pages : 776 pages
Book Rating : 4.3/5 (196 download)

DOWNLOAD NOW!

Book Synopsis Intelligent Natural Language Processing: Trends and Applications by : Khaled Shaalan

Download or read book Intelligent Natural Language Processing: Trends and Applications written by Khaled Shaalan and published by Springer. This book was released on 2017-11-17 with total page 776 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book brings together scientists, researchers, practitioners, and students from academia and industry to present recent and ongoing research activities concerning the latest advances, techniques, and applications of natural language processing systems, and to promote the exchange of new ideas and lessons learned. Taken together, the chapters of this book provide a collection of high-quality research works that address broad challenges in both theoretical and applied aspects of intelligent natural language processing. The book presents the state-of-the-art in research on natural language processing, computational linguistics, applied Arabic linguistics and related areas. New trends in natural language processing systems are rapidly emerging – and finding application in various domains including education, travel and tourism, and healthcare, among others. Many issues encountered during the development of these applications can be resolved by incorporating language technology solutions. The topics covered by the book include: Character and Speech Recognition; Morphological, Syntactic, and Semantic Processing; Information Extraction; Information Retrieval and Question Answering; Text Classification and Text Mining; Text Summarization; Sentiment Analysis; Machine Translation Building and Evaluating Linguistic Resources; and Intelligent Language Tutoring Systems.

Text, Speech and Dialogue

Author : Ivan Habernal
Publisher : Springer Science & Business Media
ISBN 13 : 3642235379
Total Pages : 457 pages
Book Rating : 4.6/5 (422 download)

DOWNLOAD NOW!

Book Synopsis Text, Speech and Dialogue by : Ivan Habernal

Download or read book Text, Speech and Dialogue written by Ivan Habernal and published by Springer Science & Business Media. This book was released on 2011-08-19 with total page 457 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes the refereed proceedings of the 14th International Conference on Text, Speech and Dialogue, TSD 2011, held in Pilsen, Czech Republic, in September 2011. The 53 papers presented together with 2 invited talks were carefully reviewed and selected from 110 submissions. The main topic of this year's conference was "integrating modern Web with speech and language technologies". This year the Third International Workshop on Balto-Slavonic Natural Language was affiliated to TSD. The present book contains 8 contributions from this workshop.