18 Artículos

« Anterior Página: 1 de 1 Siguiente »

Ensemble-Based Short Text Similarity: An Easy Approach for Multilingual Datasets Using Transformers and WordNet in Real-World Scenarios

Acceso

en línea

Isabella Gagliardi and Maria Teresa Artese

When integrating data from different sources, there are problems of synonymy, different languages, and concepts of different granularity. This paper proposes a simple yet effective approach to evaluate the semantic similarity of short texts, especially k... ver más

Revista: Big Data and Cognitive Computing Formato: Electrónico

Tabla de contenido: Vol: 7 Num: 0 Par: 4 Año: 2023

The concept of pretrained language models in the context of knowledge engineering

Acceso

en línea

Dmitry Ponkin Pág. 18 - 29

The article studies the concept and technologies of pre-trained language models in the context of knowledge engineering. The author substantiates the relevance of the issue of the existence of internalized and implicit knowledge, extracted from text corp... ver más

Revista: International Journal of Open Information Technologies Formato: Electrónico

Tabla de contenido: Vol: 8 Num: 9 Par: 0 Año: 2020

Keeping Models Consistent between Pretraining and Translation for Low-Resource Neural Machine Translation

Acceso

en línea

Wenbo Zhang, Xiao Li, Yating Yang, Rui Dong and Gongxu Luo

Recently, the pretraining of models has been successfully applied to unsupervised and semi-supervised neural machine translation. A cross-lingual language model uses a pretrained masked language model to initialize the encoder and decoder of the translat... ver más

Revista: Future Internet Formato: Electrónico

Tabla de contenido: Vol: 12 Num: 0 Par: 12 Año: 2020

An Evaluation of Multilingual Offensive Language Identification Methods for the Languages of India

Acceso

en línea

Tharindu Ranasinghe and Marcos Zampieri

The pervasiveness of offensive content in social media has become an important reason for concern for online platforms. With the aim of improving online safety, a large number of studies applying computational models to identify such content have been pu... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 12 Num: 0 Par: 8 Año: 2021

Artificial Intelligence: A Blessing or a Threat for Language Service Providers in Portugal

Acceso

en línea

Célia Tavares, Luciana Oliveira, Pedro Duarte and Manuel Moreira da Silva

According to a recent study by OpenAI, Open Research, and the University of Pennsylvania, large language models (LLMs) based on artificial intelligence (AI), such as generative pretrained transformers (GPTs), may have potential implications for the job m... ver más

Revista: Informatics Formato: Electrónico

Tabla de contenido: Vol: 10 Num: 0 Par: 4 Año: 2023

A Transformer-Based Approach to Authorship Attribution in Classical Arabic Texts

Acceso

en línea

Fetoun Mansour AlZahrani and Maha Al-Yahya

Authorship attribution (AA) is a field of natural language processing that aims to attribute text to its author. Although the literature includes several studies on Arabic AA in general, applying AA to classical Arabic texts has not gained similar attent... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 12 Año: 2023

Empowering Propaganda Detection in Resource-Restraint Languages: A Transformer-Based Framework for Classifying Hindi News Articles

Acceso

en línea

Deptii Chaudhari and Ambika Vishal Pawar

Misinformation, fake news, and various propaganda techniques are increasingly used in digital media. It becomes challenging to uncover propaganda as it works with the systematic goal of influencing other individuals for the determined ends. While signifi... ver más

Revista: Big Data and Cognitive Computing Formato: Electrónico

Tabla de contenido: Vol: 7 Num: 0 Par: 4 Año: 2023

A Neural Topic Modeling Study Integrating SBERT and Data Augmentation

Acceso

en línea

Huaqing Cheng, Shengquan Liu, Weiwei Sun and Qi Sun

Topic models can extract consistent themes from large corpora for research purposes. In recent years, the combination of pretrained language models and neural topic models has gained attention among scholars. However, this approach has some drawbacks: in... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 7 Año: 2023

Fine-Tuning BERT Models for Intent Recognition Using a Frequency Cut-Off Strategy for Domain-Specific Vocabulary Extension

Acceso

en línea

Fernando Fernández-Martínez, Cristina Luna-Jiménez, Ricardo Kleinlein, David Griol, Zoraida Callejas and Juan Manuel Montero

Intent recognition is a key component of any task-oriented conversational system. The intent recognizer can be used first to classify the user?s utterance into one of several predefined classes (intents) that help to understand the user?s current goal. T... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 12 Num: 0 Par: 3 Año: 2022

Empowering Short Answer Grading: Integrating Transformer-Based Embeddings and BI-LSTM Network

Acceso

en línea

Wael H. Gomaa, Abdelrahman E. Nagib, Mostafa M. Saeed, Abdulmohsen Algarni and Emad Nabil

Automated scoring systems have been revolutionized by natural language processing, enabling the evaluation of students? diverse answers across various academic disciplines. However, this presents a challenge as students? responses may vary significantly ... ver más

Revista: Big Data and Cognitive Computing Formato: Electrónico

Tabla de contenido: Vol: 7 Num: 0 Par: 3 Año: 2023

CWSXLNet: A Sentiment Analysis Model Based on Chinese Word Segmentation Information Enhancement

Acceso

en línea

Shiqian Guo, Yansun Huang, Baohua Huang, Linda Yang and Cong Zhou

This paper proposed a method for improving the XLNet model to address the shortcomings of segmentation algorithm for processing Chinese language, such as long sub-word lengths, long word lists and incomplete word list coverage. To address these issues, w... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 6 Año: 2023

Rule-Enhanced Active Learning for Semi-Automated Weak Supervision

Acceso

en línea

David Kartchner, Davi Nakajima An, Wendi Ren, Chao Zhang and Cassie S. Mitchell

A major bottleneck preventing the extension of deep learning systems to new domains is the prohibitive cost of acquiring sufficient training labels. Alternatives such as weak supervision, active learning, and fine-tuning of pretrained models reduce this ... ver más

Revista: AI Formato: Electrónico

Tabla de contenido: Vol: 3 Num: 0 Par: 1 Año: 2022

Using Generative AI to Improve the Performance and Interpretability of Rule-Based Diagnosis of Type 2 Diabetes Mellitus

Acceso

en línea

Leon Kopitar, Iztok Fister, Jr. and Gregor Stiglic

Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates combining association rule mining with advanced natural language processing to im... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 15 Num: 0 Par: 3 Año: 2024

On Isotropy of Multimodal Embeddings

Acceso

en línea

Kirill Tyshchuk, Polina Karpikova, Andrew Spiridonov, Anastasiia Prutianova, Anton Razzhigaev and Alexander Panchenko

Embeddings, i.e., vector representations of objects, such as texts, images, or graphs, play a key role in deep learning methodologies nowadays. Prior research has shown the importance of analyzing the isotropy of textual embeddings for transformer-based ... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 14 Num: 0 Par: 7 Año: 2023

Exploring Prompts in Few-Shot Cross-Linguistic Topic Classification Scenarios

Acceso

en línea

Zhipeng Zhang, Shengquan Liu and Jianming Cheng

In recent years, large-scale pretrained language models have become widely used in natural language processing tasks. On this basis, prompt learning has achieved excellent performance in specific few-shot classification scenarios. The core idea of prompt... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 17 Año: 2023

Improving Domain-Generalized Few-Shot Text Classification with Multi-Level Distributional Signatures

Acceso

en línea

Xuyang Wang, Yajun Du, Danroujing Chen, Xianyong Li, Xiaoliang Chen, Yongquan Fan, Chunzhi Xie, Yanli Li and Jia Liu

Domain-generalized few-shot text classification (DG-FSTC) is a new setting for few-shot text classification (FSTC). In DG-FSTC, the model is meta-trained on a multi-domain dataset, and meta-tested on unseen datasets with different domains. However, previ... ver más

Revista: Applied Sciences Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 2 Año: 2023

Real-Time End-to-End Speech Emotion Recognition with Cross-Domain Adaptation

Acceso

en línea

Konlakorn Wongpatikaseree, Sattaya Singkul, Narit Hnoohom and Sumeth Yuenyong

Language resources are the main factor in speech-emotion-recognition (SER)-based deep learning models. Thai is a low-resource language that has a smaller data size than high-resource languages such as German. This paper describes the framework of using a... ver más

Revista: Big Data and Cognitive Computing Formato: Electrónico

Tabla de contenido: Vol: 6 Num: 0 Par: 3 Año: 2022

A Comparative Study of Machine Learning and Deep Learning Techniques for Fake News Detection

Acceso

en línea

Jawaher Alghamdi, Yuqing Lin and Suhuai Luo

Efforts have been dedicated by researchers in the field of natural language processing (NLP) to detecting and combating fake news using an assortment of machine learning (ML) and deep learning (DL) techniques. In this paper, a review of the existing stud... ver más

Revista: Information Formato: Electrónico

Tabla de contenido: Vol: 13 Num: 0 Par: 12 Año: 2022

« Anterior Página: 1 de 1 Siguiente »