Dr Chenghua Lin
Department of Computer Science
Senior Lecturer in Natural Language Processing
Deputy Director of Research
Member of the Natural Language Processing research group
Full contact details
Department of Computer Science
Regent Court (DCS)
211 Portobello
S1 4DP
- Profile
Dr Chenghua Lin is a Senior Lecturer in Natural Language Processing in the Department of Computer Science at the University of Sheffield. Prior to joining Sheffield, he was a SICSA Senior Lecturer in the Department of Computing Science, University of Aberdeen.
He received his PhD in Computer Science from the University of Exeter.
- Research interests
Dr Chenghua Lin's research is centred around machine learning, natural language processing, data and text mining. Currently, he is particularly interested in the development of algorithms and models for sentiment analysis, text summarisation, natural language generation and cognitive inspired context learning.
- Publications
Journal articles
- Improving unsupervised keyphrase extraction by modeling hierarchical multi-granularity features. Information Processing & Management, 60(4), 103356-103356.
- PUF-Phenotype: A Robust and Noise-Resilient Approach to Aid Group-based Authentication with DRAM-PUFs Using Machine Learning. IEEE Transactions on Information Forensics and Security, 1-1.
- View this article in WRRO Understanding linearity of cross-lingual word embedding mappings. Transactions on Machine Learning Research.
- Token relation aware Chinese named entity recognition. ACM Transactions on Asian and Low-Resource Language Information Processing.
- Ada: Adversarial learning based data augmentation for malicious users detection. Applied Soft Computing, 117.
- Adaptive pre-training and collaborative fine-tuning: a win-win strategy to improve review analysis tasks. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 30, 622-634.
- Routine outcome monitoring in psychotherapy treatment using sentiment-topic modelling approach. International Journal on Advanced Science, Engineering and Information Technology, 11(6).
- Named entity aware transfer learning for biomedical factoid question answering. IEEE/ACM Transactions on Computational Biology and Bioinformatics.
- Summarising historical text in modern languages. EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, 3123-3142.
- A unified latent variable model for contrastive opinion mining. Frontiers of Computer Science, 14(2), 404-416. View this article in WRRO
- Extractive and abstractive sentence labelling of sentiment-bearing topics. Frontiers of Computer Science. View this article in WRRO
- Deep Ensemble Learning for News Stance Detection. arXiv preprint arXiv:1909.12233.
- Data-driven two-layer visual dictionary structure learning. Journal of Electronic Imaging, 28(02), 1-1.
- Sherlock: A semi-automatic framework for quiz generation using a hybrid semantic similarity measure. Cognitive Computation, 7, 667-679.
- Sentiment-topic modeling in text mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 5(5), 246-254.
- Expressing Quantities in Words: Towards a Computational Model.
- Sentiment Analysis in Social Media. Encyclopedia of Social Network Analysis and Mining, 1688-1699.
- Dynamic joint sentiment-topic model. ACM Transactions on Intelligent Systems and Technology, 5(1), 1-21.
- Weakly Supervised Joint Sentiment-Topic Detection from Text. IEEE Transactions on Knowledge and Data Engineering, 24(6), 1134-1145.
- View this article in WRRO A Text Reassembling Approach to Natural Language Generation.
- Split Over-Training for Unsupervised Purchase Intention Identification. International Journal of Advanced Trends in Computer Science and Engineering, 9(3), 3921-3928.
- Dialogue State Tracking with Pretrained Encoder for Multi-domain Trask-oriented Dialogue Systems.
- Combining Pre-trained Word Embeddings and Linguistic Features for Sequential Metaphor Identification.
- Interpreting Verbal Metaphors by Paraphrasing.
- Domain-Driven and Discourse-Guided Scientific Summarisation, Lecture Notes in Computer Science (pp. 361-376). Springer Nature Switzerland
- Sentiment Analysis in Social Media, Encyclopedia of Social Network Analysis and Mining Springer
- Safety, Encyclopedia of Social Network Analysis and Mining (pp. 2281-2281). Springer New York
Conference proceedings papers
- Requirement Formalisation Using Natural Language Processing and Machine Learning: A Systematic Review. Proceedings of the 11th International Conference on Model-Based Software and Systems Engineering, 19 February 2023 - 21 February 2023.
- Improving Variational Autoencoders with Density Gap-based Regularization
- Tell me how to survey: literature review made simple with automatic reading path generation. Proceedings of 2022 IEEE 38th International Conference on Data Engineering (ICDE) (pp 3426-3438). Kuala Lumpur, Malaysia, 9 May 2022 - 12 May 2022.
- EtriCA: Event-Triggered Context-Aware Story Generation Augmented by Cross Attention. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- CM-Gen: A Neural Framework for Chinese Metaphor Generation with Explicit Context Modelling. Proceedings of the 29th International Conference on Computational Linguistics
- Development of a Benchmark Corpus to Support Entity Recognition in Job Descriptions. 2022 Language Resources and Evaluation Conference, LREC 2022 (pp 1201-1208)
- TranSHER: Translating Knowledge Graph Embedding with Hyper-Ellipsoidal Restriction. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp 8517-8528)
- Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp 10589-10604)
- Semi-deterministic and Contrastive Variational Graph Autoencoder for Recommendation. Proceedings of the 30th ACM International Conference on Information & Knowledge Management
- Cross-lingual word embedding refinement by ℓ1 norm optimisation. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp 2690-2701). Virtual conference, 6 June 2021 - 11 June 2021. View this article in WRRO
- Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step Rewriting. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), August 2021 - August 2021.
- On the Low-density Latent Regions of VAE-based Language Models. Proceedings of Machine Learning Research (PMLR), Vol. 148 (pp 343-357)
- Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis. NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp 2364-2375)
- Fast and Scalable Dialogue State Tracking with Explicit Modular Decomposition. NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp 289-295)
- Affective Decoding for Empathetic Response Generation. INLG 2021 - 14th International Conference on Natural Language Generation, Proceedings (pp 331-340)
- Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2021 - June 2021. View this article in WRRO
- View this article in WRRO Latent space factorisation and manipulation via matrix subspace projection. 37th International Conference on Machine Learning, ICML 2020, Vol. PartF168147-8 (pp 5872-5882)
- Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation. Proceedings of the 28th International Conference on Computational Linguistics, December 2020 - December 2020.
- DGST: a Dual-Generator Network for Text Style Transfer. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2020 - November 2020.
- Feature LDA: a Supervised Topic Model for Automatic Detection of Web API Documentations from the Web. Proceedings of the 11th International Semantic Web Conference
- View this article in WRRO End-to-End Sequential Metaphor Identification Inspired by Linguistic Theories. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp 3888-3898). Florence, Italy, 28 July 2019 - 2 August 2019.
- QTUNA: A corpus for understanding how speakers use quantification. INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference (pp 124-129)
- Deep Ensemble Learning for News Stance Detection. Proceedings of the 5th International Conference on Computational Social Science
- Generating quantified descriptions of abstract visual scenes. INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference (pp 529-539)
- View this article in WRRO A stable variational autoencoder for text modelling. INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference (pp 594-599)
- A dual-attention hierarchical recurrent neural network for dialogue act classification. CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference (pp 383-392)
- Word embedding and wordnet based metaphor identification and interpretation. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp 1222-1231). Melbourne, Australia, 15 July 2018 - 20 July 2018. View this article in WRRO
- Generating descriptions for sequential images with local-object attention and global semantic context modelling. 2IS and NLG 2018 - Workshop on Intelligent Interactive Systems and Language Generation, Proceedings of the Workshop (pp 3-8)
- Modelling pro-drop with the rational speech acts model. INLG 2018 - 11th International Natural Language Generation Conference, Proceedings of the Conference (pp 159-164)
- Generating Description for Sequential Images with Local-Object Attention Conditioned on Global Semantic Context. Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG) (pp 3-8)
- Modelling Various Kinds of Specifications. Proceedings of the International Workshop on Computational Models of Language Generation and Processing in Pragmatics (
- Understanding how to explain package recommendations in the clothes domain. CEUR Workshop Proceedings, Vol. 2225 (pp 74-78)
- Assessing the Effectiveness of Affective Lexicons for Depression Classification (pp 65-69)
- Statistical NLG for generating the content and form of referring expressions. INLG 2018 - 11th International Natural Language Generation Conference, Proceedings of the Conference (pp 482-491)
- SimpleNLG-ZH: A linguistic realisation engine for mandarin. INLG 2018 - 11th International Natural Language Generation Conference, Proceedings of the Conference (pp 57-66)
- Incorporating Constraints into Matrix Factorization for Clothes Package Recommendation. Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization
- View this article in WRRO ABDN at SemEval-2018 Task 10: recognising discriminative attributes using context embeddings and WordNet. Proceedings of the 12th International Workshop on Semantic Evaluation (pp 1017-1021). Association for Computational Linguistics (ACL)
- Analysing the causes of depressed mood from depression vulnerable individuals. Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017) (pp 9-17)
- Extracting and understanding contrastive opinion through topic relevant sentences. Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (pp 395-400)
- Automatically Labelling Sentiment-Bearing Topics with Descriptive Sentence Labels (pp 299-312)
- Matrix factorization for package recommendations. CEUR Workshop Proceedings, Vol. 1892 (pp 23-28)
- Tracking Sentiment and Topic Dynamics from Social Media (pp 457-465)
- Statistics-based lexical choice for NLG from quantitative information. INLG 2016 - 9th International Natural Language Generation Conference, Proceedings of the Conference (pp 104-108)
- Generating pseudotransactions for improving sparse matrix factorization. Proceedings of the 10th ACM Conference on Recommender Systems (pp 439-442). ACM
- A curated corpus for sentiment-topic analysis. Emotion and Sentiment Analysis
- Automatically Predicting Quiz Difficulty Level Using Similarity Measures. Proceedings of the Knowledge Capture Conference on ZZZ - K-CAP 2015, 7 October 2015 - 10 October 2015.
- Applying Rule Extraction & Rule Refinement techniques to (Blackbox) Classifiers. Proceedings of the Knowledge Capture Conference on ZZZ - K-CAP 2015, 7 October 2015 - 10 October 2015.
- Web as Corpus Supporting Natural Language Generation for Online River Information Communication. Proceedings of the 24th International Conference on World Wide Web - WWW '15 Companion, 18 May 2015 - 22 May 2015.
- Hete-cf: Social-based collaborative filtering recommendation using heterogeneous relations. Proceedings of the 14th IEEE International Conference on Data Mining (pp 917-922). Shenzhen, China, 14 December 2014 - 17 December 2014.
- Hetpathmine: A novel transductive classification algorithm on heterogeneous information networks. Advances in Information Retrieval: 36th European Conference on IR Research, ECIR 2014, Amsterdam, The Netherlands, April 13-16, 2014, Proceedings, Vol. 8416 (pp 210-221). Springer, 13 April 2014 - 16 April 2014.
- Sherlock: A semi-automatic quiz generation system using linked data. CEUR Workshop Proceedings, Vol. 1272 (pp 9-12)
- Tracking sentiment and topic dynamics from social media. Sixth International AAAI Conference on Weblogs and Social Media
- Tracking sentiment and topic dynamics from social media. ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (pp 483-486)
- Online Sentiment and Topic Dynamics Tracking over the Streaming Data. 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing, 3 September 2012 - 5 September 2012.
- Harnessing the crowds for automating the identification of web APIs. AAAI Spring Symposium - Technical Report, Vol. SS-12-04 (pp 58-63)
- Feature LDA: A supervised topic model for automatic detection of web API documentations from the web. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 7649 LNCS(PART 1) (pp 328-343)
- Sentence subjectivity detection with weakly-supervised learning. Proceedings of 5th International Joint Conference on Natural Language Processing (pp 1153-1161)
- Automatically extracting polarity-bearing topics for cross-domain sentiment classification. ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (pp 123-131)
- A comparative study of bayesian models for unsupervised sentiment detection. CoNLL 2010 - Fourteenth Conference on Computational Natural Language Learning, Proceedings of the Conference (pp 144-152)
- Protein-Protein Interactions Classification from Text via Local Learning with Class Priors (pp 182-191)
- Joint sentiment/topic model for sentiment analysis. Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09, 2 November 2009 - 6 November 2009.
- A multi-agent system for intelligent pervasive spaces. 2008 IEEE International Conference on Service Operations and Logistics, and Informatics, 12 October 2008 - 15 October 2008.
- Review of computer vision in intelligent environment design
- A multi-agent system for intelligent pervasive spaces. Proceedings of 2008 IEEE International Conference on Service Operations and Logistics, and Informatics, IEEE/SOLI 2008, Vol. 1 (pp 1005-1010)
- Latent Space Factorisation and Manipulation via Matrix Subspace Projection. ICML
Theses / Dissertations
- Language Model as an Annotator: Unsupervised Context-aware Quality Phrase Generation.
- The Iron(ic) Melting Pot: Reviewing Human Evaluation in Humour, Irony and Sarcasm Generation.
- Enhancing Biomedical Lay Summarisation with External Knowledge Graphs.
- Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers.
- Effective Distillation of Table-based Reasoning Ability from LLMs.
- Audio Contrastive based Fine-tuning.
- Improving Medical Dialogue Generation with Abstract Meaning Representations.
- Requirement Formalisation using Natural Language Processing and Machine Learning: A Systematic Review.
- Metaphor Detection with Effective Context Denoising.
- FrameBERT: Conceptual Metaphor Detection with Frame Embedding Learning.
- Routine Outcome Monitoring in Psychotherapy Treatment using Sentiment-Topic Modelling Approach.
- MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning, arXiv.
- HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models, arXiv.
- NGEP: A Graph-based Event Planning Framework for Story Generation.
- Improving Chinese Story Generation via Awareness of Syntactic Dependencies and Semantics.
- Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature.
- PUF-Phenotype: A Robust and Noise-Resilient Approach to Aid Intra-Group-based Authentication with DRAM-PUFs Using Machine Learning.
- TransHER: Translating Knowledge Graph Embedding with Hyper-Ellipsoidal Restriction.
- Recent Advances in Neural Text Generation: A Task-Agnostic Survey.
- View this article in WRRO Revisiting the linearity in cross-lingual embedding mappings: from a perspective of word analogies.
- Grants
Research Grants
- Natural Language Processing for Automated Requirements Formalisation, MDENet, 01/04/2022 - 31/07/2022, £4,916, as PI
- Professional activities and memberships
Member of the Natural Language Processing research group