+1 Recommend
1 collections

      Celebrating 65 years of The Computer Journal - free-to-read perspectives - bcs.org/tcj65

      • Record: found
      • Abstract: found
      • Conference Proceedings: found
      Is Open Access

      Keyword extraction and summarization from unstructured text: A case study with open data from legal domain

      Proceedings of the Symposium on Open Data and Knowledge for a Post-Pandemic Era ODAK22, UK (ODAK 2022)
      Open Data and Knowledge for a Post-Pandemic Era
      June 30-July 1, 2022
      Entity Extraction, Information Extraction, Keyword Summarization, Natural Language Processing, Page Rank, RDF triples


            Information Extraction (IE) is an important and crucial task in the world of web and open data. IE is achieved using Natural language Processing (NLP). There are various techniques used for extraction of information, however coming up with useful and meaningful information is the most important task. Many search engines rely heavily on IE. This paper focuses on entity extraction of named entities from natural language and converting them into knowledge graph of triples. The goal is to answer two types of queries (i) Keyword search that returns exact information; (ii) Summarization of a keyword in question. A case study using open data from legal domain is presented.


            Author and article information

            July 2022
            : 1-6
            [0001]School of Computing and Augmented Intelligence

            Arizona State University

            Mesa, Arizona
            © Singh et al. Published by BCS Learning & Development Ltd. Proceedings of the Symposium on Open Data and Knowledge for a Post-Pandemic Era ODAK22, UK

            This work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

            Proceedings of the Symposium on Open Data and Knowledge for a Post-Pandemic Era ODAK22, UK
            ODAK 2022
            Brighton, UK
            June 30-July 1, 2022
            Electronic Workshops in Computing (eWiC)
            Open Data and Knowledge for a Post-Pandemic Era

            1477-9358 BCS Learning & Development

            Self URI (article page): https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/ODAK22.9
            Self URI (journal page): https://ewic.bcs.org/
            Electronic Workshops in Computing

            Applied computer science,Computer science,Security & Cryptology,Graphics & Multimedia design,General computer science,Human-computer-interaction
            RDF triples,Page Rank,Natural Language Processing,Keyword Summarization,Information Extraction,Entity Extraction


            1. "GIET: Generic Information Extraction using Triple Store Databases." INFORMATIK 2015 2015

            2. "Ontology guided information extraction from unstructured text." arXiv preprint arXiv:1302.1335 2013

            3. "Mold-a framework for entity extraction and summarization." 2020 IEEE 14th International Conference on Semantic Computing (ICSC) IEEE2020

            4. ”PageRank on Wikipedia: towards general importance scores for entities.” In International Semantic Web Conference 227 240 Springer Cham 2016

            5. & 2020 June 15 SciNER: Extracting Named Entities From Scientific Literature PubMed Central (PMC)

            6. "Entity ranking in Wikipedia." In Proceedings of the 2008 ACM symposium on Applied computing 1101 1106 ACM 2008

            7. Information Extraction 21st Oct, 2021 Available: https://web.stanford.edu/~jurafsky/slp3/17.pdf

            8. OntoNotes Release 5.0 LDC2013T19 Web Download Philadelphia Linguistic Data Consortium 2013

            9. Lawsuits against companies dataset Nov ‘21 Available: https://www.businesshumanrights.org/en/latestnews/?&content_types=lawsuits&language=en

            10. "WordNet: An electronic lexical database Cambridge, MA MIT Press 1998 423 Applied Psycholinguistics 22.1 (2001) 131 134

            11. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) 2014

            12. "Optimization of natural language processing components for robustness and scalability." PhD diss., University of Colorado at Boulder, 2012

            13. "Semantic relation extraction from legislative text using generalized syntactic dependencies and support vector machines." Intl. Workshop on Rules and Rule Markup Languages for the Semantic Web Springer Berlin, Heidelberg 2013

            14. "From text to knowledge: Semantic entity extraction using yago ontology." International Journal of Machine Learning and Computing 1.2 2011 113

            15. "Patternbased approaches to semantic relation extraction: A state-of-the-art." Terminology 14.1 2008 1

            16. "KGen: a knowledge graph generator from biomedical scientific literature." BMC medical informatics and decision making 20.4 2020 1 24


            Comment on this article