Natural Language Processing Projects , 1st ed. Build Next-Generation NLP Applications Using AI Techniques

Langue : Anglais

Auteurs : Kulkarni Akshay, Shivananda Adarsha, Kulkarni Anoosh

Couverture de l’ouvrage Natural Language Processing Projects

Résumé
Sommaire
Biographie
Commentaire

Leverage machine learning and deep learning techniques to build fully-fledged natural language processing (NLP) projects. Projects throughout this book grow in complexity and showcase methodologies, optimizing tips, and tricks to solve various business problems. You will use modern Python libraries and algorithms to build end-to-end NLP projects.

The book starts with an overview of natural language processing (NLP) and artificial intelligence to provide a quick refresher on algorithms. Next, it covers end-to-end NLP projects beginning with traditional algorithms and projects such as customer review sentiment and emotion detection, topic modeling, and document clustering. From there, it delves into e-commerce related projects such as product categorization using the description of the product, a search engine to retrieve the relevant content, and a content-based recommendation system to enhance user experience. Moving forward, it explains how to build systems to find similar sentences using contextual embedding, summarizing huge documents using recurrent neural networks (RNN), automatic word suggestion using long short-term memory networks (LSTM), and how to build a chatbot using transfer learning. It concludes with an exploration of next-generation AI and algorithms in the research space.

By the end of this book, you will have the knowledge needed to solve various business problems using NLP techniques.

What You Will Learn

Implement full-fledged intelligent NLP applications with Python
Translate real-world business problem on text data with NLP techniques
Leverage machine learning and deep learning techniques to perform smart language processing
Gain hands-on experience implementing end-to-end search engine information retrieval, text summarization, chatbots, text generation, document clustering and product classification, and more

Who This Book Is For

Data scientists, machine learning engineers, and deep learning professionals looking to build natural language applications using Python

Chapter 1: Natural Language Processing & Artificial Intelligence Overview

Chapter Goal: This is an introductory chapter. This chapter provides a quick refresher of the topics to be covered in this book. Since this book teaches projects surrounding a specific area of technology, we will provide a brief introduction to the key concepts required for these projects. We will not be working on a specific project, rather discuss some important concepts without going into details. The depth on each of these topics will be covered in the specific chapters

No of pages: 25

Sub - Topics:

1. Artificial intelligence paradigm

2. NLP and AI life cycle

3. NLP concepts (TF-IDF, word embeddings, many more)

4. Machine learning concepts (supervised learning, classification, unsupervised learning)

5. Deep learning concepts (CNN, RNN, LSTM)

Chapter 2: Product360 - Sentiment, Emotion & Trend Capturing System

Chapter Goal: Sentiment analysis involves finding the polarity of a sentence and labels it as positive, negative or neutral. Emotion detection involves identifying emotions(sad, anger, happy, etc) from the sentences. Data is extracted from social media like Twitter, Facebook etc. and Ecommerce website, processed and analyzed using different NLP techniques will provide a 360 degree view of that product which enables better decision making. This chapter introduces sentiment analysis to the reader and the various techniques that can be used to analyze text. We will apply sentiment, emotion and trend analysis on reviews data for any E-commerce website like Amazon, Zomato, and IMDb, etc. which contains millions of customer reviews and star ratings. For this task, we will use Python libraries such as Vader, Textblob, etc.

No of pages: 30

Sub - Topics

1. Text mining and various available libraries.

2. Data preprocessing.

3. Data cleaning tricks, optimized feature engineering

4. EDA

5. Sentiment analysis

6. Emotion and trend analysis

Chapter 3: TED Talks Segmentation & Topics Extraction Using Machine Learning

Chapter Goal: Document clustering is an unsupervised learning process for grouping documents. For example, there are number of e-books and they have to be grouped to build a structure around them saves time while finding the books. Articles grouping, product clustering are the other few examples. Once we identify the clusters, it is important to understand the properties of clusters. So, Topic modeling is performed to extract topics from a set of documents and articles to understand the content of the documents using keywords and be able to tag the articles or documents using those topics.

In this chapter will see how to group TED talks based on description using various clustering techniques like K-Means and Hierarchical clustering. Then we will perform topic modeling using Latent Dirichlet Allocation (LDA) to understand what defines each cluster. Important libraries include Gensim, NLTK, Scikit-learn and word2vec for this problem. We will use over 100k articles from different American publications.

No of pages: 30

Sub - Topics

1. Data understanding and pre-processing

2. Computing TF-IDF

3. K-Means and hierarchical clustering

4. Evaluation and visualization

5. Topic modeling using Latent Dirichlet Allocation

Chapter 4: Enhancing E-commerce Through Advanced Search Engine and Recommendation System

Chapter Goal: An information retrieval system will search product descriptions based on a search query text and gives the results. Search engines are the most common and best use case of information retrieval models. The concept of information retrieval started from a string or word comparison, but it won’t be accurate as it doesn’t capture semantics. Advanced deep learning techniques made information retrieval work more accurately.

Recommender systems are everywhere and used to create a personalized recommendations to increase the user experience. There are many types of recommender systems from collaborative filtering to graph-based. But the one dependent on Natural language processing is content-based recommender systems. It leverages the content of the item or the demographics of the user to recommend and this information is purely in the form of text. In this chapter, we will use advanced deep learning and word embedding techniques to search and recommend items/products to customers and libraries like SciKit-learn, NLTK, Keras, Word2vec, etc. We will use Flipkart e-commerce sample data which has the product name and its description.

No of pages: 30

Sub - Topics:

1. Information retrieval, word embeddings for IR, similarity scoring.

2. Content-based recommendation systems working

3. Data understanding and preprocessing

4. Search engine using word embeddings

5. Recommender system using KNN

Chapter 5: E-Commerce Product Categorization Model Using Deep learning

Chapter Goal: Most of the time, classification problems won’t be binary rather they will be multiclass. For example, categorizing the retail products based on the description, categorizing the call center complaints, etc. Complexity increases as the number of classes increases. Let’s solve this problem by using deep learning techniques. We leverage deep neural networks using the Keras library. Feature engineering techniques like TF-IDF and word embeddings are considered. We will use product description data for an E-commerce company to categorize the products.

No of pages: 25

Sub - Topics:

1. Text pre-processing

2. Text to features using TF-IDF and word embeddings

3. Multi-class classification using deep neural networks

4. Parameter tuning and optimization

Chapter 6: Movie Genre Tagging

Chapter Goal: Categorizing movies into genres is one of the classic AI problems. Online movie booking platforms, review websites like IMDB would tag movies into respective genres. The genre can be action, adventure, comedy, romance and so on.

Our goal here is to tag possible movie genres given the description of the movie. Machine/model has to predict all possible classes(genres) the movie would belong to. We have solved simple multi-class classification but, in this chapter, let's explore how to solve a multi-label learning and classification problem.

No of pages: 25

Sub - Topics:

1. Text processing

2. Data preparation for modeling

3. Text to features

4. Multi-label classification using different algorithms

5. Parameter tuning and evaluation

Chapter 7: Content Recommendation for the Marketing Campaign

Chapter Goal: A content recommendation engine collects and analyzes data based on users' behavior on marketing content. This data is then used to offer personalized and relevant marketing materials. We can tailor the subjects of the emails based on historical interactions. We will use deep learning techniques using Keras along with word embeddings.

No of pages: 25

Sub - Topics:

1. Why content recommendation

2. Feature engineering

3. Open rate to find the right content

Chapter 8: Quora Question Pair Similarity

Chapter Goal: Over 100 million people visit Quora every month, so it's no surprise that many people ask similarly worded questions. Multiple questions with the same intent can cause seekers to spend more time finding the best answer to their question and make writers feel they need to answer multiple versions of the same question. The goal of this chapter is to predict which of the provided pairs of questions contain two questions with the same meaning using advanced deep learning techniques. Keras will be used to find the similarity score.

No of pages: 25

Sub - Topics:

1. Why predicting the similar questions?

2. Text pre processing

3. Word embeddings

4. Finding similar questions

Chapter 9: Resume Parsing & Shortlisting with Machine Learning

Chapter Goal: In the recruitment industry, millions of people are uploading resumes and applying for jobs every day on thousands of employment platforms. Businesses have their openings listed on these platforms and job seekers come to apply. Every business has a dedicated recruitment team that manually goes through the applicant's resumes and extracts relevant data to see if they are a fit. To automate this task, this project tries to converts an unstructured form of resume data into a structured format. It's a model that analyses and extracts resume data, returns the machine-readable output and ranks the top resume’s that are best match to the given job description. This helps to store and analyze data automatically.

No of pages: 25

Sub - Topics:

1. Resume parsing using various NLP techniques

2. NER

3. Shortlisting and ranking resumes

Chapter 10: Building Chatbot Using Transfer learning

Chapter Goal: Question Answering (QA) System - also termed as “Chatbot” is very useful as most of the deep learning-related problems can be modeled as a question answering problem. Consequently, the field is one of the most researched fields in computer science today. The last few years have seen considerable developments and improvement in the state of the art, much of which can be credited to the upcoming of deep learning. In this chapter, we will build end to end QA system using NLTK, modern deep learning algorithms, and transfer learning.

No of pages: 25

Sub - Topics:

1. Q&A system explained

2. Q&A architecture

3. Natural Language Understanding

4. Learn possible approaches and algorithms

5. How to use transfer learning

6. Fine Tuning and optimizing the network

7. End to end implementation and evaluation

Chapter 11: Summarization System Using RNN

Chapter Goal: With the ever-growing data, reading the whole document is just time-consuming. We need to summarize the huge text corpus to make life easier. Text summarization is the process of creating a short summary of a longer document with accurate meaning. It’s widely used in headlines generation, summarizing the reviews, etc. There are many approaches to solve this problem like feature-based, graph-based, using sentence embeddings, etc. Abstractive methods like deep learning and reinforcement learning are providing excellent results since it generates an entirely new sentence which captures the meaning of source document. In this chapter, we will discuss all these Extractive and Abstractive methods to summarize the text. We will be using NLTK, Gensim, SciKit-learn, and Keras libraries.

No of pages: 30

Sub - Topics:

1. Text summarization using Extractive methods

2. Abstractive methods

3. Text summarization using deep learning

4. Text summarization using reinforcement learning

Chapter 12: Automated Text Generation Using LSTM and Encoders

Chapter Goal: Text Generation is a type of Language Modelling problem. Language Modelling is the core problem for several natural language processing tasks such as speech to text, conversational system, and text summarization. A trained language model learns the likelihood of occurrence of a word based on the previous sequence of words used in the text. Language models can be operated at the character level, n-gram level, sentence level, or even paragraph level. In this chapter, we will create a language model for generating natural language text by implement and training state-of-the-art recurrent neural network. We will use the Python programming language for this purpose. The objective of this model is to generate new text, given that some input text is present. We will start building the architecture. We will be using NLTK, Gensim, SciKit-learn and Keras libraries.

No of pages: 25

Sub - Topics:

1. Text generation concepts and application

2. Text generation architecture

3. Text preprocessing and feature engineering

4. Building the LSTM network model

5. Seq2Seq models

Chapter 13: Future of NLP & Next-Gen Artificial Intelligence

Chapter Goal: In this chapter, let's summarize what we learned so far in this book. We started from basics, traditional tasks to advanced text generation problems. We implemented and explored how deep learning is perfect for natural language understanding. We learned classification, information retrieval systems, Q&A systems, and also text generation. We will also explore why deep learning and other next-gen AI algorithms like GANS, Capsule networks, Differentiable Neural Computers, Unsupervised/Semi-supervised Deep Learning, Attention Networks, Transfer Learning, Deep Reinforcement Learning, Meta-Learning, is uniquely suited to NLP or their short comes, and how these algorithms would evolve and give state-of-the-art results in a slew of tasks under NLU and NLG.

No of pages: 12

Sub - Topics:

1. What did we learn

2. Future of NLP

3. Next-Gen learning algorithms for NLP

4. Deep reinforcement learning

5. What are the current challenges in NLP?

6. Research directions to solve the challenges

7. Current research in the NLP world

Akshay R Kulkarniis a renowned AI and machine learning (ML) evangelist and thought leader. He has consulted with Fortune 500 and global enterprises to drive AI and data science-led strategic transformations. Akshay has experience building and scaling AI and ML businesses and creating significant impact. He is currently Manager of Data Science & AI at Publicis Sapient on their core data science and AI team where he is part of strategy and transformation interventions through AI. He manages high-priority growth initiatives around data science and works on AI engagements by applying state-of-the-art techniques. He is a Google Developers Expert–Machine Learning, published author of books on NLP and deep learning, and a regular speaker at major AI and data science conferences (including Strata, O’Reilly AI Conf, and GIDS). Akshay is a visiting faculty member for some of the top graduate institutes in India. In 2019, he was featured as one of Top40 under 40 Data Scientists in India. In his spare time, he enjoys reading, writing, and coding, and help aspiring data scientists. He lives in Bangalore with his family.

Adarsha Shivananda is a senior data scientist on Indegene's Product and Technology team where he works on building machine learning and artificial intelligence (AI) capabilities for pharma products. He aims to build a pool of exceptional data scientists within and outside of the organization to solve problems through training programs, and always wants to stay ahead of the curve. Previously, he worked with Tredence Analytics and IQVIA. Adarsha has worked extensively in the pharma, healthcare, retail, and marketing domains. He lives in Bangalore and loves to read and teach data science.

Anoosh Kulkarni is a data scientist and senior consultant focused on artificial intelligence (AI). He has worked with global clients across multiple domains and helped them solve their business problems using mach

Covers NLP concepts and life cycle with simple and easy-to-follow end-to-end projects in Python

Includes the latest industry algorithms to implement and explain concepts and applications

Source code available at github.com/Apress/Natural-Language-Processing-Projects

Broché

Date de parution : 12-2021

Ouvrage de 317 p.

17.8x25.4 cm

Disponible chez l'éditeur (délai d'approvisionnement : 15 jours).

Prix indicatif 58,01 €

Ajouter au panier

Thème de Natural Language Processing Projects :

Langages et programmation

Mots-clés :

Natural Language Processing; Python; Deep Learning; Machine Learning; Text Analytics; CNN; RNN; LSTM; Recommendation System

Natural Language Processing Projects , 1st ed. Build Next-Generation NLP Applications Using AI Techniques

Auteurs : Kulkarni Akshay, Shivananda Adarsha, Kulkarni Anoosh

Résumé

Sommaire

Biographie

Commentaire

Thème de Natural Language Processing Projects :

Mots-clés :

Ces ouvrages sont susceptibles de vous intéresser