AI Chatbots – Can they transform user experience?

Chatbots

Over a few months, the term chatbots have gained overwhelming popularity and are at the forefront.

The enthusiasts of Artificial Intelligence are very well known about the “Turing Test” which is understood to be the criterion for intelligence. In Turing Test, “if the human being conducting the test is unable to consistently determine whether an answer has been given by a computer or by another human being, then the computer is considered to have ‘passed’ the test.” This test was not passed in a satisfying manner by most of the AI systems.

But chatbots have been successful in passing the Turing Test!!!

So, how chatbots can transform the user experience?

Let me explain with an example:

Consider a scenario, where we need some information about the documents required for getting a driving license. Most of the times the information is displayed on respective websites. But still contacting an official is considered to be a better option in order to avoid multiple visits due to one or the another reason. Sometimes, many people have the same questions and even for any official to answer the routine questions can be time-consuming. This is where chatbots can help. Chatbots can address multiple people at the same time, answering their routine questions. If they are programmed well they could avoid multiple visits for the people thereby making the process less tedious and faster.

WHAT EXACTLY IS A CHATBOT?

To explain in brief, chatbots are the computer programs that are designed to convincingly simulate how a human would behave as a conversational partner. These systems are generally used in dialog systems for various practical purposes such as customer service or information acquisition. Some chatbots also use Natural Language Processing Systems.

WHERE CHATBOTS COULD BE USED?

Chatbots can help improve the customer service of any business by providing immediate customer support, carrying out a specialized task, increasing sales and seamless mobile experience. Additionally, they can also find a place in education systems where they can be used as a tutoring resource for students and also would be helpful to answer routine questions while freeing up time for humans.

Chatbots can be helpful for many reasons. Many people already use messaging apps and they are pervasive. Besides, chatbot contents are easy to read as they are short and to the point. Moreover, they work perfectly in low bandwidth because they are short and text-based messages and don’t require high-speed internet.

At the top, taking into consideration different features and technical advantages of chatbots, it can be understood beyond doubt that it can be a beginning of the AI revolution and can indeed transform the user experience to the great extent!

(Image Source: www.scrapesentry.com)

Executing Long Running Tasks in Google App Engine – How to do it?

Most of the times a question flashes into the mind of the developers especially those who work on the Google Cloud Platform:

What if I am using Google App Engine Platform as a Service and will be having long running tasks that should run in the background for hours or maybe even days, is it possible?

Yes, it is possible. The answer is: use the “Task Queues”, one of the most laudable features provided by Google App Engine. But when hosted in the production, many types of unexpected problems arise with respect to long running tasks in task queues, which if not addressed, there may be a surprising behavior of the application in the production. Towards the end, we will be quickly discussing the different configurations required for respective application-specific requirements which are necessary for a long running task to run in the task queue.

Before going towards the discussion, let me quickly brief what is Google App Engine and what are Task Queues?

Google App Engine:

Google App Engine (most commonly referred as GAE) is a platform for building scalable web applications and mobile backends. It offers different features for web applications such as Automated Security Scanning for detecting web vulnerabilities, supports popular development tools such as Eclipse, IntelliJ, Maven, Git, Jenkins and PyCharm which makes developing on GAE developer friendly.

Moreover, features like User Authentication using Google Accounts, NoSQL Datastore, Memcache and Task Queues make GAE incomparable.

Therefore, it is most commonly preferred for developing web applications hosted on Google Cloud Platform.

GAE Task Queues:

Sometimes, there might be a scenario where a user takes a particular action on the web application and that task could be run outside of the user’s request which can be executed later.

For example, if a user wants to upload an “online” file to the web application, the user can provide just the link of the file to be uploaded and instead of waiting for the file to upload and prevent itself from performing other tasks on the application, user can return anytime later to check the progress of the upload. Here, the upload task is assigned to a task in task queue which runs asynchronously outside the user’s request and completes the task. Thus, the user can perform other tasks on the web application while the upload job will still be in progress in the background.

In this way, Task queues can help us to carry out important tasks that can be executed in the background.

Coming to the topic, let us proceed towards the discussion on how to run long running tasks in task queues and what configurations prevent the tasks to do so in production?

Let’s understand some very important information for achieving this:

Often times, the application behaves exactly as expected in the development server, but when on the production server there are chances that most of the features of the application does not work or does not yield as expected. This is because when we deploy the application on the cloud, we actually use the cloud resources and configurations that might be different from the development server. These resources and configurations can be configured by understanding the different cloud instances that are offered by the cloud service provider which are generally well described in their documentation along with the respective costs.

GAE offers two types of instance configurations, Frontend instances, and Backend Instances. As the names describe, the frontend instances are used to compute the operations that are carried out at the end-user level. The backend instances perform the computations in the background.

By default, when we first deploy our application to the GAE without editing the app-engine.XML (The application configuration file), the instance allocated is the most basic Frontend Instance. Beyond doubt, each instance is associated with respective prices.

As discussed above, scaling is the most promising feature of Google App Engine. Thereby, each instance can be configured with appropriate scaling. There are three types of scaling offered by the GAE, namely, Automatic Scaling, Basic Scaling and Manual Scaling. The configuration of scaling options can considerably affect the cost of running the application.

As the default instance is a Frontend instance, the default scaling is Automatic Scaling as per the documentation. This is where the concern for running long task in the task queue rises. How?

As per the official documentation, in Automatic Scaling, a task in a task queue can run maximum for 10 minutes. Thus, if there are tasks that can execute within the 10-minute deadline, this type of scaling would not create many problems. But we are talking about the long-running tasks that exceed the 10-minute deadline. So, how to make it work?

Indeed, automatic scaling would not help, we can switch to basic scaling or manual scaling. But while using Frontend instance, only automatic scaling is allowed. Therefore, it is also necessary to use Backend instance instead of Frontend instance.

Now, when used Backend instance, configure the app to use basic or manual scaling. For more information on scaling types and other information, please visit the official GAE documentation. (Link)

With the configuration of Backend instance and a Manual or Basic scaling, there are no restrictions for tasks to execute in 10 minutes deadline, instead, a task can run in the background until it completes its execution. However, using basic scaling would be preferred to control costs if there is no need to complex initializations and relying on state of instance’s memory over time.

Final Words:

While using different features of the Google App Engine, it must be noted that the behavior of the application is different on the development server and production server. Thus, before making a release, the application should be well tested for every feature on the Production Server.

Data Lake – Why should we use one?

From the last few years, we have observed a massive growth in the data than we have ever seen. Many organizations find an opportunity from this big data and develop different strategies to monetize it. But the major challenge is “Where to store all the data?”

We have data warehouses that store the data as per the prescribed standards of the organization. That means, when the data is coming, it may be stopped, different cleaning and smoothing operations might be performed and then are stored in the data warehouse. This indeed gives the concern about what to do about the data that won’t be requiring frequently and still different resources on processing that data are utilized.

This is where the “Data Lake” can be introduced.

A Data Lake is a gigantic data repository where the data is stored in its indigenous form. It acts as a centralized repository where the data coming from different sources are stored in its raw form without any cleaning or transformations thereby storing the data in its true form.

So why should one opt for Data Lake?

From the past two years, it has been observed that massive amounts of data are generated and there is a need to address this massive explosion of data. Most of the times, there is a comparison between Data Warehouse and Data Lake, but Data Warehouse consists of different components and stores the data in some standards which can be prescribed in the data transformation processes. The data lake can be thought of as a system that comes before data warehousing.

The term “DataLake” was first coined by James Dixon, CTO of Pentaho in 2012 to contrast with “Data Mart” or “Data Warehouse” which is a smaller repository of refined data extracted from the raw data.

He explained: “If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake and various users of the lake can come to examine, dive in, or take samples.”

Indeed, the Data Lake is not a replacement for Data Warehouse, actually if designed right, it can complement with your existing Data Warehouse and work effectively together. The best part of this integration will be that it can store all formats of the data (Structured, Semi-Structured and Unstructured) that is situated into one place.

(Image Source: www.solutionsreview.com)

Data Science – Understanding the concept and why it is important?

Over the last decade, there has been a massive growth in both data generated and data retained. These data are retained by companies as well as you and me, isn’t it? Sometimes, we call this as the “Big Data”.

Nowadays, the term “Data Science” is gaining a wide recognition. But what does a data scientist do? Data scientists are the people who make sense out of all the big data and determine what can be done with it in order to increase the productivity.

Let’s understand with an example:

Consider, you are visiting a candy shop, generally a person takes those candies that he likes, in contrast, data scientists are the people who will get all the flavors of the candies and analyze them because they really need to know what each one tastes like. In short, the title “Data Scientist” encompasses different flavors of the work. According to me, that is the major difference between a “Data Scientist”, “Statistician”, “Analyst” or an “Engineer”. A data scientist is one who does little of those tasks done by a statistician, analyst and an engineer.

To be more specific, a data scientist is one who does the following primary tasks:

  1. Data Cleaning
  2. Data Analysis
  3. Statistics
  4. Engineering

Let’s have a look at each of the tasks in brief:

  • Data Cleaning:

The data coming from different sources may contain a lot of noise, might be unformatted and might not be useful for generating valuable insights. This task ensures that all the data is well formatted and also conforms to some set of rules and standards.

  • Data Analysis:

In this task, lots of plots of data are made in order to understand the pattern of the data. Through this process, some theories regarding the data behavior are crafted in a way that will be easy to communicate and easy to act on.

  • Statistics:

A data scientist develops different models by understanding the data patterns through data analysis and develops some strategies based on the understood or developed statistics. But the most challenging aspect of this task is that the models or statistics cannot act as a permanent solution to the defined problem. Therefore, a lot of time is dedicated to this task in which a data scientist may need to evaluate and make some changes in the existing models, as well as going back to the data and bring out new features to help make better models.

  • Engineering:

The above-discussed tasks can just be defined or act as a tip of the iceberg. This is because even if we have state of the art data models for different applications, it doesn’t do anyone much good if the insights are not given to the customers or users and do it consistently. This means building a sort of a data product that can be used by the people who are not data scientists. This can be implemented in many forms like chart visualizations, metrics on a dashboard, or an application.

Understanding all the above tasks of a data scientist, in brief, it can also be understood that a long-term life cycle of a data science project may involve going back and re-analyzing the data models if there is always a new source of data coming in and there is a need to incorporate them.

Analyzing such traits and tasks of a data scientist it can be concluded beyond doubt that how great importance the data science and data scientist may have in the growth of any organization in the era of highest competition and the need of constant improvements in the services of the organizations.

(Image Courtesy: www.georgianpartners.com)

Research Paper Published on IEEE Xplore Digital Library

Greetings of the day…!

The research paper “Titled: Sensor Data Computing as a Service in Internet of Things” which I had presented at the International Conference on Colossal Data Analysis and Networking held at Indore, India in March 2016 is now made available on the IEEE Xplore Digital Library on 19 September 2016, indeed it feels like the months of perseverance had paid off.

Link to the digital library.

Please review the document, your views and thoughts are welcomed.

 

The details of the publication are as below:

Title: Sensor Data Computing as a Service in Internet of Things

Author: Karan N Tongay

DOI: 10.1109/CDAN.2016.7570963

Electronic ISBN: 978-1-5090-0669-4

The abstract can also be downloaded from the following link:

Download Abstract