Skip to main content

Posts

Showing posts from January, 2018

SPARK SQL - Know How !!!

What is Spark SQL? Spark is no doubt one of the most successful projects which the Apache Software Foundation could have ever conceived. They have incepted Spark SQL which integrates relational processing with the functional programming API of Spark. Querying data through SQL or the Hive query language is possible through Spark SQL. Those familiar with RDBMS can easily relate to the syntax of Spark SQL. Locating tables and metadata couldn’t be easier to Spark SQL. Spark SQL is known for working with structured and semi structured data. Structured data is something which has a schema which has a known set of fields. When the schema and the data has no separation then the data is known as semi structured. Spark SQL definition –  Putting it simply for structured and semi structured data processing Spark SQL is used which is nothing but a module of Spark. Hive limitations Apache Hive was originally designed to run on top of Apache Spark. But it had considerable limita...

REST API Technology Overview

What Technology Goes Into an API? APIs are driven by a set of specific technologies, making them easily understood by a wide variety of developers. A focus on simplicity means that APIs can work with any common programming language and be understood by any programmer, even one with little or no training in API technology. REST Application Programming Interface ( API ) is a set of clearly defined methods of communication between various software components. A good API makes it easier to develop a computer program by providing all the building blocks. While the specifications vary between various APIs, the end goal is to provide value to the programmer through utilization of the services gained from using an API. The most popular approach to delivering web APIs is Representational State Transfer ( REST ). This approach to API design takes advantage of the same internet mechanisms (based on the HTTP protocol) used to view regular web pages, so it has the advantage of faster im...

Artificial Intelligence Revolution : A Chip with 5 Senses

Artificial Intelligence Revolution : A Chip with 5 Senses Have you ever wished that your phone talks like an actual person? Imagine, if your computer could interact with you in such a way that it conveyed logic as well as emotions? If, when taking a selfie, your phone told you that those shoes DO NOT go with that outfit? Incredible, isn’t it? Well then, this dream-like fantasy is about to become a reality. Because the latest revolution in this area of technology is a stepping stone to just that.  Experts have managed to develop a chip that is capable of detecting the five senses, similar to how a human brain can. This contraption is remarkably identical to the human brain, being capable of sensing what is around it by sight, sound, smell, touch, and even taste. What’s even more exceptional is the fact that similar to a human brain, the power consumption of this chip is ultra low, which makes integrating it into everyday hardware very easy. Even though it is still in its ear...

Apache Spark RDD - Sampling With Replacement and Sampling Without Replacement

Sampling is a popular Spark RDD operation. Sampling With Replacement and sampling without replacement are different ways of doing sampling. This article explains the difference between them. Sampling with replacement: Consider a population of potato sacks, each of which has either 12, 13, 14, 15, 16, 17, or 18 potatoes, and all the values are equally likely. Suppose that, in this population, there is exactly one sack with each number. So the whole population has seven sacks. If I sample two with replacement, then I first pick one (say 14). I had a 1/7 probability of choosing that one. Then I replace it. Then I pick another. Every one of them still has 1/7 probability of being chosen. And there are exactly 49 different possibilities here (assuming we distinguish between the first and second.) They are: (12,12), (12,13), (12, 14), (12,15), (12,16), (12,17), (12,18), (13,12), (13,13), (13,14), etc. Sampling without replacement: Consider the same population of potato sacks, ...