Category: Big Data

29 Aug by Sohel Teli

Rack Awareness by sohel teli

Rack Awareness: Rack awareness is having the knowledge of Cluster topology or more specifically how the different data nodes are distributed across the racks of a Hadoop cluster. The importance of this knowledge relies on this assumption that collocated data nodes inside a specific rack will have more bandwidth and less latency whereas two data […]
27 Aug by Abhinav Khandelwal Tags: , , ,

Automated Text Classification with Machine Learning

In the era of digitization, information availability online has witnessed an exponential surge. The internet is brimming with textual content—be it emails, web pages, news, learning content, or journals—and this calls for effective ways to read, analyze and report this information efficiently. Text analysis is an integral aspect across several verticals, including marketing, product development, […]
26 Aug by Prabhu Sundarraj Tags: , , , , , , , ,

Hadoop-3.1.0 Multi Node Connection and Configuration in easiest way

HADOOP MULTI NODE INSTALLATION PROCESS   [ Note: I am creating multi node in most easiest way by creating and using everything in root mode on default users like (master, slave1, slave2…etc). Please don’t create any hadoop users like (hadoop, hduser, …etc) to configure and share hadoop installations. I am using root user in all […]
19 Aug by Vasim Shaikh

Cloudera manager installation on google cloud

  To deploy Cloudera Manager and CDH on an Google Compute VM instance, begin by creating an environment. The environment defines common settings, like region and key pair, that Cloudera Director uses with Google Cloud Platform. While creating an environment, you are also prompted to deploy its first cluster. Create an environment: Open a web […]