Apache Spark - Interview Questions

Apache Spark - CoreApache Spark - GraphxApache Spark MLlibApache Spark SQLApache Spark Streaming
 
MASTER Apache Spark  

Top ranked courses to help you master Apache Spark skills.
Master the Coding Interview: Data Structures + Algorithms

iconicon

Offered By - Andrei Neagoie
Platform - Udemy
Rating - * * * * *
Students Enrolled - 100,000 +

Algorithms

iconicon

Offered By - Princeton University
Platform - Coursera
Rating - * * * * *
Students Enrolled - 850,000 +

RECOMMENDED RESOURCES
Behaviorial Interview
Top resource to prepare for behaviorial and situational interview questions.

STAR Interview Example
Apache Spark - Graphx - Interview Questions

What is Apache Spark GraphX?

 FAQ

Apache Spark GraphX is a component library provided in the Apache Spark ecosystem that seamlessly works with both graphs as well as with collections.

GraphX implements a variety of graph algorithms and provides a flexible API to utilize the algorithms.

What are the different types of operators provided in the Apache GraphX library?

 FAQ

Apache Spark GraphX provides the following types of operators - Property operators, Structural operators and Join operators.

Property Operators - Property operators modify the vertex or edge properties using a user defined map function and produces a new graph.

Structural Operators - Structural operators operate on the structure of an input graph and produces a new graph.

Join Operators - Join operators add data to graphs and produces a new graphs.

What are the property operators provided in the GraphX library?

 FAQ

Property operators modify the vertex or edge properties using a user defined map function and produces a new graph.

Property operators do not impact the graph structure, but the resulting graph reuses the structural indices of the original graph.

Apache Spark GraphX provides the following property operators - mapVertices(), mapEdges(), mapTriplets()

What are the structural operators provided in the Grapx library?

 FAQ

Structural operators modify the structure of input graph and produces a new graph.

Apache Spark Graphx provides the following structural operators.

reverse()

subgraph()

mask()

groupEdges()

What are the join operators provided in the Grapx library?

 FAQ

Join operators join data from external collections (RDDs) with graphs. Apache Spark Graphx provides the following join property operators.

joinVertices() - The joinVertices() operator joins the input RDD data with vertices and returns a new graph. The vertex properties are obtained by applying the user defined map() function to the result of the joined vertices. Vertices without a matching value in the RDD retain their original value.

outerJoinVertices() - The outerJoinVertices() operator joins the input RDD data with vertices and returns a new graph. The vertex properties are obtained by applying the user defined map() function to the all vertices, and includes ones that are not present in the input RDD.

Big Data Interview Guide has over 150+ interview questions and answers. Get the guide for $44.95 only.
 
BUY EBOOK
 

What are the neighborhood aggregation operations provided in the GraphX library?

 FAQ

Apache Spark Graphx provides the following neighborhood aggregation operations.

aggregateMessages() -

mapReduceTriplets() -

collectNeighbours()

How do you build graphs from a collection of vertices and edges in an RDD using GraphX library?

 FAQ

Apache Spark Graphx provides various operation to build graphs from an RDD of vertices and edges.

GraphLoader.edgeListFile()

Graph.apply()

Graph.fromEdges()

Graph.fromEdgeTuples()

What are the analytic algorithms provided in Apache Spark GraphX?

 FAQ

Apache Spark GraphX provides a set of algorithms to simplify analytics tasks.

Page Rank - PageRank measures the importance of each vertex in a graph.

Connected Components - The connected components algorithm labels each connected component of the graph with the ID of its lowest-numbered vertex.

Triangle Counting - A vertex is part of a triangle when it has two adjacent vertices with an edge between them. GraphX implements a triangle counting algorithm in the TriangleCount object that determines the number of triangles passing through each vertex, providing a measure of clustering.

 
Important Keywords to Remember

Apache Spark GraphX
Property Operators
Structural Operators
Join Operators
neighborhood aggregation operations
analytic algorithms
Page Rank Algorithm
Connected Components Algorithm
Triangle Counting Algorithm
Big Data Interview Guide

$49.95

BUY EBOOK
  SSL Secure Payment
Java Interview Quesiuons - Secure Payment
Big Data Interview Guide

$49.95

BUY EBOOK
  SSL Secure Payment
Java Interview Quesiuons - Secure Payment