What is GraphDB?
GraphDB introduced in SQL Server 2017. We will examine GraphDB with examples in this article. At the end of the article, we will learn the architecture of GraphDB and how to query. In our article, we will also design a Normal table to address the difference between Relational DB and GraphDB.
GraphDB consists of Node and Edge tables. A Node represents an entity. This can be a person or organization. An Edge indicates the relationship between the two nodes to which it is connected. You can create Node and Edge tables under any Schema in the database, but they all belong to a logical graph.
Example
SQL Server Relational Database Example
In this example, we will see how the relational database query works and create three tables for it.
Our Tables: Personel, WorksIn ve Company
After creating these tables, we will insert our demo data into our tables and prepare related queries.
Let’s create our Demo database.
Create Database
1 2 | Create Database SQLTurkiye_GraphDB Go |
Create Demo Tables
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | USE [SQLTurkiye_GraphDB] GO CREATE TABLE [dbo].[Personel]( [Personel_id] [int] NOT NULL, [Name] [nvarchar](75) NULL, [Gender] [nvarchar](6) NULL, PRIMARY KEY CLUSTERED ( [Personel_id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO CREATE TABLE [dbo].[WorksIn]( [Personel_id] [int] NULL, [Cmp_ID] [int] NULL, [Since] [int] NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[Company]( [cmp_ID] [int] NULL, [cmpName] [varchar](100) NULL ) ON [PRIMARY] GO |
Insert Records into Tables
Insert data into created tables.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | INSERT INTO Personel VALUES (1,'Yusuf Kahveci','Male'); INSERT INTO Personel VALUES (2,'Yasar Kahveci','Male'); INSERT INTO Personel VALUES (3,'Julia Mertz Shand','Female'); INSERT INTO Personel VALUES (4,'March Hons','Male'); INSERT INTO Personel VALUES (5,'Lady Mish','Female'); INSERT INTO Personel VALUES (6,'Elizabeth Tork','Female'); INSERT INTO Company VALUES (1,'SQL Turkiye') INSERT INTO Company VALUES (2,'noSQLturkiye') INSERT INTO Company VALUES (3,'dataplatform services') INSERT INTO Company VALUES (4,'Orange Performance') INSERT INTO WorksIn VALUES(1,1,2015) INSERT INTO WorksIn VALUES(2,2,2014) INSERT INTO WorksIn VALUES(3,3,2016) INSERT INTO WorksIn VALUES(4,3,2016) INSERT INTO WorksIn VALUES(5,3,2014) INSERT INTO WorksIn VALUES(6,4,2014) |
You can see the columns we relate the tables in the graph below.
As you can see, we can access the WorksIn table by using Personel_id in the Personel table by matching it with Personel_id in WorksIn.
Then we can access the Company table by using Cmp_Id in WorksIn table by matching it with the cmp_ID in Company table.
The query should be as follows;
1 2 3 4 | SELECT p.[Name] FROM Personel p INNER JOIN WorksIn w ON p.Personel_id=w.Personel_id INNER JOIN Company c ON w.Cmp_ID=c.Cmp_ID WHERE c.cmpName='SQL Turkiye' |
In the query above, we accessed the data using Foreign Keys to find employees in the company. In this example we have reached this result with 3 tables and 3 FK, but as the complexity increases, we will be able to access the result using more tables and more FK. The structure will grow and the effort will increase.
Do not forget the above example and let’s analyze the same process in the GraphDB model using Nodes and Edges.
SQL Server GraphDB Example
Create Nodes and Edges Tables
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | CREATE TABLE [dbo].[Company_Node]( [ID] [int] NOT NULL, [name] [varchar](100) NULL, [sector] [varchar](100) NULL, [city] [varchar](100) NULL ) AS NODE; CREATE TABLE [dbo].[Personel_Node]( [ID] [int] NOT NULL, [name] [varchar](100) NULL, [Gender] [char](10) NULL ) As Node; CREATE TABLE [dbo].[City_Node]( [ID] [int] NOT NULL, [name] [varchar](100) NULL, [stateName] [varchar](100) NULL ) As Node; CREATE TABLE [WorksIn_Edge] ([year] [int] )AS EDGE CREATE TABLE LocatedIn_Edge as edge; CREATE TABLE LivesIn_Edge as edge; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | INSERT INTO [Personel_Node] VALUES (1,'Yusuf Kahveci','Male'); INSERT INTO [Personel_Node] VALUES (2,'Yasar Kahveci','Male'); INSERT INTO [Personel_Node] VALUES (3,'Taj Shand','Male'); INSERT INTO [Personel_Node] VALUES (4,'Archer Lamble','Male'); INSERT INTO [Personel_Node] VALUES (5,'Piper Koch','FeMale'); INSERT INTO [Personel_Node] VALUES (6,'Katie Darwin','FeMale'); INSERT INTO [Company_Node] VALUES (1,'SQL Turkiye','IT','Istanbul'); INSERT INTO [Company_Node] VALUES (2,'noSQLturkiye','IT','Istanbul'); INSERT INTO [Company_Node] VALUES (3,'Fabrikam Land','Pharma','Jonesbough'); INSERT INTO [Company_Node] VALUES (4,'Nod Publishers', 'IT','Jonesbough'); INSERT INTO [City_Node] VALUES (1,'Istanbul','Atasehir'); INSERT INTO [City_Node] VALUES (2,'Istanbul','Atasehir'); INSERT INTO [City_Node] VALUES (3,'Jonesbough','Lancing'); INSERT INTO [City_Node] VALUES (4,'Abbeville','Lancing'); INSERT INTO [City_Node] VALUES (5,'Zortman','Wyoming'); INSERT INTO [City_Node] VALUES (6,'Zortman','Wyoming'); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | INSERT INTO [WorksIn_Edge] VALUES ((SELECT $node_id FROM [Personel_Node] WHERE ID = 1), (SELECT $node_id FROM [Company_Node] WHERE ID = 1),2015); INSERT INTO [WorksIn_Edge] VALUES ((SELECT $node_id FROM [Personel_Node] WHERE ID = 2), (SELECT $node_id FROM [Company_Node] WHERE ID = 2),2014); INSERT INTO [WorksIn_Edge] VALUES ((SELECT $node_id FROM [Personel_Node] WHERE ID = 3), (SELECT $node_id FROM [Company_Node] WHERE ID = 3),2015); INSERT INTO [WorksIn_Edge] VALUES ((SELECT $node_id FROM [Personel_Node] WHERE ID = 4), (SELECT $node_id FROM [Company_Node] WHERE ID = 3),2016); INSERT INTO [WorksIn_Edge] VALUES ((SELECT $node_id FROM [Personel_Node] WHERE ID = 5), (SELECT $node_id FROM [Company_Node] WHERE ID = 3),2014); INSERT INTO [WorksIn_Edge] VALUES ((SELECT $node_id FROM [Personel_Node] WHERE ID = 6), (SELECT $node_id FROM [Company_Node] WHERE ID = 4),2014); Insert into LocatedIn_Edge values ((select $node_id FROM [Company_Node] WHERE ID = 1), (select $node_id FROM [City_Node] where ID=2)) Insert into LocatedIn_Edge values ((select $node_id FROM [Company_Node] WHERE ID = 2), (select $node_id FROM [City_Node] where ID=1)) Insert into LocatedIn_Edge values ((select $node_id FROM [Company_Node] WHERE ID = 3), (select $node_id FROM [City_Node] where ID=3)) Insert into LocatedIn_Edge values ((select $node_id FROM [Company_Node] WHERE ID = 4), (select $node_id FROM [City_Node] where ID=2)) Insert into LivesIn_Edge values ((select $node_id FROM [City_Node] WHERE ID = 1), (select $node_id FROM [City_Node] where ID=6)) Insert into LivesIn_Edge values ((select $node_id FROM [City_Node] WHERE ID = 2), (select $node_id FROM [City_Node] where ID=5)) Insert into LivesIn_Edge values ((select $node_id FROM [City_Node] WHERE ID = 3), (select $node_id FROM [City_Node] where ID=4)) Insert into LivesIn_Edge values ((select $node_id FROM [City_Node] WHERE ID = 4), (select $node_id FROM [City_Node] where ID=2)) Insert into LivesIn_Edge values ((select $node_id FROM [City_Node] WHERE ID = 5), (select $node_id FROM [City_Node] where ID=3)) Insert into LivesIn_Edge values ((select $node_id FROM [City_Node] WHERE ID = 6), (select $node_id FROM [City_Node] where ID=1)) |
1 2 3 4 5 | SELECT * FROM [Personel_Node] SELECT * FROM [Company_Node] SELECT * FROM [City_Node] |
Select From Edge Tables;
Now we need to write CQL and EXEC to get results from Graph. Our example is to get the names of employees working in a specific company. Accordingly, our CQL is;
1 2 3 | SELECT PN.name FROM [Personel_Node] PN, [WorksIn_Edge], [Company_Node] CN WHERE MATCH(PN-(WorksIn_Edge)->CN) AND CN.name='SQL Turkiye'; |
As seen in the graph above, we see that the query will need 1 index search.
Now let’s work with different examples,
1 2 3 | SELECT CN.name FROM [Personel_Node] PN, [WorksIn_Edge], [Company_Node] CN WHERE MATCH(PN-(WorksIn_Edge)->CN) AND PN.name='Yusuf Kahveci'; |
1 2 3 4 | select Personel_Node.name ,Company_Node.name From [Personel_Node] ,[WorksIn_Edge], [Company_Node] ,LocatedIn_Edge,[City_Node] where MATCH(Personel_Node-(WorksIn_Edge)->Company_Node and Company_Node-(LocatedIn_Edge)->City_Node ) and WorksIn_Edge.year='2014' and Company_Node.name='Fabrikam Land' |