Thursday , December 26 2024

Fulltext Search On SQL Server

 

Full-text Search used to search for columns with large text, such as varchar (max).

You can do the same with LIKE, but LIKE will work very slowly in columns with large text data of this type.

Let me tell you a story about this.

 

One day, an application team came to me an they said:

“We want to buy a database server. What features it should have?”

 

I said:

“I want to check your queries at first and then I will say the necessary features if you really need it.”

 

When I examined the queries on the server, I saw that all the queries were searching on varchar (max) columns with “like”.

and CPUs were consistently above 90%.

I put a full-text index on the search columns and told them to use “contains” instead of “like”.

And as a result of this change, CPU utilization dropped to around 10% and application performance increased about 20 times.

As a result of this, they canceled buying new server.

As you can see, you can increase performance 20 times using full-text search.

 

How do we perform full-text search?

First, create a table for doing full text search tests in the database named test using the following script.

Note: The table you want to define a Full Text index must have a unique column.

 

Creating Full-Text Catalog:

We click New Full-Text Catalog under Storage-> Full Text Catalogs under the database as below.

 

In the screen that opens, we give a name and an owner to the catalog as below (you have to click … on the right side to give the owner) and we click ok.

You should decide whether to choose Sensitive or Insensitive. If you select Sensitive, ‘a’ = ‘á’ means the same.

 

After creating catalog, we need to create our full text index.

 

Creating Full-Text Index:

Right-click on the table we created at the beginning of the article, select Full-Text index and click on Define Full-Text Index ….

 

We click next on the screen and the following screen appears.

In this screen, we need to select a unique index on the table. The data will be accessed via this unique index.

Click Next.

 

On the next screen, the columns listed in the table.

We choose the column we want to create a fulltext index.

If we create a full text index on column varbinary (MAX) or image “type column” will be active.

Since this type of image file can not be searched, another column that gives reference to this column must be selected from Type Colum.

Click Next.

 

On the next screen, we choose whether or not the changes in the table are automatically transferred to the full-text index.

We prefer to do this automatically by selecting Automatically.

Because we select it automatically, when the full-text index is created, the table’s data will be automatically transferred to the index.

This process is called full population. Click next.

 

On the incoming screen,

In the “Select full-text catalog” section we select the <full text catalog> that I created earlier.

In the “Select index filegroup” section, we specify the filegroup where we will put the index.

Because our database is a test database, we have only PRIMARY filegroup.

I generally recommend creating another filegroup for indexes in Production databases.

You can learn the tricks of creating a database with large data from the article titled “How To Create a Database On SQL Server“.

“Select full-text stoplist” contains a list of words that will not be searched.

Right click on “Full Text Stoplists” and click on New Full-Text Stoplist ….

 

From the full text stoplist name we give a name to our stop list.

We also set an owner from the Owner section by clicking on the box next to it and choosing browse.

Click Create an empty stoplist to complete the process.

 

Right click on the stop list we created and click properties.

 

In the screen that appear, we click on “Add stop word” in the Action section and write a word in the stop list that we do not want it to return as a result in the search results.

Add the “dbtut” word as a stop word.

You should select your language from Full-text language.

 

When we come back to the full text index create screen again, you can see the Stoplist we created.

If you do not see it then click back and then click next.

After we have chosen our stoplist, we move forward by clicking next.

 

On the next screen, we complete the process by saying next and finish without making any changes.

 

Searching from Full-Text Index:

You can use Contains, ContainsTable, Freetext, FreetextTable functions for searching.

 

Contains:

Using this function you can search for an exact word. It works as follows.

It takes 2 parameters. The first parameter is the full text index’s column’s name. The other parameter is the word or words to search.

We are only looking for the word “Nurullah” with the script below.

 

With the following script we are looking for “Nurullah ÇAKIR” as a space between Nurullah and ÇAKIR.

 

ContainsTable:

It shows the rank (degree) of the searched word. You can use it in the following way. The parameters are the table, the column with the full-text index and the word to be searched.

 

Freetext:

It works in the same way as the Contains function.

When Contains finds the exact word you are looking for, the FREETEXT function will also result in different states of the word.

For example, if a row in the table had an expression like “ÇAKIR NURULLAH”, the following query would return a result.

Or he would accept the words “Nurullah” and “Nurullah” as the same, and he would return them as a result.

 

FreetextTable:

It also returns the rank (degree) as ContainsTable.

The difference from the ContainsTable is that the rank will be higher because it will accept the same words “ÇAKIR” and “ÇAKIRS”.

These functions have much more details. I will not go into more details on this article.

Loading

About dbtut

We are a team with over 10 years of database management and BI experience. Our Expertises: Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Elasticsearch, Kibana, Grafana.

Leave a Reply

Your email address will not be published. Required fields are marked *