The purpose of this article is to provide a way of cleaning up of HTML tags within the data. When we use various styles or tabular format data in UI using Rich Text Editor/ Rad Grid etc, it will save data in database with HTML tags.
When you save in database this kind of field you have:
An HTML element starts with a start tag (<p>) and ends with end tag (<p/>) and everything between Start tag and End tag is HTML element. e.g.
<b>Following are the popular databases: <br /> </b>1. SQL Server <br /> 2. Oracle <br /> 3.
Teradata <br /> 4. Sybase
We are using SQL Server CHARINDEX function to get the positions of ‘<’ and ‘>’ and once found replacing the string between <….> with blank using STUFF function. We are using WHILE Loop that will run till presence of ‘<’ & ‘>’ in the string. Below is the UDF script that performs HTML Tags clean up from data.
SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE FUNCTION [dbo].[usp_ClearHTMLTags] (@String NVARCHAR(MAX)) RETURNS NVARCHAR(MAX) AS BEGIN DECLARE @Start INT, @End INT, @Length INT WHILE CHARINDEX('<', @String) > 0 AND CHARINDEX('>', @String, CHARINDEX('<', @String)) > 0 BEGIN SELECT @Start = CHARINDEX('<', @String), @End = CHARINDEX('>', @String, CHARINDEX('<', @String)) SELECT @Length = (@End - @Start) + 1 IF @Length > 0 BEGIN SELECT @String = STUFF(@String, @Start, @Length, '') END END RETURN @String END GO
By using above UDF, We can clean the HTML tags from the data.