Oracle Function:Translate & Replace
Using Oracle function TRANSLATE and REPLACE to remove special characters from a string.
This is the definition of the function.
TRANSLATE(expr, from_string, to_string)
Descrption: [FROM ORACLE DOCUMENTATION]
TRANSLATE returns expr with all occurrences of each character in from_string replaced by its corresponding character in to_string. Characters in expr that are not in from_string are not replaced. If expr is a character string, then you must enclose it in single quotation marks. The argument from_string can contain more characters than to_string. In this case, the extra characters at the end of from_string have no corresponding characters in to_string. If these extra characters appear in char, then they are removed from the return value.
You cannot use an empty string for to_string to remove all characters in from_string from the return value. Oracle Database interprets the empty string as null, and if this function has a null argument, then it returns null.
Let’s see…
SQL> create table t6 (a varchar2(20));
Table created.
SQL> insert into t6 values ('ABC!@#$%^&*XYZ');
1 row created.
SQL> commit;
Commit complete.
SQL> select * from t6;
A
--------------
ABC!@#$%^&*XYZ
SQL> SELECT TRANSLATE(a, '!@#$%^&*', ' ') FROM t6;
TRANSLATE(A,'!@#$%^&
---------------------
ABC XYZ
#####
Notice that I removed the $ (DOLLAR) sign from the list of junk.
#####
SQL> SELECT TRANSLATE(a, '!@#%^&*', ' ') FROM t6;
TRANSLATE(A,'!@#%^&*
--------------------
ABC $XYZ
#####
I would create a function that accepts a varchar2 and returns a ‘cleaned’ varchar2 for this purpose.
#####
SQL> create or replace function clean_str(in_string in varchar2)
2 return varchar2
3 is
4 out_string varchar2(4000);
5 remove_char varchar2(100);
6 begin
7 remove_char:='!@#$%^&*æ';
8 select TRANSLATE(in_string,remove_char , ' ') into out_string FROM dual;
9 return (out_string);
10 end;
11 /
Function created.
SQL> select clean_str(a) from t6;
CLEAN_STR(A)
------------
ABC XYZ
I kept the last argument ‘ ‘ (a space) because of the NULL rule mentioned above in the description. I just asked Oracle to assign a single space instead.
SQL> insert into t6 values ('ABC!@#$XYZ%^&*XML');
1 row created.
SQL> commit;
Commit complete.
SQL> select * from t6;
A
-----------------
ABC!@#$%^&*XYZ
ABC!@#$XYZ%^&*XML
SQL> select clean_str(a) from t6;
CLEAN_STR(A)
------------
ABC XYZ
ABC XYZXML
Now that I have a string like ‘ABC XYZ’ and ‘ABC XYZXML’, I am going to use Oracle function REPLACE to remove the extra space(s). The advantage of REPLACE is that it replaces all the occurances of the string/character with the provided replacement string. So lets change the function a little bit.
SQL> create or replace function clean_str(in_string in varchar2)
2 return varchar2
3 is
4 out_string varchar2(4000);
5 remove_char varchar2(100);
6 begin
7 remove_char:='!@#$%^&*æ';
8 select
9 REPLACE(TRANSLATE(in_string,remove_char , ' '),' ',NULL)
10 into out_string
11 FROM dual;
12 return (out_string);
13 end;
14 /
Function created.
SQL> select clean_str(a) from t6;
CLEAN_STR(A)
------------
ABCXYZ
ABCXYZXML
So now all you have to do is keep a list of unwanted or special characters in the remove_char variable of the function and you are golden.
I hope this helps.
-Kirtan
Updated!!!
I just realized that there is no need to use REPLACE function above. I will keep it as it is just in case if someone wants to use it as it is. Below is an updated(and better IMHO) version of clean_str() function. The idea is to pass the character that you want to remove as the THIRD argument and concatenate the variable remove_char with the same character. It is going to be a space for this example.
create or replace function clean_str(in_string in varchar2)
return varchar2
is
out_string varchar2(4000);
remove_char varchar2(100);
begin
remove_char:=' ' || '!@#''''$%^&*æ';
select
--REPLACE(TRANSLATE(in_string,remove_char , ' '),' ',NULL)
TRANSLATE(in_string,remove_char , ' ')
into out_string
FROM dual;
return (out_string);
end;
/
What you can also do is you can allow the character that you want to be removed as an input to the function.
create or replace function clean_str1(in_string in varchar2,culprit in char)
return varchar2
is
out_string varchar2(4000);
remove_char varchar2(100);
begin
remove_char:=culprit || '!@#''''$%^&*æ';
select
--REPLACE(TRANSLATE(in_string,remove_char , ' '),' ',NULL)
TRANSLATE(in_string,remove_char , culprit)
into out_string
FROM dual;
return (out_string);
end;
/SQL> select clean_str1('KIR$#@%TAN',' ') from dual;
CLEAN_STR1('KIR$#@%TAN','')
--------------------------------------------------------------------------------
KIRTAN
SQL>
This entry was posted on Wednesday, February 28th, 2007 at 7:34 pm and is filed under Oracle. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
February 28th, 2007 at 9:29 pm
Nice solution…i dont need this right now but will keep this in my knowledge base…
Thank you !
Neha
March 3rd, 2007 at 7:22 pm
Hi Kirtan
You don’t need to create a PL/SQL function or use REPLACE. TRANSLATE will do the job in one if you put the dummy character in the third argument as the first character in the second argument, e.g.:
TRANSLATE( in_string, CHR(0) || remove_chars, CHR(0) )
I’ve used CHR(0) but any character would do, even space.
Regards
Jonathan
March 4th, 2007 at 3:34 pm
Jonathan,
I created the function so that the ‘list’ of unwanted characters can be maintained cleanly. If you use TRANSLATE in your SQL statement, you would have to keep that list available for all developers at all times. To me it was a bit messy. BUT, I do see the point of not using REPLACE. I have updated my post above.
Kirtan
June 3rd, 2009 at 2:04 pm
USE regexp_replace oracle function
July 30th, 2009 at 1:06 pm
Kirtan,
I have a similar issue to remove the UNICODE character.
In my RDF file, i should accomidate it in such a way that what ever currupt characters (Especially UNICODE symbols) comes in between the email_address , it has to change to its subsequent character like this…
The output of ñ should be n. This is a kind of example..
When the input value is ‘john.muñoz@yahoo.com’ it should return ‘john.munoz@yahoo.com’
I have tried using the following functions but the output still shows me the unicode symbol…
select UNISTR(’john.muñoz@yahoo.com’) from dual
select remove_special_chars(’john.muñoz@yahoo.com’) from dual
Select translate(trim(email_address),’1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ WE8ISO8859P1 ~!#$%^&*()_+}{”:?><`=]['''';/,','1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ') from per_people_f
where employee_number = 3951
select CONVERT('john.muñoz@yahoo.com','US7ASCII') from dual
select regexp_replace('john.muñoz@yahoo.com', '[^-.@,a-z,A-Z,[:space:]]’) from dual;
If you have any ideas.. please let me know…
Thanks in advance
Ravi.
November 12th, 2009 at 5:51 am
Thanks dude!!!!!!!!!!!!!