Hotline for IT engineers/model developers: April 2012

Tuesday, April 24, 2012

T-SQL Programming - Useful Tips - Text Search

SQL Server 2008 R2

TEXT SEARCH

Case Sensitive Search
1. CHARINDEX('ExpressionToFind', 'ExpressionToSearch' COLLATE Latin1_General_CI_AS)

Here CHARINDEX function returns the position of the string found in the search. Returns 0 if not found.

Case Insensitive Search
1. Using CHARINDEX

CHARINDEX('ExpressionToFind', 'ExpressionToSearch' COLLATE Latin1_General_CS_AS)

- Here CHARINDEX function returns the position of the string found in the search. Returns 0 if not found.

2. Using Like
IF EXISTS(SELECT * FROM (SELECT 'This is a test' a) tab WHERE a like '%test')

- 'Like' does case insensitive search in T-SQL

Wednesday, April 18, 2012

How to retrieve last IDENTITY using JDBC

Background:

I was working on an existing module which was using JDBC for all the database operations. There was a scenario where front-end UI has to update a set of tables, meaning data will be inserted into parent table first, than in child tables using parent's ID to establish the referential integrity. Existing code was using database MAX function to fetch the ID which I felt not a good idea and might have concurrency issues when multiple users will be using the application. My investigation came up with few alternates which I have listed below.

Right now the application is pointing to the MS SQL Server database and all the solutions are for that database only. Plan is to do similar investigation in future to incorporate other databases like Oracle.

Solutions for MS SQL Server:

Solution-01: Using JDBC built-in functionality

Assumptions:

Table where the insertion is happening has a auto-generated key

Steps: (see code-01)

Use Statement.RETURN_GENERATED_KEYS during PreparedStatement creation
Execute insert statement
Get list of generated keys as ResultSet
Get the first value from ResultSet

Solution-02: Using MS SQL server OUTPUT clause feature in the query

Assumptions:

The query should use OUTPUT clause feature of SQL Server (I have another post, which explains how to use OUTPUT clause)

Steps: (see code-02)

Create a query with OUTPUT clause
Execute the query using ' executeQuery' method of Statement
Get list of generated keys as ResultSet
Get the first value from ResultSet

Resources:

How to use OUTPUT clause, this is my another post on this topic - http://peerprogrammer.blogspot.com/2012/04/how-to-retrieve-identity-of-table_18.html

Sample Codes:

--------------------

--- Code-01 - use JDBC feature to get IDENTITY

--------------------

public int getLastIdentity(String sql) {

PreparedStatement pStmt;

int lastIdentity = 0;

try {

pStmt = conn.prepareStatement( sql, Statement.RETURN_GENERATED_KEYS);

pStmt.executeUpdate();

ResultSet identities = pStmt.getGeneratedKeys();

identities.next();

lastIdentity = identities.getInt(1);

identities.close();

pStmt.close();

conn.close(); -- connection is created outside of this method

} catch (Exception e) { e.printStackTrace(); }

return lastIdentity;

}

--------------------

--- Code-02 - use SQL Server 2005 feature OUTPUT clause

--------------------

Input SQL should look like below and assuming the table's auto increment column name is 'id':

INSERT INTO 'tablename' OUTPUT inserted.id VALUES('First Column', 'Second Column');

public int getLastIdentity(String sql) {

PreparedStatement pStmt;

int lastIdentity = 0;

try {

pStmt = conn.prepareStatement(sql);

ResultSet identities = pStmt.executeQuery();

identities.next();

lastIdentity = identities.getInt(1);

identities.close();

pStmt.close();

conn.close(); -- connection is created outside of this method

} catch (Exception e) { e.printStackTrace(); }

return lastIdentity;

}

How to retrieve IDENTITY of a table

Background:
I have a Stored Procedure which need to insert data into a set of tables by maintaining the referential integrity. Meaning parent table ID value (ID is a Identity column) need to be fetched and used in child tables during insertion. There is a trigger defined in the parent table which inserts a record in another table. Logic in the SP is like 1. inserts a record in parent table 2. immediate statement retrieves last identity using @@IDENTITY global variable. As the trigger inserts record in another table, the @@IDENTITY variable returns the identity from second table. See Query-03 which represents the same scenario.

This problem drove me to investigate in detail about the approaches and their pros and cons to retrieve IDENTITY value of a table.

Factors need to consider: (mostly for @@IDENTITY, SCOPE_IDENTITY(), IDENT_CURRENT())
Different scopes:

The session scope - the connection which produced the IDENTITY value
The statement scope - where is the statement that produced the IDENTITY value
The table scope -the able produced the IDENTITY value

What is scope?
A scope is a module: a stored procedure, trigger, function, or batch. Therefore, two statements are in the same scope if they are in the same stored procedure, function, or batch.

What is @@IDENTITY?

@@IDENTITY is a system global variable
Returns the last IDENTITY value produced on a connection
Regardless of the table that produced the value
Regardless of the scope of the statement that produced the value
Returns IDENTITY of current session, ignores the scope

Note: From the attributes one thing is clear that developer has to be certain where the IDENTITY comes from and the scope of it unless same problem might occur which I explained above. (see Query-03 for sample code)

What are the alternatives?

Use SCOPE_IDENTITY()

Attributes:

Returns the last IDENTITY value produced on a connection and by a statement in the same scope

Regardless of the table that produced the value
Returns IDENTITY of current session and current scope

Notes:

IDENTITY created in Trigger or User Defined Function for other table(s) won't affect actual table's IDENTITY
(see Query-03 for sample code)

Use IDENT_CURRENT('tablename')

Attributes:

Returns last IDENTITY produced in a table
Regardless of the connection that created the value
regardless of the scope of the statement that produced the value

Notes:

IDENTITY created in Trigger or User Defined Function for other table(s) won't affect actual table's IDENTITY
If IDENTITY created in current scope is needed, this might not return correct value as some other connection might have already entered record in between
(see Query-03 for sample code)

Use OUTPUT clause:

Attributes:

Returns information from, or expressions based on, each row affected by an INSERT, UPDATE, DELETE (see Query-01)
Information can be retrieved in a tabular form only, cannot be retrieved in a variable (see Query-02)
Columns returned from OUTPUT reflect the data as it is after the INSERT, UPDATE, or DELETE statement has completed but before triggers are executed

Notes:

The Output clause is a new to SQL Server 2005
An UPDATE, INSERT, or DELETE statement that has an OUTPUT clause will return rows to the client even if the statement encounters errors and is rolled back. The result should not be used if any error occurs when you run the statement

Use SEQUENCE feature in column (similar as Oracle Sequence)

Notes:

Available in SQL Server 2011 only
I couldn't come up with sample code as not used SQL Server 2011 yet

Use custom sequence (a table to store sequence value and a Stored Procedure to modify the value)

Notes:

Custom logic
Concurrency need to be handled manually

Resources:

How to use OUTPUT clause - http://msdn.microsoft.com/en-us/library/ms177564.aspx
Problem in SCOPE_IDENTITY - (1) https://connect.microsoft.com/SQLServer/feedback/details/328811/scope-identity-sometimes-returns-incorrect-value (2) http://support.microsoft.com/kb/2019779

Sample Queries:

-------------
-- Query-01 - OUTPUT clause
-------------
-- Data table
CREATE TABLE test(id INT IDENTITY(1,1), value VARCHAR(100));
-- table for output
DECLARE @outputTable TABLE (id INT);

INSERT INTO test(value)
OUTPUT inserted.id
INTO @outputTable
VALUES('One');

SELECT * FROM test;
SELECT * FROM @outputTable;

-- Test OUTPUT with UPDATE
INSERT INTO test(value) VALUES('Two');
UPDATE test
SET value = 'One Changed'
OUTPUT deleted.id
INTO @outputTable
WHERE id=1
;
SELECT * FROM test;
SELECT * FROM @outputTable;

-- Test OUTPUT with DELETE
DELETE test
OUTPUT deleted.id
INTO @outputTable
WHERE id=2;
SELECT * FROM test;
SELECT * FROM @outputTable;
GO

-------------
-- Query-02 - Output of OUTPUT clause in a variable
-- Result: it fails because OUTPUT returns data in tabular format
-------------
-- Data table
CREATE TABLE test(id INT IDENTITY(1,1), value VARCHAR(100));
DECLARE @outputVariable INT;

INSERT INTO test(value)
OUTPUT inserted.id
INTO @outputVariable
VALUES('One');

SELECT * FROM test;
SELECT 'Variable= '+CONVERT(VARCHAR(100), @outputVariable);
GO

Error:
Msg 1087, Level 16, State 1, Line 6
Must declare the table variable "@outputVariable".

-------------
-- Query-03 - Usage of @@IDENTITY, SCOPE_IDENTITY(), IDENT_CURRENT()
-------------
-- Data tables
CREATE TABLE one(id INT IDENTITY(1,1), value VARCHAR(100));
CREATE TABLE two(id INT IDENTITY(1,1), value VARCHAR(100));
GO
-- Create trigger in table 'one'
CREATE TRIGGER TROne
ON one
AFTER INSERT
AS
DECLARE @value VARCHAR(100), @id INT;
SELECT @value=value, @id=id FROM inserted;

INSERT INTO two(value) VALUES('ID='+CONVERT(VARCHAR(100), @id) + ' is inserted in table one');
INSERT INTO two(value) VALUES('Value='+CONVERT(VARCHAR(100), @value) + ' is inserted in table one');
GO

INSERT INTO one(value) VALUES('Record one');
SELECT '@@IDENTITY= '+CONVERT(VARCHAR(100), @@IDENTITY);
SELECT '@@SCOPE_IDENTITY= '+CONVERT(VARCHAR(100), SCOPE_IDENTITY());
SELECT '@@IDENT_CURRENT for table one= '+CONVERT(VARCHAR(100), IDENT_CURRENT('one'));
SELECT '@@IDENT_CURRENT for table two= '+CONVERT(VARCHAR(100), IDENT_CURRENT('two'));
GO

Tuesday, April 3, 2012

Database Coding Convention

Each database object must have a short description
If needed, each object should have ‘Implementation Flow’ (bullet point) which will describe the logic implemented within it. Comment should point to proper bullet point
Each logic should have proper comment and should refer to appropriate ‘Implementation Flow’ point
Each database object must have ‘Version History’. Every time the object is modified an entry should be added. The format is as below-

Format - [Serial #] - [Author] - [Date] - [Short Description]
Example – 1 – Tom Brady – 03/20/2012 – Created
Example – 2 – Tom Brady – 03/21/2012 – Unit tested

Database object Naming Convention as follow -

Stored Procedure – prefix SP

i. Example - SPStoredProcedureName

Function – prefix FN

i. Example – FNFunctionName

Trigger – prefix TR

i. Example – TRTriggerName

Variable – camel case (first letter of first word in small case, first letter of subsequent words in upper case)

i. Example - @firstVariableName

Temp table – camel case

NULL checks need to be added for each mandatory input. If value is missing, error need to be raised

Example -

-- NULL check for inputs

IF @inputOne IS NULL OR @inputTwo IS NULL

BEGIN

RAISERROR('All inputs are required', 10, 1);

RETURN;

END;

TRY – CATCH block need to be added

[Not complete]

Monday, April 2, 2012

Reason a Stored Procedure is recompiled

Background:

I was investigating the fact that 'Table variables result in fewer re-compilations of a stored procedure as compared to temporary tables' which is mentioned in my another post 'Temp Table in SQL Server'. Than found two very good references (References-1 & 2) and felt like consolidating into fewer lines.

Query Plan related to Stored Procedure – Steps:

The Stored Procedure is executed
Check in cache to see it’s available
If not available in cache,

Load the Stored Procedure
Compile the query plan
Save compiled plan in cache

Reuse by subsequent calls until the plan is invalidated and forced to recompile

Reason a Stored Procedure is recompiled:

Use of a WITH RECOMPILE clause in the CREATE PROCEDURE or EXECUTE statement.
Schema changes to any of the referenced objects, including adding or dropping constraints, defaults, or rules.
Running sp_recompile for a table referenced by the procedure.
Restoring the database containing the procedure or any of the objects the procedure references (if you are performing cross-database operations).
Sufficient server activity causing the plan to be aged out of cache
A sufficient percentage of data changes in a table that is referenced by the stored procedure

Ways to counteract recompilation

i. Execute the statement using sp_executesql - Statements executed using the sp_executesql stored procedure are not compiled as part of the stored procedure plan

ii. Use sub-procedures to execute the statements that are causing the recompilations - the statement may still cause a recompilation, but instead of recompiling the large calling stored procedure, it will only recompile the small sub-procedure

iii. Use the KEEP PLAN option

The procedure interleaves Data Definition Language (DDL) and Data Manipulation Language (DML) operations.
The procedure performs certain operations on temporary tables

Ways to counteract recompilation

i. All statements that contain the name of a temporary table refer to a temporary table created in the same stored procedure, and not in a calling or called stored procedure, or in a string executed using the EXECUTE statement or sp_executesql stored procedure

ii. All statements that contain the name of a temporary table appear syntactically after the temporary table in the stored procedure or trigger.

iii. There are no DECLARE CURSOR statements whose SELECT statements reference a temporary table

iv. All statements that contain the name of any temporary table precede any DROP TABLE statement that references a temporary table

References:

URL - http://support.microsoft.com/kb/243586
URL - http://support.microsoft.com/kb/305977

Sunday, April 1, 2012

Temp Table in SQL Server

Background:

Recently my team started using Microsoft SQL Server 2008 R2 as the RDBMS in one of the project and was lucky to be part of developing two critical functionalities which required extensive database development (writing SPs, functions, queries etc). Because of the nature of the problem, I had to use something like temp table a lot. While doing the actual implementation, I wanted to dig further to make sure the concepts, pros and cons are fully clear, also realized to document my learning for larger audience.

The things I have considered during writing this post are – (1) explanation has to be short, simple and easy to remember (2) structure of information has to be well segmented (3) as much example as possible. Hence created three major sections: concepts (facts, pros/cons, short explanation), best practices and examples (tested code)

CONCEPTS:

Table variable:

Syntax: DECLARE @tableName TABLE (id INT);
Facts:

i. Behave like local variables in their scoping rules and will be disposed as soon as it’s out of scope

ii. Table variables are stored in ‘TempDB’

iii. Available constraints that can be used with table variables

1. CHECK

2. PRIMARY KEY

3. UNIQUE KEY

4. NULL/NOT NULL

Advantages

i. Considered as ‘Local/Private’ to the process and has less locking to the resources

ii. Transaction rollback doesn’t affect table variable as it exists for shorter span and not stored in persistent database

iii. Less logging necessary as not stored in persistent database

iv. Table variables result in fewer recompilations of a stored procedure as compared to temporary tables (reference-1)

Disadvantages

i. Local tables often disposed of before can be investigated

ii. Cannot be used in dynamic SQL. In EXECUTE and sp_ExecuteSQL (example-1)

iii. Cannot be used in ‘SELECT INTO’ to create them dynamically

iv. User defined functions cannot be called from CHECK constraints, DEFAULT values

v. Cannot be changed after declaration

vi. Cannot be truncated

vii. Cannot be dropped. Have to let go out of scope

viii. Must be referenced by an alias if needed to join

Table valued Parameter:

Facts:

i. A way to send multiple rows of data to a Stored Procedure or function without creating a temporary table or passing so many parameters

ii. ‘Table variable’ or ‘table type’ can be passed as ‘Table Valued Parameter’

iii. When table variables are passed as parameters below things happen behind the scene

1. In the TempDB, table variable is materialized

2. Reference is passed

Advantages:

i. Can be used to send client table to server

ii. Can be used to pass as Table Valued Parameter to SP or function

Disadvantages:

i. Table Valued Parameter is passed as Read-Only. No INSERT, UPDATE or DELETE operations are allowed

ii. Cannot be used in ‘SELECT INTO’ or ‘INSERT EXEC’ because of read only feature

Local Temporary Table:

Syntax: Table name starts with #
Facts:

i. Local Temporary tables are private to the process that created it

ii. If the same stored procedure is executed simultaneously by several processes, the Database Engine distinguishes the same tables created by the different processes

iii. Local temp tables are stored in ‘TempDB’

iv. Local temporary tables are dropped automatically at the end of the current session or procedure (Best Practices-4)

v. A session specific identifier is added to the name, to make sure that temp tables from other sessions don’t use the same name [same for Global Temp Table]

vi. To allow for the suffix, the table name specified for a local temporary name must be less than 116 characters [same for Global Temp Table]

vii. Within a stored procedure, a temporary table cannot be created, dropped, and then recreated a new temporary table with the same name (example-4 & 5) [same for Global Temp Table]

Advantages:

i. Can be referred within batches or in dynamic SQL (example-2 & 6)

ii. Can be referred by any nested stored procedures executed by the stored procedure that created the table (example-3)

iii. Reduction of required locks as only user has access to it

iv. Less logging necessary

Disadvantages:

i. Cannot be used in views and cannot associate triggers with them

ii. Table name limited by 116 characters

Global Temporary table:

Syntax: table name starts with ##
Facts:

i. Global temp tables work almost similar to local temp tables

ii. global tables aren’t private to the process that created it

iii. Global temp tables are visible to all sessions, until the session created it goes out of scope and referring sessions to the table go out of scope

Advantages:

i. Reduction of required locks as only user has access to it

ii. Less logging necessary

Disadvantages:

i. Usage is very limited

ii. Hard to manage data/flow consistency

Notes:

1. Table variables are completely isolated to the batch that creates them so no 're-resolution' has to occur when a CREATE or ALTER statement takes place, which may occur with a temporary table. Temporary tables need this 're-resolution' so the table can be referenced from a nested stored procedure. Table variables avoid this completely so stored procedures can use plan that is already compiled, thus saving resources to process the stored procedure (References-2)

References:

SP recompilation- I have found this link a superb one which explains in detail when a SP is recompiled (cover temp table related recompilation as well)
URL - http://support.microsoft.com/kb/243586
http://support.microsoft.com/kb/30597
http://www.sqlservercentral.com/articles/Temporary+Tables/66720/

BEST PRACTICES:

Need to put extra caution when using Local Temp Table in the called SP which is defined in calling SP because called SP might throw exception when called from any other places
Instead of Global Temp table, try to use permanent table if possible
Always use alias for tables (permanent/temp)
Don’t need to drop temp tables explicitly at the end of the Stored Procedure/block as SQL Server engine checks and drops any temp tables declared in the scope when the SP execution is ended

CODING:

Example-1: will not compile. Table variable is called in dynamic SQL

-- DROP PROCEDURE SPFirst

CREATE PROCEDURE SPFirst

DECLARE @tableValue TABLE (id INT);

EXECUTE('SELECT * FROM @tableValue');

Example-2: Will work where temp table is used in dynamic SQL

CREATE PROCEDURE SPFirst

CREATE TABLE #tempTable (id INT);

EXECUTE('SELECT * FROM # tempTable’);

Example-3: First SP calls second SP which uses the local temp table created in first. (To test, compile both SPs and execute last line)

-- DROP PROCEDURE SPFirst

CREATE PROCEDURE SPFirst

CREATE TABLE #tempTable (id INT);

INSERT INTO #tempTable VALUES(1);

INSERT INTO #tempTable VALUES(2);

EXECUTE SPSecond;

-- DROP PROCEDURE SPSecond

CREATE PROCEDURE SPSecond

SELECT * FROM #tempTable;

EXECUTE SPFirst;

Example-4: Works fine. Global temp table created and dropped from same SP

-- DROP PROCEDURE SPFirst

CREATE PROCEDURE SPFirst

CREATE TABLE ##tempTable (id INT);

INSERT INTO ##tempTable VALUES(1);

INSERT INTO ##tempTable VALUES(2);

DROP TABLE ##tempTable;

Example-5: Will not compile. Temp table (global/local) is created>dropped>recreated with same name. The reason is Temp Table works as local variable in a scope where multiple variable declaration with same name will throw compile time error

CREATE PROCEDURE SPFirst

CREATE TABLE ##tempTable (id INT);

INSERT INTO ##tempTable VALUES(1);

INSERT INTO ##tempTable VALUES(2);

DROP TABLE ##tempTable;

CREATE TABLE ##tempTable (id INT);

Example-6: Will compile but will fail during execution because temp table is defined in child session which is not available in parent session

-- DROP PROCEDURE SPFirst

CREATE PROCEDURE SPFirst

EXECUTE('CREATE TABLE #tempTable (id INT);INSERT INTO #tempTable VALUES(1);INSERT INTO #tempTable VALUES(2);');

SELECT * FROM #tempTable;

EXECUTE SPFirst;

Summary (Reference-03)

Feature	Table Variables	Temporary Tables
Scope	Current batch	Current session, nested stored procedures. Global: all sessions.
Usage	UDFs, Stored Procedures, Triggers, Batches.	Stored Procedures, Triggers, Batches.
Creation	DECLARE statement only.	CREATE TABLE statement. SELECT INTO statement.
Table name	Maximum 128 characters.	Maximum 116 characters.
Column data types	Can use user-defined data types. Can use XML collections.	User-defined data types and XML collections must be in tempdb to use.
Collation	String columns inherit collation from current database.	String columns inherit collation from tempdb database.
Indexes	Can only have indexes that are automatically created with PRIMARY KEY & UNIQUE constraints as part of the DECLARE statement.	Indexes can be added after the table has been created.
Constraints	PRIMARY KEY, UNIQUE, NULL, CHECK, but they must be incorporated with the creation of the table in the DECLARE statement. FOREIGN KEY not allowed.	PRIMARY KEY, UNIQUE, NULL, CHECK. Can be part of the CREATE TABLE statement, or can be added after the table has been created. FOREIGN KEY not allowed.
Post-creation DDL (indexes, columns)	Statements are not allowed.	Statements are allowed.
Data insertion	INSERT statement (SQL 2000: cannot use INSERT/EXEC).	INSERT statement, including INSERT/EXEC. SELECT INTO statement.
Insert explicit values into identity columns (SET IDENTITY_INSERT).	The SET IDENTITY_INSERT statement is not supported.	The SET IDENTITY_INSERT statement is supported.
Truncate table	Not allowed.	Allowed.
Destruction	Automatically at the end of the batch.	Explicitly with DROP TABLE statement. Automatically when session ends. (Global: also when other sessions have no statements using table.)
Transactions	Last only for length of update against the table variable. Uses less than temporary tables.	Last for the length of the transaction. Uses more than table variables.
Stored procedure recompilations	Not applicable.	Creating temp table and data inserts cause procedure recompilations.
Rollbacks	Not affected (Data not rolled back).	Affected (Data is rolled back).
Statistics	Optimizer cannot create any statistics on columns, so it treats table variable has having 1 record when creating execution plans.	Optimizer can create statistics on columns. Uses actual row count for generation execution plan.
Pass to stored procedures	SQL 2008 only, with predefined user-defined table type.	Not allowed to pass, but they are still in scope to nested procedures.
Explicitly named objects (indexes, constraints).	Not allowed.	Allowed, but be aware of multi-user issues.
Dynamic SQL	Must declare table variable inside the dynamic SQL.	Can use temporary tables created prior to calling the dynamic sql.