Procedure Time out: Procedures v. Functions

The place where I work is moving to MS SQL Server from Lotus Notes. I
have done a lot of coding in Lotus Notes, and have, I suppose,
intermediate skills in basic SQL -- queries, insert, updates, table
design, etc. I have a couple of questions, however. First, stored
procedures vs. functions. In my world, a function is a body of code
that returns a value; a procedure is a body of code that does things
but does not return a value (other than an error if it fails). At first
blush, MS SQL seems to follow this distinction. But further reading of
the Books Online and paper books (like Advanced T-SQL for SQL Server
2000) demonstrate that stored procedures also return values. So, why
use one over the other? Apparently, I can debug SQL stored procedures
(haven't tried yet) and it looks like I can't debug functions. On the
other hand, I can use a function in an SQL statement but I can't use a
procedure (I think). So, if I want to take two columns and do some
hand-waving over them to make a third column for each row in a row set,
I need to use a function.
I guess the above gives you a sense for what little I know. What are
people's thougts on functions vs. stored procedures. What's a good book
that might give me better insight into these sorts of issues without
spending a lot of time on what a left outer join or a derived table is?

Thanks.
BD"BlueDragon" <bluedragon013@.yahoo.com> wrote in message
news:1108255532.679335.314700@.l41g2000cwc.googlegr oups.com...
> The place where I work is moving to MS SQL Server from Lotus Notes. I
> have done a lot of coding in Lotus Notes, and have, I suppose,
> intermediate skills in basic SQL -- queries, insert, updates, table
> design, etc. I have a couple of questions, however. First, stored
> procedures vs. functions. In my world, a function is a body of code
> that returns a value; a procedure is a body of code that does things
> but does not return a value (other than an error if it fails). At first
> blush, MS SQL seems to follow this distinction. But further reading of
> the Books Online and paper books (like Advanced T-SQL for SQL Server
> 2000) demonstrate that stored procedures also return values. So, why
> use one over the other? Apparently, I can debug SQL stored procedures
> (haven't tried yet) and it looks like I can't debug functions. On the
> other hand, I can use a function in an SQL statement but I can't use a
> procedure (I think). So, if I want to take two columns and do some
> hand-waving over them to make a third column for each row in a row set,
> I need to use a function.
> I guess the above gives you a sense for what little I know. What are
> people's thougts on functions vs. stored procedures. What's a good book
> that might give me better insight into these sorts of issues without
> spending a lot of time on what a left outer join or a derived table is?
> Thanks.
> BD

Well... Maybe another way to look at it is...

I think I'd spend more time learning how a left outer join works than on
user defined functions.
I have one UDF, one trigger and a couple hundred stored procedures in the
database I've been working on recently.
So, for now, I'd suggest maybe forget about functions and take a look at
T-SQL.
Cursors would also be low on my list of things to worry about.
There are no cursors in any of those stored procedures I mention.
Although - there are good reasons to use them in some circumstances.

FWIW the function takes a date and number of months to subtract and returns
a char(6) field of yyyymm format. The database has a bunch of data
bucketed into months within year so there's a fair few stored procedures use
this UDF to work out so many months back or the like.

--
Regards,
Andy O'Neill|||Thank you for the response.
I wasn't denigrating the importance of joins; I think I understand them
fairly well, along with temp tables, derived tables, aggregate
functions, etc. The few books in my local Borders all seem to spend 25%
or more explaining these concepts, then going on to say how to crate a
stored procedure or a user function without discussing why one over the
other. I have a lot to learn about T-SQL but, in the end, it's just
another programming language. I have much more to learn about the ins
and outs of creating an RDBMS based application (much of which will com
only with experience, of course) and that's where I'm really looking
for guidance.
Thanks again.|||BlueDragon wrote:
> Thank you for the response.
> I wasn't denigrating the importance of joins; I think I understand
them
> fairly well, along with temp tables, derived tables, aggregate
> functions, etc. The few books in my local Borders all seem to spend
25%
> or more explaining these concepts, then going on to say how to crate
a
> stored procedure or a user function without discussing why one over
the
> other. I have a lot to learn about T-SQL but, in the end, it's just
> another programming language. I have much more to learn about the ins
> and outs of creating an RDBMS based application (much of which will
com
> only with experience, of course) and that's where I'm really looking
> for guidance.
> Thanks again.

A stored procedure, at its most basic, is simply a way of writing
T-SQL code so that it is physically stored within the dataserver. The
advantage of this is mainly that each time the SP is executed from a
client process, all the activity takes place at the server, without
having to send lots of T-SQL code across the network first. In
addition, the code is stored in a pre-compiled, optimised form, where
the most efficient sequence of table joins is already known, so time is
saved by not having to re-evaluate this for every execution.

Like anything else, this is a simplification, and you may not always
want to keep the pre-compiled form of the code for every execution - it
may actually hinder performance, but for now let's assume that it's a
good thing. Once you're more comfortable with writing T-SQL code, then
you can start worrying about the minuntiae of performance optimization.
:-)

There's nothing to stop the code in an SP from calling another SP or a
function, and SP code can even call itself (a RECURSIVE call). I don't
know if others will agree with this or not, but I tend to regard
functions as being utilities for manipulating data, and SPs as
utilities for retrieving or updating data, although SPs are perfectly
good at doing data manipulation too.

For example, I might write an SP to retrieve some raw date
information, and having got that data I might call a function that
converts it into a different format that my client application needs.
I might then pass that modified format to another SP which writes it
back to the database.

SPs have a further advantage, and that is database security. There is
absolutely no need to give any end user direct access to a table in a
database (i.e. they don't need to be granted the ability to SELECT,
INSERT, UPDATE or DELETE data). Instead, write a family of SPs which
allow them to achieve this only in a manner that the T-SQL code within
the SP allows (i.e. in a manner that you write and therefore control).
You then grant them EXECUTE permission on the SPs, and they can make
these data changes so far as the SP code allows them to. If they had
direct access to the tables, they would potentially be able to write
any command they wanted (e.g. delete * from <table>) which could be
very bad news indeed.|||On 12 Feb 2005 16:45:32 -0800, BlueDragon wrote:

(snip)
>I guess the above gives you a sense for what little I know. What are
>people's thougts on functions vs. stored procedures. What's a good book
>that might give me better insight into these sorts of issues without
>spending a lot of time on what a left outer join or a derived table is?

Hi BD,

I don't know of any books that focus primarly on this aspect of SQL
Server, but I can try to give you a quick rundown on some important
consideration.

As you already know, stored procedures offer pretty much everything you'd
need. You can execute almost all conceivable code in them, return a
success/failure code, return extra values in output parameters and return
one or even several result sets to the calling client. In fact, for a long
time stored procedures were all that SQL Server had. The only downside of
stored procedures is that you can't use them as part of an expression in a
query.

User-defined functions (hereafter: UDF) were added in SQL Server 2000, and
they were introduced in three flavors at once. Let's start with the
easiest type: the scalar UDF. This type of UDF takes zero or more input
parameters and returns one scalar value. A scalar UDF can be used as part
of any expression. That's the good news. The bad news is that they can
seriously slow down your query execution and that there are lots of
limitations to what you can do in a UDF. The main restrictions are that
you can't change any data in any table in a UDF and that you can't use any
non-deterministic function.

I originally included a lengthy explanation here about the reasons for
these limitations, but I decided to delete it from my message before
sending it - it's probably too technical and too long. If you do want to
know, just ask!
I will include a short explanation about the slowness of UDF: since they
operate on values, not on whole sets, they force SQL Server to use an
"internal cursor" and feed the rows one by one to the UDF. Since SQL
Server is completely optimized for set-based operations, any row-by-row
operation inevitably comes with a performance penalty.

A completely different type of UDF is the table-valued UDF. This type does
not return one single value, but a complete table, with as many columns
and as many rows as you'd like. This type of UDF can, of course, be called
in the FROM clause of any query. It is like creating a temporary table,
filling it with data, using it in a query, then deleting it again. In
fact, this is how SQL Server processes a table-valued UDF. So the good
news is that you can move the code to make the temp table out of your
procedure to a UDF, keeping the code cleaner. The bad news is (again) that
performance might suffer. If you can do it all in one query 9even though
it would be quite complex), the optimizer can do it's thing, shoving
things around, choosing between evaluation orders, etc. If you use a
table-valued UDF (or indeed even if you explicitly create and populate the
temp table), you force a specific order of evaluation on the optimizer: it
HAS to materialize the UDF first before the complete query can be
executed. And in the comparison of explicit temp table vs table valued
UDF, the temp table has the advantage that you can create indexes on it to
optimize data retrieval.

The third and last type of UDF is the inline table-valued UDF. This one
has lots in commont with the "normal" table-valued UDF. The difference is
that the whole function consists of only one select statement. This type
of UDF is in fact very much like a view, with the added advantage that you
can use parameters. The evaluation by SQL Server is also very much like a
view: it simply replaces the call to the UDF in your query with the SELECT
statement from the body of the UDF, then goes on to optimize the query as
if it was submitted like that. This UDF is the only one that doesn't come
with a performance penalty, as the optimizer only "sees" the query with
the UDF replaced by the equivalent SELECT statement (as derived table), so
it can optimize as much as it wants.

The above is a short description. A lot of details are omitted. But I hope
that it at least gives you a head start on procs, UDF's and their
differences.

Oh, and let's not forget to mention that all three types of UDF are
discsussed and illustrated with examples in Books Online, topic CREATE
FUNCTION.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)|||BlueDragon (bluedragon013@.yahoo.com) writes:
> The place where I work is moving to MS SQL Server from Lotus Notes. I
> have done a lot of coding in Lotus Notes, and have, I suppose,
> intermediate skills in basic SQL -- queries, insert, updates, table
> design, etc. I have a couple of questions, however. First, stored
> procedures vs. functions. In my world, a function is a body of code
> that returns a value; a procedure is a body of code that does things
> but does not return a value (other than an error if it fails). At first
> blush, MS SQL seems to follow this distinction. But further reading of
> the Books Online and paper books (like Advanced T-SQL for SQL Server
> 2000) demonstrate that stored procedures also return values. So, why
> use one over the other? Apparently, I can debug SQL stored procedures
> (haven't tried yet) and it looks like I can't debug functions. On the
> other hand, I can use a function in an SQL statement but I can't use a
> procedure (I think). So, if I want to take two columns and do some
> hand-waving over them to make a third column for each row in a row set,
> I need to use a function.

I think Andy put it quite nicely when he said that his database has
several hundres of stored procedures, but only one function. The same
is true in our system: maybe 10 functions - over 3700 procedures.

Of course, the fact that functions were added in SQL 2000 and procedures
has been around since the 1980s may have something to do with it. But
procedure is a much more general concept. Functions comes with plenty
of restrictions, so there are many cases you cannot use them, even if
you wanted to.

A stored procedure can return data in three different ways:
o In a result set.
o In output parameters
o As a return value.

It seems that the return value confuses you. Well, the return value is
always an integer value, and it is my strong recommendation that you
only use it return failure/success. 0 for success, and anything else
for an error. Never return data with a return value.

As for result set or parameters, parameters are good for scalar values
but cannot be used for table data. Result sets can be used for both,
since a set of scalar values is just a single-row result set. However,
output parameters usually gives somewhat better performance.

As I've indicated, functions are a lot more specialised. First of all,
a function must never change database state. Of this reason a lot
of things are not permitted in a function. For instance, you cannot call
a stored procedure. Nor can you perform updates to tables etc etc.

There are three different sort of functions:
o scalar functions
o inline table functions
o multi-step table functions.

A scalar function returns one single value that can be of any data type.
You can call a function in a SELECT statement, and pass columns, constants
or variables as arguments.

An inlined table function is actually not a function at all. It consists
of a single SELECT statement, which is expanded into the query when
SQL Server builds the plan, and the actual computation order may be
recast. That is, it is just a parameterized view.

A multi-step table function returns a table that you can SELECT from
as it if was a table and you can join the function with other tables.
In previous versions of SQL Server you would have run a stored procedure
that filler a temp table with the data, and then you would have used
that table in the query.

You can pass parameters to table functions, but you can only pass constants
and variables; you cannot pass columns from other tables. (This changes
in SQL 2005.)

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||I'm still a little at sea. First, I come from a programming background
although I've been doing T-SQL for at least 6 or 7.. days. But, pressed
for time, a much more experienced co-worker solved a problem where I
needed to parse a full contact name stored in a single column into it's
component parts to store in different columns. Had to account for names
with and without standard salutations, possible titles or generational
identifiers, etc. My first reaction would have been to write a
function; his was to write a stored procedure. BUT, at the end of the
day, if I understand BOL and the messages here, I can't use his stored
procedure to:

insert into NewTable
...
select spParseName(name) from OldTable

What I could do, I guess is write a script, create a cursor (I've
already learned to say yuck, here) , then iterate the cursor with the
stored procedure. Wouldn't it have been much faster to write a function
that I could have parsed the names in an SQL statement? Surely, this
scenario is not unusual. Or, in the alterative (and often likely) am I
missing something entirely?
Thanks|||BlueDragon (bluedragon013@.yahoo.com) writes:
> I'm still a little at sea. First, I come from a programming background
> although I've been doing T-SQL for at least 6 or 7.. days. But, pressed
> for time, a much more experienced co-worker solved a problem where I
> needed to parse a full contact name stored in a single column into it's
> component parts to store in different columns. Had to account for names
> with and without standard salutations, possible titles or generational
> identifiers, etc. My first reaction would have been to write a
> function; his was to write a stored procedure. BUT, at the end of the
> day, if I understand BOL and the messages here, I can't use his stored
> procedure to:
> insert into NewTable
> ...
> select spParseName(name) from OldTable
> What I could do, I guess is write a script, create a cursor (I've
> already learned to say yuck, here) , then iterate the cursor with the
> stored procedure. Wouldn't it have been much faster to write a function
> that I could have parsed the names in an SQL statement? Surely, this
> scenario is not unusual. Or, in the alterative (and often likely) am I
> missing something entirely?

If I get this right, it does not sound like a function would be useful
here. As I understand it, you pass a single item, and get back several.
A scalar function can only return a single value. You could of course do:

SELECT dbo.component(fullname, 1), dbo.component(fullname, 2) ...

and only get back one component at a time. But that would not very
effecient. While calling a UDF in a query is a better alternative
that iterate row for row, UDFs have quite an impact on performance.

He could have written a table-valued function, but a table-valued
function cannot accept a table column as a parameter, so you would be
back to the cursor. (This is different in SQL 2005.)

So a stored procedure should be best bet. But it could be done more
effeciently. He could operate on a temp table where the caller fills
in the names to split, and where he applies set-based statements to
split all names in one go. You as the caller create the temp table
(with name and definition specified by the cracker procedure). On
return, you have the splitted names.

There is some more information about this on
http://www.sommarskog.se/share_data.html#temptables.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Monday, February 20, 2012

Procedures v. Functions

No comments:

Post a Comment

Procedure Time out

Blog Archive

About Me