Thursday, February 16, 2012

About paging in MS SQL Server

Hi all,
Continue to my question early today, I have read some articles about
"paging" in MS Sql server, which is recommended by Tibor Karaszi. I didn't
read all of them, but I summarized the ideas about what I have read:
use SELECT TOP... statement, but with a range that selected by yourself. For
example, if you have a PK, and you know that your 30,001th row PK is PK1, so
you can have a statement like
SELECT TOP 10 * FROM Table1 WHERE PK >= PK1
By using this statment, you can get the records between 30,001 and 30,010.
The question is, how can I know that the PK for 30,001th records is PK1.
For some articles, it is suggested to use a chain identy. For example, each
time you get a page, you know the first and the last PK of that page. So you
can easily get the previous or next 10 records. However, if I am not get the
next 10, but the next 10 after 30 records, what can I do?
I had a test before I wrote this article. I generated a table with
10,000,000 records. When I am trying to get arbitrary 10 records, I can
almost get it by no time if I know the PK values. But what can I do if I
don't know the PK value?
With the same 10,000,000 records, I used the following stored procedure to
get the paging effect. Compare to the no time when I know the PK range, or
simply the starting value of PK, the 3 seconds result of following stored
procedure is not satisfied:
--sample--
CREATE PROCEDURE [dbo].[GetPageSet]
@.StartRow INT = 1,
@.PageSize INT = 10
AS
DECLARE @.PK1 INT
DECLARE MyCursor CURSOR SCROLL DYNAMIC FOR
SELECT PK FROM Table1
OPEN MyCursor
FETCH Relative @.StartRow INTO @.PK1
CLOSE MyCursor
DEALLOCATE MyCursor
SELECT TOP @.PageSize * FROM Table1
WHERE PK >=PK1
GO
----
I need 2-3 seconds to retrieve any 10 records from 10,000,000 records.
However, there are several things that I cannot get a satisified result:
1) Compare to no time if I know the PK range, this is not good enough.
2) It is said that the DYNAMIC Cursor will not as efficient as Static or
Keyset when using in Join. I didn't test the join. I wonder if it will be a
very big performance degrade if I have to use static or keyset cursor when I
need join.
3) I have to use the RELATIVE fetch because Dynamic cursor do not support
ABSOLUTE fetch.
4) If I don't use the DYNAMIC cursor, I will need nearly 3 minutes with same
procedure to get any 10 records.
5) If I don't use SCROLL cursor, I will not be able to fetch directly to
some specific row. I wonder what will happen if I fetch one by one until I
have iterated over 4,000,000 records.
Anyone have better solution?
By the way, anyone tested with ADO, OLE DB, or ODBC? Will it be better to
use these cursors? Is it possible that enhanced features in these cursors
can provide better solution for this "paging" problem?
Hope to get more disucssion on this problem.
Thx.
Joe> SELECT TOP @.PageSize * FROM Table1
> WHERE PK >=PK1
This would never compile. You cannot use a variable in a top clause.
Here is one way I handle it - http://www.aspfaq.com/2120, though admittedly
I've never tried this technique with 10,000,000 records (it's doubtful such
a resultset would be useful to page against; with that many rows, it is much
more likely that aggregates would be more useful).
It's doubtful you're going to find a solution that meets all of your
ridiculously (IMHO) stringent requirements *and* runs in under 3 seconds.
A|||Hi, Aaron
OK, The
SELECT TOP @.PageSize * FROM Table1
is only used when I was writting this artile. In my true SP, I used the
SELECT TOP 100 * FROM Table1
But that's not the critical point in this SP. What I want to know is wether
there are better solutions.
For example, I have read your article yesterday, which you also quoted
below. The first method you have used is using ADO. Which means you are
using a ADO cursor. Like you, i have used this method in a number of my old
programs. I have tested your page several times. I am not sure if the time
you give at the bottom of the page is the time you need to process the
"Paging" only. Let's assume it is. So to get 50 records out of your 10,000
records, you need:
ADO method: around 1.4 seconds
SP method: around 0.8 second
I am not sure how much you need when you get a huge table.
For the ASP method, you are using ADO cursor. According to document, the
server handle your request by create a client buffer. All the records will
be push into buffer first. If you are getting a huge table, I wonder how
much request a server can handle.
For the SP method, you are creating a auto-index through a temporary table,
which means you need to duplicate some records first. Again, if you are
handling a huge table, is there any concern?
In fact, I am not sure how the server will handle the statement I wrote in
the SP. I wonder if it will create a temp table too. If this is true, it
will be similiar to your SP.
What I want to know is: has anyone do study about this. Is there anyone who
can give a best method?
J
"Aaron Bertrand - MVP" <aaron@.TRASHaspfaq.com> wrote in message
news:Ou3$OP9RDHA.2248@.TK2MSFTNGP11.phx.gbl...
> > SELECT TOP @.PageSize * FROM Table1
> > WHERE PK >=PK1
> This would never compile. You cannot use a variable in a top clause.
> Here is one way I handle it - http://www.aspfaq.com/2120, though
admittedly
> I've never tried this technique with 10,000,000 records (it's doubtful
such
> a resultset would be useful to page against; with that many rows, it is
much
> more likely that aggregates would be more useful).
> It's doubtful you're going to find a solution that meets all of your
> ridiculously (IMHO) stringent requirements *and* runs in under 3 seconds.
> A
>

No comments:

Post a Comment