We have a massive cube. All but one of the fact tables are by 3am. But the final fact table isn't loaded in the SQL data warehouse until 11am. The users want the cube processed and available as immediately as possible. But the cube can't show ANY updated data until it shows ALL updated data including the fact table from 11am. Any thoughts on making this happen?
One option would be to process the cube at 3am in another database. Then process the final measure group at 11am. Then use the Synchronize command to get those changes to the live database soon after 11am. But there are two drawbacks... 1) This requires twice the amount of disk space as before and 2) I can't figure out a way to synchronize two databases on the same server.
Another option would be to process the cube at 3am but leave that transaction open (so end users can't see the new data) for 8 hours until you process the final measure group and then close the transaction. This doesn't seem like a good plan, but I wanted to hear people's thoughts.
I would certainly experiment with leaving transaction open for 8 hours. I don't see why it doesn't seem like a good plan to you. Note, that you will still have doubled usage of disk space, since two versions will be coexisting for the 8 hours.|||Well, maybe it's not a dumb idea. I guess the reason I was worried about it is just my relational background and the idea to never leave transactions open. But I guess that doesn't matter in this case.
So my main concern with the technical feasibility is whether the server will clean up the idle session (in so doing cancel the transaction). Am I going to have to run some cheap discover function ever 5 mins for 8 hours to keep the session alive? Any other way?
Thanks for the feedback Mosha.
|||If server canceling the idle session becomes a problem - you can always bump up the session expiration timeout.
|||I've got this working. Cube processing is broken up into two parts with the connection being dropped in between. Just like you said, it works fine. Just had to bump up the IdleOrphanSessionTimeout setting.
There is one problem, though. If we get a failure on part 2 of the processing (like if there's an invalid dimension key in a fact table) then that failure rolls back the entire transaction, not just the stuff being processed during part 2. Wondering if there's a way around this. I can walk you through reproducing this against Adventure Works:
First, change the query binding of the Reseller_Sales_2004 partition of the Reseller Sales measure group to be:
SELECT [dbo].[FactResellerSales].[ProductKey],-999 as OrderDateKey,[dbo].[FactResellerSales].[DueDateKey],[dbo].[FactResellerSales].[ShipDateKey],[dbo].[FactResellerSales].[ResellerKey], [dbo].[FactResellerSales].[EmployeeKey],[dbo].[FactResellerSales].[PromotionKey],[dbo].[FactResellerSales].[CurrencyKey],[dbo].[FactResellerSales].[SalesTerritoryKey],[dbo].[FactResellerSales].[SalesOrderNumber],[dbo].[FactResellerSales].[SalesOrderLineNumber],[dbo].[FactResellerSales].[RevisionNumber],[dbo].[FactResellerSales].[OrderQuantity],[dbo].[FactResellerSales].[UnitPrice],[dbo].[FactResellerSales].[ExtendedAmount],[dbo].[FactResellerSales].[UnitPriceDiscountPct],[dbo].[FactResellerSales].[DiscountAmount],[dbo].[FactResellerSales].[ProductStandardCost],[dbo].[FactResellerSales].[TotalProductCost],[dbo].[FactResellerSales].[SalesAmount],[dbo].[FactResellerSales].[TaxAmt],[dbo].[FactResellerSales].[Freight],[dbo].[FactResellerSales].[CarrierTrackingNumber],[dbo].[FactResellerSales].[CustomerPONumber]
FROM [dbo].[FactResellerSales]
WHERE OrderDateKey >= '915' AND OrderDateKey <= '1280'
(Note that "-999 as OrderDateKey" will cause a failure.)
Second, pull up the directory contains the 2004 partition for Internet Sales. (On your computer it should be something like c:\program files\Microsoft SQL Server\mssql.2\OLAP\data\Adventure Works DW.0.db\Adventure Works DW.18.cub\Fact Internet Sales 1.18.det\Internet_Sales_2004.16.prt) After part 1 completes, you'll see the number of files in this directory double. After part 2 fails, you'll see those new files disappear as the transaction was rolled back.
Then, compile a little VB.NET app, put a breakpoint on the Sleep command so you can see the number of files in that 2004 partition directory double. Then run it:
Public Shared Function Main(ByVal args As String()) As Integer
Dim sServer As String = "localhost"
Dim sDatabase As String = "Adventure Works DW"
ProcessPart1(sServer, sDatabase)
System.Threading.Thread.Sleep(5000) 'simulate a break in the program
ProcessPart2(sServer, sDatabase)
End Function
Public Shared Sub ProcessPart1(ByVal sServer As String, ByVal sDatabase As String)
Dim server As New Server()
server.Connect(sServer)
Dim database As Database = server.Databases.FindByName(sDatabase)
'save the session ID so we can retrieve it later and connect to this session later
'alternately, it could be saved to a SQL database instead
database.Annotations.SetText("ProcessCubeSessionID", server.SessionID)
database.Update(UpdateOptions.Default, UpdateMode.Update)
server.BeginTransaction()
Try
Dim cube As Cube = database.Cubes.FindByName("Adventure Works")
Dim mg As MeasureGroup = cube.MeasureGroups.FindByName("Internet Sales")
mg.Process(ProcessType.ProcessFull)
'after processing has succeeded, change the orphan session timeout
'to 24 hours so that this session we're about to leave open won't timeout
'before we're done with it
server.ServerProperties("IdleOrphanSessionTimeout").Value = CStr(24 * 60 * 60)
server.Update()
'false means don't end session
'leave the session open so we can connect to it later
'and continue with the open transaction
server.Disconnect(False)
'don't commit the transaction now
'leave it open until we finish the processing later
Catch ex As Exception
server.RollbackTransaction()
'blank out the ProcessCubeSessionID annotation
database.Annotations.SetText("ProcessCubeSessionID", Nothing, True)
database.Update(UpdateOptions.Default, UpdateMode.Update)
server.Disconnect(True)
Throw
End Try
End Sub
Public Shared Sub ProcessPart2(ByVal sServer As String, ByVal sDatabase As String)
Dim server As New Server()
server.Connect(sServer)
Dim database As Database = server.Databases.FindByName(sDatabase)
Dim oldSession As String = database.Annotations.GetText("ProcessCubeSessionID")
If String.IsNullOrEmpty(oldSession) Then
Throw New Exception("Was not able to find the existing session which " _
& "processed the other 95% of the cube and has the open transaction! " _
& "The cube will not be processed.")
End If
server.Disconnect(True)
Try
server.Connect(sServer, oldSession)
Dim cube As Cube = database.Cubes.FindByName("Adventure Works")
Dim mg As MeasureGroup = cube.MeasureGroups.FindByName("Reseller Sales")
mg.Process(ProcessType.ProcessFull) 'BUG?: if this fails, it rolls back the transaction!!!
server.CommitTransaction()
'blank out the ProcessCubeSessionID annotation
database.Annotations.SetText("ProcessCubeSessionID", Nothing, True)
database.Update(UpdateOptions.Default, UpdateMode.Update)
'set this property back to the default now that
'we're done needing orphan sessions to survive
server.ServerProperties("IdleOrphanSessionTimeout").Value = _
server.ServerProperties("IdleOrphanSessionTimeout").DefaultValue
server.Update()
server.Disconnect(True) 'true means to end session
Catch ex As Exception
'there was a problem with the last 5% of the processing...
'so leave the session open so the problem can be corrected and we can
'start with 95% of the processing from earlier in the morning
'being completed already
server.Disconnect(False) 'false means don't end session
Throw
End Try
End Sub
If you search for the line in the code which has a comment saying "BUG?" that's where the processing fails because of our tweak to the partition query. When that processing fails.
|||I'd still like to hear if anyone knows a workaround, but I've posted this as a bug:
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=278000
|||I expect the bug that you filed to be resolved "By Design". The definition of the transaction is that it either fully succeed or fully fails. That's the letter "A" in the ACID definition - A for Atomicy. You have very long transaction, but it doesn't matter - if anything fails within transaction, the entire transaction is rolled back.
No comments:
Post a Comment