Wednesday, March 7, 2012

RecordSet into a DataFlow Task

In the control flow I have an "Execute SQL Task" that executes a stored procedure. The stored procedure returns a result set of about 2000 rows of data into a package variable that has been typed as Object to contain the data.

What I have not been able to figure out is how to access the rows of data (in the package variable) from within a data flow task. There does not seem to be a data flow source task to perform that operation.

What am I missing that would make this easy?

...cordell...

if you need to have control on every row of the result set you may be on the need of using a data flow task(source, transforms and destination components). Would you give an example of what you are trying to accomplish after getting the 2000 rows?

|||

In the data flow task...I need to parse through the data to generate 3 unique result sets of data. One of the result sets will go to a text file, the 2nd result set go back into a staging table, and the 3rd result set goes to a excel spreadsheet.

|||

Cordell Swannack wrote:

In the control flow I have an "Execute SQL Task" that executes a stored procedure. The stored procedure returns a result set of about 2000 rows of data into a package variable that has been typed as Object to contain the data.

What I have not been able to figure out is how to access the rows of data (in the package variable) from within a data flow task. There does not seem to be a data flow source task to perform that operation.

What am I missing that would make this easy?

...cordell...

This shows you how to do it:

Recordsets instead of raw files
(http://blogs.conchango.com/jamiethomson/archive/2006/01/04/2540.aspx)

I'm not sure if this is relevant though. If you are using a SQL statement to get the data why not just put the SQL statement into an OLE DB Source adapter? Why bother with the rigmarole of a variable and then having to write code?

-Jamie

|||

Cordell Swannack wrote:

In the data flow task...I need to parse through the data to generate 3 unique result sets of data. One of the result sets will go to a text file, the 2nd result set go back into a staging table, and the 3rd result set goes to a excel spreadsheet.

I would wager you can accomplish all of this with a combination of the conditional split, multicast and/or derived column components.

Even if you can't, you would be better off employing a script component to parse through the data - there is still no need to put it into a variable first.

-Jamie

|||

Jamie Thomson wrote:

Cordell Swannack wrote:

In the control flow I have an "Execute SQL Task" that executes a stored procedure. The stored procedure returns a result set of about 2000 rows of data into a package variable that has been typed as Object to contain the data.

What I have not been able to figure out is how to access the rows of data (in the package variable) from within a data flow task. There does not seem to be a data flow source task to perform that operation.

What am I missing that would make this easy?

...cordell...

This shows you how to do it:

Recordsets instead of raw files
(http://blogs.conchango.com/jamiethomson/archive/2006/01/04/2540.aspx)

I'm not sure if this is relevant though. If you are using a SQL statement to get the data why not just put the SQL statement into an OLE DB Source adapter? Why bother with the rigmarole of a variable and then having to write code?

-Jamie

I was going to suggest something similar. In general you would use data flows to perform row by row operations; like transformations, splits, sorts, etc. However the control flow may be better place to row set based operations; like an update, select into, etc.

In you particular case, use a dataflow task and then drop a OLE DB source component, any required transform and the detination component.

|||

Helpful suggestions....

While you can use the OLE DB Source adapater inside of the Data Flow task...I am executing a stored procedure that returns back a large result set. When you click on the 'Preview' button...the data and all of the column names are returned. (Just proving that everything works.)

However I haven't figure out how to set the column names typically set by OLE DB Source component. Right now they are blank. Usually at design time when you reference a table or use simple select statement the OLE DB Source task can map the columns specified by the table name or select statement to create a list of columns that are used for the output and thus by other components in the data flow task.

But when executing a stored procedure...the column names are not returned until completion (i.e. runtime) of the stored procedure. This is the problem.

...cordell...

|||

Cordell Swannack wrote:

Helpful suggestions....

While you can use the OLE DB Source adapater inside of the Data Flow task...I am executing a stored procedure that returns back a large result set. When you click on the 'Preview' button...the data and all of the column names are returned. (Just proving that everything works.)

However I haven't figure out how to set the column names typically set by OLE DB Source component. Right now they are blank. Usually at design time when you reference a table or use simple select statement the OLE DB Source task can map the columns specified by the table name or select statement to create a list of columns that are used for the output and thus by other components in the data flow task.

But when executing a stored procedure...the column names are not returned until completion (i.e. runtime) of the stored procedure. This is the problem.

...cordell...

You're absolutely right. This IS a problem with sprocs. And I agree that this sounds like a good justification for using the Execute SQL Task route. However, I suggest another workaround here:

Using stored procedures inside an OLE DB Source component
(http://blogs.conchango.com/jamiethomson/archive/2006/12/20/SSIS_3A00_-Using-stored-procedures-inside-an-OLE-DB-Source-component.aspx)

that you may wish to employ that will enable you to use the OLE DB Source component.

-Jamie

|||

I found your blog entry on the subject just as I was receiving your email.

The reason why a UDF doesn't work for my solution is the amount of transformation processing that must be performed upon the data. Basically I have 10,000's of data records that are processed and then randomized into a laboratory trial groups which are then distributed out to various computer systems for analysis.

I ended up changing around the stored procedure to create a table in the database...and then once the SSIS work has completed...I drop the table. A less than perfect solution...but it gets the job done and it is time to move on.

Thank you Jamie for your time to answer my questions and your patience.

...cordell...

No comments:

Post a Comment