Saturday, February 25, 2012

Records comparsion

Please look at the code template below for my problem. I am comparing two tables. Table 99 is the current state of the table. I get in a fresh extract of Table 99 everyday from my source - DB2 and I insert that in Table 00. Then I execute the code template below so that Table 99 stays current with the source DB2 on a daily basis.
--
CREATE TABLE myTable99 (
Col1 char(1)
, Col2 char(1)
, Col3 char(1)
, Col4 char(1)
, Col5 char(1)
, CONSTRAINT myTable99_pk PRIMARY KEY (Col1, Col2))

CREATE TABLE myTable00 (
Col1 char(1)
, Col2 char(1)
, Col3 char(1)
, Col4 char(1)
, Col5 char(1)
, CONSTRAINT myTable00_pk PRIMARY KEY (Col1, Col2))
GO

INSERT INTO myTable99(Col1,Col2,Col3,Col4,Col5)
SELECT '1','1','a','b','c' UNION ALL
SELECT '1','2','d','e','f' UNION ALL
SELECT '1','3','g','h','i' UNION ALL
SELECT '1','4','j','k','l'
--DELETED

INSERT INTO myTable00(Col1,Col2,Col3,Col4,Col5)
SELECT '1','1','a','b','c' UNION ALL --NO CHANGE
SELECT '1','2','x','y','z' UNION ALL -- UPDATE
SELECT '1','3','g','h','i' UNION ALL --NO CHANGE
SELECT '2','3','j','b','c' --INSERT
GO

SELECT * FROM myTable99
SELECT * FROM myTable00
GO

--DO DELETES FIRST

DELETE FROM a
FROM myTable99 a
LEFT JOIN myTable00 b
ON a.Col1 = b.Col1
AND a.Col2 = b.Col2
WHERE b.Col1 IS NULL AND b.Col2 IS NULL

-- INSERT

INSERT INTO myTable99(Col1,Col2,Col3,Col4,Col5)
SELECT a.Col1, a.Col2, a.Col3, a.Col4, a.Col5
FROM myTable00 a
LEFT JOIN myTable99 b
ON a.Col1 = b.Col1
AND a.Col2 = b.Col2
WHERE b.Col1 IS NULL AND b.Col2 IS NULL

-- UPDATE

UPDATE a
SET Col3 = b.Col3
, Col4 = b.Col4
, Col5 = b.Col5
FROM myTable99 a
INNER JOIN myTable00 b
ON a.Col1 = b.Col1
AND a.Col2 = b.Col2
AND ( a.Col3 <> b.Col3
OR a.Col4 <> b.Col4
OR a.Col5 <> b.Col5)
GO

SELECT * FROM myTable99
SELECT * FROM myTable00
GO

DROP TABLE myTable00
DROP TABLE myTable99
GO

--
I want to process new and updated records.
Let me explain the problem -

The last row for myTable99 is - 1','4','j','k','l'

The last row for myTable00 is - '2','3','j','b','c'

Col1 & Col2 are the PK's. So the last row from Table 99 gets deleted and the one from Table00 gets inserted as a 'New' record.Now I want to do processing only on those records that are new. So I should process record '2','3','j','b','c' .

My processing depends on the Col3 (attribute). My problem is I do not want to process records if the attribute has been processed before. In this case Col3 has 'j' and it has been processed before in 1','4','j','k','l'. It's just that this record no longer exist. So when I get in '2','3','j','b','c' , I want to say that this record has been processed.

How do I do that with the Insert new records query? The same case is for updates as well.

Any help is appreciated.

Thank you.My processing depends on the Col3 (attribute). My problem is I do not want to process records if the attribute has been processed before. In this case Col3 has 'j' and it has been processed before in 1','4','j','k','l'. It's just that this record no longer exist. So when I get in '2','3','j','b','c' , I want to say that this record has been processed.

You lost me....

Huh?

Understand....Whatever is in Col3 attribute BELONGS to the composite key Col1+Col2...it does not transcend the row to some other key?

It belongs to that row...if it so happens that the same value is on another row that's fine...but it belongs to that other key...

Can you give the explanation another shot?|||Why so complicated?

Can you not flag the affected records after the import and use the EXISTS / NOT EXISTS clause to target those which have been actioned as you translate them from import source to stored data?|||I think he's saying, that if I messed with an "attribute" (bad choice of words here) of a value of 5 in column 3, then I don't want to touch ANY of the rows...across keys...right?|||Here is the processing deal. I get the value of col 3 and check to see if it exists in this other table Z.

So now when I get a new record and if I have already checked that col3 value in Table Z, then I don't really need to check it again?|||Clear as mud...what does that have to do with adding and updating rows?

And what are you checking col3 for?|||what does that have to do with adding and updating rows?
Well every row that is added/updated has the status of a new record. And all new records have to be processed.

And what are you checking col3 for?
I am checking if the col3 value of the new record exist in this other table called Table Z. If it exist then I spit out the contents of Table 99 and Table Z into another table say X. Now my client looks to see the degree of similarity in col3 and the column in Table Z. For instance - col3 value is 'j' and the column in Table Z is 'jo'

Now for a new record which has the same value of Col3 as an old/delted record that was spit out then I have many duplicate values of Col3 (but different primary key) in X. Now the client has to check the same value of Col3 again for degree of similarity with Table Z.

I know it's a bit crazy but that's how the data is. Let me know what you think.|||I;m thinking I'm glad I'm not you...

But a previous post by someone a good idea...

When you add or update a record, tag it with an indicator NEW_IND for example, the do you're comparison step where the ind = 'Y' or whatever...then when you're donme set them all bacl to N

What'chyou think?|||Well if I tag it with IND = 'Y' and do the comparison then isn't it possible that Col3 of the record with IND = 'Y' would have been compared before?|||You cannot delete the old records AND retain knowledge of the old record's values. You must retain a knowledge of what the old the old values were.

You probably need an intermediate table indicating which of the possible values you have already processed, or implement some ki9nd of flagging system such that you identify the records to be deleted, insert the new records, compare the col3 values and then delete the flagged records.|||I appologize for the tone of my previous message. I did not intend for it to sound aggressive.

I had to address a similar problem to create change files (LDIFs) for changes to an X500 directory. Our solution was to mirror the table structure and do a comparison of changes based off of the key values for the appropriate tables.

Prior to import (or live data manipulation) the two instances are identical. A series of EXISTS / NOT EXISTS comparisons returns the newly added or deleted records. These changses are passed to the ADD/DELETE tables.

At the field level we passed the field names with an INNER JOIN fo the PK to look for field level changes. These changes were passed to a MODIFY table.

I am sure that you can use a variation of this logic to solve your problem.

No comments:

Post a Comment