You are viewing limited content. For full access, please sign in.

Question

Question

Need some SQL help getting info out of LF

asked on July 12, 2018 Show version history

We have some metadata fields in our documents, such as name, student ID, status, DOB, etc. The authoritative source for these fields is our student information system. The Registrar's office wants the metadata in LF synced from the SIS so that the fields in the documents are always up to date. I have written a synchronization script in SQL, along with a workflow, to accomplish this. However, what I have found is that some LF document IDs (entry IDs) generated by the SQL statement are not found when processed by the workflow. This doesn't make sense as there wouldn't be any way for the IDs to disappear between the two processes, so IDs generated by the SQL statement should ALWAYS be found by the workflow. This makes me think that maybe my SQL statement isn't correct in some way. I am hoping someone can look at my SQL and sanity-check it for me. The whole process works just fine, except for the fact that maybe 1/10 or 1/100 of the entry IDs generated do not get found by the workflow.

 

Here is the SQL statement that generates the data out of LF. Please look it over and tell me if it looks right or if I am missing something. For instance, last night when it ran, out of 50,000 documents, 6 document IDs were not found. If you need more info, please let me know. Thanks!

 

Also, here is the way this works:

1) SQL stored procedure runs that generates the documents out of LF (below)

2) SQL SP then reads the authoritative info out of the SIS

3) SQL SP then does some processing which leaves only those records that have different field values in LF (sample data shown below) vs. what is in the SIS

4) Workflow runs which takes the records out of the built sync table (sample data shown below) and updates the field values in LF for the given entry IDs (DocID).

 

DECLARE @IDField       varchar(30) = 'Student ID';
DECLARE @SSNField      varchar(30) = 'SSN';
DECLARE @LNameField    varchar(30) = 'Last Name';
DECLARE @FNameField    varchar(30) = 'First Name';
DECLARE @MNameField    varchar(30) = 'Middle Initial';
DECLARE @DOBField      varchar(30) = 'Birth Date';
DECLARE @MatricField   varchar(30) = 'Matric Date';
DECLARE @StatusField   varchar(30) = 'Student Status';
DECLARE @PrivacyField  varchar(30) = 'Privacy Act';
DECLARE @DependField   varchar(30) = 'Dependency Affidavit';
DECLARE @DeceasedField varchar(30) = 'Deceased Student';

DECLARE @StudentRec    varchar(40) = 'Student Academic Record';
DECLARE @HousingRec    varchar(40) = 'Student Housing Record';
DECLARE @FinAidRec     varchar(40) = 'Student Financial Aid Record';

WITH TempTable (tocid, pset_name, prop_name, str_val, short_str_val, date_val) AS (
SELECT PV.tocid, PS.pset_name, prop_name, str_val, short_str_val, date_val
  FROM LF8.dbo.propval PV
    INNER JOIN LF8.dbo.propdef PD ON PV.prop_id = PD.prop_id 
           AND PD.prop_name IN  (@IDField, @SSNField, @LNameField, @FNameField, @MNameField, @DOBField, @MatricField, @StatusField, @PrivacyField, @DependField, @DeceasedField)
	LEFT  JOIN LF8.dbo.toc TOC ON TOC.tocid = PV.tocid AND TOC.parentid > 2
	INNER JOIN LF8.dbo.propset PS on TOC.pset_id = PS.pset_id 
	       AND PS.pset_name IN (@StudentRec, @HousingRec, @FinAidRec)
)
 INSERT INTO LFFieldData (DocID, StuID, SSN, LName, FName, MI, DOB, Matric, Status, Privacy, Depend, Deceased)
   SELECT tocid,
        max(CASE WHEN prop_name = @IDField       THEN str_val END) AS StuID,
        max(CASE WHEN prop_name = @SSNField      THEN str_val END) AS SSN,
        max(CASE WHEN prop_name = @LNameField    THEN str_val END) AS LNAME,
        max(CASE WHEN prop_name = @FNameField    THEN str_val END) AS FNAME,
        max(CASE WHEN prop_name = @MNameField    THEN str_val END) AS MI,
        max(CASE WHEN prop_name = @DOBField      THEN str_val END) AS DOB,
        max(CASE WHEN prop_name = @MatricField   THEN str_val END) AS MATRIC,
        max(CASE WHEN prop_name = @StatusField   THEN str_val END) AS STATUS,
        max(CASE WHEN prop_name = @PrivacyField  THEN str_val END) AS PRIVACY,
        max(CASE WHEN prop_name = @DependField   THEN str_val END) AS DEPEND,
        max(CASE WHEN prop_name = @DeceasedField THEN str_val END) AS DECEASED
     FROM TempTable
   GROUP BY tocid;

Here is a sample of the data generated by the SQL statement:

DocID	StuID	        SSN	        Lname	Fname	MI	DOB		Matric  Status	                Privacy	Depend	Deceased UpdateTS
11446	123456789	123451234	STUDENT	NEW	A	1/5/1970	2018/01	Admission No Show	No	N	No	 7/11/18 5:17 PM
11453	234567890	888990000	JONES	BOB	B	3/25/1969	2018/01	Inactive	        No	N	No	 7/11/18 5:17 PM
11480	456789012	555667777	SMITH	JOHN	C	2/19/1968	2016/08	Inactive	        No	N	No	 7/11/18 5:17 PM
11483	111223333	222334444	FRANK	CHERYL	D	4/12/1967	2018/08	Pending or Cancelled	No	N	No	 7/11/18 5:17 PM

 

0 0

Answer

SELECTED ANSWER
replied on July 12, 2018

I'm not terribly familiar with the repository schema yet, but I'm assuming that the workflow won't find deleted entries. Try adding this to the toc join condition:

AND TOC.toc_flags & 2048 = 0

Also, you have my inner pedant twitching by naming a CTE "TempTable"!

0 0
replied on July 12, 2018

Thank you for this suggestion. I added your join condition to the stored procedure and re-ran it and got a difference of around 7000 records, so I will let the entire process run tonight and see if I get any unknown entries. Either way I will post back with results.

0 0
replied on July 13, 2018

Preliminary results look promising! It ran last night with no 'entry not found' errors. I am going to let it run over the weekend to tell for sure, but so far it looks good! Thanks again, and I'll mark yours as the answer next week if there are no problems.

0 0
replied on July 19, 2018

Thank you very much. This has taken care of the issue.

0 0

Replies

replied on July 12, 2018

Mark,

The documents that don't update may be locked/checked out.  I ran into something similar with a metadata update workflow recently. In my For Each loop I set the condition you see below to make sure I didn't process any checked out documents.

This would make sense as to why you are not getting the same number processed, as returned by your search.

07-12-2018 checked out.PNG
0 0
replied on July 12, 2018

This entire process is run at night, so no users are in the system. Additionally, when I do a manual search in LF, the doc ID (entry ID) that is in the SQL table comes back not found. Therefore, it appears that the SQL statement that I included is returning an entry ID, but then somehow that entry ID does not exist. I don't know how I can be pulling this directly out of LF and the ID doesn't exist.

0 0
replied on July 12, 2018

Mark, are you running into contention with any other nightly jobs that do clean up on your repository?  Perhaps you are capturing Entry IDs of docs that did exist until after that clean up job completed.

It shouldn't return entries from the Recycle Bin, but that was my only other thought.

0 0
replied on July 12, 2018

It is possible that it was hitting deleted entries. I added a condition to my query as suggested earlier and I am going to see if that makes a difference tonight.

1 0
You are not allowed to follow up in this post.

Sign in to reply to this post.