You are viewing limited content. For full access, please sign in.

Question

Question

How to get all empty/unused fields list

asked on January 21, 2024

We have a customer which have 150+ fields within many templates.

They need to clean and delete all empty fields including the unused fields (either part of template or individual, and either assigned to documents or not)

 

How can we get this list? Is there a query to achieve this?

0 0

Answer

SELECTED ANSWER
replied on January 22, 2024 Show version history

You can always do a search in the repository like this: 

{[]:[Field Name]="*"}

That will return any entries with a value in that field.

 

But I can understand wanting to do a query instead.  You don't have to wait for the program to load a list of entries, and you don't have to work down the list of individual fields.

Here's a query, that when run against your Repository database, checks against the propdef table and the propval table to try to get a count of how many entries have a value in each field: 

SELECT
  def.[prop_id],
  def.[prop_name],
  SUM(
    CASE
      WHEN val.str_val IS NULL THEN 0
      WHEN val.str_val = '' THEN 0
      ELSE 1
    END
    +
    CASE
      WHEN val.num_val IS NULL THEN 0
      ELSE 1
    END
    +
    CASE
      WHEN val.date_val IS NULL THEN 0
      ELSE 1
    END
    +
    CASE
      WHEN val.bin_val IS NULL THEN 0
      ELSE 1
    END
  ) AS value_count
FROM [propval] AS val
LEFT JOIN [propdef] AS def ON def.[prop_id] = val.[prop_id]
GROUP BY def.[prop_id], def.[prop_name]
ORDER BY def.[prop_id], def.[prop_name]

 

Note that against a test environment with only a few thousand entries, that query wasn't too bad, only taking about 20 seconds to run.  But against a live environment with many millions of entries, it was looking to take MUCH longer to run.  So I added a WHERE statement to limit how many different fields it looks at with each run.  I limited it to 10 fields in this case, and it still took about a minute to run.  Then the next search just requires incrementing the WHERE statement from "1 AND 10" to "11 AND 20", then "21 AND 30", etc., etc. 

SELECT
  def.[prop_id],
  def.[prop_name],
  SUM(
    CASE
      WHEN val.str_val IS NULL THEN 0
      WHEN val.str_val = '' THEN 0
      ELSE 1
    END
    +
    CASE
      WHEN val.num_val IS NULL THEN 0
      ELSE 1
    END
    +
    CASE
      WHEN val.date_val IS NULL THEN 0
      ELSE 1
    END
    +
    CASE
      WHEN val.bin_val IS NULL THEN 0
      ELSE 1
    END
  ) AS value_count
FROM [propval] AS val
LEFT JOIN [propdef] AS def ON def.[prop_id] = val.[prop_id]
WHERE val.[prop_id] BETWEEN 1 AND 10
GROUP BY def.[prop_id], def.[prop_name]
ORDER BY def.[prop_id], def.[prop_name]

 

EDIT TO ADD: Here's a better query, based on feedback from @████████and more testing of my own.  This was drastically faster to run: 

SELECT
  def.[prop_id],
  def.[prop_name],
  CASE
    WHEN subquery.value_count IS NULL THEN 0
    ELSE subquery.value_count
  END AS value_count
FROM [propdef] AS def
LEFT JOIN (
  SELECT
    def.[prop_id],
    def.[prop_name],
    COUNT(def.[prop_id]) AS value_count
  FROM [propval] AS val
  LEFT JOIN [propdef] AS def ON def.[prop_id] = val.[prop_id]
  WHERE (val.str_val IS NOT NULL AND val.str_val <> '')
    OR val.num_val IS NOT NULL 
    OR val.date_val IS NOT NULL 
    OR val.bin_val IS NOT NULL 
  GROUP BY def.[prop_id], def.[prop_name]
) AS subquery ON subquery.[prop_id] = def.[prop_id]
ORDER BY def.[prop_id]

 

1 0
replied on January 22, 2024

It's probably faster to rewrite the SUM/CASE as a count using

WHERE (val.str_val IS NULL OR val.str_val = '')
AND val.num_val IS NULL 
AND val.date_val IS NULL 
AND val.bin_val IS NULL 

 

1 0
replied on January 22, 2024 Show version history

That's a really good idea!

Actually, I think that would just return a list of entries that don't have the value populated, whereas this would need to be a list of fields that are never populated on any entries.  How would you rewrite the query with that WHERE statement?

Actually, I figured out a way to do it.  And it gives much better results than the original query, which apparently excluded fields that were not even displayed on any entries at all.  I'll update my original reply.

1 0
replied on January 23, 2024

Thank you @████████ and @████████ very much for the useful script.

1 0
replied on January 23, 2024

You are very welcome!

1 0

Replies

You are not allowed to reply in this post.
You are not allowed to follow up in this post.

Sign in to reply to this post.