You are viewing limited content. For full access, please sign in.

Question

Question

Downloading Audit Trail Reports - XLSX and CSV; and problem with the CSV

asked on May 8

Hi all,

When we run an Audit Trail Report, and go to download the results - we see the following behaviour:

  • If the amount of rows is on the reasonable side, then the download is an XLSX file
  • If the amount of rows is very high (seems to be when in the 100k+ range) then the download automatically becomes a Zip file; inside of which is a CSV (rather than an XLSX)

 

I haven't been able to find any documentation on what the actual cut-off row figure is for when the system downloads as the Zip/CSV;  so the first question is - does anyone know what that figure is? Is it based on row count of data, or filesize of the download (i.e. if it calculates that the file download size will be very high, it reverts to Zip/CSV)?

The second issue is that when it becomes a CSV, the format for the CSV does not seem to be correct. When opened in any application like Excel or even Notepad, the layout and formatting isn't working cvorrectly as it is not properly delimited using commas (so values are concatenated into one column when they should be split into multiple columns; headers are concatenated etc).

This is how we see it when opening the "CSV" in Excel:

This is how we see it in Notepad:

You can clearly see commas are not in the correct place.

Is anyone else seeing this? And is there a workaround at all? We are on version Laserfiche Audit Trail 11.0.2306.3549

0 0

Answer

APPROVED ANSWER
replied on May 8

Disclaimer: I'm not on the Audit Trail product team. The following are my own thoughts based on public documentation.

Regarding the "csv" delimiter, the Laserfiche Audit Trail 11 Frequently Asked Questions (FAQ) on the Support Site has a possible hint:

Q: Previously, I could directly access the audit reporting SQL database to run my own queries and generate my own audit reports or charts as an automatic scheduled task. Can I directly read the search catalog?

Audit Trail 11 uses the Laserfiche Full Text Search Engine (LFFTS) to index the audit data in order to provide a better report generation experience. As the search catalog schema is undocumented and subject to change, Audit Trail 11 includes a small toolkit (PowerShell script) to export reports as Microsoft Excel files. Use Windows scheduled tasks to create an automated task that runs the script. By default, the Excel file is a TAB-delimited CSV format file. 

It's a reasonable guess that this PowerShell script is calling the same Export Report function as the web UI, which produces a TAB-delimited .csv file. A tab-delimited csv file sounds like an oxymoron but is apparently a thing

You could try changing the file extension to ".tsv" (Tab-Separated Values) before opening it in another application and see if that fixes the issues arising from their csv auto-handling.

2 0
replied on May 8

Hi Samuel - thank you for the prompt reply; and yes dead right - the "CSV" file is actually Tab Delimited (that dawned on me after posting the question). Importing the file using the text wizard in Excel and selecting Tab delimiting then works.

This will be ok as a workaround for now; but would be great to see the CSV actually be a CSV!

We have a customer who regularly runs a report which always exceeds the row/data limit and so they will be doing this a lot.

1 0
replied on May 8 Show version history

Glad to hear it! I'm guessing there were compelling reasons to use tabs as the delimiter, most likely that trying to properly escape all values that could contain commas was a nightmare. Otherwise I imagine we would have used normal comma-delimited CSVs. Using tabs seems like a deliberate design decision.

The more reasonable "fix" is likely outputting the files with a .tsv or .txt extension rather than .csv so applications with automatic CSV handling that (reasonably) assumes they're comma delimited don't get tripped up.

I'll bring it up with the dev team.

1 0
replied on May 8 Show version history

I've been informed that the .xlsx -> .csv row count threshold is indeed 100k rows. We compress the output file in a zip archive to make the file smaller to help avoid timeout issues that can occur with large file downloads.

1 0

Replies

You are not allowed to reply in this post.
You are not allowed to follow up in this post.

Sign in to reply to this post.