You are viewing limited content. For full access, please sign in.

Question

Question

When importing docs via the Cloud API, is the file extension required?

asked on April 28

I think that the only way to have documents correctly recognized in the repository after import via the API is to include the file extension. Is this correct? It doesn't appear that there is a way for me to pass in the specific MIME type e.g. application/pdf so I need to include the file extension. 

If I don't include the file extension, the file is unreadable. 

If I am correct so far, the only downside is that all documents include the file extensions in the repository, where documents imported into the repo (not via the repo) do not have file extensions. It's not a show stopper but will be confusing for some users and create an inconsistency.

0 0

Replies

replied on April 28

The file type can be inferred from the extension, but I'm pretty sure if the source of the document correctly adds the mimetype when defining the Blob object it will be read correctly by the API.

If that doesn't work you could always rename the document after import too!

0 0
replied on April 28

Thanks. Just thinking this through...If I send a a pdf without a file extension I cannot open it in the repo. If I rename that doc with a .pdf extension I can open it in the repo. 

What's the difference between importing a pdf in the repo (browser) and importing via the API? When I import a pdf by browser it does not keep the file extension in the name. 

Can I send a mimetype in the API POST? Or does the API determine it based on my incoming file extension?

0 0
replied on April 28

The API will use the extension if it exists in the file name, but when you are constructing the request to send the file through the API you should define the file (your code) with the mimetype. Something like this https://stackoverflow.com/questions/55550834/how-to-set-mime-type-for-post-multipart-form-data-in-axios 

const data = new FormData();

data.append('action', 'ADD');
data.append('param', 0);
data.append('secondParam', 0);
data.append('file', new Blob(['test payload'], { type: 'text/csv' }));

 

0 0
replied on April 28

I can pass in the mimtype using Cloud API?

 

https://api.laserfiche.com/repository/v2/Repositories/r-xxxxx/Entries/5/Folder/Import

 

{
  "name": "filename",
  "autoRename": true,
  "pdfOptions": {
    "generateText": true,
    "generatePages": true,
    "generatePagesImageType": "StandardColor",
    "keepPdfAfterImport": false
  },...
0 0
replied on April 28

the request body doesn't directly support it, but @████████ can confirm this.

What I'm saying (and what is missing from your example) is where you are actually attaching the file binary to the request. That part of your code is where the mimetype should come from.

Where are you making this request? Do you own that codebase? Is it Laserfiche to Laserfiche?

0 0
replied on April 29

I am making this request from my python app to Laserfiche cloud. The python app is downloading files from an external web app then importing them into LF. 

 

Here's the relevant portion of my code. 
 

            # Prepare import request - use the new folder's ID
            import_url = f"{self.base_url}/Repositories/{self.config.repository_id}/Entries/{folder_entry_id}/Folder/Import"
            
            # Determine content type
            # Check if content is PDF by magic number
            is_pdf = isinstance(file_content, bytes) and file_content.startswith(b'%PDF')
            
            # Set appropriate content type - probably a pdf first, then word
            if is_pdf:
                content_type = 'application/pdf'
                # Add .pdf extension if it's missing
                if not file_name.lower().endswith('.pdf'):
                    file_name = f"{file_name}.pdf"
                print(f"Using PDF content type for {file_name}")
            else:
                # Use magic to detect content type
                content_type = magic.from_buffer(file_content, mime=True)
                if content_type == "application/msword":
                    file_name = f"{file_name}.doc"
                print(f"Using detected content type: {content_type} for {file_name}")

            # Prepare request data
            # Get template name from metadata or default to Building
            template_name = metadata.get("templateName") if metadata else "Building"

            # Base request data structure
            base_request = {
                "name": file_name,
                "autoRename": True,
                "pdfOptions": {
                    "generateText": True,
                    "generatePages": True,
                    "generatePagesImageType": "StandardColor", 
                    "keepPdfAfterImport": False
                }
            }

            # Prepare multipart form data
            files = {
                'file': (file_name, file_content, content_type)
            }
            
            # Send import request
            print(f"Sending import request to: {import_url}")
            response = requests.post(
                import_url,
                headers=self._get_headers(),
                files=files,
                data={
                    'request': json.dumps(request_data)
                }
            )

 

Based on what you are saying passing in the contentType with 'files' and packaging it with the POST should suffice. 

I am seeing an error in my code also...I don't believe I am explicitly defining contentType. 

 

0 0
replied on April 29

Hmm, I don't know if the declaration of content_type is your problem because python if statements are scoped to the parent function or module. 

  1. You can print content_type to see what mime type is being set
  2. Looking at repository > details pane/tab > show more
    1. When importing into the repository does the full file appear? i.e., 1MB file not a 0kb file
    2. What mime type shows up in the repository in the details pane?
0 0
You are not allowed to follow up in this post.

Sign in to reply to this post.