Extract PDF Metadata

The pdf-info endpoint returns metadata from a PDF document.

The pdf-info endpoint extracts a PDF's metadata as a JSON document. In this tutorial we use the pdf-info endpoint to fetch metadata from a PDF document and return that metadata as a JSON document. We first call the pdf-info REST endpoint directly using cURL.

Check out our blog for tips and tutorials!

We then use the C# client library to invoke the REST endpoint programmatically.

Required Resources

To complete this tutorial, you must add the Merge PDFs to your samples folder in your cloud storage space using the File Manager. After adding the sample resources, you should see a samples/get-pdf-info-pdf-info-endpoint folder containing the resources for this tutorial.

Sample	Sample Folder	Resources
Get Pdf Information	`samples/get-pdf-info-pdf-info-endpoint`	`fw4.pdf`

From the File Manager, download fw4.pdf to your local system; here we assume /temp/dynamicpdf-api-samples/get-pdf-info.
After downloading, delete the documents and instructions from your cloud storage space using the File Manager.

Resource	Cloud/Local
`fw4.pdf`	local

tip

See Sample Resources for instructions on adding sample resources.

Obtaining API Key

This tutorial assumes a valid API key obtained from the DynamicPDF API's Portal. Refer to the following for instructions on getting an API key.

Apps and API Keys

tip

If you are not familiar with the File Manager or Apps and API Keys, refer to the following tutorial and relevant Users Guide pages.

Make Request Using API

Let's begin by invoking the pdf-info REST endpoint directly using cURL.

Create and execute the following cURL command.

curl -X POST "https://api.dpdf.io/v1.0/pdf-info" 
-H "Authorization: Bearer DP.xxx-api-key-xxx" 
-H "Content-Type: application/pdf" 
--data-binary "@c:/temp/dynamicpdf-api-samples/get-pdf-info/fw4.pdf"

The cURL command makes a POST call to the pdf-info endpoint, passing the Authorization and Content-Type headers. It also sends the fw4.pdf file as binary data. The endpoint then returns a JSON document containing metadata describing the PDF.

Examine API Response

After executing the cURL command you should see the following JSON metadata.

{
    "author": "SE:W:CAR:MP",
    "subject": "Employee's Withholding Certificate",
    "keywords": "Fillable",
    "creator": "Adobe LiveCycle Designer ES 9.0",
    "producer": "Adobe LiveCycle Designer ES 9.0",
    "title": "2021 Form W-4",
    "pages": [
        {
            "pageNumber": 1,
            "width": 611.976,
            "height": 791.968
        },
        {
            "pageNumber": 2,
            "width": 611.976,
            "height": 791.968
        },
        {
            "pageNumber": 3,
            "width": 611.976,
            "height": 791.968
        },
        {
            "pageNumber": 4,
            "width": 611.976,
            "height": 791.968
        }
    ],
    "formFields": {
        "signatureFields": null,
        "textFields": [
            {
                "name": "topmostSubform[0].Page1[0].Step1a[0].f1_01[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].Step1a[0].f1_02[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].Step1a[0].f1_03[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].Step1a[0].f1_04[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_05[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].Step3_ReadOrder[0].f1_06[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].Step3_ReadOrder[0].f1_07[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_08[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_09[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_10[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_11[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_13[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_14[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page1[0].f1_15[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_01[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_02[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_03[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_04[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_05[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_06[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_07[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_08[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_09[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_10[0]",
                "value": null,
                "defaultValue": ""
            },
            {
                "name": "topmostSubform[0].Page3[0].f3_11[0]",
                "value": null,
                "defaultValue": ""
            }
        ],
        "choiceFields": null,
        "buttonFields": [
            {
                "name": "topmostSubform[0].Page1[0].c1_1[0]",
                "type": "checkBox",
                "value": null,
                "defaultValue": "",
                "exportValue": "1",
                "exportValues": null
            },
            {
                "name": "topmostSubform[0].Page1[0].c1_1[1]",
                "type": "checkBox",
                "value": null,
                "defaultValue": "",
                "exportValue": "2",
                "exportValues": null
            },
            {
                "name": "topmostSubform[0].Page1[0].c1_1[2]",
                "type": "checkBox",
                "value": null,
                "defaultValue": "",
                "exportValue": "3",
                "exportValues": null
            },
            {
                "name": "topmostSubform[0].Page1[0].Step2c[0].c1_2[0]",
                "type": "checkBox",
                "value": null,
                "defaultValue": "",
                "exportValue": "1",
                "exportValues": null
            }
        ],
        "pushButtons": null,
        "multiSelectListBoxFields": null
    },
    "customProperties": null,
    "xmpMetaData": true,
    "signed": false,
    "tagged": true
}

Calling Endpoint Using Client Library

Although using the pdf-info endpoint is straightforward, you can also use on the the DynamicPDF API client libraries. Complete Source. You can access the complete source for this project at one of the following GitHub projects.

Language	File Name	Location (package/namespace/etc.)	GitHub Project
Java	`GetPdfInfo.java`	`com.dynamicpdf.api.examples`	https://github.com/dynamicpdf-api/java-client-examples
C#	`Program.cs`	`GetPdfInfo`	https://github.com/dynamicpdf-api/dotnet-client-examples
Nodejs	`GetPdfInfo.js`	`nodejs-client-examples`	https://github.com/dynamicpdf-api/nodejs-client-examples
PHP	`GetPdfInfo.php`	`php-client-examples`	https://github.com/dynamicpdf-api/php-client-examples
GO	`pdf-info-example.go`	`go-client-examples`	https://github.com/dynamicpdf-api/go-client-examples/tree/main
Python	`PdfInfoExample.py`	`python-client-examples`	https://github.com/dynamicpdf-api/python-client-examples

tip

Click on the language tab of choice to view the tutorial steps for the particular language.

Available on NuGet:

Install-Package DynamicPDF.API

Create a new Visual Studio C# Console App (.NET Core) project named GetPdfInfo.
Add the DynamicPdf.Api Nuget package to the project.
Create a new static method named Run that takes the API key and base path as strings.
Add the call to Run to Main.
Create a new PdfResource instance that takes the path to the PDF.
Create a new PdfInfoinstance and pass the PdfResource instance to the constructor.
Add a call the endpoint using the PdfInfo instance's Process method.
Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.

using DynamicPDF.Api;
using System;

namespace GetPdfInfo
{
    class Program
    {
        static void Main(string[] args)
        {
            Run("DP.xxx-api-key-xxx", "c:/temp/dynamicpdf-api-samples/get-pdf-info/");
        }

        public static void Run(String apiKey, String basePath)
        {
            PdfResource resource = new PdfResource(basePath + "fw4.pdf");
            PdfInfo pdfInfo = new PdfInfo(resource);
            pdfInfo.ApiKey = apiKey;
            PdfInfoResponse response = pdfInfo.Process();

            if (response.IsSuccessful)
            {
                Console.WriteLine(response.JsonContent);
            } else
            {
                Console.WriteLine(response.ErrorJson);
            }
        }
    }
}

Available on NPM:

npm i @dynamicpdf/api

Use npm to install the DynamicPDF API module.
Create a new class named GetPdfInfo.
Create a static Run method.
In the Run method, create a new PdfResource instance that takes the path to the PDF.
Create a new PdfInfoinstance and pass the PdfResource instance to the constructor.
Add a call the endpoint using the PdfInfo instance's process method.
Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.

import {
    PdfResource,
    PdfInfo
} from "@dynamicpdf/api"

export class GetPdfInfo {
    static async Run() {
        var resource = new PdfResource("./Resources/client-libraries-examples/fw4.pdf");
        var pdfInfo = new PdfInfo(resource);
        pdfInfo.apiKey = "xxxx--apikey--xxxx";
        var res = await pdfInfo.process();

        if (res.isSuccessful) {
            console.log(JSON.parse(res.content));
        } else {
            console.log(res.errorJson);
        }
    }
}
await GetPdfInfo.Run();

Run the application node GetPdfInfo.js and the merged PDF is written to your filesystem.

Available on Maven:

https://search.maven.org/search?q=g:com.dynamicpdf.api

<dependency>
  <groupId>com.dynamicpdf.api</groupId>
  <artifactId>dynamicpdf-api</artifactId>
  <version>1.0.0</version>
</dependency>

Create a new Maven project and add the DynamicPDF API as a dependency.
Create a new class named GetPdfInfo with a main method.
Create a new method named Run.
Add the Run method call to main.
In the Run method, create a new PdfResource instance that takes the path to the PDF.
Create a new PdfInfoinstance and pass the PdfResource instance to the constructor.
Add a call the endpoint using the PdfInfo instance's Process method.
Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.

package com.dynamicpdf.api.examples;

import com.dynamicpdf.api.PdfInfo;
import com.dynamicpdf.api.PdfInfoResponse;
import com.dynamicpdf.api.PdfResource;

public class GetPdfInfo {
	public static void main(String[] args) {
		GetPdfInfo.Run("DP.xxx-api-key-xxx",
				"C:/temp/dynamicpdf-api-samples/get-pdf-info/");
	}

   	public static void Run(String apiKey, String basePath) {
		PdfResource resource = new PdfResource(basePath + "fw4.pdf");
        PdfInfo pdfInfo = new PdfInfo(resource);
        pdfInfo.setApiKey(apiKey);
        PdfInfoResponse response = pdfInfo.process();
        if(response.getIsSuccessful()) {
            System.out.println(response.getJsonContent());
        } else {
        	System.out.println(response.getErrorJson());
        }
    } 
}

Available as a Composer package:

composer require dynamicpdf/api

Use composer to ensure you have the required PHP libraries.
Create a new class named GetPdfInfo.
Add a Run method.
In the Run method, create a new PdfResource instance that takes the path to the PDF.
Create a new PdfInfoinstance and pass the PdfResource instance to the constructor.
Add a call the endpoint using the PdfInfo instance's Process method.
Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.

<?php

require __DIR__ . '/vendor/autoload.php';

use DynamicPDF\Api\PdfResource;
use DynamicPDF\Api\PdfInfo;

class GetPdfInfo
{
    private static string $BasePath = "C:/temp/dynamicpdf-api-samples/get-pdf-info/";

    public static function Run()
    {
        $resource = new PdfResource(GetPdfInfo::$BasePath . "fw4.pdf");
        $pdfInfo = new PdfInfo($resource);
        $pdfInfo->ApiKey = "DP.xxx-api-key-xxx";
        $response = $pdfInfo->Process();
        if ($response->IsSuccessful) {
            echo ($response->JsonContent);
        } else {
            echo (json_encode($response));
        }
    }
}
GetPdfInfo::Run();

Run the application php GetPdfInfo.php and the PDF metadata is written to the console.

Available a GO package: https://pkg.go.dev/github.com/dynamicpdf-api/go-client

Ensure you have the required GO libraries.
Create a new file named pdf-info-example.go.
Add a main method.
In the main method, create a new PdfResource instance that takes the path to the PDF.
Create a new PdfInfoinstance and pass the PdfResource instance to the constructor.
Add a call the endpoint using the PdfInfo instance's Process method.
Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
Run the application go run pdf-info-example.go and the PDF metadata is written to the console.

package main

import (
	"fmt"
	"github.com/dynamicpdf-api/go-client/endpoint"
	"github.com/dynamicpdf-api/go-client/resource"
)
      
    func main() {
        
      resource := resource.NewPdfResourceWithResourcePath("C:/temp/dynamicpdf-api-samples/pdf-info/fw4.pdf", "fw4.pdf")
      text := endpoint.NewPdfInfoResource(resource)
      
      text.Endpoint.BaseUrl = "https://api.dpdf.io/"
      text.Endpoint.ApiKey = "DP.xxx-api-key-xxx"

      resp := text.Process()
      res := <-resp
	
		  if res.IsSuccessful() == true {
						fmt.Print(string(res.Content().Bytes()))
		  }
}

Available at: pip install dynamicpdf-api

Ensure you have the required Python libraries.
Create a new file named PdfInfoExample.py.
Add a run method.
In the run method, create a new PdfResource instance that takes the path to the PDF.
Create a new PdfInfo instance and pass the PdfResource instance to the constructor.
Add a call the endpoint using the PdfInfo instance's Process method.
Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
Run the application python PdfInfoExample.py and the PDF metadata is written to the console.

from dynamicpdf_api.pdf_info import PdfInfo
from dynamicpdf_api.pdf_resource import PdfResource
import pprint
import json

def run(api_key, basePath):
    resource = PdfResource(basePath + "fw9AcroForm_18.pdf")
    pdf_info = PdfInfo(resource)
    pdf_info.api_key = api_key
    response = pdf_info.process() 
    print(pprint.pformat(json.loads(response.json_content)))


if __name__ == "__main__":
    api_key = 'DP.xxx-api-key-xxx'
    basePath = "C:/temp/dynamicpdf-api-samples/"
    run(api_key, basePath)

In all six languages, the steps were similar. First, we created a new PdfResource instance by loading the path to the PDF via the constructor. Next, we created a new instance of the PdfInfo class, which abstracts the pdf-info endpoint. Then the PdfInfo instance prints the extracted PDF information as JSON after processing. Finally, we called the Process method and print the resultant JSON to the console.

Follow us on social media for latest news!

Required Resources​

Obtaining API Key​

Make Request Using API​

Examine API Response​

Calling Endpoint Using Client Library​

Required Resources

Obtaining API Key

Make Request Using API

Examine API Response

Calling Endpoint Using Client Library