Extract PDF Metadata
The pdf-info endpoint returns metadata from a PDF document.
The pdf-info
endpoint extracts a PDF's metadata as a JSON document. In this tutorial we use the pdf-info
endpoint to fetch metadata from a PDF document and return that metadata as a JSON document. We first call the pdf-info
REST endpoint directly using cURL.
We then use the C# client library to invoke the REST endpoint programmatically.
Required Resources
To complete this tutorial, you must add the Merge PDFs to your samples
folder in your cloud storage space using the File Manager. After adding the sample resources, you should see a samples/get-pdf-info-pdf-info-endpoint
folder containing the resources for this tutorial.
Sample | Sample Folder | Resources |
---|---|---|
Get Pdf Information | samples/get-pdf-info-pdf-info-endpoint | fw4.pdf |
- From the File Manager, download
fw4.pdf
to your local system; here we assume/temp/dynamicpdf-api-samples/get-pdf-info
. - After downloading, delete the documents and instructions from your cloud storage space using the File Manager.
Resource | Cloud/Local |
---|---|
fw4.pdf | local |
See Sample Resources for instructions on adding sample resources.
Obtaining API Key
This tutorial assumes a valid API key obtained from the DynamicPDF API's Portal
. Refer to the following for instructions on getting an API key.
If you are not familiar with the File Manager or Apps and API Keys, refer to the following tutorial and relevant Users Guide pages.
Make Request Using API
Let's begin by invoking the pdf-info
REST endpoint directly using cURL.
- Create and execute the following cURL command.
curl -X POST "https://api.dpdf.io/v1.0/pdf-info"
-H "Authorization: Bearer DP.xxx-api-key-xxx"
-H "Content-Type: application/pdf"
--data-binary "@c:/temp/dynamicpdf-api-samples/get-pdf-info/fw4.pdf"
The cURL command makes a POST call to the pdf-info
endpoint, passing the Authorization
and Content-Type
headers. It also sends the fw4.pdf
file as binary data. The endpoint then returns a JSON document containing metadata describing the PDF.
Examine API Response
- After executing the cURL command you should see the following JSON metadata.
{
"author": "SE:W:CAR:MP",
"subject": "Employee's Withholding Certificate",
"keywords": "Fillable",
"creator": "Adobe LiveCycle Designer ES 9.0",
"producer": "Adobe LiveCycle Designer ES 9.0",
"title": "2021 Form W-4",
"pages": [
{
"pageNumber": 1,
"width": 611.976,
"height": 791.968
},
{
"pageNumber": 2,
"width": 611.976,
"height": 791.968
},
{
"pageNumber": 3,
"width": 611.976,
"height": 791.968
},
{
"pageNumber": 4,
"width": 611.976,
"height": 791.968
}
],
"formFields": {
"signatureFields": null,
"textFields": [
{
"name": "topmostSubform[0].Page1[0].Step1a[0].f1_01[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].Step1a[0].f1_02[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].Step1a[0].f1_03[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].Step1a[0].f1_04[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_05[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].Step3_ReadOrder[0].f1_06[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].Step3_ReadOrder[0].f1_07[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_08[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_09[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_10[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_11[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_13[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_14[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page1[0].f1_15[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_01[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_02[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_03[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_04[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_05[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_06[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_07[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_08[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_09[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_10[0]",
"value": null,
"defaultValue": ""
},
{
"name": "topmostSubform[0].Page3[0].f3_11[0]",
"value": null,
"defaultValue": ""
}
],
"choiceFields": null,
"buttonFields": [
{
"name": "topmostSubform[0].Page1[0].c1_1[0]",
"type": "checkBox",
"value": null,
"defaultValue": "",
"exportValue": "1",
"exportValues": null
},
{
"name": "topmostSubform[0].Page1[0].c1_1[1]",
"type": "checkBox",
"value": null,
"defaultValue": "",
"exportValue": "2",
"exportValues": null
},
{
"name": "topmostSubform[0].Page1[0].c1_1[2]",
"type": "checkBox",
"value": null,
"defaultValue": "",
"exportValue": "3",
"exportValues": null
},
{
"name": "topmostSubform[0].Page1[0].Step2c[0].c1_2[0]",
"type": "checkBox",
"value": null,
"defaultValue": "",
"exportValue": "1",
"exportValues": null
}
],
"pushButtons": null,
"multiSelectListBoxFields": null
},
"customProperties": null,
"xmpMetaData": true,
"signed": false,
"tagged": true
}
Calling Endpoint Using Client Library
Although using the pdf-info
endpoint is straightforward, you can also use on the the DynamicPDF API client libraries. Complete Source. You can access the complete source for this project at one of the following GitHub projects.
Language | File Name | Location (package/namespace/etc.) | GitHub Project |
---|---|---|---|
Java | GetPdfInfo.java | com.dynamicpdf.api.examples | https://github.com/dynamicpdf-api/java-client-examples |
C# | Program.cs | GetPdfInfo | https://github.com/dynamicpdf-api/dotnet-client-examples |
Nodejs | GetPdfInfo.js | nodejs-client-examples | https://github.com/dynamicpdf-api/nodejs-client-examples |
PHP | GetPdfInfo.php | php-client-examples | https://github.com/dynamicpdf-api/php-client-examples |
GO | pdf-info-example.go | go-client-examples | https://github.com/dynamicpdf-api/go-client-examples/tree/main |
Python | PdfInfoExample.py | python-client-examples | https://github.com/dynamicpdf-api/python-client-examples |
Click on the language tab of choice to view the tutorial steps for the particular language.
- C# (.NET)
- Java
- Node.js
- PHP
- GO
- Python
Available on NuGet:
Install-Package DynamicPDF.API
- Create a new Visual Studio C# Console App (.NET Core) project named
GetPdfInfo
. - Add the DynamicPdf.Api Nuget package to the project.
- Create a new static method named
Run
that takes the API key and base path as strings. - Add the call to
Run
toMain
. - Create a new
PdfResource
instance that takes the path to the PDF. - Create a new
PdfInfo
instance and pass thePdfResource
instance to the constructor. - Add a call the endpoint using the
PdfInfo
instance'sProcess
method. - Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
using DynamicPDF.Api;
using System;
namespace GetPdfInfo
{
class Program
{
static void Main(string[] args)
{
Run("DP.xxx-api-key-xxx", "c:/temp/dynamicpdf-api-samples/get-pdf-info/");
}
public static void Run(String apiKey, String basePath)
{
PdfResource resource = new PdfResource(basePath + "fw4.pdf");
PdfInfo pdfInfo = new PdfInfo(resource);
pdfInfo.ApiKey = apiKey;
PdfInfoResponse response = pdfInfo.Process();
if (response.IsSuccessful)
{
Console.WriteLine(response.JsonContent);
} else
{
Console.WriteLine(response.ErrorJson);
}
}
}
}
Available on NPM:
npm i @dynamicpdf/api
- Use npm to install the DynamicPDF API module.
- Create a new class named
GetPdfInfo
. - Create a static
Run
method. - In the
Run
method, create a newPdfResource
instance that takes the path to the PDF. - Create a new
PdfInfo
instance and pass thePdfResource
instance to the constructor. - Add a call the endpoint using the
PdfInfo
instance'sprocess
method. - Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
import {
PdfResource,
PdfInfo
} from "@dynamicpdf/api"
export class GetPdfInfo {
static async Run() {
var resource = new PdfResource("./Resources/client-libraries-examples/fw4.pdf");
var pdfInfo = new PdfInfo(resource);
pdfInfo.apiKey = "xxxx--apikey--xxxx";
var res = await pdfInfo.process();
if (res.isSuccessful) {
console.log(JSON.parse(res.content));
} else {
console.log(res.errorJson);
}
}
}
await GetPdfInfo.Run();
- Run the application
node GetPdfInfo.js
and the merged PDF is written to your filesystem.
Available on Maven:
https://search.maven.org/search?q=g:com.dynamicpdf.api
<dependency>
<groupId>com.dynamicpdf.api</groupId>
<artifactId>dynamicpdf-api</artifactId>
<version>1.0.0</version>
</dependency>
-
Create a new Maven project and add the DynamicPDF API as a dependency.
-
Create a new class named
GetPdfInfo
with amain
method. -
Create a new method named
Run
. -
Add the
Run
method call tomain
. -
In the
Run
method, create a newPdfResource
instance that takes the path to the PDF. -
Create a new
PdfInfo
instance and pass thePdfResource
instance to the constructor. -
Add a call the endpoint using the
PdfInfo
instance'sProcess
method. -
Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
package com.dynamicpdf.api.examples;
import com.dynamicpdf.api.PdfInfo;
import com.dynamicpdf.api.PdfInfoResponse;
import com.dynamicpdf.api.PdfResource;
public class GetPdfInfo {
public static void main(String[] args) {
GetPdfInfo.Run("DP.xxx-api-key-xxx",
"C:/temp/dynamicpdf-api-samples/get-pdf-info/");
}
public static void Run(String apiKey, String basePath) {
PdfResource resource = new PdfResource(basePath + "fw4.pdf");
PdfInfo pdfInfo = new PdfInfo(resource);
pdfInfo.setApiKey(apiKey);
PdfInfoResponse response = pdfInfo.process();
if(response.getIsSuccessful()) {
System.out.println(response.getJsonContent());
} else {
System.out.println(response.getErrorJson());
}
}
}
Available as a Composer package:
composer require dynamicpdf/api
- Use composer to ensure you have the required PHP libraries.
- Create a new class named
GetPdfInfo
. - Add a
Run
method. - In the
Run
method, create a newPdfResource
instance that takes the path to the PDF. - Create a new
PdfInfo
instance and pass thePdfResource
instance to the constructor. - Add a call the endpoint using the
PdfInfo
instance'sProcess
method. - Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
<?php
require __DIR__ . '/vendor/autoload.php';
use DynamicPDF\Api\PdfResource;
use DynamicPDF\Api\PdfInfo;
class GetPdfInfo
{
private static string $BasePath = "C:/temp/dynamicpdf-api-samples/get-pdf-info/";
public static function Run()
{
$resource = new PdfResource(GetPdfInfo::$BasePath . "fw4.pdf");
$pdfInfo = new PdfInfo($resource);
$pdfInfo->ApiKey = "DP.xxx-api-key-xxx";
$response = $pdfInfo->Process();
if ($response->IsSuccessful) {
echo ($response->JsonContent);
} else {
echo (json_encode($response));
}
}
}
GetPdfInfo::Run();
- Run the application
php GetPdfInfo.php
and the PDF metadata is written to the console.
Available a GO package: https://pkg.go.dev/github.com/dynamicpdf-api/go-client
- Ensure you have the required GO libraries.
- Create a new file named
pdf-info-example.go
. - Add a
main
method. - In the
main
method, create a newPdfResource
instance that takes the path to the PDF. - Create a new
PdfInfo
instance and pass thePdfResource
instance to the constructor. - Add a call the endpoint using the
PdfInfo
instance'sProcess
method. - Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
- Run the application
go run pdf-info-example.go
and the PDF metadata is written to the console.
package main
import (
"fmt"
"github.com/dynamicpdf-api/go-client/endpoint"
"github.com/dynamicpdf-api/go-client/resource"
)
func main() {
resource := resource.NewPdfResourceWithResourcePath("C:/temp/dynamicpdf-api-samples/pdf-info/fw4.pdf", "fw4.pdf")
text := endpoint.NewPdfInfoResource(resource)
text.Endpoint.BaseUrl = "https://api.dpdf.io/"
text.Endpoint.ApiKey = "DP.xxx-api-key-xxx"
resp := text.Process()
res := <-resp
if res.IsSuccessful() == true {
fmt.Print(string(res.Content().Bytes()))
}
}
Available at: pip install dynamicpdf-api
- Ensure you have the required Python libraries.
- Create a new file named
PdfInfoExample.py
. - Add a
run
method. - In the
run
method, create a newPdfResource
instance that takes the path to the PDF. - Create a new
PdfInfo
instance and pass thePdfResource
instance to the constructor. - Add a call the endpoint using the
PdfInfo
instance'sProcess
method. - Check that the call was successful and if successful, then write the PDF information (as JSON metadata) to the console.
- Run the application
python PdfInfoExample.py
and the PDF metadata is written to the console.
from dynamicpdf_api.pdf_info import PdfInfo
from dynamicpdf_api.pdf_resource import PdfResource
import pprint
import json
def run(api_key, basePath):
resource = PdfResource(basePath + "fw9AcroForm_18.pdf")
pdf_info = PdfInfo(resource)
pdf_info.api_key = api_key
response = pdf_info.process()
print(pprint.pformat(json.loads(response.json_content)))
if __name__ == "__main__":
api_key = 'DP.xxx-api-key-xxx'
basePath = "C:/temp/dynamicpdf-api-samples/"
run(api_key, basePath)
In all six languages, the steps were similar. First, we created a new PdfResource
instance by loading the path to the PDF via the constructor. Next, we created a new instance of the PdfInfo
class, which abstracts the pdf-info
endpoint. Then the PdfInfo
instance prints the extracted PDF information as JSON after processing. Finally, we called the Process
method and print the resultant JSON to the console.