Serverless PDF Generation from HTML (WYSIWYG as PDF)

Prerequisite:

How do you generate PDF?

Puppeteer:

Puppeteer is a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. By default, it launches in headless mode and can do everything a modern browser can, including rendering HTML with CSS. It is under Apache License 2.0 and comes with permission for commercial use.

Show me the code

const puppeteer = require('puppeteer');const generatePDF = async (pageUrl, newPdfFileName) => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(pageUrl, {
waitUntil: 'networkidle0'
});
await page.pdf({
path: `/tmp/${newPdfFileName}`,
format: 'a4',
printBackground: true
});
await browser.close();
}

What does the above code do?

Sample Usage:

generatePDF(‘https://gsswain.com/resume', 'GSSwain-Resume.pdf');

Do I need to host the HTML pages publicly?

How does this solution help?

Have you used this on production?

Generating PDF from HTML on AWS Lambda

  • The Lambdasits behind API Gateway. The API accepts a request payload in JSON format. This payload must contain an url for which a PDF needs to be generated. The Lambda generates the required PDF and puts it in a S3 bucket. Finally the API responds with a 201 HTTP status code and a location header containing the S3 object URL. (For supporting CORS one needs to return the S3 URL in the response body as well.)
  • For this example, the S3 bucket has a bucket policy which only allows public access to objects tagged with public=yes.(You should ideally block public access to S3 and either share a S3 Signed URL or a CloudFront Signed URL)
  • We also have a Usage plan to restrict the maximum number of requests one can make and an API key is mandatory to access the API.
  • The Puppeteer dependency is put into a Lambda Layer. Instead of using the puppeteer npm library we need to use chrome-aws-lambda as per the troubeleshooting guideline here.

Lambda Source Code

SAM template (template.yaml)

Lambda Handler (pdf-generator/src/app.js)

Pdf Generation Request Handler (pdf-generator/src/pdf-generation-request-handler.js)

Pdf Generation Request Adapter (pdf-generator/src/pdf-generation-request-adapter.js)

Pdf Generation Service (pdf-generator/src/pdf-generation-service.js)

Pdf Storage Service (pdf-generator/src/s3-pdf-storage-service.js)

PDF Generation Request (pdf-generator/src/pdf-storage-request.js)

S3 PDF Storage Request Adapter (pdf-generator/src/s3-pdf-storage-request-adapter.js)

File Service (pdf-generator/src/file-service.js)

Pdf Generation Response Adapter (pdf-generator/src/pdf-generation-response-adapter.js)

Config (pdf-generator/src/config.js)

package.json (pdf-generator/package.json)

build.sh

.npmignore

Lambda Layer package.json (dependencies/nodejs/node14/package.json)

main.yml(.github/workflows/main.yml)

Parameters for sam local (sam-local-env.json)

Build on Local:

Test on Local:

Start the api

Sample Request:

curl -i -X POST \
http://127.0.0.1:3000/generate-pdf/ \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-d '{
"url": "https://gsswain.com/resume"
}'

Sample Response:

HTTP/1.0 201 CREATED
Content-Type: application/json
Access-Control-Allow-Origin: http://localhost:8080
location: https://dummyS3Url/3c8e8e4c-f7b4-4f5b-8e85-8dc06f8a22c3_1615739992609_353075938260911200.pdf
Content-Length: 105
Server: Werkzeug/1.0.1 Python/3.8.8
Date: Sun, 14 Mar 2021 16:39:57 GMT
{"pdfUrl":"https://dummyS3Url/3c8e8e4c-f7b4-4f5b-8e85-8dc06f8a22c3_1615739992609_353075938260911200.pdf"}

Deploy on AWS:

Test the deployed solution

Sample Request:

curl -i -X POST \
https://<YOUR-API-ID>.execute-api.ap-southeast-2.amazonaws.com/Prod/generate-pdf/ \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-H 'x-api-key: <YOUR-API-KEY_GOES_HERE>' \
-d '{
"url": "https://gsswain.com/resume"
}'

Sample Response:

HTTP/2 201
content-type: application/json
content-length: 171
location: <GENERATED_PDF_URL>
date: Sun, 14 Mar 2021 15:26:36 GMT
x-amzn-requestid: 8aa9cd12-aaa9-4628-96db-e2d7b952feb5
access-control-allow-origin: https://gsswain.com
x-amz-apigw-id: cLuuTFlYSwMF5dw=
x-amzn-trace-id: Root=1-604e2b28-1cff94e27f34e4e35ec1a6e8;Sampled=0
{"pdfUrl":"<GENERATED_PDF_URL>"}

Cleanup:

Delete all objects in S3 bucket

Delete the CloudFormation Stack

Summary:

References:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
GSSwain

GSSwain

Technology enthusiast, learner, builder