DNSTrail uses iframes as screenshots. Trailcap pre-processes them for us.
To make this fast and secure, there are a few things we need to do:
- The
iframeHTML shouldn't have any sensitive information left over from when we captured it, like hiddenformelements holding sessions IDs. - The iframe HTML file should be completely self-contained (inline styles, images).
- The iframe HTML should not include any javascript (and it's displayed with the
sandboxattribute anyway) - the iframe should be as small as possible.
Trailcap brute-force minimizes an HTML file. It uses puppeteer to render screenshots while it tries to remove:
- Each DOM element
- Each attribute
- Each member of a "class" attribute.
It only considers a modification a success if the result of a render is a pixel-perfect match for the original file.
Basically, it trys to reduce an HTML file to the minimum required to maintain its visual presentation.
This is really unavoidable. Every modification triggers a Blink render, PNG generation, and an image compare. On my maxed-out 2017 iMac, the Cloudflare admin page (~90K) takes about 5 minutes to process, and gets the page down to 38K.
The CNN homepage (1.2MB) takes much longer (1h), and is reduced to 80K.
- Load the page you'd like to capture in Chrome.
- Use Devtools Inspector to edit visual things you'd like (example.com, user@example.com). Obscure API keys, any domain and usernames, etc.
- Export the HTML file with the "SinglePage" Chrome extension. This inlines images, styles, and removes Javascript.
- Run that file through Trailcap. Wait 15 to 20 hours.
- Use DNSTrail's interactive editor to mark-up your PageCap.
Note, I haven't converted this to an installable executable yet, it's just index.js, but the rest of the command line usage should be accurate.
trailcap [--verbose] [--dump] [--diff] [--show] [--phase phase0] inputfile.html
-
--verbose(or-V) prints activity to STDERR while it operates. -
--dump(or-d) will writesnap-%d.pngimages every time a modification is attempted, and before it's compared to the pristine initial page.pristine.pngis also generated. -
--diff(or-D) will write adiff-%d.pngimage for each modification it makes that does not match the pristine page. -
--show(or-s) Will take puppeteer out of headless mode so you can see what its doing. This can lead to false-positives, as if you interact with that instance of Chrome, it'll be caught by puppeteer's screenshot (including scrolling) -
--phase PHASE(or-p) selects the phases trailcap will use.PHASEmust be one ofdenode,deattribute,declass,purgecssorminify.The
--phaseoption may be given multiple times, and defaults to--phase denode --phase deattribute.Note that the
declassphase is particularly slow, but normally very useful.purgecssandminifyare pretty fast, but mangle the output enough to make it hard to reason with.