Skip to content

Allow spliting the data portion into multiple files. #101

@Nodja

Description

@Nodja

I wanted to host a chat analytics page on github pages since GH Pages seems ideal to host a static file like this, but the discord server I'm using is almost 400MB of data and there's a 100MB file size limit. I can use LFS but I've run into bandwidth issue in the past.

The solution would be to split the data into separate files.

I've managed to do this manually, but it would be ideal if this was supported natively, since my method is not optimal. Here's how I achieved it:

  • Generate a report.html file like normal
  • Extract the contents of the script tag with the 'data' id to a separate report.txt file, leave the <script> tag there empty as we're gonna update it later.
  • Split the file into 25MB chunks. I used a python script
split.py
chunk_size = 25 * 1024 * 1024

prefix = "report"

with open(prefix + ".txt", "rb") as input_file:
    chunk_number = 1
    while True:
        chunk_data = input_file.read(chunk_size)
        if not chunk_data:
            break

        output_file_name = f"{prefix}-{str(chunk_number).zfill(3)}.txt"

        with open(output_file_name, "wb") as output_file:
            output_file.write(chunk_data)

        chunk_number += 1
  • Load the split files using a new script tag, the script tag needs to be in the <head> tag so it loads before anything else. I used an external file like so: <script src="loader.js"></script> but inline is fine.
loader.js
var files = [
  "report-001.txt",
  "report-002.txt",
  "report-003.txt",
  "report-004.txt",
  "report-005.txt"
  // etc.
];

var data = "";

for (let i = 0; i < files.length; i++) {
  let file = files[i];
  let rawFile = new XMLHttpRequest();
  rawFile.open("GET", file, false);
  rawFile.onreadystatechange = function () {
    if (rawFile.readyState === 4) {
      if (rawFile.status === 200 || rawFile.status == 0) {
        data += rawFile.responseText;
      }
    }
  };
  rawFile.send(null);
}

var data_script = document.getElementById("data");
data_script.innerHTML = data;

That's it. The only issue is that rendering is completely locked until all the chunks are downloaded. It would be best if loading was supported natively instead of a hack like this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions