|
| 1 | +# nbgitpuller - downloader plugin documentation |
| 2 | + |
| 3 | +nbgitpuller uses [pluggy](https://pluggy.readthedocs.io/en/stable/) as a framework |
| 4 | +to load any installed nbgitpuller-downloader plugins. There are three downloader plugins |
| 5 | +available right now: |
| 6 | +- [nbgitpuller-downloader-googledrive](https://github.com/jupyterhub/nbgitpuller-downloader-googledrive) |
| 7 | +- [nbgitpuller-downloader-dropbox](https://github.com/jupyterhub/nbgitpuller-downloader-dropbox) |
| 8 | +- [nbgitpuller-downloader-generic-web](https://github.com/jupyterhub/nbgitpuller-downloader-generic-web) |
| 9 | + |
| 10 | + |
| 11 | +There are several pieces to be aware of for the plugin to work correctly: |
| 12 | +1. The setup.cfg(or setup.py) file must have the entry_points definition. |
| 13 | +For example: |
| 14 | +> [options.entry_points] |
| 15 | +nbgitpuller = |
| 16 | + dropbox=nbgitpuller_downloader_dropbox.dropbox_downloader |
| 17 | + |
| 18 | +2. The file referenced for use by nbgitpuller in the plug-in (the above example is looking for the |
| 19 | +file, dropbox_downloader) must implement the function handle_files(query_line_args) and be decorated with `@hookimpl`. |
| 20 | +3. As a consequence of this, the following must be imported: |
| 21 | + - `from nbgitpuller.hookspecs import hookimpl` |
| 22 | +4. The implementation of the handle_files function in your plugin needs to return |
| 23 | + two pieces of information: |
| 24 | + - the name of the folder, the archive is in after decompression |
| 25 | + - the path to the local git repo mimicking a remote origin repo |
| 26 | + |
| 27 | +nbgitpuller provides a function in plugin_helper.py called handle_files_helper that handles the downloading |
| 28 | +and returning of the correct information if given a URL, the extension of the |
| 29 | +file to decompress(zip or tar) and the progress function(I will describe that |
| 30 | +more later) but you are welcome to implement the functionality of handle_files_helper in your |
| 31 | +plug-in. There may be use cases not covered by the currently available plugins like needing to authenticate against |
| 32 | +the webserver or service where your archive is kept. Either way, it behooves you |
| 33 | +to study the handle_files_helper function in nbgitpuller to get a sense of how this function |
| 34 | +is implemented. |
| 35 | + |
| 36 | +For the rest of the steps, I refer you to the [nbgitpuller-downloader-dropbox](https://github.com/jupyterhub/nbgitpuller-downloader-dropbox) plugin. |
| 37 | + ``` |
| 38 | + @hookimpl |
| 39 | + def handle_files(query_line_args): |
| 40 | + query_line_args["repo"] = query_line_args["repo"].replace("dl=0", "dl=1") # dropbox: dl set to 1 |
| 41 | + ext = determine_file_extension(query_line_args["repo"])` |
| 42 | + query_line_args["extension"] = ext |
| 43 | + loop = asyncio.get_event_loop() |
| 44 | + tasks = handle_files_helper(query_line_args), query_line_args["progress_func"]() |
| 45 | + result_handle, _ = loop.run_until_complete(asyncio.gather(*tasks)) |
| 46 | + return result_handle |
| 47 | + ``` |
| 48 | + |
| 49 | +The following pieces describe what happens in handle_files before, at least, in this case, we call |
| 50 | +the handle_files_helper function: |
| 51 | + |
| 52 | +1) The parameter, query_line_args, is all the query line arguments you include on the nbgitpuller link. This means you |
| 53 | + can put keyword arguments into your nbgitpuller links and have access to these arguments in the handle_files |
| 54 | + function. |
| 55 | + For example, you might set up a link like this: |
| 56 | + http://[your hub]/hub/user-redirect/git-pull?repo=[link to your archive]&keyword1=value1&keyword2=value2&provider=dropbox&urlpath=tree%2F%2F |
| 57 | + In your handle_files function, you could make this call to get your custom arguments: |
| 58 | + |
| 59 | + ``` |
| 60 | + query_line_args["keyword1"] |
| 61 | + query_line_args["keyword2"] |
| 62 | + ``` |
| 63 | +2) The query_line_args parameter also includes the progress function used to monitor the download_q |
| 64 | + for messages; messages in the download_q are written to the UI so users can see the progress and |
| 65 | + steps being taken to download their archives. You will notice the progress function is passed into |
| 66 | + handle_files_helper and accessed like this: |
| 67 | + ``` |
| 68 | + query_line_args["progress_func"] |
| 69 | + query_line_args["download_q"] |
| 70 | + ``` |
| 71 | +3) The first line of the handle_files function for the dropbox downloader is specific to DropBox. The URL to a file |
| 72 | + in DropBox contains one URL query parameter(dl=0). This parameter indicates to Dropbox whether to download the |
| 73 | + file or open it in their browser-based file system. In order to download the file, this parameter |
| 74 | + needs to be changed to dl=1. |
| 75 | +4) The next line determines the file extension (zip, tar.gz, etc). |
| 76 | + This is added to the query_lines_args map and passed off to the handle_files_helper to |
| 77 | + help the application know which utility to use to decompress the archive -- unzip or tar -xzf. |
| 78 | +5) Since we don't want the user to have to wait while the download process finishes, we have made |
| 79 | + downloading of the archive a non-blocking process using the package asyncio. Here are the steps: |
| 80 | + - get the event loop |
| 81 | + - setup two tasks: |
| 82 | + - a call to the handle_files_helper with our arguments |
| 83 | + - the progress_loop function |
| 84 | + - execute the two tasks in the event loop. |
| 85 | +6) The function returns two pieces of information to nbgitpuller: |
| 86 | + - the directory name of the decompressed archive |
| 87 | + - the local_origin_repo path. |
| 88 | + |
| 89 | +The details of what happens in handle_files_helper can be found by studying the function. Essentially, the archive is downloaded, decompressed, and set up in a file |
| 90 | +system to act as a remote repository(e.g. the local_origin_repo path). |
| 91 | + |
| 92 | + |
0 commit comments