elastic · seanstory · Jan 12, 2024 · Dec 14, 2023 · Dec 18, 2023 · Dec 18, 2023
diff --git a/example-apps/internal-knowledge-search/.flaskenv b/example-apps/internal-knowledge-search/.flaskenv
@@ -0,0 +1,3 @@
+FLASK_APP=api/app.py
+FLASK_RUN_PORT=3001
+FLASK_DEBUG=1
diff --git a/example-apps/internal-knowledge-search/.gitignore b/example-apps/internal-knowledge-search/.gitignore
@@ -0,0 +1,7 @@
+frontend/build
+frontend/node_modules
+api/__pycache__
+.venv
+venv
+.DS_Store
+.env
diff --git a/example-apps/internal-knowledge-search/README.md b/example-apps/internal-knowledge-search/README.md
@@ -0,0 +1,179 @@
+# Elastic Internal Knowledge Search App
+
+This is a sample app that demonstrates how to build an internal knowledge search application with document-level security on top of Elasticsearch.
+
+**Requires at least 8.11.0 of Elasticsearch.**
+
+
+## Download the Project
+
+Download the project from Github and extract the `internal-knowledge-search` folder.
+
+```bash
+curl https://codeload.github.com/elastic/elasticsearch-labs/tar.gz/main | \
+tar -xz --strip=2 elasticsearch-labs-main/example-apps/internal-knowledge-search
+```
+
+## Installing and connecting to Elasticsearch
+
+### Install Elasticsearch
+
+There are a number of ways to install Elasticsearch. Cloud is best for most use-cases. Visit the [Install Elasticsearch](https://www.elastic.co/search-labs/tutorials/install-elasticsearch) for more information.
+
+### Connect to Elasticsearch
+
+This app requires the following environment variables to be set to connect to Elasticsearch:
+
+```sh
+export ELASTICSEARCH_URL=...
+export ELASTIC_USERNAME=...
+export ELASTIC_PASSWORD=...
+```
+
+You can add these to a `.env` file for convenience. See the `env.example` file for a .env file template.
+
+You can also set the `ELASTIC_CLOUD_ID` instead of the `ELASTICSEARCH_URL` if you're connecting to a cloud instance and prefer to use the cloud ID.
+
+# Workplace Search Reference App
+
+This application shows you how to build an application using [Elastic Search Applications](https://www.elastic.co/guide/en/enterprise-search/current/search-applications.html) for a Workplace Search use case.
+![img.png](img.png)
+
+The application uses the [Search Application Client](https://github.com/elastic/search-application-client). Refer to this [guide](https://www.elastic.co/guide/en/enterprise-search/current/search-applications-search.html) for more information.
+
+## Running the application
+
+### Configuring mappings (subject to change in the near future)
+
+The application uses two mapping files (will be replaced with a corresponding UI in the near future).
+One specifies the mapping of the documents in your indices to the rendered search result.
+The other one maps a source index to a corresponding logo.
+
+#### Data mapping
+
+The data mappings are located inside [config/documentsToSearchResultMappings.json](src/config/documentsToSearchResultMappings.json).
+Each entry maps the fields of the documents to the search result UI component for a specific index. The mapping expects `title`, `created`, `previewText`, `fullText`, and `link` as keys.
+Specify a field name of the document you want to map for each key.
+
+##### Example:
+
+Content document:
+
+````json
+{
+  "name": "Document name",
+  "_timestamp": "2342345934",
+  "summary": "Some summary",
+  "fullText": "description",
+  "link": "some listing url"
+}
+````
+
+Mapping:
+````json
+{
+  "search-mongo": {
+    "title": "name",
+    "created": "_timestamp",
+    "previewText": "summary",
+    "fullText": "description",
+    "link": "listing_url"
+  }
+}
+````
+
+#### Logo mapping
+You can specify a logo for each index behind the search application. Place your logo inside [data-source-logos](public/data-source-logos) and configure
+your mapping as follows:
+
+````json
+{
+  "search-index-1": "data-source-logos/some_logo.png",
+  "search-index-2": "data-source-logos/some_other_logo.webp"
+}
+````
+
+### Configuring the search application
+
+To be able to use the index filtering and sorting in the UI you should update the search template of your search application:
+
+`PUT _application/search_application/{YOUR_SEARCH_APPLICATION_NAME}`
+````json
+{
+  "indices": [{YOUR_INDICES_USED_BY_THE_SEARCH_APPLICATION}],
+  "template": {
+    "script": {
+      "lang": "mustache",
+      "source": """
+        {
+          "query": {
+            "bool": {
+              "must": [
+              {{#query}}
+              {
+                "query_string": {
+                  "query": "{{query}}"
+                }
+              }
+              {{/query}}
+            ],
+            "filter": {
+              "terms": {
+              "_index": {{#toJson}}indices{{/toJson}}
+            }
+            }
+            }
+          },
+          "from": {{from}},
+          "size": {{size}},
+          "sort": {{#toJson}}sort{{/toJson}}
+        }
+      """,
+      "params": {
+        "query": "",
+        "size": 10,
+        "from": 0,
+        "sort": [],
+        "indices": []
+      }
+    }
+  }
+````
+
+### Setting the search app variables
+
+You need to set search application name and search application endpoints to the corresponding values in the UI. You'll get these values when [creating a search application](https://www.elastic.co/guide/en/enterprise-search/current/search-applications.html). Note that for the endpoint you should use just the hostname, so excluding the `/_application/search_application/{application_name}/_search`.
+
+### Disable CORS
+
+By default, Elasticsearch is configured to disallow cross-origin resource requests. To call Elasticsearch from the browser, you will need to [enable CORS on your Elasticsearch deployment](https://www.elastic.co/guide/en/elasticsearch/reference/current/behavioral-analytics-cors.html#behavioral-analytics-cors-enable-cors-elasticsearch).
+
+If you don't feel comfortable enabling CORS on your Elasticsearch deployment, you can set the search endpoint in the UI to `http://localhost:3001/api/search_proxy`. Change the host if you're running the backend elsewhere. This will make the backend act as a proxy for the search calls, which is what you're most likely going to do in production.
+
+
+### Set up DLS with SPO
+1. create a connector in kibana named `search-sharepoint`
+2. start connectors-python, if using connector clients
+3. enable DLS
+4. run an access control sync
+5. run a full sync
+6. define mappings, as above in this README
+7. create search application
+8. enable cors: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-application-security.html#search-application-security-cors-elasticsearch
+
+### Change your API host
+
+By default, this app will run on `http://localhost:3000` and the backend on `http://localhost:3001`. If you are running the backend in a different location, set the environment variable `REACT_APP_API_HOST` to wherever you're hosting your backend, plus the `/api` path.
+
+
+### Run API and frontend
+
+```sh
+# Launch API app
+flask run
+
+# In a separate terminal launch frontend app
+cd app-ui && npm install && npm run start
+```
+
+You can now access the frontend at http://localhost:3000. Changes are automatically reloaded.
diff --git a/example-apps/internal-knowledge-search/api/app.py b/example-apps/internal-knowledge-search/api/app.py
@@ -0,0 +1,185 @@
+from flask import Flask, jsonify, request, Response, current_app
+from flask_cors import CORS
+from elasticsearch_client import elasticsearch_client
+import os
+import sys
+import requests
+
+app = Flask(__name__, static_folder="../frontend/build", static_url_path="/")
+CORS(app)
+
+
+def get_identities_index(search_app_name):
+    search_app = elasticsearch_client.search_application.get(
+        name=search_app_name)
+    identities_indices = elasticsearch_client.indices.get(
+        index=".search-acl-filter*")
+    secured_index = [
+        app_index
+        for app_index in search_app["indices"]
+        if ".search-acl-filter-" + app_index in identities_indices
+    ]
+    if len(secured_index) > 0:
+        identities_index = ".search-acl-filter-" + secured_index[0]
+        return identities_index
+    else:
+        raise ValueError(
+            "Could not find identities index for search application %s", search_app_name
+        )
+
+
+@app.route("/")
+def api_index():
+    return app.send_static_file("index.html")
+
+
+@app.route("/api/default_settings", methods=["GET"])
+def default_settings():
+    return {
+        "elasticsearch_endpoint": os.getenv("ELASTICSEARCH_URL") or "http://localhost:9200"
+    }
+
+
+@app.route("/api/search_proxy/<path:text>", methods=["POST"])
+def search(text):
+    response = requests.request(
+        method="POST",
+        url=os.getenv("ELASTICSEARCH_URL") + '/' + text,
+        data=request.get_data(),
+        allow_redirects=False,
+        headers={"Authorization": request.headers.get(
+            "Authorization"), "Content-Type": "application/json"}
+    )
+
+    return response.content
+
+
+@app.route("/api/persona", methods=["GET"])
+def personas():
+    try:
+        search_app_name = request.args.get("app_name")
+        identities_index = get_identities_index(search_app_name)
+        response = elasticsearch_client.search(
+            index=identities_index, size=1000)
+        hits = response["hits"]["hits"]
+        personas = [x["_id"] for x in hits]
+        personas.append("admin")
+        return personas
+
+    except Exception as e:
+        current_app.logger.warn(
+            "Encountered error %s while fetching personas, returning default persona", e
+        )
+        return ["admin"]
+
+
+@app.route("/api/indices", methods=["GET"])
+def indices():
+    try:
+        search_app_name = request.args.get("app_name")
+        search_app = elasticsearch_client.search_application.get(
+            name=search_app_name)
+        return search_app['indices']
+
+    except Exception as e:
+        current_app.logger.warn(
+            "Encountered error %s while fetching personas, returning default persona", e
+        )
+        return ["admin"]
+
+
+@app.route("/api/api_key", methods=["GET"])
+def api_key():
+    search_app_name = request.args.get("app_name")
+    role_name = search_app_name + "-key-role"
+    default_role_descriptor = {}
+    default_role_descriptor[role_name] = {
+        "cluster": [],
+        "indices": [
+            {
+                "names": [search_app_name],
+                "privileges": ["read"],
+                "allow_restricted_indices": False,
+            }
+        ],
+        "applications": [],
+        "run_as": [],
+        "metadata": {},
+        "transient_metadata": {"enabled": True},
+        "restriction": {"workflows": ["search_application_query"]},
+    }
+    identities_index = get_identities_index(search_app_name)
+    try:
+        persona = request.args.get("persona")
+        if persona == "":
+            raise ValueError("No persona specified")
+        role_descriptor = {}
+
+        if persona == "admin":
+            role_descriptor = default_role_descriptor
+        else:
+            identity = elasticsearch_client.get(
+                index=identities_index, id=persona)
+            permissions = identity["_source"]["query"]["template"]["params"][
+                "access_control"
+            ]
+            role_descriptor = {
+                "dls-role": {
+                    "cluster": ["all"],
+                    "indices": [
+                        {
+                            "names": [search_app_name],
+                            "privileges": ["read"],
+                            "query": {
+                                "template": {
+                                    "params": {"access_control": permissions},
+                                    "source": """{
+                                "bool": {
+                                    "filter": {
+                                        "bool": {
+                                            "should": [
+                                                {
+                                                    "bool": {
+                                                        "must_not": {
+                                                            "exists": {
+                                                                "field": "_allow_access_control"
+                                                            }
+                                                        }
+                                                    }
+                                                },
+                                                {
+                                                    "terms": {
+                                                        "_allow_access_control.enum": {{#toJson}}access_control{{/toJson}}
+                                                    }
+                                                }
+                                            ]
+                                        }
+                                    }
+                                }
+                            }""",
+                                }
+                            },
+                        }
+                    ],
+                    "restriction": {"workflows": ["search_application_query"]},
+                }
+            }
+        api_key = elasticsearch_client.security.create_api_key(
+            name=search_app_name+"-internal-knowledge-search-example-"+persona, expiration="1h", role_descriptors=role_descriptor)
+        return {"api_key": api_key['encoded']}
+
+    except Exception as e:
+        current_app.logger.warn(
+            "Encountered error %s while fetching api key", e)
+        raise e
+
+
+@app.cli.command()
+def create_index():
+    """Create or re-create the Elasticsearch index."""
+    basedir = os.path.abspath(os.path.dirname(__file__))
+    sys.path.append(f"{basedir}/../")
+
+
+if __name__ == "__main__":
+    app.run(port=3001, debug=True)