-
Notifications
You must be signed in to change notification settings - Fork 195
Tidying up more applies_to tags in the Troubleshooting section #4473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c86a9c3
f79d1c5
72e1371
7b2eb01
32de7e0
f5f594b
3f75fef
108597c
5768cb6
bf835c6
f657a0a
05dc0a5
2cce2ea
164ccf8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,65 +4,33 @@ | |
| - https://www.elastic.co/guide/en/elasticsearch/reference/current/increase-capacity-data-node.html | ||
| applies_to: | ||
| stack: | ||
| deployment: | ||
| eck: | ||
| ess: | ||
| ece: | ||
| self: | ||
| products: | ||
| - id: elasticsearch | ||
| --- | ||
|
|
||
| # Increase the disk capacity of data nodes [increase-capacity-data-node] | ||
|
|
||
| :::::::{tab-set} | ||
| Disk capacity pressures may cause index failures, unassigned shards, and cluster instability. | ||
|
|
||
| ::::::{tab-item} {{ech}} | ||
| In order to increase the disk capacity of the data nodes in your cluster: | ||
| {{es}} uses [disk-based shard allocation watermarks](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#disk-based-shard-allocation) to manage disk space on nodes, which can block allocation or indexing when nodes run low on disk space. Refer to [](/troubleshoot/elasticsearch/fix-watermark-errors.md) for additional details on how to address this situation. | ||
|
|
||
| 1. Log in to the [{{ecloud}} console](https://cloud.elastic.co?page=docs&placement=docs-body). | ||
| 2. On the **Hosted deployments** panel, click the gear under the `Manage deployment` column that corresponds to the name of your deployment. | ||
| 3. If autoscaling is available but not enabled, enable it. You can do this by clicking the button `Enable autoscaling` on a banner like the one below: | ||
| To increase the disk capacity of the data nodes in your cluster, complete these steps: | ||
|
|
||
| :::{image} /troubleshoot/images/elasticsearch-reference-autoscaling_banner.png | ||
| :alt: Autoscaling banner | ||
| :screenshot: | ||
| ::: | ||
| 1. [Estimate how much disk capacity you need](#estimate-required-capacity). | ||
| 1. [Increase the disk capacity](#increase-disk-capacity-of-data-nodes). | ||
|
|
||
| Or you can go to `Actions > Edit deployment`, check the checkbox `Autoscale` and click `save` at the bottom of the page. | ||
|
|
||
| :::{image} /troubleshoot/images/elasticsearch-reference-enable_autoscaling.png | ||
| :alt: Enabling autoscaling | ||
| :screenshot: | ||
| ::: | ||
| ## Estimate the amount of required disk capacity [estimate-required-capacity] | ||
|
|
||
| 4. If autoscaling has succeeded the cluster should return to `healthy` status. If the cluster is still out of disk, check if autoscaling has reached its limits. You will be notified about this by the following banner: | ||
| The following steps explain how to retrieve the current disk watermark configuration of the cluster and how to check the current disk usage on the nodes. | ||
|
|
||
| :::{image} /troubleshoot/images/elasticsearch-reference-autoscaling_limits_banner.png | ||
| :alt: Autoscaling banner | ||
| :screenshot: | ||
| ::: | ||
|
|
||
| or you can go to `Actions > Edit deployment` and look for the label `LIMIT REACHED` as shown below: | ||
|
|
||
| :::{image} /troubleshoot/images/elasticsearch-reference-reached_autoscaling_limits.png | ||
| :alt: Autoscaling limits reached | ||
| :screenshot: | ||
| ::: | ||
|
|
||
| If you are seeing the banner click `Update autoscaling settings` to go to the `Edit` page. Otherwise, you are already in the `Edit` page, click `Edit settings` to increase the autoscaling limits. After you perform the change click `save` at the bottom of the page. | ||
| :::::: | ||
|
|
||
| ::::::{tab-item} Self-managed | ||
| In order to increase the data node capacity in your cluster, you will need to calculate the amount of extra disk space needed. | ||
|
|
||
| 1. First, retrieve the relevant disk thresholds that will indicate how much space should be available. The relevant thresholds are the [high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) for all the tiers apart from the frozen one and the [frozen flood stage watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage-frozen) for the frozen tier. The following example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark: | ||
| 1. Retrieve the relevant disk thresholds that indicate how much space should be available. The relevant thresholds are the [high watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high) for all the tiers apart from the frozen one and the [frozen flood stage watermark](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-flood-stage-frozen) for the frozen tier. The following example demonstrates disk shortage in the hot tier, so only the high watermark is retrieved: | ||
|
|
||
| ```console | ||
| GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* | ||
| ``` | ||
|
|
||
| The response will look like this: | ||
| The response looks like this: | ||
|
|
||
| ```console-result | ||
| { | ||
|
|
@@ -83,33 +51,138 @@ | |
| } | ||
| ``` | ||
|
|
||
| The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have more than 150GB available, read more on how this threshold works [here](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). | ||
| The above means that in order to resolve the disk shortage, disk usage must drop below the 90% or have more than 150GB available. Read more on how this threshold works [here](elasticsearch://reference/elasticsearch/configuration-reference/cluster-level-shard-allocation-routing-settings.md#cluster-routing-watermark-high). | ||
|
|
||
| 2. The next step is to find out the current disk usage, this will indicate how much extra space is needed. For simplicity, our example has one node, but you can apply the same for every node over the relevant threshold. | ||
| 1. Find the current disk usage, which in turn indicates how much extra space is required. For simplicity, our example has one node, but you can apply the same for every node over the relevant threshold. | ||
|
|
||
| ```console | ||
| GET _cat/allocation?v&s=disk.avail&h=node,disk.percent,disk.avail,disk.total,disk.used,disk.indices,shards | ||
| ``` | ||
|
|
||
| The response will look like this: | ||
| The response looks like this: | ||
|
|
||
| ```console-result | ||
| node disk.percent disk.avail disk.total disk.used disk.indices shards | ||
| instance-0000000000 91 4.6gb 35gb 31.1gb 29.9gb 111 | ||
| ``` | ||
|
|
||
| 3. The high watermark configuration indicates that the disk usage needs to drop below 90%. To achieve this, 2 things are possible: | ||
| In this scenario, the high watermark configuration indicates that the disk usage needs to drop below 90%, while the current disk usage is 91%. | ||
|
|
||
|
|
||
| ## Increase the disk capacity of your data nodes [increase-disk-capacity-of-data-nodes] | ||
|
|
||
| Here are the most common ways to increase disk capacity: | ||
|
|
||
| * You can expand the disk space of the existing nodes. This is typically achieved by replacing your nodes with ones with higher capacity. | ||
| * You can add additional data nodes to the data tier that is short of disk space, increasing the overall capacity of that tier and potentially improving performance by distributing data and workload across more resources. | ||
|
|
||
| When you add another data node, the cluster doesn't recover immediately and it might take some time until shards are relocated to the new node. | ||
| You can check the progress with the following API call: | ||
|
|
||
| ```console | ||
| GET /_cat/shards?v&h=state,node&s=state | ||
| ``` | ||
|
|
||
| If in the response the shards' state is `RELOCATING`, it means that shards are still moving. Wait until all shards turn to `STARTED`. | ||
|
|
||
| :::::::{applies-switch} | ||
|
|
||
| ::::::{applies-item} { ess:, ece: } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Open question for ECE and ECH (cc: @shainaraskas and @yetanothertw ). No need to address this in this PR at this moment, but maybe we want to register this somewhere else.
Not sure if it's worthy to add that info.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I definitely think this is valid information. It might make more sense to tackle it in a separate PR as there're a few places that could benefit from this change. Would it make sense to tie that to Shaina's draft issue or will I open a separate one?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm totally ok to tie that to Shaina's draft.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please add another one
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Opened #4552 |
||
|
|
||
| :::{warning} | ||
| :applies_to: ece: | ||
| In ECE, resizing is limited by your [allocator capacity](/deploy-manage/deploy/cloud-enterprise/ece-manage-capacity.md). | ||
| ::: | ||
|
|
||
| To increase the disk capacity of the data nodes in your cluster: | ||
|
|
||
| * to add an extra data node to the cluster (this requires that you have more than one shard in your cluster), or | ||
| * to extend the disk space of the current node by approximately 20% to allow this node to drop to 70%. This will give enough space to this node to not run out of space soon. | ||
| 1. Log in to the [{{ecloud}} console](https://cloud.elastic.co?page=docs&placement=docs-body) or ECE Cloud UI. | ||
| 1. On the home page, find your deployment and select **Manage**. | ||
| 1. Go to **Actions** > **Edit deployment** and check that autoscaling is enabled. Adjust the **Enable Autoscaling for** dropdown menu as needed and select **Save**. | ||
| 1. If autoscaling is successful, the cluster returns to a `healthy` status. | ||
| If the cluster is still out of disk, check if autoscaling has reached its set limits and [update your autoscaling settings](/deploy-manage/autoscaling/autoscaling-in-ece-and-ech.md#ec-autoscaling-update). | ||
|
|
||
| 4. In the case of adding another data node, the cluster will not recover immediately. It might take some time to relocate some shards to the new node. You can check the progress here: | ||
| You can also add more capacity by adding more nodes to your cluster and targeting the data tier that may be short of disk. For more information, refer to [](/troubleshoot/elasticsearch/add-tier.md). | ||
|
|
||
| :::::: | ||
|
|
||
| ::::::{applies-item} { self: } | ||
| To increase the data node capacity in your cluster, you can [add more nodes](/deploy-manage/maintenance/add-and-remove-elasticsearch-nodes.md) to the cluster, or increase the disk capacity of existing nodes. Disk expansion procedures depend on your operating system and storage infrastructure and are outside the scope of Elastic support. In practice, this is often achieved by [removing a node from the cluster](https://www.elastic.co/search-labs/blog/elasticsearch-remove-node) and reinstalling it with a larger disk. | ||
|
|
||
| :::::: | ||
|
|
||
| ::::::{applies-item} { eck: } | ||
| To increase the disk capacity of data nodes in your {{eck}} cluster, you can either add more data nodes or increase the storage size of existing nodes. | ||
|
|
||
| **Option 1: Add more data nodes** | ||
|
|
||
| 1. Update the `count` field in your data node NodeSet to add more nodes: | ||
|
|
||
| ```yaml subs=true | ||
| apiVersion: elasticsearch.k8s.elastic.co/v1 | ||
| kind: Elasticsearch | ||
| metadata: | ||
| name: quickstart | ||
| spec: | ||
| version: {{version.stack}} | ||
| nodeSets: | ||
| - name: data-nodes | ||
| count: 5 # Increase from previous count | ||
| config: | ||
| node.roles: ["data"] | ||
| volumeClaimTemplates: | ||
| - metadata: | ||
| name: elasticsearch-data | ||
| spec: | ||
| accessModes: | ||
| - ReadWriteOnce | ||
| resources: | ||
| requests: | ||
| storage: 100Gi | ||
| ``` | ||
|
|
||
| 1. Apply the changes: | ||
|
|
||
| ```sh | ||
| kubectl apply -f your-elasticsearch-manifest.yaml | ||
| ``` | ||
|
|
||
| ECK automatically creates the new nodes and {{es}} will relocate shards to balance the load. You can monitor the progress using: | ||
|
|
||
| ```console | ||
| GET /_cat/shards?v&h=state,node&s=state | ||
| ``` | ||
|
|
||
| If in the response the shards' state is `RELOCATING`, it means that shards are still moving. Wait until all shards turn to `STARTED` or until the health disk indicator turns to `green`. | ||
| :::::: | ||
| **Option 2: Increase storage size of existing nodes** | ||
|
|
||
| 1. If your storage class supports [volume expansion](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#expanding-persistent-volumes-claims), you can increase the storage size in the `volumeClaimTemplates`: | ||
|
|
||
| ```yaml subs=true | ||
| apiVersion: elasticsearch.k8s.elastic.co/v1 | ||
| kind: Elasticsearch | ||
| metadata: | ||
| name: quickstart | ||
| spec: | ||
| version: {{version.stack}} | ||
| nodeSets: | ||
| - name: data-nodes | ||
| count: 3 | ||
| config: | ||
| node.roles: ["data"] | ||
| volumeClaimTemplates: | ||
| - metadata: | ||
| name: elasticsearch-data | ||
| spec: | ||
| accessModes: | ||
| - ReadWriteOnce | ||
| resources: | ||
| requests: | ||
| storage: 200Gi # Increased from previous size | ||
| ``` | ||
|
|
||
| 1. Apply the changes. If the volume driver supports `ExpandInUsePersistentVolumes`, the filesystem will be resized online without restarting {{es}}. Otherwise, you may need to manually delete the Pods after the resize so they can be recreated with the expanded filesystem. | ||
|
Check notice on line 183 in troubleshoot/elasticsearch/increase-capacity-data-node.md
|
||
|
|
||
| ::::::: | ||
| For more information, refer to [](/deploy-manage/deploy/cloud-on-k8s/update-deployments.md) and [](/deploy-manage/deploy/cloud-on-k8s/volume-claim-templates.md). | ||
|
|
||
| :::::: | ||
| ::::::: | ||
Uh oh!
There was an error while loading. Please reload this page.