Skip to content

Commit 524c1a3

Browse files
authored
Updated readme to use aws CLI and fixed paths. Updated spark_sql version (#233)
1 parent 0b81d2e commit 524c1a3

File tree

2 files changed

+27
-9
lines changed

2 files changed

+27
-9
lines changed

emr-serverless-spark/README.md

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,28 @@ We will run a Java Spark job on EMR Serverless using a simple Java "Hello World"
55
## Prerequisites
66

77
* LocalStack
8-
* `aws` CLI & `awslocal` script
8+
* `aws` CLI
99
* Docker
1010
* Java and Maven
1111

1212
## Installation
1313

14+
### Configuring a custom profile
15+
Configure a custom profile to use with LocalStack. Add the following profile to your AWS configuration file (by default, this file is at ~/.aws/config):
16+
```shell
17+
[profile localstack]
18+
region=us-east-1
19+
output=json
20+
endpoint_url = http://localhost:4566
21+
```
22+
23+
Add the following profile to your AWS credentials file (by default, this file is at ~/.aws/credentials):
24+
```shell
25+
[localstack]
26+
aws_access_key_id=test
27+
aws_secret_access_key=test
28+
```
29+
1430
Before creating the EMR Serverless job, we need to create a JAR file containing the Java code. We have the `java-demo-1.0.jar` file in the current directory. Alternatively, you can create the JAR file yourself by following the steps below.
1531

1632
```bash
@@ -21,14 +37,15 @@ mvn package
2137
Next, we need to create an S3 bucket to store the JAR file. To do this, run the following command:
2238

2339
```bash
40+
cd ..
2441
export S3_BUCKET=test
25-
awslocal s3 mb s3://$S3_BUCKET
42+
aws s3 mb s3://$S3_BUCKET
2643
```
2744

2845
You can now copy the JAR file from your current directory to the S3 bucket:
2946

3047
```bash
31-
awslocal s3 cp java-demo-1.0.jar s3://${S3_BUCKET}/code/java-spark/
48+
aws s3 cp hello-world/target/java-demo-1.0.jar s3://${S3_BUCKET}/code/java-spark/java-demo-1.0.jar
3249
```
3350

3451
## Creating the EMR Serverless Job
@@ -42,7 +59,7 @@ export JOB_ROLE_ARN=arn:aws:iam::000000000000:role/emr-serverless-job-role
4259
We can now create an EMR Serverless application, which will run Spark 3.3.0. Run the following command:
4360

4461
```bash
45-
awslocal emr-serverless create-application \
62+
aws emr-serverless create-application \
4663
--type SPARK \
4764
--name serverless-java-demo \
4865
--release-label "emr-6.9.0" \
@@ -73,7 +90,7 @@ export APPLICATION_ID='<application-id>'
7390
Start the EMR Serverless application:
7491

7592
```shell
76-
awslocal emr-serverless start-application \
93+
aws emr-serverless start-application \
7794
--application-id $APPLICATION_ID
7895
```
7996

@@ -82,7 +99,7 @@ awslocal emr-serverless start-application \
8299
You can now run the EMR Serverless job:
83100

84101
```bash
85-
awslocal emr-serverless start-job-run \
102+
aws emr-serverless start-job-run \
86103
--application-id $APPLICATION_ID \
87104
--execution-role-arn $JOB_ROLE_ARN \
88105
--job-driver '{
@@ -103,6 +120,7 @@ awslocal emr-serverless start-job-run \
103120
The Spark logs will be written to the S3 bucket specified in the `logUri` parameter. You can stop the EMR Serverless application with the following command:
104121

105122
```bash
106-
awslocal emr-serverless stop-application \
123+
aws emr-serverless stop-application \
107124
--application-id $APPLICATION_ID
125+
108126
```

emr-serverless-spark/hello-world/pom.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@
88
<dependencies>
99
<dependency> <!-- Spark dependency -->
1010
<groupId>org.apache.spark</groupId>
11-
<artifactId>spark-sql_2.12</artifactId>
12-
<version>3.3.0</version>
11+
<artifactId>spark-sql_2.13</artifactId>
12+
<version>3.5.0</version>
1313
<scope>provided</scope>
1414
</dependency>
1515
</dependencies>

0 commit comments

Comments
 (0)