[GV-22] Integrate usageAnalytics microservice into GV #40

qanagattandyr · 2025-04-11T00:04:45Z

Jira Ticket

Description

Create a new router and connect to the existing schema. Added feature is usageAnalytics, added new tab, updated docker.

Test the full integration with GradeView.

Type of Change

Bug fix (non-breaking change which fixes an issue)
[ x] New feature (non-breaking change which adds functionality)
Refactoring (non-breaking change)
Breaking change (fix or feature that would change existing functionality)

Changes

website, docker-compose, api/v2 edited.

Testing

Tested by running locally on different laptops.

Checklist

My branch name matches the format: <ticket-id>/<brief-description-of-change>
My PR name matches the format: [<ticket-id>] <brief-description-of-change>
I have added doc-comments to all new functions (JSDoc for JS and Docstrings for Python)
I have reviewed all of my code
My code only contains major changes related to my ticket

Screenshots/Video

Additional Notes

Set up for production (needed for security)

Remove untracked files

…to main

set up proper rate limiting

* chore: remove unused dependencies in api * feat: update dockerfile for prod * feat: add v1 version of server to track old server * refactor: remove deprecated routes and functions from new server * chore: remove unused packages from website * chore: remove unused packages from web server * feat: update web server for prod

* Add logic for redirecting to the error page * Add errors for getEntry. Add error handling for getStudent. Fix bug for checking if an email is a student. The bug is due to not using the await keyword * Fix security issue for printing out the invalid key in the logs. * Revert home.js back - This redirect will be given to another ticket: GV-30 * Improve error handling in backend API calls - Added specific error handling for missing authorization headers (401) - Ensured proper HTTP status codes (404, 500) are returned for key API operations - Updated error logging for better debugging * chore: return 4xx and 5xx errors for the API calls * chore: Add KeyNotFoundError to grades API for 404 error handling * chore: change 2 lines for styling purposes in progressQueryString * fix: handle database errors, 4xx, and 5xx errors for several API endpoints * fix: Change error switch case to use err.name. typeoferror does not identify the error name * Fix code scanning alert no. 1: Use of externally-controlled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Update students/index.js error message * Fix code scanning alert no. 6: Use of externally-controlled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Fix code scanning alert no. 4: Use of externally-controlled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Manually merge changes from main * fix: Security vulnerability for throwing errors regarding student email * refactor(lint): reduce scope of local vars --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: Connor Bernard <connorbernard@berkeley.edu>

* feat: add endpoint for accepting file uploads * feat: add support for indexing concept map schema * Fix code scanning alert no. 8: Missing rate limiting Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * fix(sec): limit scope of unsync to uploads folder --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* refactor(structure): move errors to lib * feat: add native logging * docs: add doc comment to morgan

* fix: ensure current internal proxying doesn't throw errors * feat: add dev config for live updates

* Added label to profile button * GV-42:Ensure accessibility standards are met * GV-42/ensure-accessibility-standards-are-met * GV-42/ensure-accessibility-standards-are-met * GV-42

…d list as the selected student (#20)

* Add schema validation for JSON input to CM * Update requirements.txt * Raise error when validation fails * Add functionality to prevent injection and DDOS attacks * Add functionality to prevent injection and DDOS targeting input text file * Add os to requirements.txt

This reverts commit bb6e622.

* Rough draft of index.test.js. Need to look through and check that everything works, plus need to add to package.json * Set up dependencies, final step recheck index.test.js * Changes so far, going to hard reset to see if i can get dependencies to work * Changes that Paulina helped me with, to be revisited if my current approach doesnt work * Very very close. Need to amend the tests in index.test.js, but dependencies have been fixed * Almost done. Fix 4th test case * Complete index.test.js with wanted tests * Fix: readded module:type line to package.json

* Sync CM config with Fa24 HAID and add SID of duplicate of Fa24 HAID * Update parser to handle commas, colons, etc within node names + remove sp. character from wordle-lite node name * Remove temporary comment * Restore original formatting of student and class levels * Update Berkeley_CS10.txt * Allow all characters other than left bracket in node names * Revert to old dev sheet ID

…30) * Process post request with mastery learning info * Restore index method to original state * Generate CM from post parameters * Access nested student and class mastery keys * Bug fix + return correct error code * Fix validation bug + load constants from file * Attempt to POST CM through the iframe + transfer CM-200 changes * Implement api endpoint to return mastery mapping and post to CM in iframe * Make dedicated router for mastery mapping + cleanup * Formatting changes * Formatting changes * Formatting changes * Formatting change * Misc. changes + optimize getTopicsFromUser * Rename variables per convention Co-authored-by: Connor <connorbernard@berkeley.edu> * Nest async function within useEffect * Minor change * Use axios to retrieve CM html * Remove error handling * Mark deprecated functions as such --------- Co-authored-by: Connor <connorbernard@berkeley.edu>

* Updated Grading Breakdown Segment * Updated Buckets segment of Buckets

* feat: update to grid2 stable * chore: update outdated dependencies

* feat: update cm to SP25 * fix: revert style update

* Added project 3 to CM * Added midterm, which was previously not added * Added Fractal portion of Midterm

linear · 2025-04-11T00:04:47Z

GV-22 Integrate usageAnalytics microservice into GV

gitguardian · 2025-04-11T00:05:16Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
14519307	Triggered	Google Cloud Keys	`be17226`	api/auth/service_account.json	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

usageAnalytics/app.py

+    overall, average, graph_file = log_analytics.analyze_csv_logs(filepath, title)
+
+    # Read the graph file
+    with open(graph_file, "rb") as img_file:


To fix the problem, we need to ensure that the constructed file paths are contained within the intended directories. This can be achieved by normalizing the paths and verifying that they start with the expected base directories. We will use os.path.normpath to normalize the paths and then check if they start with CSV_DIR or GRAPH_DIR.

usageAnalytics/app.py

+    filepath, _, _, error_msg, status_code = process_log_data(log_type, email)
+
+    if status_code != 200:
+        return jsonify({"error": error_msg}), status_code


To fix the problem, we need to ensure that any user-provided input included in the JSON response is properly sanitized or escaped. In this case, we can use the flask.escape function to escape the log_type before including it in the error message. This will prevent any malicious scripts from being executed if the JSON response is rendered in an unsafe manner by a client.

usageAnalytics/app.py

+    if status_code != 200:
+        return jsonify({"error": error_msg}), status_code
+
+    return send_file(filepath, as_attachment=True)


To fix the problem, we need to validate the log_type parameter to ensure it only contains allowed values. This can be achieved by defining a list of allowed log types and checking if the provided log_type is in this list before proceeding with file operations. This approach ensures that only valid log types are processed, preventing any potential path traversal attacks.

usageAnalytics/app.py

+    file_path = os.path.join(CSV_DIR, file_name)
+    log_analytics.write_entries_to_csv(entries, file_path)
+    cleanup_logs(base_filename)
+    return send_file(file_path, as_attachment=True)


To fix the problem, we need to ensure that the constructed file path is contained within a safe root folder. We can achieve this by normalizing the path using os.path.normpath and then checking that the normalized path starts with the root folder (CSV_DIR). This will prevent any path traversal attacks by ensuring that the file path remains within the intended directory.

usageAnalytics/app.py

+    _, overall, average, graph_base64, status_code = process_log_data(log_type, email)
+
+    if status_code != 200:
+        return jsonify({"error": graph_base64}), status_code


To fix the problem, we need to ensure that any user-provided input, such as the log_type parameter, is properly sanitized before being included in the response. The best way to fix this issue is to use the escape function from the flask module to sanitize the log_type value before including it in the error message.

We will modify the code in the get_analytics function to escape the log_type value before including it in the error message.

usageAnalytics/app.py

+
+    overall, average, graph_file = log_analytics.analyze_csv_logs(csv_file_path, f"Access Density for {email}")
+
+    with open(graph_file, "rb") as img_file:


To fix the problem, we need to ensure that the constructed file paths are contained within the intended directories. This can be achieved by normalizing the paths and verifying that they start with the expected base directory. We will use os.path.normpath to normalize the paths and then check if they start with CSV_DIR or GRAPH_DIR.

usageAnalytics/app.py

+        return render_template("analytics_result.html",
+                              title="Error",
+                              overall_accesses="",
+                              average_accesses="",
+                              graph_image=""), status_code


To fix the problem, we need to ensure that the log_type parameter is properly validated and sanitized before being used in the render_template function. The best way to fix this issue is to use the escape function from the flask module to sanitize the log_type parameter before including it in the template. This will prevent any malicious scripts from being executed.

usageAnalytics/app.py

+
+
+if __name__ == "__main__":
+    app.run(host="0.0.0.0", port=4000, debug=True)


To fix the problem, we need to ensure that the Flask application does not run in debug mode in a production environment. The best way to achieve this is by using environment variables to control the debug mode. This way, we can set the debug mode to True during development and False during production without changing the code.

We will modify the app.run method to check an environment variable (e.g., FLASK_DEBUG) to determine whether to enable debug mode. This approach maintains the existing functionality while enhancing security.

usageAnalytics/log_analytics.py

+
+
+def write_entries_to_csv(entries, output_csv):
+    with open(output_csv, 'w', newline='') as csvfile:


To fix the problem, we need to ensure that the constructed file path is validated before it is used to access the file system. We can achieve this by normalizing the path and ensuring it is contained within the intended base directory. This approach will prevent path traversal attacks by ensuring that the file path does not escape the base directory.

Normalize the constructed file path using os.path.normpath.

Check that the normalized path starts with the base directory (CSV_DIR).

Raise an exception or return an error response if the path validation fails.

Connor-Bernard and others added 30 commits February 17, 2023 17:20

completed production integration with public repo

be17226

updated gitlab to check for production

99a7373

updated makefile for pushing to production

2bdcee8

added .cloudignore to prevent standard ignore pattern

5c20a7a

fixed deployment config file

bd2e7bc

setting up cloudignore and testing

30af632

updated gcloudignore to forcefully include production folder

00a428e

Merge branch 'main' into production

e6f3fda

Merge pull request #25 from Connor-Bernard/production

2228da9

Set up for production (needed for security)

Deleted unused service account

269f8c3

Delete website/production/build directory

8593d6a

Merge pull request #26 from Connor-Bernard/remove-untracked-files

bf5d5bd

Remove untracked files

updated gitignore to leave empty auth directory

8a1e625

Merge branch 'main' of https://github.com/Connor-Bernard/gradeView in…

8a325ab

…to main

updated default.json to be in coordination with readme

93cb5be

updated makefile to properly pop stash

4149c84

updated makefile for error checking

4fe984c

updated makefile to properly pop stash

13dd669

updated makefile for error checking

275a998

Merge branch 'main' of https://github.com/Connor-Bernard/gradeView in…

06a28ff

…to main

updated cloudignore to include node_modules

6dced4c

updated makefile to force add production build on production call

effb971

updated makefile for production build

c8cd156

closes #17

48e2b3b

Configuring Codacy code review

c2c6cd6

set up proper rate limiting

63c191c

Merge branch 'main' into rate_limiting

f450ab5

Merge pull request #29 from Connor-Bernard/rate_limiting

2c7e21a

set up proper rate limiting

Delete codacy.yml

fb922a2

closes #23

1bb67ae

Connor-Bernard and others added 23 commits October 27, 2024 19:32

feat: update to express v5 and lint (#13)

f6b875e

feat: rename app (#16)

cab4a54

[GV-58] Native Request Logging (#17)

aeffdaf

* refactor(structure): move errors to lib * feat: add native logging * docs: add doc comment to morgan

[GV-61] Docker Dev Live Updates (#18)

ef1fd89

* fix: ensure current internal proxying doesn't throw errors * feat: add dev config for live updates

GV-42/ensure-accessibility-standards-are-met (#19)

71d17af

* Added label to profile button * GV-42:Ensure accessibility standards are met * GV-42/ensure-accessibility-standards-are-met * GV-42/ensure-accessibility-standards-are-met * GV-42

Sorts students in alphabetical order, sets first student in the sorte…

52877fe

…d list as the selected student (#20)

Revert "[CM-24]Sanitize file input for parser (#21)" (#24)

17d4d42

This reverts commit bb6e622.

[GV-10] Update Grading Breakdown and Buckets spr25 (#32)

f21f32c

* Updated Grading Breakdown Segment * Updated Buckets segment of Buckets

fix: track progressreports folder

77d97ae

[GV-11] Update to SP25 (#33)

c9f33a9

* feat: update to grid2 stable * chore: update outdated dependencies

[GV-11] Update CM to SP25 (#34)

8d58cc4

* feat: update cm to SP25 * fix: revert style update

Add project 1 and project 2 to current semester CM (#35)

f6f5dd1

[GV-26] Added Assignments to SP25 CM (#37)

2ff82fd

* Added project 3 to CM * Added midterm, which was previously not added * Added Fractal portion of Midterm

Fix to CM nodes not having proper mastery colors (#36)

28a7037

usage analytics tab added

7e18830

gitignore

45d2388

qanagattandyr self-assigned this Apr 11, 2025

qanagattandyr requested a review from Connor-Bernard April 11, 2025 00:05

qanagattandyr requested review from TeetoToast and eldarhasanov079 April 11, 2025 00:05

github-advanced-security bot found potential problems Apr 11, 2025

View reviewed changes

@@ -83,3 +83,5 @@
                 filename = f"{base_filename}_{timestamp}.csv"
-                filepath = os.path.join(CSV_DIR, filename)
+                filepath = os.path.normpath(os.path.join(CSV_DIR, filename))
+                if not filepath.startswith(CSV_DIR):
+                    raise Exception("Invalid file path")
@@ -93,2 +95,5 @@
                 overall, average, graph_file = log_analytics.analyze_csv_logs(filepath, title)
+                graph_file = os.path.normpath(graph_file)
+                if not graph_file.startswith(GRAPH_DIR):
+                    raise Exception("Invalid graph file path")

@@ -78,3 +78,4 @@
                 else:
-                    return None, None, None, f"Unknown log type: {log_type}", 400
+                    from flask import escape
+                    return None, None, None, f"Unknown log type: {escape(log_type)}", 400

@@ -45,4 +45,9 @@
+            ALLOWED_LOG_TYPES = ["login", "email", "api-login", "student-grades", "specific-student", "bins"]
             def process_log_data(log_type, email=None, days=30):
                 """Common function to process log data for different endpoints"""
+                if log_type not in ALLOWED_LOG_TYPES:
+                    return None, None, None, f"Unknown log type: {log_type}", 400
                 ts_filter = get_timestamp_filter(days)
@@ -77,4 +82,2 @@
                     base_filename = "bins_access"
-                else:
-                    return None, None, None, f"Unknown log type: {log_type}", 400

@@ -170,3 +170,5 @@
                 file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv'
-                file_path = os.path.join(CSV_DIR, file_name)
+                file_path = os.path.normpath(os.path.join(CSV_DIR, file_name))
+                if not file_path.startswith(CSV_DIR):
+                    return jsonify({"error": "Invalid file path"}), 400
                 log_analytics.write_entries_to_csv(entries, file_path)

@@ -194,3 +194,4 @@
                 if status_code != 200:
-                    return jsonify({"error": graph_base64}), status_code
+                    from flask import escape
+                    return jsonify({"error": escape(graph_base64)}), status_code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GV-22] Integrate usageAnalytics microservice into GV #40

[GV-22] Integrate usageAnalytics microservice into GV #40

Uh oh!

qanagattandyr commented Apr 11, 2025

Uh oh!

linear bot commented Apr 11, 2025

Uh oh!

gitguardian bot commented Apr 11, 2025

Uh oh!

Check failure

Copilot Autofix

Check warning

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Check warning

Copilot Autofix

Check failure

Copilot Autofix

Check warning

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants


		overall, average, graph_file = log_analytics.analyze_csv_logs(csv_file_path, f"Access Density for {email}")

		with open(graph_file, "rb") as img_file:

@@ -281,3 +281,5 @@
                 csv_file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv'
-                csv_file_path = os.path.join(CSV_DIR, csv_file_name)
+                csv_file_path = os.path.normpath(os.path.join(CSV_DIR, csv_file_name))
+                if not csv_file_path.startswith(CSV_DIR):
+                    raise Exception("Invalid file path")
                 log_analytics.write_entries_to_csv(entries, csv_file_path)
@@ -286,2 +288,5 @@
                 overall, average, graph_file = log_analytics.analyze_csv_logs(csv_file_path, f"Access Density for {email}")
+                graph_file = os.path.normpath(graph_file)
+                if not graph_file.startswith(GRAPH_DIR):
+                    raise Exception("Invalid file path")

@@ -3,3 +3,3 @@
-            from flask import Flask, send_file, request, jsonify, render_template
+            from flask import Flask, send_file, request, jsonify, render_template, escape
             from datetime import datetime, timedelta
@@ -309,3 +309,3 @@
-                title = f"Student: {email}" if log_type == "specific-student" else log_type.replace("-", " ").title()
+                title = f"Student: {email}" if log_type == "specific-student" else escape(log_type.replace("-", " ").title())



		if __name__ == "__main__":
		app.run(host="0.0.0.0", port=4000, debug=True)

@@ -319,2 +319,3 @@
             if __name__ == "__main__":
-                app.run(host="0.0.0.0", port=4000, debug=True)
+                debug_mode = os.getenv("FLASK_DEBUG", "False").lower() in ["true", "1", "t"]
+                app.run(host="0.0.0.0", port=4000, debug=debug_mode)



		def write_entries_to_csv(entries, output_csv):
		with open(output_csv, 'w', newline='') as csvfile:

[GV-22] Integrate usageAnalytics microservice into GV #40

Are you sure you want to change the base?

[GV-22] Integrate usageAnalytics microservice into GV #40

Uh oh!

Conversation

qanagattandyr commented Apr 11, 2025

Jira Ticket

Description

Type of Change

Changes

Testing

Checklist

Screenshots/Video

Additional Notes

Uh oh!

linear bot commented Apr 11, 2025

Uh oh!

gitguardian bot commented Apr 11, 2025

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Uh oh!

Check failure

Uh oh!

Copilot Autofix

Check warning

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check warning

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Check warning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot Autofix

Check failure

Copilot Autofix

Check failure

Uh oh!

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants