-
Notifications
You must be signed in to change notification settings - Fork 0
[GV-22] Integrate usageAnalytics microservice into GV #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[GV-22] Integrate usageAnalytics microservice into GV #40
Conversation
Set up for production (needed for security)
Remove untracked files
set up proper rate limiting
* chore: remove unused dependencies in api * feat: update dockerfile for prod * feat: add v1 version of server to track old server * refactor: remove deprecated routes and functions from new server * chore: remove unused packages from website * chore: remove unused packages from web server * feat: update web server for prod
* Add logic for redirecting to the error page * Add errors for getEntry. Add error handling for getStudent. Fix bug for checking if an email is a student. The bug is due to not using the await keyword * Fix security issue for printing out the invalid key in the logs. * Revert home.js back - This redirect will be given to another ticket: GV-30 * Improve error handling in backend API calls - Added specific error handling for missing authorization headers (401) - Ensured proper HTTP status codes (404, 500) are returned for key API operations - Updated error logging for better debugging * chore: return 4xx and 5xx errors for the API calls * chore: Add KeyNotFoundError to grades API for 404 error handling * chore: change 2 lines for styling purposes in progressQueryString * fix: handle database errors, 4xx, and 5xx errors for several API endpoints * fix: Change error switch case to use err.name. typeoferror does not identify the error name * Fix code scanning alert no. 1: Use of externally-controlled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Update students/index.js error message * Fix code scanning alert no. 6: Use of externally-controlled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Fix code scanning alert no. 4: Use of externally-controlled format string Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Manually merge changes from main * fix: Security vulnerability for throwing errors regarding student email * refactor(lint): reduce scope of local vars --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: Connor Bernard <connorbernard@berkeley.edu>
* feat: add endpoint for accepting file uploads * feat: add support for indexing concept map schema * Fix code scanning alert no. 8: Missing rate limiting Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * fix(sec): limit scope of unsync to uploads folder --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* refactor(structure): move errors to lib * feat: add native logging * docs: add doc comment to morgan
* fix: ensure current internal proxying doesn't throw errors * feat: add dev config for live updates
* Added label to profile button * GV-42:Ensure accessibility standards are met * GV-42/ensure-accessibility-standards-are-met * GV-42/ensure-accessibility-standards-are-met * GV-42
…d list as the selected student (#20)
* Add schema validation for JSON input to CM * Update requirements.txt * Raise error when validation fails * Add functionality to prevent injection and DDOS attacks * Add functionality to prevent injection and DDOS targeting input text file * Add os to requirements.txt
* Rough draft of index.test.js. Need to look through and check that everything works, plus need to add to package.json * Set up dependencies, final step recheck index.test.js * Changes so far, going to hard reset to see if i can get dependencies to work * Changes that Paulina helped me with, to be revisited if my current approach doesnt work * Very very close. Need to amend the tests in index.test.js, but dependencies have been fixed * Almost done. Fix 4th test case * Complete index.test.js with wanted tests * Fix: readded module:type line to package.json
* Sync CM config with Fa24 HAID and add SID of duplicate of Fa24 HAID * Update parser to handle commas, colons, etc within node names + remove sp. character from wordle-lite node name * Remove temporary comment * Restore original formatting of student and class levels * Update Berkeley_CS10.txt * Allow all characters other than left bracket in node names * Revert to old dev sheet ID
…30) * Process post request with mastery learning info * Restore index method to original state * Generate CM from post parameters * Access nested student and class mastery keys * Bug fix + return correct error code * Fix validation bug + load constants from file * Attempt to POST CM through the iframe + transfer CM-200 changes * Implement api endpoint to return mastery mapping and post to CM in iframe * Make dedicated router for mastery mapping + cleanup * Formatting changes * Formatting changes * Formatting changes * Formatting change * Misc. changes + optimize getTopicsFromUser * Rename variables per convention Co-authored-by: Connor <connorbernard@berkeley.edu> * Nest async function within useEffect * Minor change * Use axios to retrieve CM html * Remove error handling * Mark deprecated functions as such --------- Co-authored-by: Connor <connorbernard@berkeley.edu>
* Updated Grading Breakdown Segment * Updated Buckets segment of Buckets
* feat: update to grid2 stable * chore: update outdated dependencies
* feat: update cm to SP25 * fix: revert style update
* Added project 3 to CM * Added midterm, which was previously not added * Added Fractal portion of Midterm
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 14519307 | Triggered | Google Cloud Keys | be17226 | api/auth/service_account.json | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
| overall, average, graph_file = log_analytics.analyze_csv_logs(filepath, title) | ||
|
|
||
| # Read the graph file | ||
| with open(graph_file, "rb") as img_file: |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that the constructed file paths are contained within the intended directories. This can be achieved by normalizing the paths and verifying that they start with the expected base directories. We will use os.path.normpath to normalize the paths and then check if they start with CSV_DIR or GRAPH_DIR.
-
Copy modified lines R84-R86 -
Copy modified lines R96-R98
| @@ -83,3 +83,5 @@ | ||
| filename = f"{base_filename}_{timestamp}.csv" | ||
| filepath = os.path.join(CSV_DIR, filename) | ||
| filepath = os.path.normpath(os.path.join(CSV_DIR, filename)) | ||
| if not filepath.startswith(CSV_DIR): | ||
| raise Exception("Invalid file path") | ||
|
|
||
| @@ -93,2 +95,5 @@ | ||
| overall, average, graph_file = log_analytics.analyze_csv_logs(filepath, title) | ||
| graph_file = os.path.normpath(graph_file) | ||
| if not graph_file.startswith(GRAPH_DIR): | ||
| raise Exception("Invalid graph file path") | ||
|
|
| filepath, _, _, error_msg, status_code = process_log_data(log_type, email) | ||
|
|
||
| if status_code != 200: | ||
| return jsonify({"error": error_msg}), status_code |
Check warning
Code scanning / CodeQL
Reflected server-side cross-site scripting Medium
user-provided value
Cross-site scripting vulnerability due to a
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that any user-provided input included in the JSON response is properly sanitized or escaped. In this case, we can use the flask.escape function to escape the log_type before including it in the error message. This will prevent any malicious scripts from being executed if the JSON response is rendered in an unsafe manner by a client.
-
Copy modified lines R79-R80
| @@ -78,3 +78,4 @@ | ||
| else: | ||
| return None, None, None, f"Unknown log type: {log_type}", 400 | ||
| from flask import escape | ||
| return None, None, None, f"Unknown log type: {escape(log_type)}", 400 | ||
|
|
| if status_code != 200: | ||
| return jsonify({"error": error_msg}), status_code | ||
|
|
||
| return send_file(filepath, as_attachment=True) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
This path depends on a
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to validate the log_type parameter to ensure it only contains allowed values. This can be achieved by defining a list of allowed log types and checking if the provided log_type is in this list before proceeding with file operations. This approach ensures that only valid log types are processed, preventing any potential path traversal attacks.
-
Copy modified lines R46-R47 -
Copy modified lines R50-R52
| @@ -45,4 +45,9 @@ | ||
|
|
||
| ALLOWED_LOG_TYPES = ["login", "email", "api-login", "student-grades", "specific-student", "bins"] | ||
|
|
||
| def process_log_data(log_type, email=None, days=30): | ||
| """Common function to process log data for different endpoints""" | ||
| if log_type not in ALLOWED_LOG_TYPES: | ||
| return None, None, None, f"Unknown log type: {log_type}", 400 | ||
|
|
||
| ts_filter = get_timestamp_filter(days) | ||
| @@ -77,4 +82,2 @@ | ||
| base_filename = "bins_access" | ||
| else: | ||
| return None, None, None, f"Unknown log type: {log_type}", 400 | ||
|
|
| file_path = os.path.join(CSV_DIR, file_name) | ||
| log_analytics.write_entries_to_csv(entries, file_path) | ||
| cleanup_logs(base_filename) | ||
| return send_file(file_path, as_attachment=True) |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that the constructed file path is contained within a safe root folder. We can achieve this by normalizing the path using os.path.normpath and then checking that the normalized path starts with the root folder (CSV_DIR). This will prevent any path traversal attacks by ensuring that the file path remains within the intended directory.
-
Copy modified lines R171-R173
| @@ -170,3 +170,5 @@ | ||
| file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv' | ||
| file_path = os.path.join(CSV_DIR, file_name) | ||
| file_path = os.path.normpath(os.path.join(CSV_DIR, file_name)) | ||
| if not file_path.startswith(CSV_DIR): | ||
| return jsonify({"error": "Invalid file path"}), 400 | ||
| log_analytics.write_entries_to_csv(entries, file_path) |
| _, overall, average, graph_base64, status_code = process_log_data(log_type, email) | ||
|
|
||
| if status_code != 200: | ||
| return jsonify({"error": graph_base64}), status_code |
Check warning
Code scanning / CodeQL
Reflected server-side cross-site scripting Medium
user-provided value
Cross-site scripting vulnerability due to a
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that any user-provided input, such as the log_type parameter, is properly sanitized before being included in the response. The best way to fix this issue is to use the escape function from the flask module to sanitize the log_type value before including it in the error message.
We will modify the code in the get_analytics function to escape the log_type value before including it in the error message.
-
Copy modified lines R195-R196
| @@ -194,3 +194,4 @@ | ||
| if status_code != 200: | ||
| return jsonify({"error": graph_base64}), status_code | ||
| from flask import escape | ||
| return jsonify({"error": escape(graph_base64)}), status_code | ||
|
|
|
|
||
| overall, average, graph_file = log_analytics.analyze_csv_logs(csv_file_path, f"Access Density for {email}") | ||
|
|
||
| with open(graph_file, "rb") as img_file: |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that the constructed file paths are contained within the intended directories. This can be achieved by normalizing the paths and verifying that they start with the expected base directory. We will use os.path.normpath to normalize the paths and then check if they start with CSV_DIR or GRAPH_DIR.
-
Copy modified lines R282-R284 -
Copy modified lines R289-R291
| @@ -281,3 +281,5 @@ | ||
| csv_file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv' | ||
| csv_file_path = os.path.join(CSV_DIR, csv_file_name) | ||
| csv_file_path = os.path.normpath(os.path.join(CSV_DIR, csv_file_name)) | ||
| if not csv_file_path.startswith(CSV_DIR): | ||
| raise Exception("Invalid file path") | ||
| log_analytics.write_entries_to_csv(entries, csv_file_path) | ||
| @@ -286,2 +288,5 @@ | ||
| overall, average, graph_file = log_analytics.analyze_csv_logs(csv_file_path, f"Access Density for {email}") | ||
| graph_file = os.path.normpath(graph_file) | ||
| if not graph_file.startswith(GRAPH_DIR): | ||
| raise Exception("Invalid file path") | ||
|
|
| return render_template("analytics_result.html", | ||
| title="Error", | ||
| overall_accesses="", | ||
| average_accesses="", | ||
| graph_image=""), status_code |
Check warning
Code scanning / CodeQL
Reflected server-side cross-site scripting Medium
user-provided value
Cross-site scripting vulnerability due to a
user-provided value
Cross-site scripting vulnerability due to a
user-provided value
Cross-site scripting vulnerability due to a
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that the log_type parameter is properly validated and sanitized before being used in the render_template function. The best way to fix this issue is to use the escape function from the flask module to sanitize the log_type parameter before including it in the template. This will prevent any malicious scripts from being executed.
-
Copy modified line R4 -
Copy modified line R310
| @@ -3,3 +3,3 @@ | ||
|
|
||
| from flask import Flask, send_file, request, jsonify, render_template | ||
| from flask import Flask, send_file, request, jsonify, render_template, escape | ||
| from datetime import datetime, timedelta | ||
| @@ -309,3 +309,3 @@ | ||
|
|
||
| title = f"Student: {email}" if log_type == "specific-student" else log_type.replace("-", " ").title() | ||
| title = f"Student: {email}" if log_type == "specific-student" else escape(log_type.replace("-", " ").title()) | ||
|
|
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| app.run(host="0.0.0.0", port=4000, debug=True) |
Check failure
Code scanning / CodeQL
Flask app is run in debug mode High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that the Flask application does not run in debug mode in a production environment. The best way to achieve this is by using environment variables to control the debug mode. This way, we can set the debug mode to True during development and False during production without changing the code.
We will modify the app.run method to check an environment variable (e.g., FLASK_DEBUG) to determine whether to enable debug mode. This approach maintains the existing functionality while enhancing security.
-
Copy modified lines R320-R321
| @@ -319,2 +319,3 @@ | ||
| if __name__ == "__main__": | ||
| app.run(host="0.0.0.0", port=4000, debug=True) | ||
| debug_mode = os.getenv("FLASK_DEBUG", "False").lower() in ["true", "1", "t"] | ||
| app.run(host="0.0.0.0", port=4000, debug=debug_mode) |
|
|
||
|
|
||
| def write_entries_to_csv(entries, output_csv): | ||
| with open(output_csv, 'w', newline='') as csvfile: |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 months ago
To fix the problem, we need to ensure that the constructed file path is validated before it is used to access the file system. We can achieve this by normalizing the path and ensuring it is contained within the intended base directory. This approach will prevent path traversal attacks by ensuring that the file path does not escape the base directory.
- Normalize the constructed file path using
os.path.normpath. - Check that the normalized path starts with the base directory (
CSV_DIR). - Raise an exception or return an error response if the path validation fails.
-
Copy modified lines R134-R136 -
Copy modified lines R147-R149 -
Copy modified lines R160-R162 -
Copy modified lines R177-R179 -
Copy modified lines R190-R192
| @@ -133,3 +133,5 @@ | ||
| file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv' | ||
| file_path = os.path.join(CSV_DIR, file_name) | ||
| file_path = os.path.normpath(os.path.join(CSV_DIR, file_name)) | ||
| if not file_path.startswith(CSV_DIR): | ||
| return jsonify({"error": "Invalid file path"}), 400 | ||
| log_analytics.write_entries_to_csv(entries, file_path) | ||
| @@ -144,3 +146,5 @@ | ||
| file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv' | ||
| file_path = os.path.join(CSV_DIR, file_name) | ||
| file_path = os.path.normpath(os.path.join(CSV_DIR, file_name)) | ||
| if not file_path.startswith(CSV_DIR): | ||
| return jsonify({"error": "Invalid file path"}), 400 | ||
| log_analytics.write_entries_to_csv(entries, file_path) | ||
| @@ -155,3 +159,5 @@ | ||
| file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv' | ||
| file_path = os.path.join(CSV_DIR, file_name) | ||
| file_path = os.path.normpath(os.path.join(CSV_DIR, file_name)) | ||
| if not file_path.startswith(CSV_DIR): | ||
| return jsonify({"error": "Invalid file path"}), 400 | ||
| log_analytics.write_entries_to_csv(entries, file_path) | ||
| @@ -170,3 +176,5 @@ | ||
| file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv' | ||
| file_path = os.path.join(CSV_DIR, file_name) | ||
| file_path = os.path.normpath(os.path.join(CSV_DIR, file_name)) | ||
| if not file_path.startswith(CSV_DIR): | ||
| return jsonify({"error": "Invalid file path"}), 400 | ||
| log_analytics.write_entries_to_csv(entries, file_path) | ||
| @@ -181,3 +189,5 @@ | ||
| file_name = f'{base_filename}_{datetime.now().strftime("%Y%m%d_%H%M%S")}.csv' | ||
| file_path = os.path.join(CSV_DIR, file_name) | ||
| file_path = os.path.normpath(os.path.join(CSV_DIR, file_name)) | ||
| if not file_path.startswith(CSV_DIR): | ||
| return jsonify({"error": "Invalid file path"}), 400 | ||
| log_analytics.write_entries_to_csv(entries, file_path) |
Jira Ticket
Jira Ticket
Description
Create a new router and connect to the existing schema. Added feature is usageAnalytics, added new tab, updated docker.
Test the full integration with GradeView.
Type of Change
Changes
website, docker-compose, api/v2 edited.
Testing
Tested by running locally on different laptops.
Checklist
<ticket-id>/<brief-description-of-change>[<ticket-id>] <brief-description-of-change>Screenshots/Video
Additional Notes