Skip to content

Commit 11b217d

Browse files
GDS Fundamentals SME Review (#459)
* update as per comments * go live --------- Co-authored-by: Adam Cowley <adam.cowley@neo4j.com>
1 parent 57aed6a commit 11b217d

File tree

10 files changed

+111
-6
lines changed

10 files changed

+111
-6
lines changed
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
sequenceDiagram
2+
participant U as User
3+
participant A as Agent
4+
participant R as Read Tool
5+
participant D as Neo4j Docs MCP
6+
7+
U->>A: Fact-check @2-tiers lesson
8+
9+
Note over A: Step 1: Read lesson content
10+
par Parallel reads
11+
A->>R: Read lesson.adoc
12+
A->>R: Read questions/1-choosing.adoc
13+
end
14+
R-->>A: Lesson content (4 tiers, limits, features)
15+
16+
Note over A: Step 2: Discover Aura docs
17+
A->>D: list_manual_pages("aura")
18+
D-->>A: 100+ Aura doc URLs
19+
20+
Note over A: Step 3: Read core tier docs
21+
par Batch 1
22+
A->>D: read_page(main aura page)
23+
A->>D: read_page(create-instance)
24+
A->>D: read_page(backup-restore)
25+
end
26+
D-->>A: Tier names, backup frequencies
27+
28+
Note over A: Step 4: Verify security claims
29+
par Batch 2
30+
A->>D: read_page(encryption)
31+
A->>D: read_page(single-sign-on)
32+
A->>D: read_page(cloud-providers)
33+
end
34+
D-->>A: CMEK for VDC, SSO for BC+VDC
35+
36+
Note over A: Step 5: Check RBAC & limits
37+
par Batch 3
38+
A->>D: read_page(migration-free)
39+
A->>D: read_page(user-management)
40+
end
41+
D-->>A: RBAC exists for all tiers
42+
43+
Note over A: Step 6: Additional verification
44+
par Batch 4
45+
A->>D: read_page(instance-resources)
46+
A->>D: read_page(ip-filtering)
47+
end
48+
49+
par Batch 5
50+
A->>D: read_page(billing-dimensions)
51+
A->>D: read_page(new-console)
52+
end
53+
54+
Note over A: Step 7: Analyze & compare
55+
A->>A: Cross-reference claims vs docs
56+
A->>A: Identify verified/unverified/incorrect
57+
58+
A->>U: Fact-check report with corrections
59+
60+
157 KB
Loading

asciidoc/courses/gds-fundamentals/course.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
= Get started with the Graph Data Science library
1+
= Get started with Graph Data Science
22
:usecase: recommendations
33
:categories: data-scientist:1, data-analysis:10, intermediate:3
44
:duration: 3-4 hours
55
:caption: Learn the fundamentals of Neo4j Graph Data Science
6-
:status: draft
6+
:status: active
77
:key-points: Graph projections, Algorithm execution, Algorithm configuration, Relationship aggregation, Projection modeling
88
:graph-analytics-plugin: true
99

asciidoc/courses/gds-fundamentals/modules/2-gds-basic-concepts/lessons/2-graph-projection-basics/lesson.adoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,12 @@ By the end of this lesson, you will understand:
1616
* What graph structures you're creating when you project
1717
* Why different projection types matter for algorithms
1818
19+
[NOTE]
20+
.Algorithm requirements drive projection choices
21+
====
22+
Different algorithms have different requirements for graph structure. Some algorithms work optimally on monopartite graphs (single node type), while others are designed for bipartite graphs (two distinct node types). As you learn projection techniques throughout this module, keep in mind that your projection choices should be guided by which algorithms you plan to use. You'll learn more about algorithm-specific requirements in Module 3.
23+
====
24+
1925

2026
== Cypher Projection Anatomy
2127

asciidoc/courses/gds-fundamentals/modules/3-working-with-algorithms/lessons/1-algorithms-overview/lesson.adoc

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,12 @@ Let's say you're a producer, and you want a star who will bridge multiple fan co
159159

160160
You could use these results.
161161

162+
[NOTE]
163+
.Computational complexity and large graphs
164+
====
165+
Betweenness centrality is computationally expensive, especially on large graphs. It has O(n³) time complexity, which means it can take hours or even days to run on graphs with millions of nodes.
166+
====
167+
162168

163169
== Community Detection: Finding Groups
164170

@@ -362,6 +368,28 @@ Different questions require different algorithm categories:
362368

363369
The same projection can answer multiple questions. One projection cannot answer all questions equally.
364370

371+
[NOTE]
372+
.Graph size and algorithm performance
373+
====
374+
When working with large graphs, algorithm performance becomes critical. Some algorithms that work well on small datasets become impractical on graphs with millions of nodes:
375+
376+
**Performance considerations by graph size:**
377+
378+
* **Small graphs (<1M nodes):** Most algorithms run quickly; choose based on your analytical question
379+
* **Medium graphs (1-10M nodes):** Avoid exact betweenness centrality
380+
* **Large graphs (>10M nodes):** Prioritize scalable algorithms and consider approximate versions
381+
382+
**Approximate algorithms:**
383+
384+
Many computationally expensive algorithms have approximate versions that trade some accuracy for significant speed improvements. These use sampling or heuristics to provide results faster. Look for parameters like:
385+
386+
* `samplingSize` - Controls how much of the graph to sample
387+
* `maxIterations` - Limits computation time for iterative algorithms
388+
* `tolerance` - Sets convergence thresholds for early stopping
389+
390+
Check the link:https://neo4j.com/docs/graph-data-science/current/[GDS documentation] for algorithm-specific parameters and their approximate variants.
391+
====
392+
365393
== What's next
366394

367395
You now understand the five main categories of algorithms in GDS and the types of questions each can answer. You've seen how the same data can be modeled differently depending on your analytical question.

asciidoc/courses/gds-fundamentals/modules/3-working-with-algorithms/lessons/2-five-execution-modes/lesson.adoc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -397,6 +397,16 @@ The output shows how many nodes and relationships will be processed, and how muc
397397
398398
GDS operates entirely in heap memory. For large graphs or complex algorithms, you may need to increase your heap size.
399399
400+
[NOTE]
401+
.Estimate for time planning, not just memory
402+
====
403+
While estimate mode primarily shows memory requirements, it's also valuable for understanding computational scale. The `nodeCount` and `relationshipCount` in the estimate output, combined with knowledge of an algorithm's complexity, help you predict execution time.
404+
405+
For example, if estimate shows your graph has 250 million nodes and you're planning to run betweenness centrality (O(n³) complexity), you can anticipate an extremely long runtime—potentially days. This is when you should consider approximate algorithms or alternative approaches before starting a job that might run indefinitely.
406+
407+
Use estimate mode as your first check for both memory feasibility and computational practicality.
408+
====
409+
400410
To check or modify heap settings, open your `neo4j.conf` file:
401411
402412
image::images/neo4j_conf.png[the main instance page of Neo4j Desktop 2. Click the three dots, hover on Open and choose neo4j.conf]

asciidoc/courses/gds-fundamentals/modules/3-working-with-algorithms/lessons/4-understanding-gds-docs/lesson.adoc

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,6 @@ Or, if you absolutely had to use Leiden, you could add relationship weights to s
7979

8080
Most algorithms support multiple configurations, but checking these attributes first saves time and helps you understand the algorithm's capabilities.
8181

82-
8382
== Reading Algorithm Syntax
8483

8584
The syntax section shows you exactly how to call an algorithm. Here's an example for PageRank:

asciidoc/courses/gds-fundamentals/modules/4-essential-projection-techniques/lessons/4-projection-modeling/questions/1-question-driven-design.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[.question]
2-
= Question-Driven Projection Design
2+
= Using questions to drive projection design
33

44
You're asked: "Which actors are most influential in Hollywood based on their collaboration network?"
55

asciidoc/courses/gds-product-introduction/course.adoc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
:duration: 30 minutes
55
:next: graph-data-science-fundamentals
66
:caption: Gain a high-level technical understanding of the Neo4j Graph Data Science (GDS) library
7-
:status: active
7+
:status: redirect
8+
:redirect: /courses/gds-fundamentals/
89
:key-points: Graph Data Science, Graph projections, Installation options, GDS licensing
910

1011
== Course Description

asciidoc/courses/graph-data-science-fundamentals/course.adoc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,9 @@
44
:duration: 1 hour
55
:next: gds-shortest-paths
66
:caption: Learn all you need to know about Graph Algorithms and Machine Learning Pipelines
7-
:status: active
87
:key-points: Graph Data Science, Graph algorithms, Machine learning pipelines, GDS machine learning operations
8+
:status: redirect
9+
:redirect: /courses/gds-fundamentals/
910

1011
== Course Description
1112

0 commit comments

Comments
 (0)