Neo4j Guide — Graph Database with Cypher Query Language
Neo4j is the world’s leading graph database, representing data as nodes connected by relationships with properties, and queried using Cypher — an intuitive declarative query language for graph traversal and pattern matching.
What You’ll Learn
By the end of this tutorial, you’ll understand the graph data model (nodes, relationships, properties), write Cypher queries using MATCH, CREATE, and WHERE, create and query graph indexes, use graph algorithms for pathfinding and recommendations, and connect Neo4j to Python applications.
Why Neo4j Matters
Graph databases solve problems that are painful in relational databases: social networks, recommendation engines, fraud detection, and knowledge graphs. LinkedIn uses graphs for “People You May Know”, Netflix for recommendations, and banks for fraud detection. Durga Antivirus Pro uses Neo4j for analyzing relationships between threat actors, malware variants, and compromised systems. Learning Neo4j gives you a unique skill in one of the fastest-growing database categories.
Neo4j Learning Path
flowchart LR
A[SQL Basics] --> B[PostgreSQL]
B --> C[Neo4j]
C --> D[Elasticsearch]
D --> E[MongoDB]
E --> F[Database Design]
C --> G{You Are Here}
style G fill:#f90,color:#fff
What Is Neo4j? (The “Why” First)
Think about how SQL databases model relationships — they use foreign keys and JOINs. If you want to find “friends of friends who bought products I viewed”, you’d need multiple JOINs across several tables, and the query complexity grows exponentially with each hop. In Neo4j, relationships are first-class citizens — they’re stored as physical pointers, not computed via JOINs. Finding connections of any depth is fast and intuitive: MATCH (me)-[:FRIEND*2..5]->(product) finds products my friends (up to 5 hops away) bought, in one simple query.
Graph vs Relational Model
| Concept | SQL Database | Neo4j |
|---|---|---|
| Entities | Rows in tables | Nodes |
| Relationships | FOREIGN KEY + JOIN | Native relationships |
| Depth queries | Complex recursive CTEs | Simple pattern matching |
| Schema | Rigid (ALTER TABLE) | Schema-optional |
| Performance | Degrades with JOIN depth | Constant for any depth |
The Graph Model
flowchart TB
Alice[Person: Alice] -->|FRIENDS_WITH| Bob[Person: Bob]
Alice -->|BOUGHT| Laptop[Product: Laptop]
Bob -->|BOUGHT| Mouse[Product: Mouse]
Bob -->|BOUGHT| Headphones[Product: Headphones]
Laptop -->|CATEGORY| Electronics[Category: Electronics]
Mouse -->|CATEGORY| Electronics
Headphones -->|CATEGORY| Electronics
Alice -->|REVIEWED| Laptop
style Alice fill:#f90,color:#fff
style Bob fill:#f90,color:#fff
Every node has labels (like a table name) and properties (key-value pairs). Every relationship has a type and direction, and can also have properties.
Creating Nodes and Relationships with Cypher
// Create person nodes
CREATE (alice:Person {name: 'Alice', age: 28, city: 'New York'})
CREATE (bob:Person {name: 'Bob', age: 35, city: 'Los Angeles'})
CREATE (charlie:Person {name: 'Charlie', age: 22, city: 'Chicago'})
// Create product nodes
CREATE (laptop:Product {name: 'MacBook Pro', price: 2499})
CREATE (mouse:Product {name: 'Magic Mouse', price: 79})
CREATE (headphones:Product {name: 'AirPods Pro', price: 249})
// Create category nodes
CREATE (electronics:Category {name: 'Electronics'})
// Create relationships
CREATE (alice)-[:FRIENDS_WITH]->(bob)
CREATE (alice)-[:BOUGHT {date: '2026-06-01', quantity: 1}]->(laptop)
CREATE (bob)-[:BOUGHT {date: '2026-06-05', quantity: 2}]->(mouse)
CREATE (bob)-[:BOUGHT {date: '2026-06-06', quantity: 1}]->(headphones)
CREATE (laptop)-[:BELONGS_TO]->(electronics)
CREATE (mouse)-[:BELONGS_TO]->(electronics)
CREATE (headphones)-[:BELONGS_TO]->(electronics)
CREATE (alice)-[:REVIEWED {rating: 5, text: 'Great laptop!'}]->(laptop)Querying with MATCH
// Find all products that Alice bought
MATCH (alice:Person {name: 'Alice'})-[:BOUGHT]->(product:Product)
RETURN product.name, product.price;
// Output:
// product.name | product.price
// MacBook Pro | 2499
// Find friends of Alice and what they bought
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH]->(friend:Person)
MATCH (friend)-[:BOUGHT]->(product:Product)
RETURN friend.name, product.name, product.price;
// Output:
// friend.name | product.name | product.price
// Bob | Magic Mouse | 79
// Bob | AirPods Pro | 249
// Find products that Alice's friends bought (but Alice didn't buy)
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH]->(friend:Person)
MATCH (friend)-[:BOUGHT]->(product:Product)
WHERE NOT EXISTS {
MATCH (alice)-[:BOUGHT]->(product)
}
RETURN DISTINCT product.name, product.price;
// Output:
// product.name | product.price
// Magic Mouse | 79
// AirPods Pro | 249Path Queries and Variable Depth
// Find all products in the Electronics category (any depth)
MATCH (product:Product)-[:BELONGS_TO*]->(cat:Category {name: 'Electronics'})
RETURN product.name;
// Find friends up to 3 hops away from Alice
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH*1..3]->(distant:Person)
RETURN DISTINCT distant.name;
// Find shortest path between two people
MATCH path = shortestPath(
(alice:Person {name: 'Alice'})-[:FRIENDS_WITH*]-(charlie:Person {name: 'Charlie'})
)
RETURN [node IN nodes(path) | node.name] AS friend_chain;
// Output:
// friend_chain
// [Alice, Bob, Charlie]Adding Indexes and Constraints
// Create an index for fast node lookups by property
CREATE INDEX person_name_index FOR (p:Person) ON (p.name);
// Create a unique constraint (like PRIMARY KEY in SQL)
CREATE CONSTRAINT person_name_unique FOR (p:Person) REQUIRE p.name IS UNIQUE;
// Create a composite index
CREATE INDEX person_city_age_index FOR (p:Person) ON (p.city, p.age);
// Verify indexes
SHOW INDEXES;Using Neo4j with Python
from neo4j import GraphDatabase
class Neo4jClient:
def __init__(self, uri, user, password):
self.driver = GraphDatabase.driver(uri, auth=(user, password))
def close(self):
self.driver.close()
def get_recommendations(self, user_name):
with self.driver.session() as session:
result = session.run("""
MATCH (user:Person {name: $name})
-[:FRIENDS_WITH]->(friend:Person)
-[:BOUGHT]->(product:Product)
WHERE NOT EXISTS {
MATCH (user)-[:BOUGHT]->(product)
}
RETURN product.name AS product,
product.price AS price,
friend.name AS bought_by
ORDER BY price DESC
""", name=user_name)
return [record.data() for record in result]
# Use the client
client = Neo4jClient("bolt://localhost:7687", "neo4j", "password")
recommendations = client.get_recommendations("Alice")
for rec in recommendations:
print(f"{rec['bought_by']} bought {rec['product']} (${rec['price']})")
client.close()Expected output:
Bob bought AirPods Pro ($249)
Bob bought Magic Mouse ($79)Graph Algorithms in Neo4j
Neo4j includes a library of graph algorithms via the Graph Data Science (GDS) library:
// PageRank — find influential nodes
CALL gds.pageRank.write({
nodeProjection: 'Person',
relationshipProjection: 'FRIENDS_WITH',
writeProperty: 'pagerank'
});
// Check PageRank scores
MATCH (p:Person)
RETURN p.name, p.pagerank
ORDER BY p.pagerank DESC;
// Community detection (Louvain)
CALL gds.louvain.write({
nodeProjection: 'Person',
relationshipProjection: 'FRIENDS_WITH',
writeProperty: 'community'
});
// Check communities
MATCH (p:Person)
RETURN p.community, collect(p.name) AS members;Common Neo4j Errors
1. IndexNotFound
Trying to use an index that doesn’t exist. Fix: Check with SHOW INDEXES and create the index first with CREATE INDEX.
2. ConstraintValidationFailed
Violating a uniqueness constraint. Fix: Check for existing nodes with the same property value before creating, or use MERGE instead of CREATE.
3. SyntaxError: unexpected token
Cypher syntax error. Fix: Check keyword spelling, parentheses, and arrow directions (-> vs -). Remember: -[:RELATION]-> has a specific format.
4. Neo.TransientError.General.MemoryPoolOutOfMemoryError
The query requires more memory than available. Fix: Use PROFILE or EXPLAIN to optimize the query. Add LIMIT to reduce result size, or increase dbms.memory.heap.max_size.
5. Relationship Direction Confusion
// WRONG — no arrow direction
MATCH (a:Person)--(b:Person)
// RIGHT — directed relationship
MATCH (a:Person)-[:FRIENDS_WITH]->(b:Person)Fix: Always specify relationship direction and type for production queries.
6. Cartesian Product Warning
// WRONG — no connection between a and b creates cartesian product
MATCH (a:Person), (b:Product)
RETURN a.name, b.name;Fix: Always connect nodes with a relationship in your MATCH clause. If you do need a cross product, explicitly use OPTIONAL MATCH.
7. Connection refused
Neo4j server isn’t running or the port is wrong. Fix: Check the Neo4j status: sudo systemctl status neo4j. Default ports: bolt://7687, HTTP://7474.
Practice Questions
1. What is a node in Neo4j?
A node represents an entity in the graph (like a person, product, or category). Nodes have labels (types) and properties (key-value attributes). They’re similar to rows in a relational table but can have multiple labels.
2. How is a relationship different from a foreign key in SQL?
A relationship in Neo4j is a physical pointer stored with the node, providing constant-time traversal. A SQL foreign key requires a JOIN operation that scans indexes. Relationships can also have properties; foreign keys cannot.
3. What does MERGE do in Cypher?
MERGE is like a combined MATCH + CREATE — it finds a pattern and creates it if it doesn’t exist. Use MERGE to avoid duplicate nodes and relationships.
4. Challenge: Write a Cypher query that recommends products to a user based on what similar users (same city) bought.
MATCH (user:Person {name: 'Alice'})
MATCH (similar:Person {city: user.city})
WHERE similar <> user
MATCH (similar)-[:BOUGHT]->(product:Product)
WHERE NOT EXISTS {
MATCH (user)-[:BOUGHT]->(product)
}
RETURN product.name, COUNT(*) AS popularity
ORDER BY popularity DESC
LIMIT 5;5. What is the difference between MATCH and OPTIONAL MATCH?
MATCH requires a pattern to exist — if it doesn’t, no results are returned. OPTIONAL MATCH is like SQL’s LEFT JOIN — it returns NULLs when no match is found.
Real-World Task: Build a Fraud Detection Graph
Model a fraud detection system — similar to what Durga Antivirus Pro uses for identifying threat patterns:
// Create the graph model
CREATE (alice:User {name: 'Alice', email: 'alice@example.com', risk_score: 0.1})
CREATE (device1:Device {fingerprint: 'ABC123', os: 'Windows 10', ip: '192.168.1.1'})
CREATE (device2:Device {fingerprint: 'DEF456', os: 'Windows 10', ip: '10.0.0.1'})
CREATE (tx1:Transaction {id: 'TXN001', amount: 500, status: 'approved'})
CREATE (tx2:Transaction {id: 'TXN002', amount: 1500, status: 'flagged'})
// Connect entities
CREATE (alice)-[:USED_DEVICE {timestamp: '2026-06-07 10:00'}]->(device1)
CREATE (alice)-[:MADE_TRANSACTION {timestamp: '2026-06-07 11:00'}]->(tx1)
CREATE (alice)-[:MADE_TRANSACTION {timestamp: '2026-06-07 12:00'}]->(tx2)
CREATE (device1)-[:SAME_FINGERPRINT]->(device2)
// Find potentially fraudulent patterns: same device fingerprint
// used by different users
MATCH (u1:User)-[:USED_DEVICE]->(:Device)-[:SAME_FINGERPRINT]->(:Device)<-[:USED_DEVICE]-(u2:User)
WHERE id(u1) < id(u2)
RETURN u1.name AS user1, u2.name AS user2, 'Shared device fingerprint' AS pattern;
// Find high-risk transactions from users sharing a device
MATCH path = (u:User)-[:USED_DEVICE]->(d:Device)-[:SAME_FINGERPRINT*1..2]->()
MATCH (u)-[:MADE_TRANSACTION]->(tx:Transaction)
WHERE tx.amount > 1000
RETURN u.name, tx.amount, tx.status, d.fingerprint;FAQ
Try It Yourself
Start Neo4j locally (Docker recommended) and run these exploration queries:
// View database schema
CALL db.schema.visualization();
// Count nodes and relationships by type
MATCH (n)
RETURN labels(n) AS label, COUNT(*) AS count
ORDER BY count DESC;
MATCH ()-[r]->()
RETURN type(r) AS relationship_type, COUNT(*) AS count
ORDER BY count DESC;
// Find all paths between two nodes (up to 6 hops)
MATCH path = (alice:Person {name: 'Alice'})-[:*1..6]-(bob:Person {name: 'Bob'})
RETURN [node IN nodes(path) | node.name] AS path_nodes,
[rel IN relationships(path) | type(rel)] AS path_rels;
// Find isolated nodes (no relationships)
MATCH (n)
WHERE NOT (n)--()
RETURN n;These graph traversal patterns power DodaZIP’s file relationship mapping and Durga Antivirus Pro’s threat actor network analysis — connecting seemingly unrelated events into actionable intelligence.
What’s Next
Congratulations on completing this Neo4j tutorial! Here’s where to go from here:
- Practice daily — Consistency is more important than long study sessions
- Build a project — Apply what you learned by building something real
- Explore related topics — Check out other tutorials in the same category
- Join the community — Discuss with other learners and share your progress
Remember: every expert was once a beginner. Keep coding!
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro