Learn Databases: Neo4j Guide — Graph Database with Cypher Query Language

Neo4j Guide — Graph Database with Cypher Query Language

DodaTech Updated Jun 7, 2026 9 min read

Neo4j is the world’s leading graph database, representing data as nodes connected by relationships with properties, and queried using Cypher — an intuitive declarative query language for graph traversal and pattern matching.

What You’ll Learn

By the end of this tutorial, you’ll understand the graph data model (nodes, relationships, properties), write Cypher queries using MATCH, CREATE, and WHERE, create and query graph indexes, use graph algorithms for pathfinding and recommendations, and connect Neo4j to Python applications.

Why Neo4j Matters

Graph databases solve problems that are painful in relational databases: social networks, recommendation engines, fraud detection, and knowledge graphs. LinkedIn uses graphs for “People You May Know”, Netflix for recommendations, and banks for fraud detection. Durga Antivirus Pro uses Neo4j for analyzing relationships between threat actors, malware variants, and compromised systems. Learning Neo4j gives you a unique skill in one of the fastest-growing database categories.

Neo4j Learning Path

    flowchart LR
  A[SQL Basics] --> B[PostgreSQL]
  B --> C[Neo4j]
  C --> D[Elasticsearch]
  D --> E[MongoDB]
  E --> F[Database Design]
  C --> G{You Are Here}
  style G fill:#f90,color:#fff

Prerequisites: Familiarity with SQL or any database concept. Python experience is helpful for the driver examples. No prior graph theory knowledge is needed.

What Is Neo4j? (The “Why” First)

Think about how SQL databases model relationships — they use foreign keys and JOINs. If you want to find “friends of friends who bought products I viewed”, you’d need multiple JOINs across several tables, and the query complexity grows exponentially with each hop. In Neo4j, relationships are first-class citizens — they’re stored as physical pointers, not computed via JOINs. Finding connections of any depth is fast and intuitive: MATCH (me)-[:FRIEND*2..5]->(product) finds products my friends (up to 5 hops away) bought, in one simple query.

Graph vs Relational Model

Concept	SQL Database	Neo4j
Entities	Rows in tables	Nodes
Relationships	FOREIGN KEY + JOIN	Native relationships
Depth queries	Complex recursive CTEs	Simple pattern matching
Schema	Rigid (ALTER TABLE)	Schema-optional
Performance	Degrades with JOIN depth	Constant for any depth

The Graph Model

    flowchart TB
    Alice[Person: Alice] -->|FRIENDS_WITH| Bob[Person: Bob]
    Alice -->|BOUGHT| Laptop[Product: Laptop]
    Bob -->|BOUGHT| Mouse[Product: Mouse]
    Bob -->|BOUGHT| Headphones[Product: Headphones]
    Laptop -->|CATEGORY| Electronics[Category: Electronics]
    Mouse -->|CATEGORY| Electronics
    Headphones -->|CATEGORY| Electronics
    Alice -->|REVIEWED| Laptop
    style Alice fill:#f90,color:#fff
    style Bob fill:#f90,color:#fff

Every node has labels (like a table name) and properties (key-value pairs). Every relationship has a type and direction, and can also have properties.

Creating Nodes and Relationships with Cypher

// Create person nodes
CREATE (alice:Person {name: 'Alice', age: 28, city: 'New York'})
CREATE (bob:Person {name: 'Bob', age: 35, city: 'Los Angeles'})
CREATE (charlie:Person {name: 'Charlie', age: 22, city: 'Chicago'})

// Create product nodes
CREATE (laptop:Product {name: 'MacBook Pro', price: 2499})
CREATE (mouse:Product {name: 'Magic Mouse', price: 79})
CREATE (headphones:Product {name: 'AirPods Pro', price: 249})

// Create category nodes
CREATE (electronics:Category {name: 'Electronics'})

// Create relationships
CREATE (alice)-[:FRIENDS_WITH]->(bob)
CREATE (alice)-[:BOUGHT {date: '2026-06-01', quantity: 1}]->(laptop)
CREATE (bob)-[:BOUGHT {date: '2026-06-05', quantity: 2}]->(mouse)
CREATE (bob)-[:BOUGHT {date: '2026-06-06', quantity: 1}]->(headphones)
CREATE (laptop)-[:BELONGS_TO]->(electronics)
CREATE (mouse)-[:BELONGS_TO]->(electronics)
CREATE (headphones)-[:BELONGS_TO]->(electronics)
CREATE (alice)-[:REVIEWED {rating: 5, text: 'Great laptop!'}]->(laptop)

Querying with MATCH

// Find all products that Alice bought
MATCH (alice:Person {name: 'Alice'})-[:BOUGHT]->(product:Product)
RETURN product.name, product.price;

// Output:
// product.name   | product.price
// MacBook Pro    | 2499

// Find friends of Alice and what they bought
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH]->(friend:Person)
MATCH (friend)-[:BOUGHT]->(product:Product)
RETURN friend.name, product.name, product.price;

// Output:
// friend.name | product.name  | product.price
// Bob         | Magic Mouse   | 79
// Bob         | AirPods Pro   | 249

// Find products that Alice's friends bought (but Alice didn't buy)
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH]->(friend:Person)
MATCH (friend)-[:BOUGHT]->(product:Product)
WHERE NOT EXISTS {
    MATCH (alice)-[:BOUGHT]->(product)
}
RETURN DISTINCT product.name, product.price;

// Output:
// product.name  | product.price
// Magic Mouse   | 79
// AirPods Pro   | 249

Path Queries and Variable Depth

// Find all products in the Electronics category (any depth)
MATCH (product:Product)-[:BELONGS_TO*]->(cat:Category {name: 'Electronics'})
RETURN product.name;

// Find friends up to 3 hops away from Alice
MATCH (alice:Person {name: 'Alice'})-[:FRIENDS_WITH*1..3]->(distant:Person)
RETURN DISTINCT distant.name;

// Find shortest path between two people
MATCH path = shortestPath(
    (alice:Person {name: 'Alice'})-[:FRIENDS_WITH*]-(charlie:Person {name: 'Charlie'})
)
RETURN [node IN nodes(path) | node.name] AS friend_chain;

// Output:
// friend_chain
// [Alice, Bob, Charlie]

Adding Indexes and Constraints

// Create an index for fast node lookups by property
CREATE INDEX person_name_index FOR (p:Person) ON (p.name);

// Create a unique constraint (like PRIMARY KEY in SQL)
CREATE CONSTRAINT person_name_unique FOR (p:Person) REQUIRE p.name IS UNIQUE;

// Create a composite index
CREATE INDEX person_city_age_index FOR (p:Person) ON (p.city, p.age);

// Verify indexes
SHOW INDEXES;

Using Neo4j with Python

from neo4j import GraphDatabase

class Neo4jClient:
    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))

    def close(self):
        self.driver.close()

    def get_recommendations(self, user_name):
        with self.driver.session() as session:
            result = session.run("""
                MATCH (user:Person {name: $name})
                    -[:FRIENDS_WITH]->(friend:Person)
                    -[:BOUGHT]->(product:Product)
                WHERE NOT EXISTS {
                    MATCH (user)-[:BOUGHT]->(product)
                }
                RETURN product.name AS product,
                       product.price AS price,
                       friend.name AS bought_by
                ORDER BY price DESC
            """, name=user_name)
            return [record.data() for record in result]

# Use the client
client = Neo4jClient("bolt://localhost:7687", "neo4j", "password")
recommendations = client.get_recommendations("Alice")
for rec in recommendations:
    print(f"{rec['bought_by']} bought {rec['product']} (${rec['price']})")

client.close()

Expected output:

Bob bought AirPods Pro ($249)
Bob bought Magic Mouse ($79)

Graph Algorithms in Neo4j

Neo4j includes a library of graph algorithms via the Graph Data Science (GDS) library:

// PageRank — find influential nodes
CALL gds.pageRank.write({
    nodeProjection: 'Person',
    relationshipProjection: 'FRIENDS_WITH',
    writeProperty: 'pagerank'
});

// Check PageRank scores
MATCH (p:Person)
RETURN p.name, p.pagerank
ORDER BY p.pagerank DESC;

// Community detection (Louvain)
CALL gds.louvain.write({
    nodeProjection: 'Person',
    relationshipProjection: 'FRIENDS_WITH',
    writeProperty: 'community'
});

// Check communities
MATCH (p:Person)
RETURN p.community, collect(p.name) AS members;

Common Neo4j Errors

1. `IndexNotFound`

Trying to use an index that doesn’t exist. Fix: Check with SHOW INDEXES and create the index first with CREATE INDEX.

2. `ConstraintValidationFailed`

Violating a uniqueness constraint. Fix: Check for existing nodes with the same property value before creating, or use MERGE instead of CREATE.

3. `SyntaxError: unexpected token`

Cypher syntax error. Fix: Check keyword spelling, parentheses, and arrow directions (-> vs -). Remember: -[:RELATION]-> has a specific format.

4. `Neo.TransientError.General.MemoryPoolOutOfMemoryError`

The query requires more memory than available. Fix: Use PROFILE or EXPLAIN to optimize the query. Add LIMIT to reduce result size, or increase dbms.memory.heap.max_size.

5. Relationship Direction Confusion

// WRONG — no arrow direction
MATCH (a:Person)--(b:Person)

// RIGHT — directed relationship
MATCH (a:Person)-[:FRIENDS_WITH]->(b:Person)

Fix: Always specify relationship direction and type for production queries.

6. Cartesian Product Warning

// WRONG — no connection between a and b creates cartesian product
MATCH (a:Person), (b:Product)
RETURN a.name, b.name;

Fix: Always connect nodes with a relationship in your MATCH clause. If you do need a cross product, explicitly use OPTIONAL MATCH.

7. Connection refused

Neo4j server isn’t running or the port is wrong. Fix: Check the Neo4j status: sudo systemctl status neo4j. Default ports: bolt://7687, HTTP://7474.

Practice Questions

1. What is a node in Neo4j?

A node represents an entity in the graph (like a person, product, or category). Nodes have labels (types) and properties (key-value attributes). They’re similar to rows in a relational table but can have multiple labels.

2. How is a relationship different from a foreign key in SQL?

A relationship in Neo4j is a physical pointer stored with the node, providing constant-time traversal. A SQL foreign key requires a JOIN operation that scans indexes. Relationships can also have properties; foreign keys cannot.

3. What does MERGE do in Cypher?

MERGE is like a combined MATCH + CREATE — it finds a pattern and creates it if it doesn’t exist. Use MERGE to avoid duplicate nodes and relationships.

4. Challenge: Write a Cypher query that recommends products to a user based on what similar users (same city) bought.

MATCH (user:Person {name: 'Alice'})
MATCH (similar:Person {city: user.city})
    WHERE similar <> user
MATCH (similar)-[:BOUGHT]->(product:Product)
WHERE NOT EXISTS {
    MATCH (user)-[:BOUGHT]->(product)
}
RETURN product.name, COUNT(*) AS popularity
ORDER BY popularity DESC
LIMIT 5;

5. What is the difference between MATCH and OPTIONAL MATCH?

MATCH requires a pattern to exist — if it doesn’t, no results are returned. OPTIONAL MATCH is like SQL’s LEFT JOIN — it returns NULLs when no match is found.

Real-World Task: Build a Fraud Detection Graph

Model a fraud detection system — similar to what Durga Antivirus Pro uses for identifying threat patterns:

// Create the graph model
CREATE (alice:User {name: 'Alice', email: 'alice@example.com', risk_score: 0.1})
CREATE (device1:Device {fingerprint: 'ABC123', os: 'Windows 10', ip: '192.168.1.1'})
CREATE (device2:Device {fingerprint: 'DEF456', os: 'Windows 10', ip: '10.0.0.1'})
CREATE (tx1:Transaction {id: 'TXN001', amount: 500, status: 'approved'})
CREATE (tx2:Transaction {id: 'TXN002', amount: 1500, status: 'flagged'})

// Connect entities
CREATE (alice)-[:USED_DEVICE {timestamp: '2026-06-07 10:00'}]->(device1)
CREATE (alice)-[:MADE_TRANSACTION {timestamp: '2026-06-07 11:00'}]->(tx1)
CREATE (alice)-[:MADE_TRANSACTION {timestamp: '2026-06-07 12:00'}]->(tx2)
CREATE (device1)-[:SAME_FINGERPRINT]->(device2)

// Find potentially fraudulent patterns: same device fingerprint
// used by different users
MATCH (u1:User)-[:USED_DEVICE]->(:Device)-[:SAME_FINGERPRINT]->(:Device)<-[:USED_DEVICE]-(u2:User)
WHERE id(u1) < id(u2)
RETURN u1.name AS user1, u2.name AS user2, 'Shared device fingerprint' AS pattern;

// Find high-risk transactions from users sharing a device
MATCH path = (u:User)-[:USED_DEVICE]->(d:Device)-[:SAME_FINGERPRINT*1..2]->()
MATCH (u)-[:MADE_TRANSACTION]->(tx:Transaction)
WHERE tx.amount > 1000
RETURN u.name, tx.amount, tx.status, d.fingerprint;

FAQ

What is the difference between Neo4j and a relational database?

Neo4j stores relationships as first-class citizens with physical pointers, making graph traversals fast at any depth. SQL databases compute relationships via JOINs, which become slow as depth increases. Use relational for tabular data, Neo4j for connected data.

Is Neo4j ACID compliant?

Yes. Neo4j supports full ACID transactions. Write operations within a single transaction are atomic, consistent, isolated, and durable.

What is the Cypher query language?

Cypher is Neo4j’s declarative graph query language, designed to be intuitive using ASCII art for patterns: (node)-[:RELATIONSHIP]->(node). It’s like SQL for graphs — but optimized for pattern matching and path traversal.

Can I use Neo4j with Python?

Yes. Neo4j provides official drivers for Python, Java, JavaScript, .NET, and Go. The Python driver uses the Bolt binary protocol for efficient communication.

What is the Graph Data Science library?

GDS is a plugin for Neo4j that provides graph algorithms: centrality (PageRank, Betweenness), community detection (Louvain, Label Propagation), pathfinding (shortest path, A*), and node embedding.

Try It Yourself

Start Neo4j locally (Docker recommended) and run these exploration queries:

// View database schema
CALL db.schema.visualization();

// Count nodes and relationships by type
MATCH (n)
RETURN labels(n) AS label, COUNT(*) AS count
ORDER BY count DESC;

MATCH ()-[r]->()
RETURN type(r) AS relationship_type, COUNT(*) AS count
ORDER BY count DESC;

// Find all paths between two nodes (up to 6 hops)
MATCH path = (alice:Person {name: 'Alice'})-[:*1..6]-(bob:Person {name: 'Bob'})
RETURN [node IN nodes(path) | node.name] AS path_nodes,
       [rel IN relationships(path) | type(rel)] AS path_rels;

// Find isolated nodes (no relationships)
MATCH (n)
WHERE NOT (n)--()
RETURN n;

These graph traversal patterns power DodaZIP’s file relationship mapping and Durga Antivirus Pro’s threat actor network analysis — connecting seemingly unrelated events into actionable intelligence.

What’s Next

CouchDB Guide

Snowflake Guide

SQL Basics Guide

Congratulations on completing this Neo4j tutorial! Here’s where to go from here:

Practice daily — Consistency is more important than long study sessions
Build a project — Apply what you learned by building something real
Explore related topics — Check out other tutorials in the same category
Join the community — Discuss with other learners and share your progress

Remember: every expert was once a beginner. Keep coding!

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Previous SQL Query Optimization: Performance Tuning Guide Next Database Indexing: Complete Developer Guide

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Databases