What is Cypher?
Cypher is a graph query language that is used to query the Neo4j Database. Just like you use SQL to query a MySQL database, you would use Cypher to query the Neo4j Database.
A simple Cypher query can look something like this
Details
This content is only revealed when the user clicks the block title.
MATCH (m:Movie)
WHERE m.released > 2000
RETURN m LIMIT 5
Expected Result: The above query will return all the movies that were released after the year 2000 limiting the result to 5 items.
Try
-
Write a query to retrieve all the movies released after the year 2005.
Details
MATCH (m:Movie) WHERE m.released > 2005 RETURN m
-
Write a query to return the count of movies released after the year 2005. (Hint: you can use the
COUNT(m)
function to return the count)Details
MATCH (m:Movie) WHERE m.released > 2005 RETURN count(m)
Nodes and Relationships
Nodes and Relationships are the basic building blocks of a graph database.
Nodes
Nodes represent entities. A node in graph database is similar to a row in a relational database.
In the picture below we can see 2 kinds of nodes - Person
and Movie
. In writing a Cypher query, a node is enclosed between a
parenthesis — like (p:Person)
where p
is a variable and Person
is the type of node it is referring to.
Relationship
Two nodes can be connected with a relationship. In the above image ACTED_IN
, REVIEWED
, PRODUCED
, WROTE
and DIRECTED
are all
relationships connecting the corresponding types of nodes.
In writing a cypher query, relationships are enclosed in square brackets - like [w:WORKS_FOR]
where w
is a variable and WORKS_FOR
is
the type of relationship it is referring to.
Two nodes can be connected with more than one relationships.
MATCH (p:Person)-[d:DIRECTED]-(m:Movie)
WHERE m.released > 2010
RETURN p,d,m
Expected Result: The above query will return all Person nodes who directed a movie that was released after 2010.
Try
-
Query to get all the people who acted in a movie that was released after 2010.
Details
MATCH (p:Person)-[r:ACTED_IN]-(m:Movie) WHERE m.released > 2010 RETURN p,r,m
Labels
Labels is a name or identifier of a Node or a Relationship. In the image below Movie
and Person
are Node labels and ACTED_IN
, REVIEWED
, etc are Relationship types.
In writing a Cypher query, Labels are prefixed with a colon - like :Person
or :ACTED_IN
. You can assign the node label to a variable by prefixing the syntax with the variable name. Like (p:Person)
means p
variable denoted Person
labeled nodes.
Labels are used when you want to perform operations only on a specific types of Nodes. Like
MATCH (p:Person)
RETURN p
LIMIT 20
will return only Person
Nodes (limiting to 20 items) while
MATCH (n)
RETURN n
LIMIT 20
will return all kinds of nodes (limiting to 20 items).
Properties
Properties are name-value pairs that are used to add attributes to nodes and relationships.
To return specific properties of a node you can write
MATCH (m:Movie)
RETURN m.title, m.released

Expected Result - This will return Movie nodes but with only the title
and released
properties.
Try
-
Write a query to get
name
andborn
properties of the Person node.Details
MATCH (p:Person) RETURN p.name, p.born
Create a Node
CREATE
clause can be used to create a new node or a relationship.
CREATE (p:Person {name: 'John Doe'})
RETURN p
The above statement will create a new Person
node with property name
having value John Doe
.
Try
-
Create a new
Person
node with a propertyname
having the value of your name.Details
CREATE (p:Person {name: '<Your Name>'}) RETURN p
Finding Nodes with Match and Where Clause
Match
clause is used to find nodes that match a particular pattern. This is the primary way of getting data from a Neo4j database.
In most cases, a Match
is used along with certain conditions to narrow down the result.
MATCH (p:Person {name: 'Tom Hanks'})
RETURN p
This is one way of doing it. Although you can only do basic string match based filtering this way (without using WHERE
clause).
Another way would be to use a WHERE
clause which allows for more complex filtering including >
, <
, STARTS WITH
, ENDS WITH
, etc
MATCH (p:Person)
WHERE p.name = "Tom Hanks"
RETURN p
Both of the above queries will return the same results.
You can read more about Where clause and list of all filters here - https://neo4j.com/docs/cypher-manual/current/clauses/where/
Try
-
Find the movie with title "Cloud Atlas"
Details
MATCH (m:Movie {title: "Cloud Atlas"}) RETURN m
-
Get all the movies that were released between 2010 and 2015.
Details
MATCH (m:Movie) WHERE m.released > 2010 AND m.released < 2015 RETURN m
Merge Clause
The MERGE
clause is used to either
-
match the existing nodes and bind them or
-
create new node(s) and bind them
It is a combination of MATCH
and CREATE
and additionally allows to specify additional actions if the data was matched or created.
MERGE (p:Person {name: 'John Doe'})
ON CREATE SET p.createdAt = timestamp()
ON MATCH SET p.lastLoggedInAt = timestamp()
RETURN p
The above statement will create the Person node if it does not exist. If the node already exists, then it will set the property lastLoggedInAt
to the current timestamp. If the node did not exist and was newly created instead, then it will set the createdAt
property to the current timestamp.
Try
-
Write a query using Merge to create a movie node with title "Greyhound". If the node does not exist then set its
released
property to 2020 andlastUpdatedAt
property to the current time stamp. If the node already exists, then only setlastUpdatedAt
to the current time stamp. Return the movie node.Details
MERGE (m:Movie {title: 'Greyhound'}) ON CREATE SET m.released = "2020", m.lastUpdatedAt = timestamp() ON MATCH SET m.lastUpdatedAt = timestamp() RETURN m
Create a Relationship
A Relationship connects 2 nodes.
MATCH (p:Person), (m:Movie)
WHERE p.name = "Tom Hanks" AND m.title = "Cloud Atlas"
CREATE (p)-[w:WATCHED]->(m)
RETURN type(w)
The above statement will create a relationship :WATCHED
between the existing Person
and Movie
nodes and return the type of relationship (i.e WATCHED
).
Try
-
Create a relationship
:WATCHED
between the node you created for yourself previously in step 6 and the movie Cloud Atlas and then return the type of created relationshipDetails
MATCH (p:Person), (m:Movie) WHERE p.name = "<Your Name>" AND m.title = "Cloud Atlas" CREATE (p)-[w:WATCHED]->(m) RETURN type(w)
Relationship Types
In Neo4j, there can be 2 kinds of relationships - incoming and outgoing.
In the above picture, the Tom Hanks node is said to have an outgoing relationship while the Cloud Atlas node is said to have an incoming relationship.
Relationships always have a direction. However, you only have to pay attention to the direction where it is useful.
To denote an outgoing or an incoming relationship in cypher, we use →
or ←
.
Example -
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
RETURN p,r,m
In the above query, Person has an outgoing relationship and movie has an incoming relationship.
Although, in the case of the movies dataset, the direction of the relationship is not that important and even without denoting the direction in the query, it will return the same result. So the query
MATCH (p:Person)-[r:ACTED_IN]-(m:Movie)
RETURN p,r,m
will return the same reuslt as the above one.
Try
-
Write a query to find the nodes
Person
andMovie
which are connected byREVIEWED
relationship and is outgoing from thePerson
node and incoming to theMovie
node.Details
MATCH (p:Person)-[r:REVIEWED]-(m:Movie) RETURN p,r,m
Advanced Cypher queries
Let’s look at some questions that you can answer with Cypher queries.
-
Finding who directed Cloud Atlas movie
MATCH (m:Movie {title: 'Cloud Atlas'})<-[d:DIRECTED]-(p:Person) RETURN p.name
-
Finding all people who have co-acted with Tom Hanks in any movie
MATCH (tom:Person {name: "Tom Hanks"})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(p:Person) RETURN p.name
-
Finding all people related to the movie Cloud Atlas in any way
MATCH (p:Person)-[relatedTo]-(m:Movie {title: "Cloud Atlas"}) RETURN p.name, type(relatedTo)
In the above query, we only used the variable
relatedTo
which will try to find all the relationships between anyPerson
node and the movie node "Cloud Atlas" -
Finding Movies and Actors that are 3 hops away from Kevin Bacon.
MATCH (p:Person {name: 'Kevin Bacon'})-[*1..3]-(hollywood) RETURN DISTINCT p, hollywood
Note: in the above query, hollywood
refers to any node in the database (in this case Person
and Movie
nodes)
Great Job!
Now you know the basics of writing Cypher queries. You are on your way to becoming a graphista! Congratulations.
Feel free to play around with the data by writing more Cypher queries. If you want to learn more about Cypher,
you can use one of the below resources
-
Cypher Manual - detailed manual on Cypher syntax
-
Online Training - Introduction to Neo4j - If you are new to Neo4j and like to learn through an online class, this is the best place to get started.