Ikai Lan says

I say things!

Issuing App Engine datastore queries with the Low-Level API

Last time, I wrote an introduction to using the low-level API for creating entities, setting keys, and getting keys by value.

Basic queries and sorts

These are useful when we know the keys, but its often very useful to be able to query entities by their properties. Consider the Person entities we created for the last example, Alice and Bob:

Entity alice = new Entity("Alice", "Person");
 alice.setProperty("gender", "female");
 alice.setProperty("age", 20);

 Entity bob = new Entity(“Person”, “Bob”);
 bob.setProperty("gender", "male");
 bob.setProperty("age", "23");

 DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
 datastore.put(alice);
 datastore.put(bob);

Let’s create a query to find the first 10 Persons that are female and sort them by age ascending. How would we write this?

Query findFemalesQuery = new Query("Person");
findFemalesQuery.addFilter("gender", FilterOperator.EQUAL, "female");
findFemalesQuery.addSort("age", SortDirection.ASCENDING);
datastore.prepare(findFemalesQuery).asList(FetchOptions.Builder.withLimit(10));

Here are the steps we took:

  1. Created a Query object, specifying the Query kind
  2. Added a QueryFilter. Note that this is typesafe. We specify the enum representing the FilterOperator we want to use
  3. Added a QuerySort. Again, like the QueryFilter, we select the property to sort on as well as an enum representing either an ascending order or descending order.
  4. We prepare the query. On this result we return it as either an Iterator or as a List of Entities. On this method we can either execute the default query, or we can pass a set of options. In the example above, we use FetchOptions.Builder to set the only option we care about: the limit. We only want 10, so we call withLimit() and pass it 10.

The query interface works well because it’s typesafe where the datastore is typesafe, and not so when the datastore is not – you won’t get errors at runtime because you misspelled “WHERE”, for instance, but you have to be careful not to misspell the properties you are looking for. The flexibility of this interface means that no longer are we constrained by the “every object must have the same bag of properties” frame of thinking. Furthermore, because we don’t need to know the property names apriori (we can use getProperties() and return a Map), we can iterate through this and figure out the keys/value pairs at runtime. This leads to some very powerful abstractions.

Doing a keys only query

It sometimes makes sense for us to only retrieve the keys in a given query. It’s actually incredibly easy, so as long as we know what to expect:

Query findFemalesQuery = new Query("Person");
findFemalesQuery.addFilter("gender", FilterOperator.EQUAL, "female");
findFemalesQuery.addSort("age", SortDirection.ASCENDING);
findFemalesQuery.setKeysOnly();

List<Entity> results = datastore.prepare(findFemalesQuery).asList(
FetchOptions.Builder.withLimit(10));

The only code that’s different in creating the Query object is that we call setKeysOnly(). This still returns a List of entity objects with only the Kind and Key populated. If we wrote a test for this, it would look like this:

Entity alice = results.get(0);
assertEquals("Return Key for Entity", KeyFactory.createKey("Person", "Alice"), alice.getKey());
assertNull("Should not return female property", alice.getProperty("gender"));
assertEquals("Returns Entities with no properties", 0, alice.getProperties().size());

Only the Kind and Key are populated in these Entity objects. Even though the API looks similar, under the hood, the behavior is completely different. Recall how queries work underneath the hood:

  1. Traverse an index and retrieve keys
  2. Using those keys, fetch the entities from the datastore

The time to do a query depends on the index traversal time as well as the number of entities to retrieve. In a keys only query, this is what happens:

  1. Traverse an index and retrieve keys

We completely eliminate step 2 from the process. If all we want is Key information or are counting entities (and the count can be done using only indexes), this is the approach we would take.

Ancestor Queries

Let’s pretend Alice and Bob have child entities:

Entity madHatter = new Entity("Friend", "Mad Hatter", alice.getKey());
Entity doormouse = new Entity("Friend", "Doormouse", alice.getKey());
Entity chesireCat = new Entity("Friend", "Chesire Cat", alice.getKey());

Entity redQueen = new Entity("Friend", "Red Queen", bob.getKey());

datastore.put(madHatter);
datastore.put(doormouse);
datastore.put(chesireCat);
datastore.put(redQueen);

Alice now has Friends Mad Hatter, Doormouse and the Chesire Cat as child entities, while Bob has on the Red Queen. How do we find all friends of Alice or Bob? Like so:

Query friendsOfAliceQuery = new Query("Friend");
friendsOfAliceQuery.setAncestor(alice.getKey());

List<Entity> results = datastore.prepare(friendsOfAliceQuery).asList(FetchOptions.Builder.withDefaults());

Query friendsOfBobQuery = new Query("Friend");
friendsOfBobQuery.setAncestor(bob.getKey());

results = datastore.prepare(friendsOfBobQuery).asList(FetchOptions.Builder.withDefaults());

What’s great about these queries is that the datastore knows exactly where to start. Because keys embed parent Key information – Mad Hatter, Doormouse and the Chesire Cat all have “Alice” as a prefix in their key (this is also why you cannot change an entity’s entity group after creation), we know that we just need to start the query from Alice’s Key and just traverse entities with a Key greater than Alice. It’s also a great way of organizing data. Just be aware that too many transactions on a single entity group will destroy your throughput, so design for as small entity groups as possible.

Summary

Hopefully this blog post explains a few more features of the low-level API. Understanding the low-level API is an important step in understanding the datastore, and understanding the datastore is a critical step for learning how to build efficient, optimized applications for App Engine.

About these ads

Written by Ikai Lan

July 13, 2010 at 4:43 pm

13 Responses

Subscribe to comments with RSS.

  1. Keep those articles coming!
    The GAE doco really lacks information on using the low-level API!

    Guillaume Laforge

    July 14, 2010 at 8:06 am

  2. Hi,

    I’m using native Api. I like them much more than JDO. :)

    There is one thing that I didn’t yet managed: if an entity (project e.g) has 2 different collections of children (tasks and linksToStaff) and those are saved in the same entity group.
    Is there a way to retrieve all keys from a given ancestor regardless the type?

    Query friendsOfBobQuery = new Query(ancerstorKey);

    doesn’t work for me. :(

    Query friendsOfBobQuery = new Query(“Friend”);
    7 friendsOfBobQuery.setAncestor(bob.getKey());

    Uberto Barbini

    July 19, 2010 at 6:33 am

  3. The lack of documentation on low level api is a very big problem.
    I’m very happy to find your blob.
    But i have a question for you..
    Suppose i have a Entity

    Entity alice = new Entity(“Alice”, “Person”);
    alice.setProperty(“gender”, “female”);
    alice.setProperty(“age”, 20)
    ArrayList role = new ArrayList();
    role.add(“admin”);
    role.add(“authenticated”);
    alice.setProperty(“role”,role)

    well, now how I can retrive all the Entity with role equals to admin?
    How the query can go inside the ArrayList?
    Thank’s.

    Salvatore Belardo

    July 27, 2010 at 2:17 pm

  4. Same thing – you should just be able to call addFilter with an EQUAL operation.

    Ikai Lan

    July 27, 2010 at 2:25 pm

  5. Yes but, how is the notation?The simple

    Query query = new Query(“Person”);
    query.addFilter(“role”, FilterOperator.EQUAL, “admin”);

    Low level api iterate by itself inside every field of an ArrayList?

    salvatore belardo

    July 27, 2010 at 2:30 pm

  6. Is it possible to do an ancestor query on the parent of a parent? I guess in your example if madHatter had a child and redQueen also had a child both say “FriendOfFriend”, could I query “FriendOfFriend” using the alice and bob keys?

    David

    July 28, 2010 at 5:31 am

  7. Excellent post. It would be very interresting to compare the datastore to the bigtable implementation itself. In bigtable, there are column families and timestamp indexing, I guess that translates to a three level hierarchy in the datastore low level API(first level: the entity key, second level: the column family key and third level, the time stamps key), what is still needed is some info on actually how to model data that fits with the bigtable philosophy in the first place.

    Jungleman

    July 31, 2010 at 3:48 pm

  8. The datastore actually sits on top of a layer that sits on top of BigTable. Could be a good topic one of these days. Maybe I’ll do it at a future Google Developer Day or DevFest.

    Ikai Lan

    July 31, 2010 at 3:57 pm

  9. Hello Ikai. Thanks for the posts about the low-level API. I know you dabble (or dabbled) in scala, so I wanted to tell you (and any interested scala users) about highchair. Among other modules, it provides a module for working with persistent GAE entities with a simple, idiomatic scala library and a type-safe query DSL ( https://github.com/chrislewis/highchair/wiki/Datastore ). Thanks again for the articles!

    Chris Lewis

    May 31, 2011 at 7:31 am

  10. Hey, thanks for the comment! I don’t know if I like the “DSL” style so much where you omit the method operator, but looks neat.

    Any examples with the Async datastore? It’s also not immediately clear to me how one would do writes, entity group management, transactions, etc.

    Ikai Lan

    June 2, 2011 at 12:07 pm

  11. Obviously I prefer the DSL; I find it more natrual with the added bonus of compile-time checking (something I value highly for non-toy apps). To each his own.

    Support for async is an open ticket – I expect to address it soon.

    Writes are done through the ‘Kind’ put method: Person.put(personInstance). That’s not in the wiki, but it’s in the specs (ex: https://github.com/chrislewis/highchair/blob/0.0.4/datastore/src/test/scala/datastore/EntitySpec.scala ).

    Could you elaborate on the kinds of operations involved in “entity group management”? I’m aware of entity groups, but I don’t have a clear understanding of what they are or their implications.

    Transactions could be managed directly on the ‘raw’ datastore api. Not elegant, but it should work. I’ve had it on the mental list to support them explicitly.

    Thanks for the feedback. I’ve opened tickets for entity group and transaction support (async already planned). The reason it seems so bare is simply because I develop the project alone and on an as-needed basis – that includes requests from users.

    Thanks again – I hope you’ll share more info (or literature) on entity groups.

    Chris Lewis

    June 2, 2011 at 1:03 pm

  12. Hello,

    Thanks for this topic really helpful. I tried to use it ant it works perfectly for inserting, updating and deleting. I am stuck about performances for the getting part when I have “relationship” between several entities.

    Let’s consider the following example. A Person lives at an Address that is in a specific Country.

    Entity country = new Entity(“Country”);
    country.setProperty(“name”, “France”);

    Entity address = new Entity(“Address”);
    address.setProperty(“street”, “Rue de la Paix”);
    address.setProperty(“city”, “Paris”);
    address.setProperty(“country”, country.getKey());

    Entity alice = new Entity(“Person”);
    alice.setProperty(“name”, “alice”);
    alice.setProperty(“gender”, “female”);
    alice.setProperty(“age”, 20);
    alice.setProperty(address.getKey());

    Let imagine I have thousands of Persons. If I want to display a list of the Persons displaying their name, their city and their country in my list, how do I have to build my queries ? I fear to have to do a really huge number of queries to be able to display these information.

    If you have any idea… I will be pleased.
    Thanks for help,

    Joshua

    July 6, 2011 at 2:06 am

  13. The answer is not to store them in different entities. Normalization is evil.

    Instead, store country and address information on the Person entity.

    Ikai Lan

    July 16, 2011 at 4:29 pm


Comments are closed.