Our Application would not be very useful if it was not able to store and retrieve data. There are a number of databases available to web applications, and to Google App Engine projects. We are going to use Google’s Datastore, which is Google’s interpretation of a NoSQL database. In this article, we will explore how we can implement datastore in our application.

What is Datastore? How is it different from MySQL?

Traditionally, web apps have used the vey popular MySQL database to store data. MySQL is a very powerful database, however has its limitations when the datasets it manages get very large. Datastore is structured very differently, and is very efficient with enormous data sets. It does bring about some major differences in use however.

In a traditional MySQL database, tables are created, with a set structure that data inserted into its rows must adhere to. With MySQL you can also perform searches over multiple database tables simultaneously using joins etc. With Datastore, instead of managing tables, and table structures, we instead define a Kind, and each item of that Kind, is known as an entity. For instance, in MySQL you may have a Users table that stores rows of user data. In Datastore, you would have a Kind named Users, and store entities of that Kind in the datastore. Entities are also very flexible. There is no requirement for any 2 entities of the same kind to have the same structure or properties set. This makes Datastore very flexible, however you need to more closely manage your entities in your application, as they may not all contain the same level of information. Datastore also does not allow for query joins and is somewhat limited in the types of query you can perform. The tradeoff however, is that you will have very fast and efficient access to your data from any instance or module of your application.

Datastore also has, what I perceive to be an advantage over MySQL, in that you can manage all of your entity structures using your application code. Gone are the days of having to manually or programatically modify table structures, and risk corrupting or breaking your application due to the changes. Let’s look at how we can get started with Datastore.

Accessing Datastore from PHP

When you read the documentation for Google’s Datastore, found at https://cloud.google.com/datastore/docs/concepts/overview, you will notice it supports, and google provides libraries for Go, Java, Node.js, Python and Ruby. But where is PHP? At this stage, Datastore does not have any native support for PHP. Google does however provide access to Datastore through API’s and through a protocol buffer service. Don’t worry too much about this, as there is a publicly available, open-source library available thanks to developer Tom Walder, that manages to provide a very easy to implement connection to Datastore. It can be found at https://github.com/tomwalder/php-gds. We will be utilising this library, to use datastore. I have installed the library using composer, as we are already using composer for other dependencies and autoloading.

PHP-GDS

Let’s have a look at how Tom Walder’s library breaks down communication with Datastore. There are essentially 3 layers we need to look at. The first is the gateway object, which manages our connection with datastore. The second is the store object, which manages the storage and retrieval of our entities. And lastly is our entity object, which holds all of the properties of our entities.

The Gateway

We do not need to understand much about the gateway object, it does its work in the background and allows our stores to communicate with datastore. There are 2 methods used for connecting to datastore with a gateway. The first is using Google’s Client API, which will connect you using external API connections, or using Protocol Buffers, which are far more efficient. We will be using protocol buffers in our application. In order to create a protocol buffer gateway, after including the required library with either composer, or installing it manually, we simply need to use the following code:

 

<?PHP

	$Gateway = new  \GDS\Gateway\Protobuf(null, 'namespace');
	
?>

Using this code we have created a gateway object using protocol buffers. Note there are two arguments required by the constructor. The first is Dataset, we have set this to null, as by default this will be automatically detected, and set to our application’s dataset by App Engine. The second argument we passed in was a namespace. We will use namespaces in our application to manage access to our data. Namespacing our data allows us to separate data between users of our application. For example, this application will allow freelancers to sign up and manage their projects. To achieve this, each freelancer, or each agency will be assigned a data namespace. In this way, we can store all of the entities relating to an individual agency/freelancer within their own namespace in our datastore. This means that even if we accidentally select all entities in our datastore, only the individual freelancer’s entities will ever be shown to them. We will also use some namespaces for system entities like sessions and users etc that will be required globally throughout the system. To set a different namespace, we simply change the ‘namespace’ to the namespace of our choosing.

We can see from the line of code we have just written, that connecting to our datastore is relatively simple, however it does require us to add arguments to our code to get a valid gateway back. Let’s add a static function to our Container class, in order to help automate the process later on.

<?PHP // core/container.class.php

	public static function newGateway($Namespace){
		return new \GDS\Gateway\ProtoBuf(null, $Namespace);
	}
	
?>
	
<?PHP // example only
	$Gateway = Container::newGateway('Sessions');	
?>

By adding a static method to our container, we can now create a gateway object by calling our Container’s newGateway method and providing our chosen namespace. In the example above, I have set Sessions as the namespace.

The next layer, is our Store, which will handle communication between our App and datastore. To create our store, we can extend the store class provided by the PHP-GDS library. We then define in our new store class, the intended, or planned structure of our Entity.

<?PHP // core/sessionstore.class.php

	class SessionStore extends GDS\Store{
	
		public function buildSchema(){
			$Schema = new GDS\Schema('Session');
			$Schema->addString('Data', FALSE);
			$Schema->addDateTime('LastRequest', FALSE);
			$Schema->addDatetime('ExpireTime', TRUE);
			return $Schema;
		}

		public function __construct($Gateway){
			parent::__construct($this->buildSchema(), $Gateway);
			$this->setEntityClass('Session');
		}
	
	}
	
?>

Here we have created a SessionStore class, and we have defined some properties we expect our entities to have. This is done by defining a buildSchema() function and creating a GDS\Schema object. The types of data, and functions to call to add each type are listed below:

  • String – addString()
  • Integer – addInteger()
  • DateTime – addDateTime()
  • Float – addFloat()
  • Boolean – addBoolean()
  • StringList – addStringList()
  • Geopoint – addGeopoint()

For our sessions, we have defined:

  • String – Data, to store our session data
  • DateTime – LastRequest, to store the time of the last request for this session
  • DateTime – ExpireTime, to store the expiration time of the session

Note we have not defined a key, or an ID for our sessions. There are two inbuilt key types we will see later, to identify our entities. They are KeyId, and KeyName.

You will also see that we have overwritten the __construct() function in our SessionStore class. We have first called the parent::__construct() method, and provided our schema and our gateway object, and you can see that our gateway object becomes a dependency. Also in our __construct() method, we define our EntityClass. This is a great feature introduced by Tom Walder, that allows us to se the class of the entity that will be stored using this store. By doing this, we can pass in an entity object of this class, and hen we retrieve it, it will be returned to us as an object of that class, with all the properties pre-filled. No need to manually map our search results to our class!

As we have a dependency in our __construct() method, let’s add this new SessionStore class to our dependency injection Container class.

<?PHP // core/container.class.php
	
	public static function newGateway($Namespace){
		return new \GDS\Gateway\ProtoBuf(null, $Namespace);
	}
	
	public static function newSessionStore(){
		return new SessionStore(self::newGateway('Session'));
	}

?>

You can see we have now automated the dependency injection requirements of both our gateway and our  store. We can now create our store by simply calling Container::newSessionStore(); which will make life much easier!

The last layer is the entity itself. The PHP-GDS library provides a template for an entity, and we can simply extend the entity class, and provide a custom entity name. In this case, we will call in Session, this will allow our SessionStore to create Session objects, which will be complete entities.

<?PHP // core/session.class.php

	class Session extends GDS\Entity{
	
	}
	
?>

That’s all there is to our Session class. It will act as an empty shell, ready to accept any data we input from our application, or from our SessionStore. We can pass it to our SessionStore to be stored in our datastore as well.

We have now set up the 3 components we need to manage our datastore communications for Session objects. The gateway object we created can be re-used with all of our stores, and all of our entity objects, we simply need to set the correct namespace when instantiating it.

To use datastore for other entities, we now just need to create new Store and Entity classes, in the same way as we have for our Session and SessionStore.

Using our Datastore Objects in our Application

Now that we know how to set up our gateway, store and entity classes. Let’s have a look at how we can use them in our application to store and retrieve our data. We have set up datastore access for sessions, so let’s now look at how to set our session properties, and store them in our store.

 

<?PHP // example only

	$Session = new Session();
	$Session->Data = 'Example Data';
	$Session->LastRequest = new DateTime();
	$Session->ExpireTime = new DateTime('+ 900 seconds');
	$Session->setKeyName('ExampleSession');

	$SessionStore = Container::newSessionStore();
	$SessionStore->upsert($Session);
	
?>

To set data to save to our datastore, we simply created a new instance of our Session class, and set the properties to the data we wish to save. You will notice I have also set a KeyName, using setKeyName. This is a method built in to the GDS\Entity class. In a real session scenario the data would be set by our CustomSessionHandler class, however the above is simply for demonstration purposes.

To insert our data, we called on our SessionStore’s upsert method and passed in our entity. The upsert method will insert new entities, and will also update an entity provided the existing KeyName or KeyId values match the existing entity’s key.

To retrieve entities from our datastore, there are a number of query methods built in to the PHP-GDS Store class.

  • fetchById($str_id) – By passing in an ID, we will get back the entity with that ID
  • fetchByIds(array $arr_ids) – By passing in an array of ID’s, we will get back an array of entities with those ID’s
  • fetchByName($str_name) – By passing in a KeyName, we will get back the entity with that key name set
  • fetchByNames(array $arr_names) – By passing in an array of KeyNames, we will get back an array of entities with those Key Names set
  • query($str_query, $arr_params) – The query method allows you to query the datastore using the GQL query language. This is quite similar to standard SQL query language. Parameters are designated in the query using the @ symbol followed by the identifier, e.g. @Last, the parameter is then set in the params array using [‘Last’ => ‘Value’]
  • fetchOne($str_query, $arr_params) – The fetch one method allows you to query the datastore and return just one entity. This could be useful when sorting entities by date or time in a query and returning the oldest or most recent.
  • fetchAll($str_query, $arr_params) – The fetch all method will allow you to query the datastore and return all matching entities.
  • fetchPage($int_page_size, $mix_offset) – The fetch page method allows you to fetch a page of results setting a page size, and an offset value.
  • fetchEntityGroup($obj_entity) – The fetchEntityGroup method allows you to fetch an entire entity group from the datastore. Datastore allows you to set ancestry for entities, so for a person entity you could set a parent, grandparent etc. The fetchEntityGroup allows you to fetch all entities in the ancestry tree, when the root entity is passed in.

Let’s look at a simple example of fetching the session we just stored.

<?PHP // example only
	
	$SessionStore = Container::newSessionStore();
	$Session = $SessionStore->fetchByKeyName('ExampleSession');
	
?>

Finally, to delete an entity, we use the store’s delete() method, and pass in either the entity we wish to delete, or an empty entity with the correct KeyId or KeyName set.

<?PHP // example only

	$SessionStore = Container::newSessionStore();
	$Session = $SessionStore->fetchByKeyName('ExampleSession');
	$SessionStore->delete($Session);
	
?>

We have now discussed how to set our application up to store our entities using  stores and a gateway, and we have seen how to insert, update, fetch and delete our entities. We have not yet discussed testing our datastore interactions with PHPUnit. This is because datastore is not available to us from the command line, as it requires the AppEngine SDK to function. In the next article, we will write a very basic datastore emulator, that will allow us to test basic datastore interactions.