martes, 22 de abril de 2014

Part 1 - Testing your Datastore models with Google Cloud Platform Interactive Console

This is the first in a series of short articles that will cover a way of testing Google's Datastore models using the Interactive Console present in the dev-server included with Google Cloud SDK. Also we'll be seeing a good way (at least for us)  to create and load large amount of testing data for our models so you can work with datasets similar to the ones you're going to use in production. Finally we'll see how easy is to integrate this tools into a simple app using Ferris MVC.

Note. I'm using Google Cloud Python SDK version 1.9.3, on Ubuntu Linux 12.04, 64 bits. As I want this short intro to grow up in time I'll be using the excellent MVC framework "Ferris" from CloudSherpas from the beginning. The IDE I'm using is EMACS 23.3.  

1. Creating your first "Person" Datastore model 

1. Create a blank project 

First you need to create an empty project to start with. To do so just make a copy of new_project template included with  Google Cloud SDK.

 cp -R new_project_template my_new_project   

2. Install Ferris MVC

Download Ferris MVC from the official site:

 wget -c https://bitbucket.org/cloudsherpas/ferris-framework/get/master.zip  

Unzip the package in your "my_new_project" folder or directory

 $ unzip cloudsherpas-ferris-framework-*.zip -d .   
 $ cd cloudsherpas-ferris-framework-*   
 $ mv * ../   
 $ rm -fr cloudsherpas-ferris-framework-*  

Edit your and save your app.yaml file so you know is your project

 #my_new_project/app.yaml  
 application: my-new-app-id  #.appspot.com                                                
 version: 1.0  

Start your app for the first time just to check everything is ok

To start your app for the first time just launch the dev server  like this.

 $ dev_appserver.py my_new_project   

Then open your browser at http://localhost:8080 (local app URL) and to http://localhost:8000 (local admin console).

3. Creating your first Datastore Model 

For the purposes of this article we'll create a simple two-properties model for the Kind "Person". This model includes the "name" of the person an also a multi-value text field for saving tags related to this person. Following the directory structure we'd just created by copying Ferris to our project, we need to create our model under  the folder app/models like this: (you can use the text editor or IDE of your choice).

 # app/models/person.py  
 from ferris import BasicModel                                                     
 from google.appengine.ext import ndb   

 class Person(BasicModel):                         
   """   
   NDB Model for Person                                                
   name: StringProperty - Required                                
   skills: StringProperty(repeated=True) - Optional                                            
   """                              
  name = ndb.StringProperty(required=True)        
  tags = ndb.StringProperty(repeated=True)   

Once you create your model you can restart your development server (CTRL+C) and dev_appappserver.py my_new_project again.

Using the interactive console to test your model 

Google's Cloud SDK contains a very nice administration app that includes an interactive console to test your code or data for different purposes. In this case, we are going to use the interactive console to test our simple Person model by creating new entries and do some queries against it.

Mass creation of test entries with sample data 

One of the things that I find most annoying when testing some app is the lack of real or at least close to real data. A lot of problems (think of performance, scalability, validation and data modeling problems just to name a few),  can be tracked down during early stages of development if just one can count on an adequate dataset to work with.  Even for our simple model is good to have a decent size dataset to play with queries and insertions over tags or names in it.

Although the developement server included with the GC SDK appears to have utilities to mass insert data into Google's Datastore, I could not found a good resource or reference to execute that procedure when one is using NDB (the official doc, mentions DB but not NDB implementation).  In this scenario i'd remembered excelent Perl::Maker package and decided to search for an equivalent for Python. What i'd found was "Faker", a very simple yet powerful tool for creating user data: names, surnames and general paragraphs (lorem ipsum), so, what we are going to do right now is to add Faker to our project and start using the Interactive Console to do some tests.

1. Download Faker from official github site 

https://github.com/joke2k/faker

2. Unzip Faker inside your project folder 

 $ cd my_new_project  
 $ unzip faker-master.zip -d .   
 $ mv faker-master faker   

3. Creating mass data from Interactive Console 

Start your development server (if you didn't already) 

 $ dev_appserver.py my_new_project   

Launch de Admin Console

Point your browser to http://localhost:8000 then click on "Interactive Console" from the left menu bar 



Once there clear the default code in the text box. 
Finally copy and paste the following python code in the text box:

from app.models.person import Person 
from google.appengine.ext import  ndb
from random import randrange
from faker import Faker                                                
                                                                              
# a new instance of Faker                                                                                                                 
fake = Faker()
NENTRIES = 10

# a list wich contains fake tags to randomize over                                                                                        
ftags = (['engineering','custom-making','writing','electrician',
            'interiors-design','dancing','smiling','drifting','drinking'])                                                                                       
ftags_new = []                                                              
                                                                              
for i in range(1,NENTRIES):  
# create a new name                                               
 fname = fake.name()
            
# randomize over tags list size and items to create the person's tags            
# randomize the number of tags
 for j in range(0,randrange(9)):
# randomize tags                                            
  ftags_new.extend([ftags[randrange(9)]])                                   
 print ftags_new                                                            
 person = Person(name=fname, tags=ftags_new)                                
 del ftags_new[:]
# save the new person on datastore everytime                                     
 person.put()

After executing the code above you should see a printed list of all tags list random assigned to each new entry.

['interiors-design', 'writing', 'electrician', 'dancing', 'interiors-design', 'drinking', 'smiling', 'interiors-design']
['electrician', 'writing']
['interiors-design']
['interiors-design', 'interiors-design', 'dancing', 'smiling', 'smiling']
['drifting', 'electrician', 'dancing', 'interiors-design', 'dancing', 'drinking']
['electrician', 'writing', 'drifting', 'writing', 'writing', 'drinking', 'interiors-design', 'engineering']
['interiors-design', 'interiors-design']
['writing']
['drinking', 'drinking', 'drinking', 'engineering']

Now you can go to "Datastore Viewer" in the console to see the entries you'd just created. In this case you'd created just 10 entries, but rhe number can be higher.

...

In the next part we'll be covering querying against Datastore and the repeated attributes and also some very nice options for timing your entries creation process.


References

[1] Google's App Engine Official Documentation  https://developers.google.com/appengine/?csw=1
[2] Ferris MVC http://ferris-framework.appspot.com
[3] Faker - https://github.com/joke2k/faker