jueves 20 de enero de 2011

TUTORIAL : migrate mysql to GAE (Google App Engine)


This post is based upon Nick's Blog Advanced Bulk Loading, part 3: Alternate datasources , Migration to a Better Datastore and Google app engine Uploading data

From Nicks Blog : The bulkloader automatically supports loading data from CSV files, but it's not restricted to that.
The key to this is the generate_records method. This method accepts a filename, and is expected to yield a list of strings for each entity to be uploaded. By overriding this method, we can load from any datasource we wish

This is an example (algo taken from Nicks blog :P) but with some fixes I had to add

import MySQLdb
from google.appengine.tools import bulkloader
from google.appengine.ext import db

class MySQLLoader(bulkloader.Loader):
  def __init__(self, kind_name, query, converters):
    self.query = query
    bulkloader.Loader.__init__(kind_name, converters)

  def initialize(self, filename, loader_opts)
    self.connect_args = dict(urlparse.parse_qsl(loader_opts))

  def generate_records(self, filename):
    """Generates records from a MySQL database."""
    db = MySQLdb.connect(self.connect_args)
    cursor = db.cursor()
    cursor.execute(self.query)
    return iter(cursor.fetchone, None)

class BlogPost(db.Model):
  title = db.TextProperty(required=True)
  date = db.DateProperty(required=True, auto_now_add=True)
  body = db.TextProperty(required=True)

class BlogPostLoader(MySQLLoader):
  def __init__(self):
    MySQLLoader.__init__('BlogPost',
                         'SELECT title, date, body FROM posts',
                         [('title', str),
                          ('date', custom_date('%m/%d/%Y')),
                          ('body', str)
                         ])

laoders = [BlogPostLoader]


And here how to use it.

appcfg.py upload_data --config_file=blogpost_loader.py --filename=sarasa --loader_opts="passwd=passwd&db;=things" --kind=BlogPost

If you want to put the data to the developtment :

appcfg.py upload_data --config_file=blogpost_loader.py --filename=sarasa --loader_opts="passwd=passwd&db;=things" --kind=BlogPost --url=http://localhost:8080/remote_api 

Dont forget to add the remote_api in handlers :

-- url: /loadusers
 script: myloader.py
 login: admin

Also go to http://localhost:8000/remote_api and signin with a mail checking the box signin as administrator with your browser.
-
Google Analytics Alternative