Using the bulkloader with Java App Engine

The latest release of the datastore bulkloader greatly simplifies import and export of data from App Engine applications for developers. We’ll go through a step by step example for using this tool with a Java application. Note that only setting up Remote API is Java specific – everything can be used with Python applications. Unlike certain phone companies, this is one store that doesn’t care what language your application is written in.

Checking for our Prerequisites:

If you already have Python 2.5.x and the Python SDK installed, skip this section.

First off, we’ll need to download the Python SDK. This example assumes we have Python version 2.5.x installed. If you’re not sure what version you have installed, open up a terminal and type “python”. This opens up a Python REPL, with the first line displaying the version of Python you’re using. Here’s example output:

Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

(Yes, Pythonistas, the version on my laptop is ooooooooold).

Download the Python SDK from the following link. As of the writing of this post, the newest version is 1.3.4: Direct link.

Unzip this file. It’ll be easier for you if you place this in your path. Linux and OS X users will append this in their ~/.bash_profile:

PATH="/path/to/where/you/unzipped/appengine:${PATH}"
export PATH

To test that everything is working, type:

appcfg.py

You’ll see a page of usage commands that starts out like this:

Usage: appcfg.py [options] <action>
 
Action must be one of:
create_bulkloader_config: Create a bulkloader.yaml from a running application.
cron_info: Display information about cron jobs.
download_data: Download entities from datastore.
help: Print help for a specific action.
request_logs: Write request logs in Apache common log format.
rollback: Rollback an in-progress update.
update: Create or update an app version.
update_cron: Update application cron definitions.
update_dos: Update application dos definitions.
update_indexes: Update application indexes.
update_queues: Update application task queue definitions.
upload_data: Upload data records to datastore.
vacuum_indexes: Delete unused indexes from application.
Use 'help <action>' for a detailed description.

…. (and so forth)

Now we can go ahead and start using the bulkloader.

About the author

I'm Ikai and I've been fascinated by technology since before I could walk. I became the 'go to' guy for my friend's tech issues and after getting the same questions over and over, I decided to write down my answers. I hope you find the same value that my friend did.

Leave a Comment