Example usage for org.apache.hadoop.mapreduce.lib.db DBSplitter split

Introduction

In this page you can find the example usage for org.apache.hadoop.mapreduce.lib.db DBSplitter split.

Prototype

List<InputSplit> split(Configuration conf, ResultSet results, String colName) throws SQLException;

Source Link

Document

Given a ResultSet containing one record (and already advanced to that record) with two columns (a low value, and a high value, both of the same type), determine a set of splits that span the given values.

Usage

From source file:co.nubetech.apache.hadoop.mapred.DataDrivenDBInputFormat.java

License:Apache License

/** {@inheritDoc} */
public List<InputSplit> getSplits(Configuration job) throws IOException {

    int targetNumTasks = job.getInt(MRJobConfig.NUM_MAPS, 1);
    if (1 == targetNumTasks) {
        // There's no need to run a bounding vals query; just return a split
        // that separates nothing. This can be considerably more optimal for
        // a//  w w  w .  j ava 2s  . c  o  m
        // large table with no index.
        List<InputSplit> singletonSplit = new ArrayList<InputSplit>();
        singletonSplit.add(
                new org.apache.hadoop.mapreduce.lib.db.DataDrivenDBInputFormat.DataDrivenDBInputSplit("1=1",
                        "1=1"));
        return singletonSplit;
    }

    ResultSet results = null;
    Statement statement = null;
    Connection connection = getConnection();
    try {
        statement = connection.createStatement();

        results = statement.executeQuery(getBoundingValsQuery());
        results.next();

        // Based on the type of the results, use a different mechanism
        // for interpolating split points (i.e., numeric splits, text
        // splits,
        // dates, etc.)
        int sqlDataType = results.getMetaData().getColumnType(1);
        DBSplitter splitter = getSplitter(sqlDataType);
        if (null == splitter) {
            throw new IOException("Unknown SQL data type: " + sqlDataType);
        }

        //return convertSplit(splitter.split(job, results, getDBConf()
        //      .getInputOrderBy()));
        return splitter.split(job, results, getDBConf().getInputOrderBy());
    } catch (SQLException e) {
        throw new IOException(e.getMessage());
    } finally {
        // More-or-less ignore SQL exceptions here, but log in case we need
        // it.
        try {
            if (null != results) {
                results.close();
            }
        } catch (SQLException se) {
            LOG.debug("SQLException closing resultset: " + se.toString());
        }

        try {
            if (null != statement) {
                statement.close();
            }
        } catch (SQLException se) {
            LOG.debug("SQLException closing statement: " + se.toString());
        }

        try {
            connection.commit();
            closeConnection();
        } catch (SQLException se) {
            LOG.debug("SQLException committing split transaction: " + se.toString());
        }
    }
}