Package weka.core.converters
Class DatabaseLoader
- java.lang.Object
-
- weka.core.converters.AbstractLoader
-
- weka.core.converters.DatabaseLoader
-
- All Implemented Interfaces:
java.io.Serializable
,BatchConverter
,DatabaseConverter
,IncrementalConverter
,Loader
,OptionHandler
,RevisionHandler
public class DatabaseLoader extends AbstractLoader implements BatchConverter, IncrementalConverter, DatabaseConverter, OptionHandler
Reads Instances from a Database. Can read a database in batch or incremental mode.
In inremental mode MySQL and HSQLDB are supported.
For all other DBMS set a pseudoincremental mode is used:
In pseudo incremental mode the instances are read into main memory all at once and then incrementally provided to the user.
For incremental loading the rows in the database table have to be ordered uniquely.
The reason for this is that every time only a single row is fetched by extending the user query by a LIMIT clause.
If this extension is impossible instances will be loaded pseudoincrementally. To ensure that every row is fetched exaclty once, they have to ordered.
Therefore a (primary) key is necessary.This approach is chosen, instead of using JDBC driver facilities, because the latter one differ betweeen different drivers.
If you use the DatabaseSaver and save instances by generating automatically a primary key (its name is defined in DtabaseUtils), this primary key will be used for ordering but will not be part of the output. The user defined SQL query to extract the instances should not contain LIMIT and ORDER BY clauses (see -Q option).
In addition, for incremental loading, you can define in the DatabaseUtils file how many distinct values a nominal attribute is allowed to have. If this number is exceeded, the column will become a string attribute.
In batch mode no string attributes will be created. Valid options are:-url <JDBC URL> The JDBC URL to connect to. (default: from DatabaseUtils.props file)
-user <name> The user to connect with to the database. (default: none)
-password <password> The password to connect with to the database. (default: none)
-Q <query> SQL query of the form SELECT <list of columns>|* FROM <table> [WHERE] to execute. (default: Select * From Results0)
-P <list of column names> List of column names uniquely defining a DB row (separated by ', '). Used for incremental loading. If not specified, the key will be determined automatically, if possible with the used JDBC driver. The auto ID column created by the DatabaseSaver won't be loaded.
-I Sets incremental loading
- Version:
- $Revision: 11199 $
- Author:
- Stefan Mutter (mutter@cs.waikato.ac.nz)
- See Also:
Loader
, Serialized Form
-
-
Field Summary
-
Fields inherited from interface weka.core.converters.Loader
BATCH, INCREMENTAL, NONE
-
-
Constructor Summary
Constructors Constructor Description DatabaseLoader()
Constructor
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
connectToDatabase()
Opens a connection to the databaseInstances
getDataSet()
Return the full data set in batch mode (header and all intances at once).java.lang.String
getKeys()
Gets the key columns' nameInstance
getNextInstance(Instances structure)
Read the data set incrementally---get the next instance in the data set or returns null if there are no more instances to get.java.lang.String[]
getOptions()
Gets the settingjava.lang.String
getPassword()
Returns the database passwordjava.lang.String
getQuery()
Gets the query to execute against the databasejava.lang.String
getRevision()
Returns the revision string.Instances
getStructure()
Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.java.lang.String
getUrl()
Gets the URLjava.lang.String
getUser()
Gets the user namejava.lang.String
globalInfo()
Returns a string describing this Loaderjava.lang.String
keysTipText()
the tip text for this propertyjava.util.Enumeration
listOptions()
Lists the available optionsstatic void
main(java.lang.String[] options)
Main method.java.lang.String
passwordTipText()
the tip text for this propertyjava.lang.String
queryTipText()
the tip text for this propertyvoid
reset()
Resets the Loader ready to read a new data setvoid
resetStructure()
Resets the structure of instancesvoid
setKeys(java.lang.String keys)
Sets the key columns of a database tablevoid
setOptions(java.lang.String[] options)
Sets the options.void
setPassword(java.lang.String password)
Sets user password for the databasevoid
setQuery(java.lang.String q)
Sets the query to execute against the databasevoid
setSource()
Sets the database url using the DatabaseUtils filevoid
setSource(java.lang.String url)
Sets the database urlvoid
setSource(java.lang.String url, java.lang.String userName, java.lang.String password)
Sets the database url, user and pwvoid
setUrl(java.lang.String url)
Sets the database URLvoid
setUser(java.lang.String user)
Sets the database userjava.lang.String
urlTipText()
the tip text for this propertyjava.lang.String
userTipText()
the tip text for this property-
Methods inherited from class weka.core.converters.AbstractLoader
setRetrieval, setSource, setSource
-
-
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing this Loader- Returns:
- a description of the Loader suitable for displaying in the explorer/experimenter gui
-
reset
public void reset() throws java.lang.Exception
Resets the Loader ready to read a new data set- Specified by:
reset
in interfaceLoader
- Overrides:
reset
in classAbstractLoader
- Throws:
java.lang.Exception
- if an error occurs while disconnecting from the database
-
resetStructure
public void resetStructure()
Resets the structure of instances
-
setQuery
public void setQuery(java.lang.String q)
Sets the query to execute against the database- Parameters:
q
- the query to execute
-
getQuery
public java.lang.String getQuery()
Gets the query to execute against the database- Returns:
- the query
-
queryTipText
public java.lang.String queryTipText()
the tip text for this property- Returns:
- the tip text
-
setKeys
public void setKeys(java.lang.String keys)
Sets the key columns of a database table- Parameters:
keys
- a String containing the key columns in a comma separated list.
-
getKeys
public java.lang.String getKeys()
Gets the key columns' name- Returns:
- name of the key columns'
-
keysTipText
public java.lang.String keysTipText()
the tip text for this property- Returns:
- the tip text
-
setUrl
public void setUrl(java.lang.String url)
Sets the database URL- Specified by:
setUrl
in interfaceDatabaseConverter
- Parameters:
url
- string with the database URL
-
getUrl
public java.lang.String getUrl()
Gets the URL- Specified by:
getUrl
in interfaceDatabaseConverter
- Returns:
- the URL
-
urlTipText
public java.lang.String urlTipText()
the tip text for this property- Returns:
- the tip text
-
setUser
public void setUser(java.lang.String user)
Sets the database user- Specified by:
setUser
in interfaceDatabaseConverter
- Parameters:
user
- the database user name
-
getUser
public java.lang.String getUser()
Gets the user name- Specified by:
getUser
in interfaceDatabaseConverter
- Returns:
- name of database user
-
userTipText
public java.lang.String userTipText()
the tip text for this property- Returns:
- the tip text
-
setPassword
public void setPassword(java.lang.String password)
Sets user password for the database- Specified by:
setPassword
in interfaceDatabaseConverter
- Parameters:
password
- the password
-
getPassword
public java.lang.String getPassword()
Returns the database password- Returns:
- the database password
-
passwordTipText
public java.lang.String passwordTipText()
the tip text for this property- Returns:
- the tip text
-
setSource
public void setSource(java.lang.String url, java.lang.String userName, java.lang.String password)
Sets the database url, user and pw- Parameters:
url
- the database urluserName
- the user namepassword
- the password
-
setSource
public void setSource(java.lang.String url)
Sets the database url- Parameters:
url
- the database url
-
setSource
public void setSource() throws java.lang.Exception
Sets the database url using the DatabaseUtils file- Throws:
java.lang.Exception
- if something goes wrong
-
connectToDatabase
public void connectToDatabase()
Opens a connection to the database
-
getStructure
public Instances getStructure() throws java.io.IOException
Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.- Specified by:
getStructure
in interfaceLoader
- Specified by:
getStructure
in classAbstractLoader
- Returns:
- the structure of the data set as an empty set of Instances
- Throws:
java.io.IOException
- if an error occurs
-
getDataSet
public Instances getDataSet() throws java.io.IOException
Return the full data set in batch mode (header and all intances at once).- Specified by:
getDataSet
in interfaceLoader
- Specified by:
getDataSet
in classAbstractLoader
- Returns:
- the structure of the data set as an empty set of Instances
- Throws:
java.io.IOException
- if there is no source or parsing fails
-
getNextInstance
public Instance getNextInstance(Instances structure) throws java.io.IOException
Read the data set incrementally---get the next instance in the data set or returns null if there are no more instances to get. If the structure hasn't yet been determined by a call to getStructure then method does so before returning the next instance in the data set.- Specified by:
getNextInstance
in interfaceLoader
- Specified by:
getNextInstance
in classAbstractLoader
- Parameters:
structure
- the dataset header information, will get updated in case of string or relational attributes- Returns:
- the next instance in the data set as an Instance object or null if there are no more instances to be read
- Throws:
java.io.IOException
- if there is an error during parsing
-
getOptions
public java.lang.String[] getOptions()
Gets the setting- Specified by:
getOptions
in interfaceOptionHandler
- Returns:
- the current setting
-
listOptions
public java.util.Enumeration listOptions()
Lists the available options- Specified by:
listOptions
in interfaceOptionHandler
- Returns:
- an enumeration of the available options
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Sets the options. Valid options are:-url <JDBC URL> The JDBC URL to connect to. (default: from DatabaseUtils.props file)
-user <name> The user to connect with to the database. (default: none)
-password <password> The password to connect with to the database. (default: none)
-Q <query> SQL query of the form SELECT <list of columns>|* FROM <table> [WHERE] to execute. (default: Select * From Results0)
-P <list of column names> List of column names uniquely defining a DB row (separated by ', '). Used for incremental loading. If not specified, the key will be determined automatically, if possible with the used JDBC driver. The auto ID column created by the DatabaseSaver won't be loaded.
-I Sets incremental loading
- Specified by:
setOptions
in interfaceOptionHandler
- Parameters:
options
- the options- Throws:
java.lang.Exception
- if options cannot be set
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
main
public static void main(java.lang.String[] options)
Main method.- Parameters:
options
- the options
-
-